-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.5.1, 4.4.0-rc10
-
Component/s: Index Maintenance
-
None
-
Storage Execution
-
ALL
-
Execution Team 2021-07-26
-
18
This is a very unlikely bug that has only reproduced by building an index on a tiny capped collection (maxSize=1) with a high number of concurrent inserts.
Update: This was also observed in SERVER-56062 on collections that were not trivially small.
If an index build collection scan recovers from yielding and can't restore its cursor because the saved position was deleted, then an index build will crash at this invariant with a CappedPositionLost error.
Example:
Invariant failure","attr":{"expr":"status.isA<ErrorCategory::Interruption>() || status.isA<ErrorCategory::ShutdownError>()","msg":"Unnexpected error code during index build cleanup: CappedPositionLost: CollectionScan died due to position in capped collection being deleted. Last seen record id: RecordId(1)
It wouldn't be a complete solution to just abort the index build, because a secondary could hit this error independently of a primary and still crash.
I think we can safely restart the collection scan if we hit a CappedPositionLost error. While this poses a liveness issue, I think the circumstances of hitting this bug are extreme enough to warrant this solution.
- duplicates
-
SERVER-56062 Restart index builds after CappedPositionLost errors
- Closed