-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Storage Execution
-
Execution Team 2022-02-21, Execution Team 2022-03-07
-
20
In order to guarantee cache eviction progress, WiredTiger requires MongoDB to use one WT_SESSION per thread. This is because transactions can only be rolled-back for eviction when API calls are made into the session. When each thread has only one session, then WiredTiger can guarantee forward eviction progress without blocking, because all operations will eventually make API calls, which allows them to be rolled-back if they are blocking eviction.
Using more than one session per thread risks the following deadlock:
- Operation writes using session S1
- Operation reads using session S2
- S2 is blocked on cache eviction. S1 is the oldest transaction that is pinning content, and therefore needs to be rolled back. Because this operation is not actively making calls into S1, we reach a deadlock
Note that this is only a problem when a read-only session is used while also holding onto a session that has performed writes. This is not problematic with two writing sessions, two reading sessions, or when a writing session holds onto a read-only session.
Edit: my previous claim that this is only a problem with read-only sessions is incorrect. Every session that wishes to write must first open a cursor, which involves a cache eviction check. So the deadlock scenario is still possible.
We should audit and make assertions that an operation in a WriteUnitOfWork (i.e. a write transaction) cannot open any new sessions. For cases where happens, we should find a way to stop, or add the "cache_max_wait_ms" option to allow the operation to time out.
- is related to
-
SERVER-67514 SizeStorer load() can get stuck in page eviction
- Closed
-
SERVER-61097 SizeStorer can cause deadlocks with cache eviction
- Closed
-
SERVER-62650 RecordStore RecordId initialization can deadlock transactions with cache eviction
- Closed
-
WT-9035 Asynchronously roll back transactions due to cache pressure
- Closed
-
WT-7203 Add WT diagnostic mode test for conflicting session use by a thread
- Backlog
-
WT-8864 Document operations that can timeout due to cache_max_wait_ms
- Closed
- related to
-
WT-8245 Fix eviction hang during importCollection
- Closed
-
WT-9330 Add observability on the last thread that accessed a session
- Open
-
SERVER-64856 Explore reusing the caller's WT_SESSION in getLatestOplogTimestamp
- Closed