-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.4.9
-
Component/s: None
-
None
-
Replication
-
ALL
-
-
Repl 2022-05-16, Repl 2022-05-30, Repl 2022-06-13, Repl 2022-06-27, Repl 2022-07-11, Repl 2022-08-08, Repl 2022-08-22, Repl 2022-09-05, Repl 2022-09-19, Repl 2022-07-25, Repl 2022-10-03
It looks like MongoNotPrimaryException (or whatever the protocol response is that triggers this error in the Java driver) might actually be an indefinite error, rather than a definite failure. Consider this pair of operations from a Jepsen list-append test:
{:type :fail, :f :txn, :value [[:append 855 3]], :time 36272337272, :process 36, :error :not-primary, :index 56335} {:type :ok, f :txn, value [[:r 855 [3]]], time 38283284542, process 42, index 57897},
In this case both "transactions" are actually single-document operations. The first operation performs a single findAndModify to $push the number 3 onto a list in document 855; that write threw a MongoNotPrimaryException. The second is a read of document 855, which observed that write of 3.
The documentation for MongoNotPrimaryException says that the server "refused to execute... a write operation", which seems fairly plain: the write of 3 must not have happened. Since we go on to read 3, this looks like an aborted read.
This problem occurs with MongoDB 4.4.9 and Java driver 4.6.0, write concern majority, read concern snapshot/majority, and is reproducible using network partitions.
It also looks like MongoWriteConcernWithResponseException with a message containing "InterruptedDueToReplStateChange" may also do the same thing, but I'm less sure whether this error should be interpreted as definite or not.
- depends on
-
DRIVERS-2327 Propagate Original Error for Write Errors Labeled NoWritesPerformed
- Implementing
-
SERVER-66479 Create an error label indicating if a retryable error is "definite".
- Closed