-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Server Security
-
ALL
-
4
This error surfaced while testing FLE state collections residing on arbitrary data shards but it looks like a generic problem: batched command responses containing write errors are never serializing error labels (despite being able to deserialize them).
Scenario:
- Create FLE collection on primary shard and FLE state collections on a different shard (this will cause FLE cruds to be executed in distributed transactions targeting 2 different shards)
- Issue a write within a transaction on the encrypted collection
- A StaleConfig error is returned. Despite being a transient transaction error, the client will not retry the transaction because the TransientTransactionError label is not attached to the reply.
Attaching a reproducible to apply on the top of commit 2f3dd3bb7d and execute any test with FLE transactions to verify that the reply is missing "errorLabels":["TransientTransactionError"].
For example, executing this:
./buildscripts/resmoke.py run --suite fle2_sharding --storageEngine=wiredTiger --jobs=1 --storageEngineCacheSizeGB=0.5 --dbpath=/tmp/testpath --runAllFeatureFlagTests src/mongo/db/modules/enterprise/jstests/fle2/txn_insert.js
This is the returned error:
{ "nInserted" : 0, "writeError" : { "code" : 13388, "errmsg" : "Transaction 92f84da6-dbee-42b5-8df7-362955a88b51 - 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= - - :0 was aborted on statement 2 due to: an error from cluster data placement change :: caused by :: Encountered error from localhost:20001 during a transaction :: caused by :: sharding status of collection txn_insert.enxcol_.basic.ecoc is not currently known and needs to be recovered" } }
- is related to
-
SERVER-89931 StaleConfig error is not retried as a first statement in a txn with FLE sharded collections
- Closed