-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Sharding
-
Fully Compatible
-
ALL
-
v4.4
-
Sharding 2020-04-06
-
15
A change in SERVER-45779 (server version 4.5.0-34-g45341f9) is causing one of Python's transactions documentation test to fail on sharded clusters with the following error (PYTHON-2154):
"Transaction 58ec29bc-f257-43ce-81ed-2db10a7da770:3 was aborted on statement 0 due to: an error from cluster data placement change :: caused by :: Encountered error from localhost:27217 during a transaction :: caused by :: sharding status of collection hr.employees is not currently available for filtering and needs to be recovered from the config server"
Here's a link to source for the failing "update_employee_info" transaction test: https://github.com/mongodb/mongo-python-driver/blob/1fdfb9864b3c0614f0f46add0bb7a25ebb8d2e77/test/test_examples.py#L902
The failure happens on the first command in the transaction on line 875.
The test does not shard the collection or enable sharding on the database so hr.employees is an unsharded collection.
Here are the mongos logs for the failing test:
2020-03-05T21:58:17.855+0000 I NETWORK [listener] connection accepted from 127.0.0.1:42732 #685 (19 connections now open) 2020-03-05T21:58:17.857+0000 I NETWORK [conn685] received client metadata from 127.0.0.1:42732 conn685: {"driver":{"name":"PyMongo","version":"3.11.0.dev1"},"os":{"type":"Linux","name":"Red Hat Enterprise Linux Server 6.2 Santiago","architecture":"x86_64","version":"2.6.32-220.el6.x86_64"},"platform":"CPython 2.7.17.final.0"} 2020-03-05T21:58:17.860+0000 I NETWORK [listener] connection accepted from 127.0.0.1:42733 #686 (20 connections now open) 2020-03-05T21:58:17.861+0000 I NETWORK [conn686] received client metadata from 127.0.0.1:42733 conn686: {"driver":{"name":"PyMongo","version":"3.11.0.dev1"},"os":{"type":"Linux","name":"Red Hat Enterprise Linux Server 6.2 Santiago","architecture":"x86_64","version":"2.6.32-220.el6.x86_64"},"platform":"CPython 2.7.17.final.0"} 2020-03-05T21:58:17.862+0000 I NETWORK [ConfigServerCatalogCacheLoader-0] ignoring minOpTime for { mode: "nearest" } 2020-03-05T21:58:17.863+0000 I NETWORK [ConfigServerCatalogCacheLoader-0] ignoring minOpTime for { mode: "primary" } 2020-03-05T21:58:17.864+0000 I SH_REFR [ConfigServerCatalogCacheLoader-0] Refresh for database hr took 1 ms and failed :: caused by :: NamespaceNotFound: database hr not found 2020-03-05T21:58:17.864+0000 I NETWORK [conn686] ignoring minOpTime for { mode: "primary" } 2020-03-05T21:58:17.875+0000 I NETWORK [ConfigServerCatalogCacheLoader-0] ignoring minOpTime for { mode: "nearest" } 2020-03-05T21:58:17.876+0000 I SH_REFR [ConfigServerCatalogCacheLoader-0] Refresh for database hr from version {} to version {"uuid":{"$uuid":"4f3f02c9-58eb-41fa-978c-8c8f78cb99c8"},"lastMod":1} took 0 ms 2020-03-05T21:58:17.876+0000 I NETWORK [conn686] ignoring minOpTime for { mode: "nearest" } 2020-03-05T21:58:17.890+0000 I NETWORK [ConfigServerCatalogCacheLoader-0] ignoring minOpTime for { mode: "nearest" } 2020-03-05T21:58:17.891+0000 I NETWORK [ConfigServerCatalogCacheLoader-0] ignoring minOpTime for { mode: "primary" } 2020-03-05T21:58:17.892+0000 I SH_REFR [ConfigServerCatalogCacheLoader-0] Refresh for database reporting took 1 ms and failed :: caused by :: NamespaceNotFound: database reporting not found 2020-03-05T21:58:17.892+0000 I NETWORK [conn686] ignoring minOpTime for { mode: "primary" } 2020-03-05T21:58:17.904+0000 I NETWORK [ConfigServerCatalogCacheLoader-0] ignoring minOpTime for { mode: "nearest" } 2020-03-05T21:58:17.905+0000 I SH_REFR [ConfigServerCatalogCacheLoader-0] Refresh for database reporting from version {} to version {"uuid":{"$uuid":"89d001c7-f571-47cf-885e-5e7e4d682360"},"lastMod":1} took 0 ms 2020-03-05T21:58:17.905+0000 I NETWORK [conn686] ignoring minOpTime for { mode: "nearest" } 2020-03-05T21:58:17.933+0000 I NETWORK [conn686] ignoring minOpTime for { mode: "primary" } 2020-03-05T21:58:18.039+0000 I - [conn686] slow query{"type":"command","ns":"reporting.$cmd","command":{"dropDatabase":1,"writeConcern":{"w":"majority"},"lsid":{"id":{"$uuid":"05f6af3a-f73b-472e-8924-1517ee7094f4"}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1583445497,"i":167}},"signature":{"keyId":0,"hash":{"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}}}},"$db":"reporting","$readPreference":{"mode":"primary"}},"numYields":0,"reslen":186,"protocol":"op_msg","durationMillis":106} 2020-03-05T21:58:18.041+0000 I NETWORK [conn686] ignoring minOpTime for { mode: "primary" } 2020-03-05T21:58:18.142+0000 I - [conn686] slow query{"type":"command","ns":"hr.$cmd","command":{"dropDatabase":1,"writeConcern":{"w":"majority"},"lsid":{"id":{"$uuid":"05f6af3a-f73b-472e-8924-1517ee7094f4"}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1583445498,"i":5}},"signature":{"keyId":0,"hash":{"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}}}},"$db":"hr","$readPreference":{"mode":"primary"}},"numYields":0,"reslen":179,"protocol":"op_msg","durationMillis":100}
The mongod logs for this time are large so I've attached them: db.zip. I do see this error:
2020-03-05T21:58:17.919+0000 D1 - [conn44] User Assertion: StaleConfig{ ns: "hr.employees", vReceived: Timestamp(0, 0), vReceivedEpoch: ObjectId('000000000000000000000000'), shardId: "sh01" }: sharding status of collection hr.employees is not currently available for filtering and needs to be recovered from the config server src/mongo/db/s/collection_sharding_runtime.cpp 140
- is depended on by
-
SERVER-32198 Missing collection metadata on the shard implies both UNSHARDED and "metadata not loaded yet"
- Closed
-
CDRIVER-3564 Transactions test in /Samples test frequently fails on 4.5-latest sharded clusters
- Closed
-
PYTHON-2154 test_examples.TestTransactionExamples fails on 4.5-latest sharded clusters
- Closed
- related to
-
SERVER-74195 Transaction failed after version upgrade
- Closed
-
PYTHON-2154 test_examples.TestTransactionExamples fails on 4.5-latest sharded clusters
- Closed
-
SERVER-45779 Throw Stale Shard Version when collection is unknown on getOwnershipFilter
- Closed