-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 5.0.3
-
Component/s: None
-
Server Triage
-
ALL
I run a sharded cluster with 4 shards, each shard is deployed as PSA Replica Set (primary+secondary+arbiter). The Config server (CSRS) is deployed with 3 members (PSS). The application has a fairly high load, every second up to 100'000 documents are inserted, resulting in app. 5 Billion documents inserted every day.
The documents are inserted into (non-sharded) raw-data collections. Once per hour I run a bucketing job which does following steps:
- Check health status of the replica sets. If a secondary member is not available then reconfigure the replica set member to { {{votes: 0, priority: 0 }}}
- Rename the (non-sharded) raw-data collection.
Due to continuous inserts, a new raw-data collection is automatically created instantly. - Create some indexes on the new raw-data collection
- Run some aggregation pipelines with $count, $group, etc. on the "old" raw-data collection. Aggregated data is merged into a sharded final collection.
- Drop the "old" raw-data collection
When you run renameCollection on mongos then write concern is set to "majority". That's the reason for doing the health check at the beginning and reconfigure the replica set in case of an outage.
Now the problem: If a secondary member goes down, the everything works as intended. The member is reconfigured as passive and application continues to work. But when the member recovers again, then sometimes the database get corrupted. I can drop any collection or even entire database (i.e. db.dropDatabase()) but a new "raw-data" collection cannot be created. I can create any new collection with different name but the "raw-data" collection name is blocked.
Restarting all nodes in the sharded cluster does not help.
Dropping all collections from database and even db.dropDatabase() does not help
So far, the only solution was to drop the entire sharded cluster and deploy it from scratch.
Currently the application runs as "proof of concept", of course it would be no solution for production environment.
It seems to be a bit difficult to reproduce this problem, however it appeared already two times. The first time I faced the issue was while Upgrade a Sharded Cluster
Perhaps based on this information you get already an idea where the problem could be. I will try to hit this issue again with proper collection of all log files. In general replication seems to have a problem when you drop and re-create a collections with the same name. I also run into problems when replica set is running an initial synchronization. My current workaround is to suspend the bucketing job whenever I need to run any kind of recover/restoring.
Kind Regards
Wernfried
- duplicates
-
SERVER-65930 DDL coordinators and rename participant initial checkpoint may incur in DuplicateKey error
- Closed
- related to
-
SERVER-60266 Retry WriteConcernError exceptions in DDL coordinators
- Closed
-
SERVER-61416 Indefinitely retry errors in rename coordinator
- Closed