-
Type: Improvement
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: Querying, Replication, Sharding
-
Environment:ubuntu lucid 64 bit
-
Fully Compatible
-
Sharding 2017-08-21
-
(copied to CRM)
Secondary reads in MongoDB are only eventually consistent - the state of the system will not reflect the latest changes. When balancing, the state of the cluster is changing implicitly, and so secondary reads are inconsistent. This means that duplicate, stale, or missing data can be observed when balancing operations are active, along with orphaned data from aborted balancer operations.
Issues with orphaned data affecting results from primary reads are different problems - see SERVER-3645 for example.
Original description:
Mongo may return too many documents in a sharded system. This may occur when a document is located on more than one shard. We don't know yet why some documents are located on more than one shard because we never access shards directly. We always access mongoDB through mongos (router). Perhaps these documents result from a failed chunk migration?
In any case, even if these documents exist on more than one shard, mongo should be clever enough to return only those, which are tracked by the config servers.
Let me show you a test case (documents are sharded by _id):
mongos> db.offer.find({shopId:100}).count() 0 ## no doc of shopId:100 exist yet, so let add one through the router: mongos> db.offer.save({"_id" : 100, "shopId" : 100, "version": 1}) mongos> exit bye ## let's add an document on another shard (this time by accessing it directly to beeing able to reproduce) > mongo localhost:20017/offerStore MongoDB shell version: 2.0.5 connecting to: localhost:20017/offerStore PRIMARY> db.offer.find({shopId:100}).count() 0 PRIMARY> db.offer.save({"_id" : 100, "shopId" : 100, "version": 2}) PRIMARY> db.offer.find({shopId:100}).count() 1 PRIMARY> exit bye ## let's check what mongos thinks how many docs of shopId:100 it has: MongoDB shell version: 2.0.5 connecting to: localhost:20021/offerStore mongos> db.offer.find({shopId:100}).count() 2 ## this is a bug, because mongos should find only 1 doc since the 2nd doc is a an orphan, not beeing referenced by config servers: mongos> db.printShardingStatus(true) --- Sharding Status --- sharding version: { "_id" : 1, "version" : 3 } shards: { "_id" : "shard1", "host" : "shard1/localhost:20017" } { "_id" : "shard2", "host" : "shard2/localhost:20018" } { "_id" : "shard3", "host" : "shard3/localhost:20019" } databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "offerStore", "partitioned" : true, "primary" : "shard1" } offerStore.offer chunks: shard3 6 shard1 7 shard2 7 { "_id" : { $minKey : 1 } } -->> { "_id" : NumberLong(538697491) } on : shard3 { "t" : 4000, "i" : 2 } { "_id" : NumberLong(538697491) } -->> { "_id" : NumberLong(538748351) } on : shard3 { "t" : 4000, "i" : 4 } { "_id" : NumberLong(538748351) } -->> { "_id" : NumberLong(538827239) } on : shard3 { "t" : 5000, "i" : 4 } { "_id" : NumberLong(538827239) } -->> { "_id" : NumberLong(538893516) } on : shard3 { "t" : 6000, "i" : 2 } { "_id" : NumberLong(538893516) } -->> { "_id" : NumberLong(591546899) } on : shard3 { "t" : 6000, "i" : 3 } { "_id" : NumberLong(591546899) } -->> { "_id" : NumberLong(647519529) } on : shard1 { "t" : 6000, "i" : 1 } { "_id" : NumberLong(647519529) } -->> { "_id" : NumberLong(660087036) } on : shard1 { "t" : 3000, "i" : 2 } { "_id" : NumberLong(660087036) } -->> { "_id" : NumberLong(675320121) } on : shard1 { "t" : 3000, "i" : 6 } { "_id" : NumberLong(675320121) } -->> { "_id" : NumberLong(691204023) } on : shard1 { "t" : 3000, "i" : 7 } { "_id" : NumberLong(691204023) } -->> { "_id" : NumberLong(706454221) } on : shard1 { "t" : 3000, "i" : 4 } { "_id" : NumberLong(706454221) } -->> { "_id" : NumberLong(751548202) } on : shard1 { "t" : 3000, "i" : 5 } { "_id" : NumberLong(751548202) } -->> { "_id" : NumberLong(799095936) } on : shard1 { "t" : 7000, "i" : 0 } { "_id" : NumberLong(799095936) } -->> { "_id" : NumberLong(844050111) } on : shard2 { "t" : 7000, "i" : 1 } { "_id" : NumberLong(844050111) } -->> { "_id" : NumberLong(896132956) } on : shard2 { "t" : 6000, "i" : 8 } { "_id" : NumberLong(896132956) } -->> { "_id" : NumberLong(937716362) } on : shard2 { "t" : 6000, "i" : 10 } { "_id" : NumberLong(937716362) } -->> { "_id" : NumberLong(960061623) } on : shard2 { "t" : 6000, "i" : 11 } { "_id" : NumberLong(960061623) } -->> { "_id" : NumberLong(995515056) } on : shard2 { "t" : 5000, "i" : 2 } { "_id" : NumberLong(995515056) } -->> { "_id" : NumberLong(1021076450) } on : shard2 { "t" : 6000, "i" : 4 } { "_id" : NumberLong(1021076450) } -->> { "_id" : NumberLong(1035798084) } on : shard2 { "t" : 6000, "i" : 5 } { "_id" : NumberLong(1035798084) } -->> { "_id" : { $maxKey : 1 } } on : shard3 { "t" : 5000, "i" : 0 } mongos> db.offer.find({shopId:100}) { "_id" : 100, "shopId" : 100, "version" : 1 } ## this is correct (only 1 doc found) BUT see the next one: mongos> rs.slaveOk() mongos> db.offer.find({shopId:100}) { "_id" : 100, "shopId" : 100, "version" : 2 } { "_id" : 100, "shopId" : 100, "version" : 1 } ## this is a bug since mongo queries all shards without ever asking whether they return orphan docs or not mongos> db.offer.find({_id:100}) { "_id" : 100, "shopId" : 100, "version" : 1 } ## When searching by sharding key, mongo get it correct.
- is duplicated by
-
SERVER-14644 Retrieving duplicate records with the same _id from secondaries.
- Closed
-
SERVER-31663 Inconsistent query results between primary and secondary
- Closed
-
SERVER-8948 Count() can be wrong in sharded collections
- Closed
-
SERVER-9858 After a chunk migration, requests on secondaries return multiple objects with same _id
- Closed
-
SERVER-21650 Duplicate _id when reading from secondaries on a sharded cluster
- Closed
-
SERVER-6563 Improve consistency of non-primary reads in a sharded cluster
- Closed
- is related to
-
SERVER-30708 _id index returning more than one document with same _id in aggregations and counts.
- Closed
-
SERVER-3645 Sharded collection counts (on primary) can report too many results
- Closed
-
SERVER-8598 Add command to cleanup orphaned data created by failed chunk migrations
- Closed
- related to
-
SERVER-8598 Add command to cleanup orphaned data created by failed chunk migrations
- Closed
-
SERVER-20782 Support causal consistency with secondary reads in sharded, replicated MongoDB clusters
- Closed
-
SERVER-23917 splitVector can't be run against secondary
- Closed