-
Type: Bug
-
Resolution: Duplicate
-
Priority: Critical - P2
-
None
-
Affects Version/s: 2.2.2
-
Component/s: Sharding
-
None
-
Environment:All nodes running Ubuntu 12.10 on Rackspace Cloud Server instances
Config servers on their own nodes
mongos running on same node as app server (php + apache)
2 mongod shards are 3 node replica sets all running on separate nodes
-
ALL
-
We are doing pre-production load/scale testing and have a sharded collection with the following configuraiton:
db.createCollection('sitemedia');
db.sitemedia.ensureIndex(
);
sh.shardCollection("vsco_1.sitemedia",
);
db.sitemedia.ensureIndex(
)
db.sitemedia.ensureIndex(
)
db.sitemedia.ensureIndex(
)
After several hundred thousand inserts we end up with the following:
mongos> sh.status()
— Sharding Status —
sharding version:
shards:
{ "_id" : "s0", "host" : "s0/x.x.x.x:27017,x.x.x.x:27017,x.x.x.x:27017" } { "_id" : "s1", "host" : "s1/x.x.x.x:27017,x.x.x.x:27017,x.x.x.x:27017" }databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "vsco_1", "partitioned" : true, "primary" : "s1" } vsco_1.sitemedia chunks:
s1 5
s0 5
{ "user.id" :
, "_id" :
{ $minKey : 1 }} -->>
{ "user.id" : 1, "_id" : ObjectId("5100315283ce40cc3c000000") } on : s1 Timestamp(6000, 0)
-->>
{ "user.id" : 21346, "_id" : ObjectId("50fdca5583ce40427d0000a1") } on : s0 Timestamp(6000, 1)
-->>
{ "user.id" : 21498, "_id" : ObjectId("50fdcd7783ce40cd7d0001f7") } on : s0 Timestamp(5000, 5)
-->>
{ "user.id" : 21795, "_id" : ObjectId("50fdca1083ce409d7d000078") } on : s0 Timestamp(3000, 0)
-->>
{ "user.id" : 22093, "_id" : ObjectId("50fdc24183ce402d30000230") } on : s0 Timestamp(4000, 0)
-->>
{ "user.id" : 22152, "_id" : ObjectId("50fdd19383ce40d07d0003f3") } on : s0 Timestamp(5000, 0)
-->>
{ "user.id" : 22179, "_id" : ObjectId("50fdf58883ce40751e0004f5") } on : s1 Timestamp(5000, 1)
-->>
{ "user.id" : 22204, "_id" : ObjectId("50fdd4db83ce40d67d000560") } on : s1 Timestamp(4000, 4)
-->>
{ "user.id" : "22215", "_id" : ObjectId("50fde18a83ce40d50b00031c") } on : s1 Timestamp(4000, 5)
-->> { "user.id" :
{ $maxKey : 1 }, "_id" :
{ $maxKey : 1 }} on : s1 Timestamp(3000, 3)
{ "_id" : "sitemedia", "partitioned" : false, "primary" : "s1" } { "_id" : "test", "partitioned" : false, "primary" : "s1" }Our application work flow is as follows:
1) Insert a new document to the collection with a UserId.
2) Take the returned MongoId, and then update the document with some additional meta data about the document including where the actual image is stored and other relational data
What we are seeing is that after the update, doing a db.sitemedia.find({_id:ObjectId($idhash)}) on the mongo shell connected to mongos return an empty result while doing a db.sitemedia.find({_id:ObjectId($idhash)}).count() returns 1
Connecting to the two shards and running find at the mongo shell shows the mongo object living on the wrong shard. Attaching a Paste bin of our log file showing the insert and update results and what is getting pushed in to the system.
What we are noticing is that on the first insert since we don't have a MongoId yet, Mongo just inserts the document into the default shard. However, because the index is also based on userid, the document should have been placed on shard0 and not shard1. the subsequent update shows that mongos sends the update to both shards and returns a success because shard1 did the update. however on a find mongos can't find the result on shard0 since the document is living on shard1.
- duplicates
-
SERVER-7379 Immutable shardkey becomes mutable
- Closed