Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Critical - P2
Fix Version/s: 1.5.0
Affects Version/s: 1.4.0, 1.4.1
Component/s: None
Labels:
None
Environment:
MongoHQ, semi-dedicated environment, 2 replicas and an arbiter (orchid)
Ruby 1.9.2, mongo/bson/bson_ext 1.4.1, Mongoid 2.0.2

Confidence Status:
None

Backwards Compatibility:
Major Change

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

The short story is that upgrading to 1.4.0 and then 1.4.1 made our production environment (almost) toast.

new-relic.png: shows the query performance right after deployment w/ 1.4.0 and then 1.4.1 until the problem went away, downgrade to 1.3.1
mongohq-conncount.png: shows the number of connections from the rails app to mongo varying significantly up to 77, downgrading to 1.3.1 put it back in a stable number of 11
mongostat.png shows nothing unusual while queries timeout from ruby

a random sampler

30.times { puts Benchmark.realtime

{ Mongoid.master.connection.active? }

; sleep(1) }) which executes db.runCommand(

{ ping: 1 }

0.024342775344848633
0.08080220222473145
2.113878011703491 <-------- not ok
0.023059368133544922
0.03187060356140137

at the same time we're experiencing timeouts between replicas, but with 1.3.1 it doesn't affect performance, mongodb log

Fri Oct 21 20:50:01 [ReplSetHealthPollTask] EINTR retry
Fri Oct 21 20:50:01 [ReplSetHealthPollTask] DBClientCursor::init call() failed
Fri Oct 21 20:50:01 [ReplSetHealthPollTask] replSet info arbiter0.orchid.mongohq.com:10001 is down (or slow to respond): DBClientBase::findOne: transport error: arbiter0.orchid.mongohq.com:10001 query:

{ replSetHeartbeat: "orchid_1", v: 3, pv: 1, checkEmpty: false, from: "node0.orchid.mongohq.com:10001" }

Fri Oct 21 20:50:05 [ReplSetHealthPollTask] replSet info arbiter0.orchid.mongohq.com:10001 is up

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

mongohq-conncount.png
31 kB
Oct 21 2011 09:39:56 PM UTC
mongostat.png
89 kB
Oct 21 2011 09:39:56 PM UTC
new-relic.png
23 kB
Oct 21 2011 09:39:56 PM UTC

Assignee:: Kyle Banker (Inactive)
Reporter:: Daniel Doubrovkine
Votes:: 3 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Oct 21 2011 09:39:56 PM UTC
Updated:: Feb 01 2013 06:59:28 PM UTC
Resolved:: Nov 28 2011 06:52:03 PM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates