-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
Affects Version/s: 2.6.0-rc2
-
Component/s: None
-
None
-
ALL
We have observed 3 separate instances of the mongod process dying unexpectedly with no message in the log file. In each instance, the last message in the log file was about to log metadata event.
In each episodes the mongod in question has been a member of a shard in a cluster. The episodes were observed on three separate physical servers.
Log files attached.
In the third episode dmesg said the following:
INFO: task mongod:29664 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. mongod D 0000000000000001 0 29664 29663 0x00000080 ffff88066032bdf8 0000000000000086 ffff88066032bdc0 ffff88066032bdbc ffff88066032bd88 ffff88063fc24500 ffff880028215f80 0000000000000400 ffff88008217fab8 ffff88066032bfd8 000000000000f4e8 ffff88008217fab8 Call Trace: [<ffffffffa00745c5>] jbd2_log_wait_commit+0xc5/0x140 [jbd2] [<ffffffff81090d30>] ? autoremove_wake_function+0x0/0x40 [<ffffffffa0074676>] ? __jbd2_log_start_commit+0x36/0x40 [jbd2] [<ffffffffa009409c>] ext4_sync_file+0x13c/0x250 [ext4] [<ffffffff811a57a1>] vfs_fsync_range+0xa1/0xe0 [<ffffffff811a584d>] vfs_fsync+0x1d/0x20 [<ffffffff811a588e>] do_fsync+0x3e/0x60 [<ffffffff811a58e0>] sys_fsync+0x10/0x20 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
- is duplicated by
-
SERVER-7790 Segfault in splitchunk following dropDatabase
- Closed
- related to
-
SERVER-13429 Replace writes to cout/cerr or stdout/stderr in server with log operations
- Closed