Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Community Answered
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 4.0.9, 4.0.19
Component/s: None
Labels:
None

Operating System:
ALL
Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Recently, We encountered a strange phenomenon
some 4.0 mongodb sharding cluster , The replication of secondary hang up. So the lag between primary and secondary have growing so large.

I have colloect the pstack data of mongod.

we can know that 16 replWriterThread is waiting for tasks, meaning they are idle。
```
#0 futex_wait_cancelable (private=0, expected=0, futex_word=0x5580fc7dd458) at ../sysdeps/unix/sysv/linux/futex-internal.h:88#1 _pthread_cond_wait_common (abstime=0x0, mutex=0x5580fc7dd400, cond=0x5580fc7dd430) at pthread_cond_wait.c:502#2 __pthread_cond_wait (cond=0x5580fc7dd430, mutex=0x5580fc7dd400) at pthread_cond_wait.c:655#3 0x00005580f5f7ceec in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()#4 0x00005580f5632750 in mongo::ThreadPool::_consumeTasks() ()#5 0x00005580f5632e86 in mongo::ThreadPool::_workerThreadBody(mongo::ThreadPool*, std::_cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()#6 0x00005580f56331be in std::thread::_Impl<std::_Bind_simple<mongo::stdx::thread::thread<mongo::ThreadPool::_startWorkerThread_inlock()::{lambda()#1}, , 0>(mongo::ThreadPool::_startWorkerThread_inlock()::{lambda()#1})::{lambda()#1} ()> >::_M_run() ()#7 0x00005580f5f7ff60 in execute_native_thread_routine ()#8 0x00007fd5151a2fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486#9 0x00007fd5150d14cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
```

but the batcher thread is waitForIdle for repl thread.
```
#0 futex_wait_cancelable (private=0, expected=0, futex_word=0x5580fc7dd48c) at ../sysdeps/unix/sysv/linux/futex-internal.h:88#1 __pthread_cond_wait_common (abstime=0x0, mutex=0x5580fc7dd400, cond=0x5580fc7dd460) at pthread_cond_wait.c:502#2 __pthread_cond_wait (cond=0x5580fc7dd460, mutex=0x5580fc7dd400) at pthread_cond_wait.c:655#3 0x00005580f5f7ceec in std::condition_variable::wait(std::unique_lock<std::mutex>&) ()#4 0x00005580f56311bb in mongo::ThreadPool::waitForIdle() ()#5 0x00005580f4816d91 in mongo::repl::SyncTail::multiApply(mongo::OperationContext*, std::vector<mongo::repl::OplogEntry, std::allocator<mongo::repl::OplogEntry> >) ()#6 0x00005580f48186e3 in mongo::repl::SyncTail::_oplogApplication(mongo::repl::OplogBuffer*, mongo::repl::ReplicationCoordinator*, mongo::repl::SyncTail::OpQueueBatcher*) ()#7 0x00005580f48198c3 in mongo::repl::SyncTail::oplogApplication(mongo::repl::OplogBuffer*, mongo::repl::ReplicationCoordinator*) ()
```

so i guess there is a bug here, but i don't find what's the root cause of the bug.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

ps1
May 10 2021 02:29:32 AM UTC
5.21 MB
FirstName lipengchong
ps2
May 10 2021 02:29:38 AM UTC
5.23 MB
FirstName lipengchong

related to

SERVER-56054 Change minThreads value for replication writer thread pool to 0

Closed

Assignee:: Dmitry Agranat

Reporter:: FirstName lipengchong

Participants:: Dmitry Agranat, FirstName lipengchong

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: May 10 2021 02:30:20 AM UTC

Updated:: Oct 27 2023 03:56:28 PM UTC

Resolved:: May 16 2021 10:00:36 AM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates