Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: 3.2.3, 3.3.2
Affects Version/s: None
Component/s: Storage, WiredTiger
Labels:
- WTplaybook
- code-only

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Completed:

3.2.3
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Here's an example:

Checkpoint from K to N
Drop command at L
This results in a complete stall from M to N (including the ftdc thread, which explains the lack of ftdc samples from M to N)

Stack traces collected during the stall show why: multiple operations are all stuck in dropAllQueued which in turn is blocked in __wt_session_drop:

00000066`976bf198 00007ffc`4b93dd39 ntdll!ZwWaitForSingleObject+0xa
00000066`976bf1a0 00007ffc`4b93b704 ntdll!RtlLockHeap+0x169
00000066`976bf270 00007ff7`ef7e839d ntdll!RtlEnterCriticalSection+0xe4
00000066`976bf2b0 00007ff7`ef7e4d07 mongod!__wt_session_drop+0x8d
00000066`976bf2e0 00007ff7`ef18cfd2 mongod!__session_drop+0xe7
00000066`976bf340 00007ff7`ef19f2ad mongod!mongo::WiredTigerKVEngine::dropAllQueued+0x162
00000066`976bf4c0 00007ff7`ef19c507 mongod!mongo::WiredTigerSessionCache::releaseSession+0x19d
00000066`976bf520 00007ff7`ef19c584 mongod!mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit+0x57
00000066`976bf560 00007ff7`eef1d6c9 mongod!mongo::WiredTigerRecoveryUnit::`scalar deleting destructor'+0x14
00000066`976bf590 00007ff7`eef1d724 mongod!mongo::OperationContextImpl::~OperationContextImpl+0x79
00000066`976bf5d0 00007ff7`ef0fb839 mongod!mongo::OperationContextImpl::`scalar deleting destructor'+0x14
00000066`976bf600 00007ff7`ef812245 mongod!mongo::ServiceContext::OperationContextDeleter::operator()+0x59
00000066`976bf640 00007ff7`ef2e57f7 mongod!mongo::MyMessageHandler::process+0xf5
00000066`976bf860 00007ff7`ef016bdd mongod!mongo::PortMessageServer::handleIncomingMsg+0x4a7

The FTDC thread is similarly stuck:

00000063`1823f498 00007ffc`4b93dd39 ntdll!ZwWaitForSingleObject+0xa
00000063`1823f4a0 00007ffc`4b93b704 ntdll!RtlLockHeap+0x169
00000063`1823f570 00007ff7`ef7e839d ntdll!RtlEnterCriticalSection+0xe4
00000063`1823f5b0 00007ff7`ef7e4d07 mongod!__wt_session_drop+0x8d
00000063`1823f5e0 00007ff7`ef18cfd2 mongod!__session_drop+0xe7
00000063`1823f640 00007ff7`ef19f2ad mongod!mongo::WiredTigerKVEngine::dropAllQueued+0x162
00000063`1823f7c0 00007ff7`ef19c507 mongod!mongo::WiredTigerSessionCache::releaseSession+0x19d
00000063`1823f820 00007ff7`ef19c584 mongod!mongo::WiredTigerRecoveryUnit::~WiredTigerRecoveryUnit+0x57
00000063`1823f860 00007ff7`eef1d6c9 mongod!mongo::WiredTigerRecoveryUnit::`scalar deleting destructor'+0x14
00000063`1823f890 00007ff7`eef1d724 mongod!mongo::OperationContextImpl::~OperationContextImpl+0x79
00000063`1823f8d0 00007ff7`ef0fb839 mongod!mongo::OperationContextImpl::`scalar deleting destructor'+0x14
00000063`1823f900 00007ff7`eee7a490 mongod!mongo::ServiceContext::OperationContextDeleter::operator()+0x59
00000063`1823f940 00007ff7`eee7bf1d mongod!mongo::FTDCCollectorCollection::collect+0x300
00000063`1823fb80 00007ff7`ef016bdd mongod!mongo::FTDCController::doLoop+0x37d

I believe the sequence is:

at L the drop command attempts to drop the collection but gets an EBUSY from __wt_session_drop due to the checkpoint, so it queues the drop command for later execution
every second from L to M an attempt is made to drop the collection again but again gets an EBUSY from __wt_session_drop so it remains queued and operations continue
but starting at M __wt_session_drop no longer returns EBUSY but instead gets stuck waiting on the checkpoint, resulting in the stall

It looks like there are two problems:

due to the logic in dropAllQueued in the integration layer, every completing operation on all threads attempts to execute the queued drop; this is why the stall is a complete stall, instead of affecting only one thread
starting at M, during the critical phase of the check point (coinciding with the "pinned transactions" stat), WT blocks in __wt_session_drop instead of returning EBUSY

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

drop-stall.png
Jan 15 2016 06:13:46 PM UTC
105 kB
Bruce Lucas

depends on

WT-2333 Add a flag so drop doesn't block

Closed

WT-2334 Add support for background drops

Closed

Assignee:: Michael Cahill (Inactive)
Reporter:: Bruce Lucas (Inactive)
Participants:: Bruce Lucas, Githook User, Michael Cahill, Michael Dolinsky, Ramon Fernandez Marina
Votes:: 1 Vote for this issue
Watchers:: 18 Start watching this issue

Created:: Jan 15 2016 06:11:38 PM UTC
Updated:: Nov 18 2016 01:38:32 AM UTC
Resolved:: Feb 02 2016 05:19:40 AM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates