Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 3.0.12, 3.2.10, 3.4.0-rc0
Component/s: Stability
Labels:
None

Operating System:
ALL
Steps To Reproduce:

Hide

unpredictable, seems to be produce on secondary
(not sure, 3 times on secondary replica in two week)

Show
unpredictable, seems to be produce on secondary (not sure, 3 times on secondary replica in two week)
Sprint:
Sharding 2016-09-19, Sharding 2016-10-10, Sharding 2016-10-31
Linked BF Score:
5
Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

All threads hang on waiting locks. The state of the LockHead indicates that there is inconsistency between the grantedList and grantedCount where there are not granted requests, but the granted counts are non-zero:

(gdb) p $18
$38 = {
  resourceId = {
    _fullHash = 2305843009213693953
  },
  grantedList = {
    _front = 0x0,
    _back = 0x0
  },
  grantedCounts = {0, 1, 0, 0, 0},
  grantedModes = 2,
  conflictList = {
    _front = 0x7f08028,
    _back = 0x5902cf628
  },
  conflictCounts = {0, 1490, 0, 0, 1},
  conflictModes = 18,
  partitions = {
    <std::_Vector_base<mongo::LockManager::Partition*, std::allocator<mongo::LockManager::Partition*> >> = {
      _M_impl = {
        <std::allocator<mongo::LockManager::Partition*>> = {
          <__gnu_cxx::new_allocator<mongo::LockManager::Partition*>> = {<No data fields>}, <No data fields>},
        members of std::_Vector_base<mongo::LockManager::Partition*, std::allocator<mongo::LockManager::Partition*> >::_Vector_impl:
        _M_start = 0x7470640,
        _M_finish = 0x7470640,
        _M_end_of_storage = 0x7470680
      }
    }, <No data fields>},
  conversionsCount = 0,
  compatibleFirstCount = 0
}

Attachment shows stacks of all threads and the output of db.currentOp().

Here is our cluster info:

3 shards * 3 replica
using both range & hash based sharding
collection size from 50GB to 500GB

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

stacks.txt
Sep 08 2016 06:05:59 AM UTC
6.14 MB
xiaost
gdb.withsymbols.out
Sep 13 2016 07:54:59 AM UTC
6.60 MB
xiaost
currentOp.out
Sep 21 2016 06:32:21 AM UTC
910 kB
xiaost
LockMgrInvariants.diff
Sep 27 2016 12:37:39 PM UTC
14 kB
Kaloian Manassiev
server_status.txt
Sep 28 2016 01:01:27 PM UTC
24 kB
xihui he

is related to

SERVER-26578 Add startup warning for Intel CPUs which might have TSX bugs

Closed

Assignee:: Kaloian Manassiev

Reporter:: xiaost

Participants:: Geert Bosch, Githook User, Kaloian Manassiev, Teemu Sirkiä, xiaost, xihui he

Votes:: 1 Vote for this issue

Watchers:: 13 Start watching this issue

Created:: Sep 08 2016 06:05:59 AM UTC

Updated:: May 10 2023 06:48:51 PM UTC

Resolved:: Oct 11 2016 06:32:37 PM UTC

Confidence Status Last Update:: 28/Sep/16 3:12 PM

GA Target Date:: None

Public Preview Target Date:: None

Private Preview Target Date:: None

Experiment Target Date:: None

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates