Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Critical - P2
Fix Version/s: 4.4.2
Affects Version/s: 4.4.0
Component/s: Cluster Management
Labels:
- external-user

Case:
Confidence Status:
None

Backwards Compatibility:
Fully Compatible

Documentation Changes:
Not Needed

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

We discovered blocking threads a couple of times in some of our Java processes shortly after an election in the replica set of 3 MongoDB instances, which make our services unresponsive:

The Thread-Dump looks, in all processes with blocking threads, similar.

First blocked thread:

cluster-ClusterId{value='61f29e20622d627cc0e99c19', description='null'}-mongodb-instance-3:27017 id=24 state=BLOCKED
    - waiting to lock <0x0b2400cb> (a com.mongodb.internal.connection.MultiServerCluster)
     owned by cluster-ClusterId{value='61f29e20622d627cc0e99c19', description='null'}-mongodb-instance-1:27017 id=20
    at com.mongodb.internal.connection.AbstractMultiServerCluster.onChange(AbstractMultiServerCluster.java:175)
    at com.mongodb.internal.connection.AbstractMultiServerCluster.access$100(AbstractMultiServerCluster.java:50)
    at com.mongodb.internal.connection.AbstractMultiServerCluster$DefaultServerDescriptionChangedListener.serverDescriptionChanged(AbstractMultiServerCluster.java:139)
    at com.mongodb.internal.connection.DefaultSdamServerDescriptionManager.updateDescription(DefaultSdamServerDescriptionManager.java:127)
    at com.mongodb.internal.connection.DefaultSdamServerDescriptionManager.update(DefaultSdamServerDescriptionManager.java:81)
    at com.mongodb.internal.connection.DefaultServerMonitor$ServerMonitorRunnable.run(DefaultServerMonitor.java:165)
    at java.base@11.0.12/java.lang.Thread.run(Thread.java:829)    Locked synchronizers: count = 1
      - java.util.concurrent.locks.ReentrantLock$NonfairSync@32e56896

Second blocked thread:

cluster-ClusterId{value='61f29e20622d627cc0e99c19', description='null'}-mongodb-instance-2:27017 id=22 state=BLOCKED
    - waiting to lock <0x0b2400cb> (a com.mongodb.internal.connection.MultiServerCluster)
     owned by cluster-ClusterId{value='61f29e20622d627cc0e99c19', description='null'}-mongodb-instance-1:27017 id=20
    at com.mongodb.internal.connection.AbstractMultiServerCluster.onChange(AbstractMultiServerCluster.java:175)
    at com.mongodb.internal.connection.AbstractMultiServerCluster.access$100(AbstractMultiServerCluster.java:50)
    at com.mongodb.internal.connection.AbstractMultiServerCluster$DefaultServerDescriptionChangedListener.serverDescriptionChanged(AbstractMultiServerCluster.java:139)
    at com.mongodb.internal.connection.DefaultSdamServerDescriptionManager.updateDescription(DefaultSdamServerDescriptionManager.java:127)
    at com.mongodb.internal.connection.DefaultSdamServerDescriptionManager.update(DefaultSdamServerDescriptionManager.java:81)
    at com.mongodb.internal.connection.DefaultServerMonitor$ServerMonitorRunnable.run(DefaultServerMonitor.java:165)
    at java.base@11.0.12/java.lang.Thread.run(Thread.java:829)    Locked synchronizers: count = 1
      - java.util.concurrent.locks.ReentrantLock$NonfairSync@166cbc02

We renamed the host in the thread dump, but kept the instance number.

The only way to fix this problem is to restart our processes.

We discovered this problem since we upgraded from MongoDB Driver 4.2.3 (Spring Boot 2.5.6) to MongoDB Driver 4.4.0 (Spring Boot 2.6.2) and later MongoDB Driver 4.4.1 (Spring Boot 2.6.3)

Our MongoDB Server instances run on version 4.2.2, what should be compatible according to https://docs.mongodb.com/drivers/java/sync/current/compatibility/

How to Reproduce

Unfortunately, we are not able to reproduce these phenomena. We just see them in irregular intervals and have to restart our services.

Additional Background

We use the synchronous Java Driver
Java 11
Linux

We can provide more information if needed.

Thanks for your help!

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

image-2022-02-03-17-15-21-102.png
Feb 03 2022 04:15:22 PM UTC
58 kB
Robert Zilke
image-2022-02-03-17-15-43-755.png
Feb 03 2022 04:15:44 PM UTC
21 kB
Robert Zilke
threaddump-anonymized.tdump
Feb 04 2022 08:23:43 AM UTC
431 kB
Robert Zilke

is caused by

JAVA-3928 Connection pool paused state

Closed

links to

PR1

PR1 port to 4.4.x

PR2

Assignee:: Valentin Kavalenka

Reporter:: Robert Zilke

Reviewers:: None

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: Feb 03 2022 04:49:00 PM UTC

Updated:: Oct 28 2023 11:20:51 AM UTC

Resolved:: Feb 07 2022 07:56:31 PM UTC

Confidence Status Last Update:: 04/Feb/22 4:34 PM

GA Target Date:: None

Public Preview Target Date:: None

Private Preview Target Date:: None

Experiment Target Date:: None

Details

Description

How to Reproduce

Additional Background

Attachments

Attachments

Issue Links

Activity

People

Dates