Uploaded image for project: 'Java Driver'
  1. Java Driver
  2. JAVA-4471

Election in replica set can cause deadlock in monitor threads, leading to unavailability

    • Fully Compatible
    • Not Needed

      We discovered blocking threads a couple of times in some of our Java processes shortly after an election in the replica set of 3 MongoDB instances, which make our services unresponsive:

      The Thread-Dump looks, in all processes with blocking threads, similar.

      First blocked thread:

       

      cluster-ClusterId{value='61f29e20622d627cc0e99c19', description='null'}-mongodb-instance-3:27017 id=24 state=BLOCKED
          - waiting to lock <0x0b2400cb> (a com.mongodb.internal.connection.MultiServerCluster)
           owned by cluster-ClusterId{value='61f29e20622d627cc0e99c19', description='null'}-mongodb-instance-1:27017 id=20
          at com.mongodb.internal.connection.AbstractMultiServerCluster.onChange(AbstractMultiServerCluster.java:175)
          at com.mongodb.internal.connection.AbstractMultiServerCluster.access$100(AbstractMultiServerCluster.java:50)
          at com.mongodb.internal.connection.AbstractMultiServerCluster$DefaultServerDescriptionChangedListener.serverDescriptionChanged(AbstractMultiServerCluster.java:139)
          at com.mongodb.internal.connection.DefaultSdamServerDescriptionManager.updateDescription(DefaultSdamServerDescriptionManager.java:127)
          at com.mongodb.internal.connection.DefaultSdamServerDescriptionManager.update(DefaultSdamServerDescriptionManager.java:81)
          at com.mongodb.internal.connection.DefaultServerMonitor$ServerMonitorRunnable.run(DefaultServerMonitor.java:165)
          at java.base@11.0.12/java.lang.Thread.run(Thread.java:829)    Locked synchronizers: count = 1
            - java.util.concurrent.locks.ReentrantLock$NonfairSync@32e56896
      

      Second blocked thread:

       

       

      cluster-ClusterId{value='61f29e20622d627cc0e99c19', description='null'}-mongodb-instance-2:27017 id=22 state=BLOCKED
          - waiting to lock <0x0b2400cb> (a com.mongodb.internal.connection.MultiServerCluster)
           owned by cluster-ClusterId{value='61f29e20622d627cc0e99c19', description='null'}-mongodb-instance-1:27017 id=20
          at com.mongodb.internal.connection.AbstractMultiServerCluster.onChange(AbstractMultiServerCluster.java:175)
          at com.mongodb.internal.connection.AbstractMultiServerCluster.access$100(AbstractMultiServerCluster.java:50)
          at com.mongodb.internal.connection.AbstractMultiServerCluster$DefaultServerDescriptionChangedListener.serverDescriptionChanged(AbstractMultiServerCluster.java:139)
          at com.mongodb.internal.connection.DefaultSdamServerDescriptionManager.updateDescription(DefaultSdamServerDescriptionManager.java:127)
          at com.mongodb.internal.connection.DefaultSdamServerDescriptionManager.update(DefaultSdamServerDescriptionManager.java:81)
          at com.mongodb.internal.connection.DefaultServerMonitor$ServerMonitorRunnable.run(DefaultServerMonitor.java:165)
          at java.base@11.0.12/java.lang.Thread.run(Thread.java:829)    Locked synchronizers: count = 1
            - java.util.concurrent.locks.ReentrantLock$NonfairSync@166cbc02
      

       

      We renamed the host in the thread dump, but kept the instance number.

      The only way to fix this problem is to restart our processes.

      We discovered this problem since we upgraded from MongoDB Driver 4.2.3 (Spring Boot 2.5.6) to MongoDB Driver 4.4.0 (Spring Boot 2.6.2) and later MongoDB Driver 4.4.1 (Spring Boot 2.6.3)

      Our MongoDB Server instances run on version 4.2.2, what should be compatible according to https://docs.mongodb.com/drivers/java/sync/current/compatibility/ 

      How to Reproduce

      Unfortunately, we are not able to reproduce these phenomena. We just see them in irregular intervals and have to restart our services. 

      Additional Background

      • We use the synchronous Java Driver
      • Java 11
      • Linux 

      We can provide more information if needed.

      Thanks for your help!

       

        1. threaddump-anonymized.tdump
          431 kB
        2. image-2022-02-03-17-15-43-755.png
          image-2022-02-03-17-15-43-755.png
          21 kB
        3. image-2022-02-03-17-15-21-102.png
          image-2022-02-03-17-15-21-102.png
          58 kB

            Assignee:
            valentin.kovalenko@mongodb.com Valentin Kavalenka
            Reporter:
            robert.zilke@outlook.com Robert Zilke
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: