-
Type: Bug
-
Resolution: Cannot Reproduce
-
Priority: Unknown
-
None
-
Affects Version/s: 4.5.0
-
Component/s: Connection Management
We're facing a connection issue with our MongoDb replica set and our Java application and I do not know where to search for the problem.
The setup is: MongoDb running in Version 5.0.6 on all of our four Debian 11 root servers (no docker or any other containerization).
One server runs a MongoDb instance separately in standalone mode.
The other three servers are all put together as replica set, one is primary, one secondary, one arbiter.
rs.status() is:
{ "set" : "wisdom", "date" : ISODate("2022-03-29T20:22:00.525Z"), "myState" : 2, "term" : NumberLong(1), "syncSourceHost" : "h1.example.com:27017", "syncSourceId" : 0, "heartbeatIntervalMillis" : NumberLong(2000), "majorityVoteCount" : 2, "writeMajorityCount" : 2, "votingMembersCount" : 3, "writableVotingMembersCount" : 2, "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1648585320, 1), "t" : NumberLong(1) }, "lastCommittedWallTime" : ISODate("2022-03-29T20:22:00.228Z"), "readConcernMajorityOpTime" : { "ts" : Timestamp(1648585320, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1648585320, 1), "t" : NumberLong(1) }, "durableOpTime" : { "ts" : Timestamp(1648585320, 1), "t" : NumberLong(1) }, "lastAppliedWallTime" : ISODate("2022-03-29T20:22:00.228Z"), "lastDurableWallTime" : ISODate("2022-03-29T20:22:00.228Z") }, "lastStableRecoveryTimestamp" : Timestamp(1648585298, 17), "members" : [ { "_id" : 0, "name" : "h1.example.com:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 337679, "optime" : { "ts" : Timestamp(1648585319, 3), "t" : NumberLong(1) }, "optimeDurable" : { "ts" : Timestamp(1648585319, 3), "t" : NumberLong(1) }, "optimeDate" : ISODate("2022-03-29T20:21:59Z"), "optimeDurableDate" : ISODate("2022-03-29T20:21:59Z"), "lastAppliedWallTime" : ISODate("2022-03-29T20:21:59.397Z"), "lastDurableWallTime" : ISODate("2022-03-29T20:21:59.397Z"), "lastHeartbeat" : ISODate("2022-03-29T20:21:59.889Z"), "lastHeartbeatRecv" : ISODate("2022-03-29T20:21:59.970Z"), "pingMs" : NumberLong(25), "lastHeartbeatMessage" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "", "electionTime" : Timestamp(1648246078, 2), "electionDate" : ISODate("2022-03-25T22:07:58Z"), "configVersion" : 4, "configTerm" : 1 }, { "_id" : 1, "name" : "h2.example.com:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 338620, "optime" : { "ts" : Timestamp(1648585320, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2022-03-29T20:22:00Z"), "lastAppliedWallTime" : ISODate("2022-03-29T20:22:00.228Z"), "lastDurableWallTime" : ISODate("2022-03-29T20:22:00.228Z"), "syncSourceHost" : "h2.example.com:27017", "syncSourceId" : 0, "infoMessage" : "", "configVersion" : 4, "configTerm" : 1, "self" : true, "lastHeartbeatMessage" : "" }, { "_id" : 2, "name" : "h3.example.com:27017", "health" : 1, "state" : 7, "stateStr" : "ARBITER", "uptime" : 121552, "lastHeartbeat" : ISODate("2022-03-29T20:21:59.116Z"), "lastHeartbeatRecv" : ISODate("2022-03-29T20:21:59.733Z"), "pingMs" : NumberLong(25), "lastHeartbeatMessage" : "", "syncSourceHost" : "", "syncSourceId" : -1, "infoMessage" : "", "configVersion" : 4, "configTerm" : 1 } ], "ok" : 1, "$clusterTime" : { "clusterTime" : Timestamp(1648585320, 1), "signature" : { "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="), "keyId" : NumberLong(0) } }, "operationTime" : Timestamp(1648585320, 1) }
Our main application is running on other Debian servers within docker and uses tomcat:8.5.73-jdk11-corretto as basis. We use the following mongo java driver version:
<dependency> <groupId>org.mongodb</groupId> <artifactId>mongodb-driver-sync</artifactId> <version>4.5.0</version> </dependency>
Within our application we have two database connection managers which both hold their own MongoClient. One is connecting to the standalone MongoDb, the other one to the replica set. Both use mongodb URIs which look like this:
mongodb://<IP-HERE>:27017/?waitQueueMultiple=10
Today we were facing the situation that suddenly our application had a memory hiccup, an OOM exception occurred:
Exception in thread "Thread-130174" java.lang.OutOfMemoryError: Java heap space
Directly 5sec later, the application wasn't able to connect to the replicaSet any more and the following error messages were given out again and again (here two example logs which were directly following each other):
Exception in thread "Thread-129358" com.mongodb.MongoInternalException: The responseTo (13430453) in the response does not match the requestId (13430461) in the request
at com.mongodb.internal.connection.ReplyMessage.<init>(ReplyMessage.java:65)
at com.mongodb.internal.connection.ReplyMessage.<init>(ReplyMessage.java:43)
at com.mongodb.internal.connection.InternalStreamConnection.getCommandResult(InternalStreamConnection.java:540)
at com.mongodb.internal.connection.InternalStreamConnection.receiveCommandMessageResponse(InternalStreamConnection.java:425)
at com.mongodb.internal.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:342)
at com.mongodb.internal.connection.UsageTrackingInternalConnection.sendAndReceive(UsageTrackingInternalConnection.java:116)
at com.mongodb.internal.connection.DefaultConnectionPool$PooledConnection.sendAndReceive(DefaultConnectionPool.java:643)
at com.mongodb.internal.connection.CommandProtocolImpl.execute(CommandProtocolImpl.java:71)
at com.mongodb.internal.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:240)
at com.mongodb.internal.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:226)
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:126)
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:116)
at com.mongodb.internal.connection.DefaultServer$OperationCountTrackingConnection.command(DefaultServer.java:345)
at com.mongodb.internal.operation.CommandOperationHelper.createReadCommandAndExecute(CommandOperationHelper.java:230)
at com.mongodb.internal.operation.FindOperation.lambda$execute$1(FindOperation.java:666)
at com.mongodb.internal.operation.OperationHelper.lambda$withSourceAndConnection$2(OperationHelper.java:575)
at com.mongodb.internal.operation.OperationHelper.withSuppliedResource(OperationHelper.java:600)
at com.mongodb.internal.operation.OperationHelper.lambda$withSourceAndConnection$3(OperationHelper.java:574)
at com.mongodb.internal.operation.OperationHelper.withSuppliedResource(OperationHelper.java:600)
at com.mongodb.internal.operation.OperationHelper.withSourceAndConnection(OperationHelper.java:573)
at com.mongodb.internal.operation.FindOperation.lambda$execute$2(FindOperation.java:661)
at com.mongodb.internal.async.function.RetryingSyncSupplier.get(RetryingSyncSupplier.java:65)
at com.mongodb.internal.operation.FindOperation.execute(FindOperation.java:692)
at com.mongodb.internal.operation.FindOperation.execute(FindOperation.java:86)
at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:191)
at com.mongodb.client.internal.MongoIterableImpl.execute(MongoIterableImpl.java:135)
at com.mongodb.client.internal.MongoIterableImpl.iterator(MongoIterableImpl.java:92)
at com.mongodb.client.internal.MongoIterableImpl.forEach(MongoIterableImpl.java:121)
at com.mongodb.client.internal.MappingIterable.forEach(MappingIterable.java:59)
at com.mongodb.client.internal.MappingIterable.into(MappingIterable.java:69)
at com.example.persistence.readManyAsList(SxDao.java:113)
and
Exception in thread "Thread-129766" com.mongodb.MongoInternalException: The responseTo (13430461) in the response does not match the requestId (13430469) in the request
at com.mongodb.internal.connection.ReplyMessage.<init>(ReplyMessage.java:65)
at com.mongodb.internal.connection.ReplyMessage.<init>(ReplyMessage.java:43)
at com.mongodb.internal.connection.InternalStreamConnection.getCommandResult(InternalStreamConnection.java:540)
at com.mongodb.internal.connection.InternalStreamConnection.receiveCommandMessageResponse(InternalStreamConnection.java:425)
at com.mongodb.internal.connection.InternalStreamConnection.sendAndReceive(InternalStreamConnection.java:342)
at com.mongodb.internal.connection.UsageTrackingInternalConnection.sendAndReceive(UsageTrackingInternalConnection.java:116)
at com.mongodb.internal.connection.DefaultConnectionPool$PooledConnection.sendAndReceive(DefaultConnectionPool.java:643)
at com.mongodb.internal.connection.CommandProtocolImpl.execute(CommandProtocolImpl.java:71)
at com.mongodb.internal.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:240)
at com.mongodb.internal.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:226)
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:126)
at com.mongodb.internal.connection.DefaultServerConnection.command(DefaultServerConnection.java:116)
at com.mongodb.internal.connection.DefaultServer$OperationCountTrackingConnection.command(DefaultServer.java:345)
at com.mongodb.internal.operation.CommandOperationHelper.createReadCommandAndExecute(CommandOperationHelper.java:230)
at com.mongodb.internal.operation.FindOperation.lambda$execute$1(FindOperation.java:666)
at com.mongodb.internal.operation.OperationHelper.lambda$withSourceAndConnection$2(OperationHelper.java:575)
at com.mongodb.internal.operation.OperationHelper.withSuppliedResource(OperationHelper.java:600)
at com.mongodb.internal.operation.OperationHelper.lambda$withSourceAndConnection$3(OperationHelper.java:574)
at com.mongodb.internal.operation.OperationHelper.withSuppliedResource(OperationHelper.java:600)
at com.mongodb.internal.operation.OperationHelper.withSourceAndConnection(OperationHelper.java:573)
at com.mongodb.internal.operation.FindOperation.lambda$execute$2(FindOperation.java:661)
at com.mongodb.internal.async.function.RetryingSyncSupplier.get(RetryingSyncSupplier.java:65)
at com.mongodb.internal.operation.FindOperation.execute(FindOperation.java:692)
at com.mongodb.internal.operation.FindOperation.execute(FindOperation.java:86)
at com.mongodb.client.internal.MongoClientDelegate$DelegateOperationExecutor.execute(MongoClientDelegate.java:191)
at com.mongodb.client.internal.MongoIterableImpl.execute(MongoIterableImpl.java:135)
at com.mongodb.client.internal.MongoIterableImpl.iterator(MongoIterableImpl.java:92)
at com.mongodb.client.internal.MongoIterableImpl.forEach(MongoIterableImpl.java:121)
at com.mongodb.client.internal.MappingIterable.forEach(MappingIterable.java:59)
at com.mongodb.client.internal.MappingIterable.into(MappingIterable.java:69)
at com.example.persistence.readManyAsList(SxDao.java:113)
Please notice the responseTo and requestId numbers of the subsequent message:
responseTo (13430453) in the response does not match the requestId (13430461) responseTo (13430461) in the response does not match the requestId (13430469)
The responseTo of the following log message was always referencing the requestId from before.
This type of error message was given out since that point for any try to get or store data within the replica set via its database connection manager.
Interestingly the other database connection managers, which connects to the standalone MongoDb instance, wasn't affected by the OOM exception. It was still reading and writing from/to it without any problem.
In the end we restarted our application and everything was working again. But of course we need to find out what the main problem is.
I assume it has something to do with our replica set and the OOM hiccup our app had. Does anyone has a hint what we might need to look into to get this fixed?