-
Type: Question
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.4.5
-
Component/s: None
-
None
4-node replica set, replication lag grows up on secondary about 6 hours
Previously (4.4.2/4.4.4), stop and restart host used as workaround
Steps:
- stop mongod service with
sudo systemctl stop mongod
- restart host
After restarting mongod service cannot start with some logs
{"t":{"$date":"2021-04-29T10:06:47.728+03:00"},"s":"I", "c":"FTDC", "id":20625, "ctx":"initandlisten","msg":"Initializing full-time diagnostic data capture","attr":{"dataDirectory":"/opt/mongodb/data/diagnosti c.data"}} {"t":{"$date":"2021-04-29T10:06:47.729+03:00"},"s":"I", "c":"REPL", "id":21529, "ctx":"initandlisten","msg":"Initializing rollback ID","attr":{"rbid":13}} {"t":{"$date":"2021-04-29T10:06:47.729+03:00"},"s":"I", "c":"REPL", "id":501401, "ctx":"initandlisten","msg":"Incrementing the rollback ID after unclean shutdown"} {"t":{"$date":"2021-04-29T10:06:47.729+03:00"},"s":"I", "c":"REPL", "id":21532, "ctx":"initandlisten","msg":"Incremented the rollback ID","attr":{"rbid":14}} {"t":{"$date":"2021-04-29T10:06:47.730+03:00"},"s":"I", "c":"REPL", "id":21544, "ctx":"initandlisten","msg":"Recovering from stable timestamp","attr":{"stableTimestamp":{"$timestamp":{"t":1619665481,"i":5791}} ,"topOfOplog":{"ts":{"$timestamp":{"t":1619670282,"i":569}},"t":262},"appliedThrough":{"ts":{"$timestamp":{"t":1619665481,"i":5791}},"t":262},"oplogTruncateAfterPoint":{"$timestamp":{"t":0,"i":0}}}} {"t":{"$date":"2021-04-29T10:06:47.730+03:00"},"s":"I", "c":"REPL", "id":21545, "ctx":"initandlisten","msg":"Starting recovery oplog application at the stable timestamp","attr":{"stableTimestamp":{"$timestamp" :{"t":1619665481,"i":5791}}}} {"t":{"$date":"2021-04-29T10:06:47.730+03:00"},"s":"I", "c":"REPL", "id":21550, "ctx":"initandlisten","msg":"Replaying stored operations from startPoint (inclusive) to endPoint (inclusive)","attr":{"startPoint ":{"$timestamp":{"t":1619665481,"i":5791}},"endPoint":{"$timestamp":{"t":1619670282,"i":569}}}} {"t":{"$date":"2021-04-29T10:06:48.011+03:00"},"s":"I", "c":"FTDC", "id":20631, "ctx":"ftdc","msg":"Unclean full-time diagnostic data capture shutdown detected, found interim file, some metrics may have been l ost","attr":{"error":{"code":0,"codeName":"OK"}}} {"t":{"$date":"2021-04-29T10:07:38.482+03:00"},"s":"F", "c":"REPL", "id":21238, "ctx":"ReplWriterWorker-14","msg":"Writer worker caught exception","attr":{"error":"DuplicateKey{ keyPattern: { _id: 1 }, keyValu e: { _id: { id: UUID(\"6fc79d14-fbfd-4dbb-9119-f4055647bd7d\"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) } } }: E11000 duplicate key error collection: config.transactions inde x: _id_ dup key: { _id: { id: UUID(\"6fc79d14-fbfd-4dbb-9119-f4055647bd7d\"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) } }","oplogEntry":{"ts":{"$timestamp":{"t":1619665666,"i ":9781}},"t":262,"v":2,"op":"u","ns":"config.transactions","wall":{"$date":"2021-04-29T03:07:46.794Z"},"fromMigrate":false,"o":{"_id":{"id":{"$uuid":"6fc79d14-fbfd-4dbb-9119-f4055647bd7d"},"uid":{"$binary":{"base64": "47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=","subType":"0"}}},"txnNum":134438,"lastWriteOpTime":{"ts":{"$timestamp":{"t":1619665666,"i":9781}},"t":262},"lastWriteDate":{"$date":"2021-04-29T03:07:46.794Z"}},"o2":{"_ id":{"id":{"$uuid":"6fc79d14-fbfd-4dbb-9119-f4055647bd7d"},"uid":{"$binary":{"base64":"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=","subType":"0"}}}},"b":true}}}
How can we make it works again?
- related to
-
WT-7426 Set write generation number when the page image gets created
- Closed