-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.4.6
-
Component/s: Performance, Querying
-
None
-
ALL
Every 60 seconds, Mongo stops processing all requests and pauses for 12-15 seconds.
Running on Amazon EC2 with Windows Server 2012. In the attached screenshot, you can see the pause with the drop in bandwidth every 60 seconds.
Here's the mongostat output. The pausing occurs from 17 seconds until 30 seconds of every minute. That time period doesn't appear in the mongostat output at all.
insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 232 702 535 22 325 957|0 0 508g 1017g 65.1g 8532 troidb2:49.6% 0 14|1 1|1 443k 33m 363 troi-cluster PRI 11:23:06 366 931 777 19 415 1298|0 0 508g 1017g 65.2g 19986 troidb2:66.3% 0 2|0 1|0 591k 25m 358 troi-cluster PRI 11:23:07 170 538 397 16 216 726|0 0 508g 1017g 65.2g 43289 troidb2:42.8% 0 6|0 1|0 309k 25m 362 troi-cluster PRI 11:23:08 140 499 331 20 231 599|0 0 508g 1017g 65.1g 10240 troidb2:46.9% 0 2|0 0|1 310k 33m 359 troi-cluster PRI 11:23:09 189 555 430 22 254 730|0 0 508g 1017g 65.1g 10777 troidb2:36.4% 0 0|0 0|0 345k 20m 359 troi-cluster PRI 11:23:10 189 550 435 11 259 720|0 0 508g 1017g 65.1g 12600 troidb2:47.3% 0 2|0 0|1 414k 23m 361 troi-cluster PRI 11:23:11 134 454 299 38 252 523|0 0 508g 1017g 65.1g 11131 troidb2:52.0% 0 0|0 0|1 244k 24m 359 troi-cluster PRI 11:23:12 228 632 505 19 281 911|0 0 508g 1017g 65.1g 9248 troidb2:43.9% 0 0|0 0|0 398k 19m 357 troi-cluster PRI 11:23:14 214 554 441 17 257 770|0 0 508g 1017g 65.1g 10309 troidb2:42.0% 0 0|0 1|0 338k 17m 360 troi-cluster PRI 11:23:15 134 401 290 25 184 620|0 0 508g 1017g 65.2g 4737 troidb2:84.7% 0 1|0 0|1 229k 16m 362 troi-cluster PRI 11:23:16 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 82 266 193 7 80 411|0 0 508g 1017g 65.1g 2619 troidb2:24.5% 0 42|1 0|1 189k 11m 361 troi-cluster PRI 11:23:32 318 1555 810 27 392 2770|0 1 508g 1017g 65.2g 24367 .:96.3% 0 57|0 0|1 1m 42m 354 troi-cluster PRI 11:23:33 446 1180 995 21 471 1632|0 0 508g 1017g 65.1g 19895 troidb2:55.1% 0 76|0 2|0 783k 48m 351 troi-cluster PRI 11:23:34 519 1224 1088 22 433 1797|0 0 508g 1017g 65.1g 16865 troidb2:53.9% 0 83|0 0|1 825k 40m 357 troi-cluster PRI 11:23:35 478 1253 1066 30 421 1780|0 0 508g 1017g 65.2g 13288 troidb2:46.8% 0 52|0 0|1 802k 66m 350 troi-cluster PRI 11:23:36 371 981 812 22 320 1375|0 0 508g 1017g 65.2g 16197 troidb2:48.3% 0 45|0 0|1 628k 51m 357 troi-cluster PRI 11:23:37 544 1323 1156 22 466 1880|0 0 508g 1017g 65.1g 18658 troidb2:60.9% 0 11|0 0|1 889k 46m 359 troi-cluster PRI 11:23:38 285 689 548 16 284 933|0 0 508g 1017g 65.1g 21010 troidb2:44.6% 0 10|0 0|0 419k 36m 355 troi-cluster PRI 11:23:39 200 532 438 13 307 760|0 0 508g 1017g 65.2g 24531 troidb2:36.6% 0 0|0 1|0 294k 21m 354 troi-cluster PRI 11:23:40 196 515 435 22 373 757|0 0 508g 1017g 65.1g 11106 troidb2:51.7% 0 0|0 0|1 419k 18m 356 troi-cluster PRI 11:23:41 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 182 494 398 13 307 668|0 0 508g 1017g 65.1g 10926 troidb2:39.9% 0 0|0 0|0 367k 16m 355 troi-cluster PRI 11:23:43 102 319 232 18 331 502|0 0 508g 1017g 65.1g 7334 troidb2:47.5% 0 5|0 0|1 224k 17m 358 troi-cluster PRI 11:23:44 176 510 364 17 371 669|0 0 508g 1017g 65.1g 13205 troidb2:55.9% 0 0|0 1|0 300k 20m 353 troi-cluster PRI 11:23:45 190 546 430 13 323 763|0 0 508g 1017g 65.1g 11549 troidb2:64.4% 0 0|0 0|0 330k 22m 358 troi-cluster PRI 11:23:46 167 541 388 24 324 715|0 0 508g 1017g 65.1g 7781 troidb2:47.9% 0 0|0 0|0 335k 25m 355 troi-cluster PRI 11:23:47 139 430 310 32 317 588|0 0 508g 1017g 65.1g 8335 troidb2:45.8% 0 1|0 0|1 269k 16m 355 troi-cluster PRI 11:23:48 302 814 658 38 422 1102|0 0 508g 1017g 65.1g 16742 troidb2:60.0% 0 0|0 1|0 577k 30m 350 troi-cluster PRI 11:23:49 192 668 461 20 345 812|0 0 508g 1017g 65.1g 12057 troidb2:47.4% 0 15|0 1|0 373k 32m 356 troi-cluster PRI 11:23:50 230 669 521 32 360 860|0 0 508g 1017g 65.1g 14420 troidb2:58.9% 0 0|0 1|0 398k 33m 353 troi-cluster PRI 11:23:51 164 477 385 14 291 670|0 0 508g 1017g 65.1g 10433 troidb2:57.3% 0 0|0 2|0 304k 24m 353 troi-cluster PRI 11:23:52 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 122 395 279 29 334 545|0 0 508g 1017g 65.1g 9525 troidb2:52.3% 0 0|0 0|0 245k 15m 356 troi-cluster PRI 11:23:53 173 557 398 32 424 715|0 0 508g 1017g 65.1g 9649 troidb2:49.3% 0 0|0 1|0 391k 24m 352 troi-cluster PRI 11:23:54 229 591 497 12 372 903|0 0 508g 1017g 65.1g 12203 troidb2:45.3% 0 1|0 0|1 358k 17m 352 troi-cluster PRI 11:23:55 156 486 352 11 314 719|0 0 508g 1017g 65.1g 11549 troidb2:48.7% 0 3|0 0|1 319k 18m 356 troi-cluster PRI 11:23:56 198 531 437 15 299 734|0 0 508g 1017g 65.1g 8411 troidb2:44.5% 0 0|0 0|0 382k 14m 353 troi-cluster PRI 11:23:57 217 607 496 22 345 848|0 0 508g 1017g 65.1g 11926 troidb2:53.8% 0 0|0 0|0 358k 27m 356 troi-cluster PRI 11:23:58 120 403 269 23 306 534|0 0 508g 1017g 65.1g 8570 troidb2:47.9% 0 4|0 1|0 235k 14m 354 troi-cluster PRI 11:23:59 138 460 323 39 328 601|0 0 508g 1017g 65.1g 7531 troidb2:44.5% 0 14|1 0|1 345k 21m 351 troi-cluster PRI 11:24:01 162 579 393 28 319 722|0 0 508g 1017g 65.1g 16108 troidb2:51.9% 0 0|0 0|0 338k 29m 354 troi-cluster PRI 11:24:02 144 494 329 21 251 583|0 0 508g 1017g 65.1g 12439 troidb2:57.8% 0 2|0 0|1 290k 17m 355 troi-cluster PRI 11:24:03 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 159 503 363 23 279 677|0 0 508g 1017g 65.1g 11716 troidb2:51.4% 0 1|0 1|0 346k 26m 354 troi-cluster PRI 11:24:04 263 770 632 16 316 1019|0 0 508g 1017g 65.1g 8721 troidb2:38.4% 0 4|0 0|1 533k 28m 352 troi-cluster PRI 11:24:05 220 589 437 13 258 823|0 0 508g 1017g 65.1g 10337 troidb2:41.5% 0 3|0 0|1 405k 29m 355 troi-cluster PRI 11:24:06 226 640 500 21 271 882|0 0 508g 1017g 65.1g 12170 troidb2:47.9% 0 0|0 1|0 423k 23m 350 troi-cluster PRI 11:24:07 170 523 397 15 273 686|0 0 508g 1017g 65.1g 8325 troidb2:38.7% 0 0|0 0|1 299k 31m 354 troi-cluster PRI 11:24:08 191 539 425 16 226 746|0 0 508g 1017g 65.1g 14228 troidb2:49.0% 0 2|0 1|0 364k 25m 353 troi-cluster PRI 11:24:09 166 496 387 12 263 681|0 0 508g 1017g 65.1g 12670 troidb2:51.1% 0 0|0 0|0 377k 21m 353 troi-cluster PRI 11:24:10 70 245 171 10 147 326|0 0 508g 1017g 65.1g 26227 troidb2:28.3% 0 17|0 0|0 237k 11m 350 troi-cluster PRI 11:24:11 175 554 394 16 269 698|0 0 508g 1017g 65.2g 16431 troidb2:47.1% 0 0|0 0|0 393k 25m 348 troi-cluster PRI 11:24:12 202 569 458 9 251 786|0 0 508g 1017g 65.1g 11302 troidb2:42.1% 0 1|0 1|0 323k 22m 347 troi-cluster PRI 11:24:13 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 266 752 591 16 273 994|0 0 508g 1017g 65.1g 13990 troidb2:43.0% 0 0|0 0|0 482k 30m 347 troi-cluster PRI 11:24:14 167 530 383 11 214 742|0 0 508g 1017g 65.1g 7869 troidb2:81.4% 0 21|0 4|0 314k 23m 351 troi-cluster PRI 11:24:15 173 538 404 16 185 796|0 0 508g 1017g 65.1g 5682 troidb2:82.3% 0 33|2 0|2 414k 33m 346 troi-cluster PRI 11:24:31 390 1634 924 21 454 2898|0 1 508g 1017g 65.1g 22906 .:94.6% 0 33|0 0|1 851k 47m 348 troi-cluster PRI 11:24:32 399 1081 917 33 390 1489|0 0 508g 1017g 65.1g 21923 troidb2:59.7% 0 54|0 0|1 778k 46m 354 troi-cluster PRI 11:24:33 425 1151 951 39 505 1578|0 0 508g 1017g 65.1g 17306 troidb2:61.0% 0 41|0 0|1 714k 59m 342 troi-cluster PRI 11:24:35 436 1220 1002 40 488 1704|0 0 508g 1017g 65.2g 13924 troidb2:52.2% 0 13|0 0|1 738k 76m 347 troi-cluster PRI 11:24:36 199 570 435 12 266 863|0 0 508g 1017g 65.2g 13468 troidb2:34.7% 0 0|0 1|0 340k 30m 351 troi-cluster PRI 11:24:37 255 686 576 28 428 948|0 0 508g 1017g 65.1g 13341 troidb2:58.9% 0 0|0 0|0 466k 25m 352 troi-cluster PRI 11:24:38 109 336 245 45 374 523|0 0 508g 1017g 65.1g 11436 troidb2:56.4% 0 5|0 0|1 253k 14m 351 troi-cluster PRI 11:24:39 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 177 499 392 15 385 691|0 0 508g 1017g 65.1g 10683 troidb2:49.8% 0 0|0 0|1 321k 16m 352 troi-cluster PRI 11:24:40 166 549 411 21 342 679|0 0 508g 1017g 65.1g 14133 troidb2:50.1% 0 0|0 0|0 377k 28m 349 troi-cluster PRI 11:24:41 97 325 218 8 190 416|0 0 508g 1017g 65.2g 27273 troidb2:19.0% 0 17|0 1|0 182k 11m 348 troi-cluster PRI 11:24:42 257 785 585 35 385 1009|0 0 508g 1017g 65.3g 17501 troidb2:41.0% 0 0|0 0|0 509k 44m 348 troi-cluster PRI 11:24:43 152 527 365 19 420 685|0 0 508g 1017g 65.1g 13711 troidb2:57.6% 0 4|0 1|0 361k 22m 349 troi-cluster PRI 11:24:44 226 665 509 22 360 941|0 0 508g 1017g 65.1g 14229 troidb2:51.9% 0 0|0 2|0 403k 30m 356 troi-cluster PRI 11:24:46 138 472 317 10 322 682|0 0 508g 1017g 65.1g 10797 troidb2:45.9% 0 0|0 0|0 248k 26m 350 troi-cluster PRI 11:24:47 269 792 615 27 479 1016|0 0 508g 1017g 65.2g 12482 troidb2:48.3% 0 0|0 0|0 469k 31m 354 troi-cluster PRI 11:24:48 206 627 486 34 393 828|0 0 508g 1017g 65.2g 13540 troidb2:56.2% 0 1|0 0|2 438k 29m 352 troi-cluster PRI 11:24:49 108 367 252 25 374 505|0 0 508g 1017g 65.1g 11120 troidb2:62.6% 0 0|0 1|0 218k 16m 354 troi-cluster PRI 11:24:50 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 161 492 365 32 419 644|0 0 508g 1017g 65.1g 13899 troidb2:65.8% 0 5|0 0|1 302k 19m 352 troi-cluster PRI 11:24:51 136 438 316 39 381 575|0 0 508g 1017g 65.2g 11234 troidb2:55.3% 0 13|0 1|0 378k 20m 353 troi-cluster PRI 11:24:52 177 505 391 30 358 752|0 0 508g 1017g 65.2g 14456 troidb2:44.3% 0 1|0 1|0 313k 22m 352 troi-cluster PRI 11:24:54 231 636 503 33 387 883|0 0 508g 1017g 65.1g 13599 troidb2:56.5% 0 0|0 0|0 353k 21m 351 troi-cluster PRI 11:24:55 238 673 538 21 448 1001|0 0 508g 1017g 65.1g 13095 troidb2:39.0% 0 3|0 1|1 450k 30m 348 troi-cluster PRI 11:24:56 145 462 330 40 387 678|0 0 508g 1017g 65.2g 8857 troidb2:55.5% 0 1|0 0|1 290k 17m 351 troi-cluster PRI 11:24:57 127 415 279 24 311 538|0 0 508g 1017g 65.2g 13621 troidb2:43.6% 0 0|0 0|1 262k 17m 353 troi-cluster PRI 11:24:58 54 245 145 8 117 352|0 0 508g 1017g 65.1g 17035 troidb2:56.5% 0 0|0 0|0 194k 17m 352 troi-cluster PRI 11:24:59 101 351 247 15 335 471|0 0 508g 1017g 65.2g 8898 troidb2:48.1% 0 0|0 0|0 244k 19m 352 troi-cluster PRI 11:25:01 163 541 371 24 340 678|0 0 508g 1017g 65.2g 14071 troidb2:51.3% 0 0|0 0|0 319k 21m 353 troi-cluster PRI 11:25:02 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 98 314 209 18 212 407|0 0 508g 1017g 65.2g 10360 troidb2:54.1% 0 4|0 0|1 193k 11m 352 troi-cluster PRI 11:25:03 139 455 322 18 285 626|0 0 508g 1017g 65.2g 11383 troidb2:58.9% 0 0|0 0|0 367k 20m 353 troi-cluster PRI 11:25:04 138 432 310 28 259 565|0 0 508g 1017g 65.2g 8179 troidb2:44.1% 0 1|0 0|1 242k 17m 356 troi-cluster PRI 11:25:05 115 397 276 22 249 626|0 0 508g 1017g 65.2g 12772 troidb2:54.1% 0 0|0 0|0 228k 17m 353 troi-cluster PRI 11:25:06 164 490 363 37 298 706|0 0 508g 1017g 65.2g 13933 troidb2:58.5% 0 0|0 0|0 324k 16m 352 troi-cluster PRI 11:25:07 290 836 645 21 311 1043|0 0 508g 1017g 65.2g 11125 troidb2:51.6% 0 0|0 0|0 453k 37m 355 troi-cluster PRI 11:25:08 102 354 248 15 243 468|0 0 508g 1017g 65.2g 10409 troidb2:47.4% 0 0|0 0|0 184k 15m 353 troi-cluster PRI 11:25:09 313 781 669 10 354 1109|0 0 508g 1017g 65g 11966 troidb2:48.0% 0 0|0 1|0 556k 16m 353 troi-cluster PRI 11:25:10 200 569 445 24 297 772|0 0 508g 1017g 65.1g 6835 troidb2:45.0% 0 0|0 0|0 398k 23m 351 troi-cluster PRI 11:25:11 158 510 355 10 211 647|0 0 508g 1017g 65g 13309 troidb2:45.9% 0 0|0 0|1 274k 20m 354 troi-cluster PRI 11:25:12 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 125 403 295 13 134 529|0 0 508g 1017g 65.1g 22431 troidb2:33.5% 0 14|0 1|0 243k 15m 358 troi-cluster PRI 11:25:13 223 570 472 7 204 817|0 0 508g 1017g 65.1g 24146 troidb2:38.0% 0 15|0 0|1 358k 14m 354 troi-cluster PRI 11:25:14 173 512 399 20 198 721|0 0 508g 1017g 65.1g 10073 troidb2:125.2% 0 25|0 1|1 344k 21m 357 troi-cluster PRI 11:25:15 299 883 708 30 306 1233|0 0 508g 1017g 65.2g 10175 troidb2:89.4% 0 37|0 0|2 645k 30m 352 troi-cluster PRI 11:25:17 69 211 143 3 66 376|0 0 508g 1017g 65g 2177 troidb2:41.1% 0 60|1 0|1 138k 9m 358 troi-cluster PRI 11:25:30 344 1388 831 29 478 2558|0 1 508g 1017g 65.1g 24745 .:98.3% 0 38|0 0|1 803k 32m 356 troi-cluster PRI 11:25:31 388 997 851 27 465 1383|0 0 508g 1017g 65g 17451 troidb2:62.2% 0 61|0 2|1 807k 38m 357 troi-cluster PRI 11:25:32 525 1330 1157 30 472 1788|0 0 508g 1017g 65g 12265 troidb2:51.1% 0 52|1 0|1 930k 61m 352 troi-cluster PRI 11:25:33 410 1057 842 27 417 1453|0 0 508g 1017g 65.1g 14832 troidb2:49.4% 0 7|0 1|0 840k 64m 352 troi-cluster PRI 11:25:34 455 1156 984 28 477 1592|0 0 508g 1017g 65g 15225 troidb2:63.3% 0 1|0 0|1 805k 49m 352 troi-cluster PRI 11:25:35 insert query update delete getmore command flushes mapped vsize res faults locked db idx miss % qr|qw ar|aw netIn netOut conn set repl time 166 527 383 19 349 773|0 0 508g 1017g 65g 9959 troidb2:43.2% 0 7|1 0|1 417k 25m 356 troi-cluster PRI 11:25:36 201 543 428 52 406 802|0 0 508g 1017g 65.1g 14264 troidb2:60.5% 0 0|0 0|1 388k 20m 358 troi-cluster PRI 11:25:37 70 217 164 8 256 345|0 0 508g 1017g 65g 11081 troidb2:50.9% 0 0|0 4|0 139k 6m 355 troi-cluster PRI 11:25:38 122 373 270 20 284 549|0 0 508g 1017g 65g 10036 troidb2:46.2% 0 1|0 0|1 202k 13m 356 troi-cluster PRI 11:25:39
- depends on
-
SERVER-13681 MongoDB stalls during background flush on Windows
- Closed
-
SERVER-13725 Reads & Writes are blocked during Memory Map File Remap on win/solaris
- Closed
-
SERVER-13729 Reads & Writes are blocked during data file allocation on Windows
- Closed
- is duplicated by
-
SERVER-10591 Database is locked (or freezed) when database executes a Datafilesync
- Closed
- is related to
-
SERVER-7973 Mongo should be able to reply to client requests for data in RAM while flushing to disk
- Closed
-
SERVER-13444 Long locked flush without inserts and updates
- Closed
- related to
-
SERVER-9868 heartbeats not responded to during mmap flushing on Windows
- Closed