Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Won't Fix
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- sharding-product-sync

Assigned Teams:

Sharding NYC
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Background

We have a HELP ticket scenario when a faulty disk made all disk I/O to be blocked for indefinite time, which caused the process to enter the uninterruptible sleep state. The main culprit of this state is that when SIGKILL is issued the process is not killed because it's blocked on a syscall. The user killed the primary mongod server with -9 but it was not killed. After 13 minutes after SIGKILL, the user had to shut down the Amazon EC2 instance to break down hung sessions from multiple mongos proxies to the faulty primary. This happened with v4.0.

More background on why `kill -9` will never kill the process in the uninterruptible sleep state:
https://askubuntu.com/questions/59811/kill-pid-not-really-killing-the-process-why

Various tricks people use to simulate the uninterruptible sleep state:
https://unix.stackexchange.com/questions/134888/simulate-an-unkillable-process-in-d-state

More background on why kernel prevents killing process in this kind of state:
https://stackoverflow.com/questions/223644/what-is-an-uninterruptible-process

and LWN article: https://lwn.net/Articles/288056/

Implementation details

The process of setting the device mapper has multiple steps:
1. Create a file and map the file to a new loopback device using `losetup`
2. Use `dmsetup` to map the loopback device to a device mapper device
3. Use `mkfs.ext4` to format it
4. Create the directory structure for replica set
5. Mount one of the replicas (rs1) data directory to the mapper device
6. Use `dmsetup suspend` to simulate disk outage

This procedure was already done manually and fully reproduced the production outage.

Not the same as network proxy

Please note that we already have mongobridge to simulate network errors, however this is not the same. The mongo bridge cannot make the outage in the mongod, it can only make the client to think that mongod has an outage, which is very different from the scenario in HELP ticket.

is depended on by

SERVER-55487 Use new device mapper testing facility to reproduce primary disk outage

Closed

Assignee:: [DO NOT USE] Backlog - Sharding NYC
Reporter:: Andrew Shuvalov (Inactive)
Participants:: [DO NOT USE] Backlog - Sharding NYC, Andrew Shuvalov
Votes:: 0 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Mar 24 2021 03:16:57 PM UTC
Updated:: Dec 06 2022 01:29:25 AM UTC
Resolved:: Apr 23 2021 02:14:18 PM UTC

Details

Description

Background

Implementation details

Not the same as network proxy

Attachments

Issue Links

Activity

People

Dates