Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 1.2.0
Affects Version/s: None
Component/s: None
Labels:
- size-small

Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

Due to the way the monitors clone the topology, release the read lock, update the topology, and then acquire the write lock and replace it wholesale, there is a race condition that could allow one monitor to overwrite the results of another, keeping a ServerDescription out of date until the next heartbeat.

Repro:

#[cfg_attr(feature = "tokio-runtime", tokio::test(threaded_scheduler))]
#[cfg_attr(feature = "async-std-runtime", async_std::test)]
#[function_name::named]
async fn repro() {
    let _guard: RwLockWriteGuard<()> = LOCK.run_exclusively().await;

    let client = EventClient::new().await;
    for _ in 0..5 {
        client
            .database("test")
            .run_command(doc! { "ping": 1 }, None)
            .await
            .unwrap();
    }

    let mut tallies: HashMap<StreamAddress, u32> = HashMap::new();
    for event in client.get_command_started_events("find") {
        *tallies.entry(event.connection.address.clone()).or_insert(0) += 1;
    }

    assert_eq!(tallies.len(), 2);
}

Here is some debug output from running this. Note how the first mongos flips back to unknown after the second monitor updates the topology. It remains this way for the whole test and so only one mongos ever gets selected.

running 1 test
localhost:27017: performing check
localhost:27018: performing check
localhost:27018: check done, updating
localhost:27017: check done, updating
got lock, updating state
pre update servers:
localhost:27017 Unknown
localhost:27018 Unknown
pre update servers:
localhost:27017 Mongos
localhost:27018 Unknown
got lock, updating state
pre update servers:
localhost:27017 Mongos
localhost:27018 Unknown
pre update servers:
localhost:27017 Unknown
localhost:27018 Mongos
thread 'test::coll::repro' panicked at 'assertion failed: `(left == right)`
  left: `0`,
 right: `2`', src/lib.rs:1:1

Assignee:: Patrick Freed
Reporter:: Patrick Freed
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: Nov 13 2020 02:13:06 AM UTC
Updated:: Oct 28 2023 11:00:43 AM UTC
Resolved:: Nov 24 2020 04:59:47 PM UTC
Confidence Status Last Update:: 16/Nov/20 8:12 PM

Details

Description

Attachments

Activity

People

Dates