Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-88159

mongo::Mutex masks TSAN's ability to detect a lock order inversion

    • Service Arch
    • Fully Compatible
    • ALL
    • v8.0, v7.0, v6.0, v5.0
    • Hide

      Add a unittest like this that contains a simple lock-order-inversion:

      +TEST(GeorgeTest, LockOrderInversion) {
      +#ifndef MONGO_CONFIG_USE_RAW_LATCHES
      +std::cout << "YYYY Using mongo::Mutex" << std::endl;
      +#else
      +std::cout << "YYYY Using raw std::mutex" << std::endl;
      +#endif
      +    LOGV2(24148, "XXXX IN George Test");
      +    auto m1 = MONGO_MAKE_LATCH("m1");
      +    auto m2 =  MONGO_MAKE_LATCH("m2");
      +    stdx::thread t1([&] {
      +        stdx::lock_guard lg(m1);
      +        LOGV2(24148, "XXXX Got m1 t1");
      +        stdx::lock_guard lg2(m2);
      +        LOGV2(24148, "XXXX Got m2 t1");
      +
      +    });
      +
      +    t1.join();
      +    stdx::thread t2([&] {
      +        stdx::lock_guard lg(m2);
      +        LOGV2(24148, "XXXX Got m2 t2");
      +        stdx::lock_guard lg2(m1);
      +        LOGV2(24148, "XXXX Got m1 t2");
      +
      +    });
      +    t2.join();
      +
      +}
      +
      
      

      Then, create two TSAN compile configurations, one with and one without diagnostic latches:

      ./buildscripts/scons.py --dbg=on --opt=on --use-libunwind=off --link-model=dynamic --variables-files=./etc/scons/mongodbtoolchain_stable_clang.vars --ninja ICECC=icecc CCACHE=ccache --sanitize=thread --allocator=system --use-diagnostic-latches=on NINJA_PREFIX=tsan-latches
      

      and

      ./buildscripts/scons.py --dbg=on --opt=on --use-libunwind=off --link-model=dynamic --variables-files=./etc/scons/mongodbtoolchain_stable_clang.vars --ninja ICECC=icecc CCACHE=ccache --sanitize=thread --allocator=system NINJA_PREFIX=tsan
      

      Then, run the test under each configuration. On my VWS, with the no-diagnostic-latches variant/raw std::mutex, TSAN determinsitically identifies the data race and aborts the program. With the mongo::Mutex/diagnostic-latches, running the test multiple times produces only success outputs and TSAN does not report any issues.

      Show
      Add a unittest like this that contains a simple lock-order-inversion: +TEST(GeorgeTest, LockOrderInversion) { +#ifndef MONGO_CONFIG_USE_RAW_LATCHES +std::cout << "YYYY Using mongo::Mutex" << std::endl; +# else +std::cout << "YYYY Using raw std::mutex" << std::endl; +#endif + LOGV2(24148, "XXXX IN George Test" ); + auto m1 = MONGO_MAKE_LATCH( "m1" ); + auto m2 = MONGO_MAKE_LATCH( "m2" ); + stdx::thread t1([&] { + stdx::lock_guard lg(m1); + LOGV2(24148, "XXXX Got m1 t1" ); + stdx::lock_guard lg2(m2); + LOGV2(24148, "XXXX Got m2 t1" ); + + }); + + t1.join(); + stdx::thread t2([&] { + stdx::lock_guard lg(m2); + LOGV2(24148, "XXXX Got m2 t2" ); + stdx::lock_guard lg2(m1); + LOGV2(24148, "XXXX Got m1 t2" ); + + }); + t2.join(); + +} + Then, create two TSAN compile configurations, one with and one without diagnostic latches: ./buildscripts/scons.py --dbg=on --opt=on --use-libunwind=off --link-model=dynamic --variables-files=./etc/scons/mongodbtoolchain_stable_clang.vars --ninja ICECC=icecc CCACHE=ccache --sanitize=thread --allocator=system --use-diagnostic-latches=on NINJA_PREFIX=tsan-latches and ./buildscripts/scons.py --dbg=on --opt=on --use-libunwind=off --link-model=dynamic --variables-files=./etc/scons/mongodbtoolchain_stable_clang.vars --ninja ICECC=icecc CCACHE=ccache --sanitize=thread --allocator=system NINJA_PREFIX=tsan Then, run the test under each configuration. On my VWS, with the no-diagnostic-latches variant/raw std::mutex, TSAN determinsitically identifies the data race and aborts the program. With the mongo::Mutex/diagnostic-latches, running the test multiple times produces only success outputs and TSAN does not report any issues.
    • Service Arch 2024-04-01, Service Arch 2024-04-15, Service Arch 2024-04-29, Workload Scheduling 2024-05-27, Workload Scheduling 2024-06-10, Workload Scheduling 2024-06-24, Workload Scheduling 2024-07-08

      Using mongo::Mutex instead of raw std::mutex appears to inhibit TSAN's ability to detect lock order inversions.

      We should either fix the issue or only run TSAN on --enable-diagnostic-latches=off variants.

      Reproducer attached in 'steps to reproduce'.

            Assignee:
            george.wangensteen@mongodb.com George Wangensteen
            Reporter:
            george.wangensteen@mongodb.com George Wangensteen
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: