Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-36400

Explicitly destroy the client on exiting the run body of each BackgroundJob

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 4.0.3, 4.1.2
    • Affects Version/s: None
    • Component/s: Storage
    • Fully Compatible
    • ALL
    • Storage NYC 2018-08-13
    • 3

      SERVER-34798 requires all the clients to be destroyed before the destruction of ServiceContext. However, WiredTigerCheckpointThread destroys its client asynchronously and could have a race condition with the main thread because in background.cpp:

      Unable to find source-code formatter for language: c++. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
          {
              // It is illegal to access any state owned by this BackgroundJob after leaving this
              // scope, with the exception of the call to 'delete this' below.
              stdx::unique_lock<stdx::mutex> l(_status->mutex);
              _status->state = Done;
              _status->done.notify_all();
          }
      
          if (selfDelete)
              delete this;
      }
      

      We set the state to be "Done" before the thread_local client gets destroyed because the thread is still running. But setting the state to be "Done" and notifying would unblock the main thread which could go all the way to the destructor of ServiceContext. Therefore, we could have a situation where the client of WTCheckpointThread gets destroyed by its thread after ServiceContext gets destroyed by main thread.

      The way to reproduce BF-10032 is adding a big sleep here.

      The fix should be similar to SERVER-35985: Add a ON_BLOCK_EXIT in the run() function of WTCheckPointThread

      We should check other BackgroundJobs which create clients in their run() function.

            Assignee:
            xiangyu.yao@mongodb.com Xiangyu Yao (Inactive)
            Reporter:
            xiangyu.yao@mongodb.com Xiangyu Yao (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: