Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-62457

Lock-free reads causes query subsystem to treat unsharded collection as sharded when collection is dropped and re-created (ABA problem)

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 6.0.0-rc0
    • Affects Version/s: 5.1.1, 5.2.0-rc4
    • Component/s: Catalog
    • Fully Compatible
    • ALL
    • Execution Team 2022-01-24, Execution Team 2022-02-07, Execution Team 2022-02-21, Execution Team 2022-03-07, Execution Team 2022-03-21
    • 19

      This can at least lead to a server crash with slot-based execution.

      The query subsystem uses CollectionPtr::isSharded() to decide whether to add a plan stage to do ownership filter. As part of constructing the sbe::FilterStage to do ownership filter, the ShardFiltererImpl invariants the CollectionShardingState actually had a shard key pattern associated with it.

      Constructing the special-purpose ShardFilterStage used by the classic executor doesn't have this invariant and simply forwards all documents through via kUnshardedCollection. This is why I believe only slot-based execution is affected.

      The problematic sequence involves an unsharded collection being sharded and then dropped and then re-created as an unsharded collection:

      1. Collection exists and is unsharded.
      2. Mongos attaches shard version UNSHARDED to the request.
      3. Collection becomes sharded from some other client running shardCollection.
      4. AutoGetCollectionLockFree checks and sees the collection is now sharded.
      5. Collection is dropped from some other client running drop.
      6. Collection is created again as unsharded from some other client running create.
      7. AutoGetCollectionForReadCommandBase checks and sees the collection is unsharded. In particular, CollectionShardingState::checkShardVersionOrThrow() passes because the request was sent with shard version UNSHARDED and the collection is (again now) unsharded.

      {"t":{"$date":"2022-01-08T01:39:33.733+00:00"},"s":"F",  "c":"ASSERT",   "id":23079,   "ctx":"conn23","msg":"Invariant failure","attr":{"expr":"_keyPattern","file":"src/mongo/db/exec/shard_filterer_impl.cpp","line":88}}
      
      /home/ubuntu/mongo-v52/src/mongo/util/stacktrace_posix.cpp:263:40: LibunwindStepIteration
      /home/ubuntu/mongo-v52/src/mongo/util/stacktrace_posix.cpp:434:36: mongo::stack_trace_detail::(anonymous namespace)::printStackTraceImpl(mongo::stack_trace_detail::(anonymous namespace)::Options const&, mongo::StackTraceSink*) (.constprop.360)
      /home/ubuntu/mongo-v52/src/mongo/util/stacktrace_posix.cpp:485:44: mongo::printStackTrace()
      /home/ubuntu/mongo-v52/src/mongo/util/signal_handlers_synchronous.cpp:232:28: abruptQuit
      ??:0:0: ??
      /build/glibc-S9d2JN/glibc-2.27/signal/../sysdeps/unix/sysv/linux/nptl-signals.h:80:0: __libc_signal_restore_set
      /build/glibc-S9d2JN/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:48:0: gsignal
      /build/glibc-S9d2JN/glibc-2.27/stdlib/abort.c:79:0: abort
      /home/ubuntu/mongo-v52/src/mongo/util/assert_util.cpp:121:15: mongo::invariantFailed(char const*, char const*, unsigned int)
      /home/ubuntu/mongo-v52/src/mongo/util/invariant.h:71:33: void mongo::invariantWithLocation<boost::optional<mongo::ShardKeyPattern> >(boost::optional<mongo::ShardKeyPattern> const&, char const*, char const*, unsigned int)
      /home/ubuntu/mongo-v52/src/mongo/db/exec/shard_filterer_impl.cpp:88:35: mongo::ShardFiltererImpl::getKeyPattern() const
      /home/ubuntu/mongo-v52/src/mongo/db/exec/shard_filterer_impl.cpp:87:19: mongo::ShardFiltererImpl::getKeyPattern() const (.cold.555)
      /home/ubuntu/mongo-v52/src/mongo/db/query/sbe_stage_builder.cpp:2728:57: mongo::stage_builder::SlotBasedStageBuilder::buildShardFilter(mongo::QuerySolutionNode const*, mongo::stage_builder::PlanStageReqs const&)
      ...
      /home/ubuntu/mongo-v52/src/mongo/db/query/sbe_stage_builder.cpp:2888:77: mongo::stage_builder::SlotBasedStageBuilder::build(mongo::QuerySolutionNode const*, mongo::stage_builder::PlanStageReqs const&)
      /home/ubuntu/mongo-v52/src/mongo/db/query/sbe_stage_builder.cpp:695:45: mongo::stage_builder::SlotBasedStageBuilder::build(mongo::QuerySolutionNode const*)
      /home/ubuntu/mongo-v52/src/mongo/db/query/stage_builder_util.cpp:76:47: mongo::stage_builder::buildSlotBasedExecutableTree(mongo::OperationContext*, mongo::CollectionPtr const&, mongo::CanonicalQuery const&, mongo::QuerySolution const&, mongo::PlanYieldPolicy*)
      /home/ubuntu/mongo-v52/src/mongo/db/query/get_executor.cpp:881:62: buildExecutableTree
      /home/ubuntu/mongo-v52/src/mongo/db/query/get_executor.cpp:647:18: prepare
      /home/ubuntu/mongo-v52/src/mongo/db/query/get_executor.cpp:1100:52: mongo::(anonymous namespace)::getSlotBasedExecutor(mongo::OperationContext*, mongo::CollectionPtr const*, std::unique_ptr<mongo::CanonicalQuery, std::default_delete<mongo::CanonicalQuery> >, std::function<void (mongo::CanonicalQuery*)>, mongo::PlanYieldPolicy::YieldPolicy, unsigned long)
      /home/ubuntu/mongo-v52/src/mongo/db/query/get_executor.cpp:1164:88: mongo::getExecutor(mongo::OperationContext*, mongo::CollectionPtr const*, std::unique_ptr<mongo::CanonicalQuery, std::default_delete<mongo::CanonicalQuery> >, std::function<void (mongo::CanonicalQuery*)>, mongo::PlanYieldPolicy::YieldPolicy, unsigned long)
      /home/ubuntu/mongo-v52/src/mongo/db/query/get_executor.cpp:1190:38: mongo::getExecutorFind(mongo::OperationContext*, mongo::CollectionPtr const*, std::unique_ptr<mongo::CanonicalQuery, std::default_delete<mongo::CanonicalQuery> >, std::function<void (mongo::CanonicalQuery*)>, bool, unsigned long)
      /home/ubuntu/mongo-v52/src/mongo/db/commands/find_cmd.cpp:548:69: mongo::(anonymous namespace)::FindCmd::Invocation::run(mongo::OperationContext*, mongo::rpc::ReplyBuilderInterface*)
      

            Assignee:
            dianna.hohensee@mongodb.com Dianna Hohensee (Inactive)
            Reporter:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            17 Start watching this issue

              Created:
              Updated:
              Resolved: