Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-92106

SEGV in vtable lookup inside CollectionRef::restoreCollection()

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Catalog and Routing
    • ALL
    • CAR Team 2024-07-08, CAR Team 2024-07-22
    • 200
    • 2

      Hit with a run here: https://spruce.mongodb.com/task/mongodb_mongo_master_enterprise_amazon_linux2_arm64_all_feature_flags_concurrency_simultaneous_3_linux_enterprise_patch_a325ed8cca04771c6fe4d945c82a25b042a3327d_66845f828e0a8a0007a85b52_24_07_02_20_14_19/tests?execution=0&sortBy=STATUS&sortDir=ASC

      The stack dump gives the faulting function as (something)__gen_vtable_impl(something), and gdb gives it as "typeinfo for mongo::CollectionImpl ()", both indicating that we got a bad call address from the object vtable. I'd take the CollectionImpl bit with a grain of salt, given that the vtable is messed up, but given that the caller is CollectionRef::restoreCollection(), I think we can assume the instance whose vtable we're reading is derived from mongo::Collection().

      The run is against a patch branch (SERVER-92060), but the only diff is improvements to print more information in abruptQuitWithAddrSignal(), which doesn't run until after the SEGV has already happened. The additional printing does confirm that the address in the pc register matches the fault address, confirming that we got our bad address from the vtable.

      My best guess is that this is a use-after-free of a Collection instance, with the typeinfo/vtable/whatever pointer in the instance having been modified since the memory was freed.

      Also, you'll likely need to turn off pretty printers to keep gdb from crashing. This is probably also why the core analysis job for this failure also failed.

      My best guess for the vtable lookup that triggered the fault is the getPtr()->ns() call in restoreCollection(). gdb is listing a line number in boost::optional::ptr_ref() as the call site, which indicates both inlined code, and a proximity to one of the getPtr() calls.

            Assignee:
            aitor.esteve@mongodb.com Aitor Esteve Alvarado
            Reporter:
            ronald.steinke@mongodb.com Ronald Steinke
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: