Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-81876

Improve IDL code generation for command parsers

    • Service Arch
    • Service Arch 2024-01-22

      Let's take the "find" command as an example. The command is defined here, and includes 33 fields at the time of writing this ticket. The generated parser uses a series of if / else if conditional statements to parse a BSON, by going through every element in the object:

      void parseProtected(const IDLParserContext&, const OpMsgRequest& request) {
          ...
          for (const auto& element : request.body) {
              ...
              if (fieldName == kFilterFieldName) { ... }
              else if (fieldName == kProjectionFieldName) { ... }
              else if (fieldName == kSortFieldName) { ... }
              ...
          }
      }
      

      This will translate into multiple string comparisons for each element in the BSON object (on average 16 string comparisons). As a result, parsing commands shows up as a hot-spot in performance profiles, especially for low-latency workloads (e.g. YCSB 100% read). The performance implications scales with the number of fields in a command, and the order of if/else statements.

      We can reduce the cost of parsing commands (both number of instructions and CPI) by either of the following:

      • Reordering if/else statements based on the likelihood of taking each branch to reduce the total number of string comparisons.
      • Avoid parsing the BSON object multiple times if possible (less pressure on the cache).
      • Hashing filed names at compile-time, and using the hashed values to parse commands (my PoC).

      Here are some performance numbers for my PoC using micro-benchmarks:

      Running build/install/bin/find_command_bm
      Run on (8 X 3501.93 MHz CPU s)
      CPU Caches:
        L1 Data 48 KiB (x4)
        L1 Instruction 32 KiB (x4)
        L2 Unified 1280 KiB (x4)
        L3 Unified 55296 KiB (x1)
      Load Average: 0.86, 0.77, 0.54
      --------------------------------------------------------------------------------------
      Benchmark                            Time             CPU   Iterations UserCounters...
      --------------------------------------------------------------------------------------
      BM_ParseFindCommand                510 ns          510 ns      1373844 items_per_second=1.96008M/s
      BM_ParseHeavyFindCommand           566 ns          566 ns      1262140 items_per_second=1.76706M/s
      BM_ParseFindCommandPoC             287 ns          287 ns      2440443 items_per_second=3.48677M/s
      BM_ParseHeavyFindCommandPoC        232 ns          232 ns      3011081 items_per_second=4.30192M/s
      

            Assignee:
            dominic.hernandez@mongodb.com Dominic Hernandez
            Reporter:
            amirsaman.memaripour@mongodb.com Amirsaman Memaripour
            Votes:
            1 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: