Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-1272

mongo::BufBuilder::grow fails to be inlined, makes appending slow

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 1.5.6
    • Affects Version/s: 1.5.3
    • Component/s: Internal Client
    • None
    • Environment:
      Ubuntu 10.04 x86_64, gcc-4.4.3

      When profiling a program making heavy use of the BSON library, I noticed that a large percentage of time was spent in mongo::BufBuilder::grow. That was odd since I had pre-configured BufBuilder objects with a generous chunk of memory, and I confirmed that realloc was not being called, so the buffer was never actually growing.

      It turns out that the compiler was not inlining mongo::BufBuilder::grow. The function was emitted out-of-line and was being called through the PLT to do so much as append an integer to an already allocated buffer.

      I was able to improve throughput a good bit by partitioning BufBuilder::grow into a hot inlined function do to the space check, and an explicitly non-inlined cold function to handle reallocating on overflow:

      /* returns the pre-grow write position */
      inline char* grow(int by) {
      int oldlen = l;
      l += by;
      if ( l > size )

      { grow_reallocate(); }

      return data + oldlen;
      }

      void grow_reallocate() _attribute_((noinline))

      { int a = size * 2; if ( a == 0 ) a = 512; if ( l > a ) a = l + 16 * 1024; if( a > 64 * 1024 * 1024 ) msgasserted(10000, "BufBuilder grow() > 64MB"); data = (char *) realloc(data, a); size= a; }

      After this change, the available space check was inlined during BufBuilder::append calls, and grow_reallocate was emitted out of line (and never called) This seemed to buy me about a 20% improvement in throughput while constructing complex BSON documents.

      The _attribute((noinline)) is necessary here, since I don't want grow_reallocate inlined into 'grow'. Ideally, grow_reallocate would be at .cc scope, not in the header at all, but perhaps you want to keep the BSON library 'header only'. If that is the case, then maybe think about macroizing __attribute_((noinline)), and then marking cold functions, or functions that make expensive library calls like 'realloc', as non-inline-able, as there is little or no benefit to inlining them. Similarly, splitting in-header functions that may be complicated enough that the compiler could reasonably choose not to inline into an inlined hot/fast-path and a non-inlined slow/cold-path could allow the compiler to more aggressively inline common cases.

            Assignee:
            alerner Alberto Lerner
            Reporter:
            andrew.morrow@mongodb.com Andrew Morrow (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: