Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-595

Track how many bytes match in each lexical comparison and don't check prefix bytes we know will match.

    • Type: Icon: Task Task
    • Resolution: Done
    • WT1.6.4
    • Affects Version/s: None
    • Component/s: None
    • None

      @michaelcahill, @agorrod:

      A change that went into Berkeley DB awhile back was to track how many bytes matched when searching through keys on a page, that is, if you're searching through a page for the user key "abXXX", where every key on the page has the prefix "ab", you quickly figure it out and start future comparisons 2 bytes into the strings.

      I wrote a test program that loads 5M records in sorted order, where the key is a 0-padded 10 character integer, preceded by some number of prefix bytes. So, the keys are 0000000001 to 0005000000, and then I add some number of prefix bytes, which would make the keys be something like abc0000000001 to abc0005000000. (Since all of the keys have at least 3 leading 0's, there are always 3 matching bytes.)

      The timed phase of the program is a WT_CURSOR.search call for every key in the table.

      Results from pixiebob:

       0 additional prefix bytes: old  5.88, new 5.83 (0.85%)
       5 additional prefix bytes: old  6.52, new 6.42 (1.53%)
      10 additional prefix bytes: old  6.96, new 6.81 (2.16%)
      20 additional prefix bytes: old  7.92, new 7.67 (3.16%)
      25 additional prefix bytes: old  8.57, new 8.22 (4.08%)
      50 additional prefix bytes: old 10.77, new 9.91 (7.99%)
      

      Obviously, it's no big deal if there aren't strings of prefix bytes, if someone stores URL's in a table, this will pay off.

      What I think is happening is that by starting the search further into the string we're avoiding some number of memory fetches – I can't imagine the number of comparisons we make matters.

      Probabilistically, even starting the search a few bytes into the string should mean we average fewer memory fetches to find the mismatched bytes, but I'm not sure if that's always true, based on what the compiler and/or hardware are doing with byte alignment. (I also asked fedorova if she could shed some light here.)

      I'm inclined to pull this change into the tree, because it's really cheap (3 new uint32_t variables, and a couple of additions/subtractions and a comparison or two in each loop iteration), I don't think it will slow the loop down even when it's not gaining us performance. Sasha may have something useful to say there as well.

      We could probably figure out if this optimization will win on a page-by-page basis, when we read pages into memory, but I didn't look at that in any detail.

            Assignee:
            michael.cahill@mongodb.com Michael Cahill (Inactive)
            Reporter:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: