On big-endian platforms, there are situations in which large classes of unequal mongo::Value objects can hash to the exact same hash code. The consequences can be severe for the performance of the aggregation framework. For example, $group is implemented as a hash aggregation, meaning that the Value objects on which we are aggregating serve as keys in a hash table. If all such keys have the same hash, then during the $group build phase each insertion into the hash table will degrade from O(1) to O(n). The build phase as a whole therefore degrades from O(n) to O(n^2). Presumably there could be other code paths where we use Value objects as keys in a hash table which could also be subject to a performance problem.
The specifics of the bug are a bit subtle, so I will explain it by example. Imagine we have two values which represent arrays of strings:
- Value 1: ["a", "x"]
- Value 2: ["b", "x"]
The hashes should be different because the first elements of the two arrays differ. The computation of the hash involves the following steps:
- boost::hash_combine() the Array type tag with the initial seed.
- boost::hash_combine() the first String type tag.
- Hash the first string using MurmurHash3.
- boost::hash_combine() the second String type tag.
- Hash the second string using MurmurHash3.
Imagine that we are hashing ["a", "x"] on a big-endian platform and we have already performed steps 1 and 2. We have some 64-bit hash at this point which we can represent as 4 high-order bytes followed by 4 low-order bytes: 0xhhhhhhhhllllllll. The problem lies with how we call into MurmurHash here. This uses a uint32_t as the seed, which means that the seed becomes 0xllllllll. We pass a pointer to the 64-bit hash (&seed) as the location into which the 32-bit result should be written. On a big-endian machine, this means that we write the resulting 32-bit hash into the 4 high-order bytes. Let's use 0xgggggggg to notate the resulting hash code. The means that the resulting hash is 0xggggggggllllllll. The 32-low order bits do not change at all and remain 0xllllllll.
Here's where things get even more interesting. Step 4 is to hash_combine() the second String type tag. The implementation of boost::hash_combine() is such that the 32 low-order bits of the result are a function of the low-order bits of the incoming seed. This means that the low order of the result are some function of 0xllllllll which we can notate as f(0xllllllll). The final step is to MurmurHash the second string. As before, f(0xllllllll) becomes the seed to the hash function and the resulting 32-bit hash is written into the high-order bits. The critical observation here is that the value of f(0xllllllll) does not depend at all on the contents of the first string ("a")! It is only a function of the initial seed and type tags. The means that ["a", "x"] and ["b", "x"] will have the same hash. Indeed, the hash does not depend whatsoever on the value of the first string in the array!
The short version of all this is that we use MurmurHash in such a way that on big-endian platforms the seed is the low-order 32-bits and the resulting hash is written into the high-order 32-bits. If you attempt to compose multiple such calls to MurmurHash, this means that the resulting hash does not depend at all on the first of the two hashed values. This leads to tons of hash collisions.
I've provided a repro script in the "steps to reproduce" section which demonstrates how this can cause a performance problem for a $group query on a big-endian platform. I've verified that the problem exists on HEAD of the v4.4 branch. I still need to check that the bug exists on the master branch, but I think it does given that the code hasn't changed much.
- is related to
-
SERVER-77702 Replace MurmurHash3 with absl::Hash if possible in Value::hash_combine
- Closed
-
SERVER-77703 Replace MurmurHash3 with absl::Hash if possible in oplog_applier_utils.cpp
- Closed