-
Type: Improvement
-
Resolution: Won't Do
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Btree, Not Applicable
-
Storage Engines
-
8
-
StorEng - Defined Pipeline
Note this must be done after WT-12361 is merged into develop. Motivating discussion is here.
In __lex_compare_lt_16 when we compare between 32 and 63 bits we do the following
uint32_t ta32, tb32, ua32, ub32; memcpy(&ua32, ustartp, sizeof(uint32_t)); memcpy(&ta32, tstartp, sizeof(uint32_t)); memcpy(&ub32, uendp - sizeof(uint32_t), sizeof(uint32_t)); memcpy(&tb32, tendp - sizeof(uint32_t), sizeof(uint32_t)); ua = ua32; ta = ta32; ub = ub32; tb = tb32;
which copies the bytes-to-be-compared into two different uint32_t variables before comparion. I think we can make this more efficient by instead copying all bytes into a single uint64_t variable at the same time. See this Compiler Explorer link for the example changes. This should also be applicable for the subsequent branches where we copy into uint16_t, uint8_t variables.
This ticket is to investigate the change and see if there are observable perf improvements. It only applies to comparisons shorter than 64 bits, so we need to make sure the performance tests are using sufficiently short keys.
- depends on
-
WT-12361 Revisit __wt_lex_compare_skip impl for arm
- Closed