-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Storage Engines
As wt_binary_decode.py evolves, it's getting more useful to see data in .wt files. WT-13167 extends that by added BSON dumping.
We should have an option to redact information about keys and/or data, possibly it should be the default. When redaction is on, we should not be able to show bson dump information, or even decoded bytes.
Two points, we should still go to the trouble of internally decoding (like call the bson dumper, when asked for), but just not print it. That would verify that the information is not corrupted. Second, instead of showing any bytes, we could do an MD5 hash of the bytes. That might allow us to compare two sets of data and verify that they are the same (without revealing anything). For example, looking at two equivalent leaf blocks from two checkpoints to see if items are inserted/deleted.
If we had this facility, then we could ask TSEs or customers to run decode and send us results, and customer information would remain protected.