-
Type: Task
-
Resolution: Fixed
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
QO 2022-11-28, QO 2022-12-12, QO 2022-12-26
CE accuracy testing requires a variety of datasets against which to test how accurate are different estimation methods.
Currently there are two candidates to generate random datasets:
- buildscripts/cost_model/data_generator.py
- src/mongo/db/query/ce/rand_utils.cpp, and rand_utils_new.cpp
Both data generation tools cannot be called easily from a JS test script. The goal of this task is to investigate if and how we can use the Python data generator, so that a dataset is generated and stored as a JSON file in a way that allows later a JS test to load that file, and run various queries against it.
It should be straightforward to regenerate those input datasets, or generate new ones. This could be achieved via a script that takes some dataset descriptor(s) and produces a JSON file, which is then stored in a well-known location to be used by the JS test.
This task is about figuring out how do the above, and provide one-two examples. A subsequent task will implement the generation of a variety of datasets.
- depends on
-
SERVER-72036 Implement data generation and loading into JS CE accuracy tests
- Closed