Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-85034

Investigate data generation and loading into JS CE accuracy tests

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • QO 2022-11-28, QO 2022-12-12, QO 2022-12-26

      CE accuracy testing requires a variety of datasets against which to test how accurate are different estimation methods.

      Currently there are two candidates to generate random datasets:

      • buildscripts/cost_model/data_generator.py
      • src/mongo/db/query/ce/rand_utils.cpp, and rand_utils_new.cpp

      Both data generation tools cannot be called easily from a JS test script. The goal of this task is to investigate if and how we can use the Python data generator, so that a dataset is generated and stored as a JSON file in a way that allows later a JS test to load that file, and run various queries against it.

      It should be straightforward to regenerate those input datasets, or generate new ones. This could be achieved via a script that takes some dataset descriptor(s) and produces a JSON file, which is then stored in a well-known location to be used by the JS test.

      This task is about figuring out how do the above, and provide one-two examples. A subsequent  task will implement the generation of a variety of datasets.

            Assignee:
            timour.katchaounov@mongodb.com Timour Katchaounov
            Reporter:
            timour.katchaounov@mongodb.com Timour Katchaounov
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: