Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-89632

$queryStats: Track size of server-side js code blocks used in queries.

    • Type: Icon: Improvement Improvement
    • Resolution: Won't Do
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Query Integration

      As we plan to eventually remove support of server-side javascript, one of the prerequisites for it is making sure users can easily rewrite their exising queries using js into native agg, so we may want to introduce some new expressions.
      Today when js is used in query, in $queryStats output it's captures like

      filter: { '$where': '?javascript' }

      It would be very helpful to better categorize which native js methods were used within that javascript (e.g. JSON.stringify(), hex_md5() etc) for queries with $where, $function, $accumulator

      Initially, we would have liked to capture this information as part of the query shape. However, the query shape is not the appropriate mechanism for doing so, as different functions used within a $where clause are not necessarily indicative of new workloads, would result in a very high cardinality of query stats entries that all execute some generic javascript snippet.

      The important requirement we should be trying to capture as part of this ticket is, how complex is the javascript each of these queries is issuing.

      Historically, a large proportion of MongoDB customers using server side javascript adopted it because they wrote their applications at a time when the expr() language was not available in find() commands. Server side JS provided that functionality for them. These customers will typically be running fairly small snippets of javascript, usually performing simple field comparison/filtering, which can now easily be expressed in MQL. Instead of changing the query shape to reflect which built in javascript functions are used in the query, it would be more meaningful to identify which customers are running simple vs. complex javascript. The size of the javascript snippet used in the queries is likely a very good indicator of whether or not the user's javascript is complex or not (read: easily translatable to MQL).

      With this in mind, we should simply change the query shape to provide either the length of the javascript snippet, or categorize customers into two buckets:

      a. Customers that run large amounts of javascript (i.e > 50 characters, arbitrary number) -> serialize as ?javascript_large

      b. Customers that run small amounts of javascript (i.e < 50 characters) -> serialize as ?javascript_small.

            Assignee:
            santiago.roche@mongodb.com Santiago Roche
            Reporter:
            kateryna.kamenieva@mongodb.com Katya Kamenieva
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: