Uploaded image for project: 'Spark Connector'
  1. Spark Connector
  2. SPARK-64

Sampling then projecting in the MongoSamplePartitioner is slow

    • Type: Icon: Improvement Improvement
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • 1.0.0
    • Affects Version/s: None
    • Component/s: Performance
    • None

      For example with the MovieLens dataset ~1million documents:

      Pipeline: sample, project _id: 76120 ms
      Pipeline: project _id, sample: 1124 ms

            Assignee:
            ross@mongodb.com Ross Lawley
            Reporter:
            ross@mongodb.com Ross Lawley
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: