Uploaded image for project: 'Mongoid'
  1. Mongoid
  2. MONGOID-5846

Perf degradation on distinct when legacy_pluck_distinct is enabled

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Unknown Unknown
    • None
    • Affects Version/s: 8.0.9, 8.1.9
    • Component/s: Performance
    • None
    • None
    • Ruby Drivers
    • None
    • None
    • None
    • None
    • None
    • None

      When `legacy_pluck_distinct` flag is disabled, as it is by default, doing distinct is slower than when it is enabled. It happens because Mongoid::Contextual::Mongo uses recursive_demongoize which retrieve the field for each record.

      Here is the code I used to do a benchmark :
       

      require 'bundler/inline'
      
      gemfile do
        source 'https://rubygems.org'
      
        gem 'mongoid', '~> 8.1.0'
      end
      
      Mongoid.configure do |config|
        config.clients.default = {
          hosts: ['mongo:27017'],
          database: 'mongoid_test',
        }
      end
      Mongoid.purge!
      
      class Category
        include Mongoid::Document
      
        has_many :articles
      end
      
      class Article
        include Mongoid::Document
      
        field :body, type: String
      
        belongs_to :category
      end
      
      category = Category.create!
      1000.times do |i|
        Article.create!(body: "Article #{i}", category: category)
      end
      
      puts Benchmark.bm do |x|
        Mongoid::Config.legacy_pluck_distinct = true
        x.report('legacy_pluck_distinct enabled') { 100.times { category.articles.distinct(:id) } }
        Mongoid::Config.legacy_pluck_distinct = false
        x.report('legacy_pluck_distinct disabled') { 100.times { category.articles.distinct(:id) } }
      
        context = category.articles.context
        name = '_id'
      
        # Instanciate `field` only once
        x.report('field found once') do
          100.times do
            field = Article.traverse_association_tree(name)
            context.view.distinct(name).map do |value|
              context.send(:demongoize_with_field, field, value, false)
            end
          end
        end
      
        # Instanciate `field` for each record
        x.report('field found for each record') do
          100.times do
            context.view.distinct(name).map do |value|
              field = Article.traverse_association_tree(name)
              context.send(:demongoize_with_field, field, value, false)
            end
          end
        end
      end
      

      The result of the benchmark :

                                          user     system      total        real
      legacy_pluck_distinct enabled   0.092737   0.005985   0.098722 (  0.175196)
      legacy_pluck_distinct disabled  0.212732   0.002981   0.215713 (  0.298624)
      field found once                0.102693   0.001005   0.103698 (  0.178545)
      field found for each record     0.167417   0.003969   0.171386 (  0.246786)
      

      The difference is not impressive but it obviously more important for bigger collections.

            Assignee:
            Unassigned Unassigned
            Reporter:
            guirec.corbel@gmail.com Guirec Corbel
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              None
              None
              None
              None