Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-34687

Aggregation Index Performance

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Aggregation Framework
    • None
    • ALL
    • Hide

      The following aggregation:

      db.Collection.aggregate([
          {
              $match: {
                  "SimulationDate": {$gte: ISODate("2018-03-21 00:00:00")},
                  "ParentId": null
              }
          },
          {
              $project: {
                  "CPF": "$User.CPF",
                  "Data": "$SimulationDate",
                  "Email": "$User.Email",
                  "TelCelular": {$concat: ["$User.MobilePhone.AreaCode","-","$Borrower.MobilePhone.Number"]},
                  "TelOutro": {$concat: ["$User.HomePhone.AreaCode","-","$Borrower.HomePhone.Number"]},
              }
          },
          {
              $lookup: {
                  from: "Collection",
                  let: {
                      id: "$Email",
                      data: "$Data"
                  },
                  pipeline: [
                      {
                          $match: {
                              $expr: {$and: [
                                  {$eq: ["$$id", "$User.Email"]},
                                  {$gte: ["$SimulationDate", ISODate("2018-03-21 00:00:00")]},
                                  {$lte: ["$SimulationDate", "$$data"]},
                              ]}
                          }
                      },
                      {
                          $project: {
                              "CPF_Email": "$User.CPF",
                              "Data_Email": "$SimulationDate"
                          }
                      }
                  ],
                  as: "Email"
              }
          }
      ])
      

      seems to run much slower (I couldn't even wait for the first line to return) than the (similar?) find+Javascript script:

      var ds = new Date(2018,2,21)
      db.Collection.find({"SimulationDate": {$gte: ds},"ParentId": null},{'SimulationDate':1,'User.Email':1}).forEach(p=>{
          db.Collection.find({'User.Email':p.User.Email,SimulationDate:{$gte:ds, $lte:p.SimulationDate}}).forEach(pb=>{
              print (pb.User.Email+","+pb.SimulationDate)
          })
      })
      
      Show
      The following aggregation: db.Collection.aggregate([ { $match: { "SimulationDate": {$gte: ISODate("2018-03-21 00:00:00")}, "ParentId": null } }, { $project: { "CPF": "$User.CPF", "Data": "$SimulationDate", "Email": "$User.Email", "TelCelular": {$concat: ["$User.MobilePhone.AreaCode","-","$Borrower.MobilePhone.Number"]}, "TelOutro": {$concat: ["$User.HomePhone.AreaCode","-","$Borrower.HomePhone.Number"]}, } }, { $lookup: { from: "Collection", let: { id: "$Email", data: "$Data" }, pipeline: [ { $match: { $expr: {$and: [ {$eq: ["$$id", "$User.Email"]}, {$gte: ["$SimulationDate", ISODate("2018-03-21 00:00:00")]}, {$lte: ["$SimulationDate", "$$data"]}, ]} } }, { $project: { "CPF_Email": "$User.CPF", "Data_Email": "$SimulationDate" } } ], as: "Email" } } ]) seems to run much slower (I couldn't even wait for the first line to return) than the (similar?) find+Javascript script: var ds = new Date(2018,2,21) db.Collection.find({"SimulationDate": {$gte: ds},"ParentId": null},{'SimulationDate':1,'User.Email':1}).forEach(p=>{ db.Collection.find({'User.Email':p.User.Email,SimulationDate:{$gte:ds, $lte:p.SimulationDate}}).forEach(pb=>{ print (pb.User.Email+","+pb.SimulationDate) }) })

      It seems that Index usage during $lookup phases is not fully optimized for $pipeline+$expr lookups.

            Assignee:
            asya.kamsky@mongodb.com Asya Kamsky
            Reporter:
            danielp@bompracredito.com.br Daniel Polistchuck
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: