Uploaded image for project: 'Node.js Driver'
  1. Node.js Driver
  2. NODE-220

Usage of ensureIndex in GridStore on files and chunk collections.

    • Type: Icon: Task Task
    • Resolution: Done
    • Priority: Icon: Minor - P4 Minor - P4
    • 1.4.8
    • Affects Version/s: 1.3, 1.4, 1.4.8
    • Component/s: None

      GridStore.prototype.open ensures an index on the files collection with the filename field, as well as on the chunks collection with files_id and n. Setting aside the obviousness of performance benefits, these indices are not required. I am curious as to why they are embedded in the driver.

      GridFS Index documentation only mentions indexing with regards to the chunks collection. It does recommend to consult the specific driver however.

      When investigating how these came into existence, the answer is readily available in the commit history and is mentioned in the HISTORY/change log. These resources mention a ticket #649, but I have been unable to locate this issue, which is why I'm inquiring here about the rationale behind the enhancement. It appears to have originated in version 1.1.4, but JIRA only goes back to 1.3 and there isn't an "issue" section I can find on github for the project.

      The specific source in question is as follows:

      lib/mongodb/gridfs/gridstore.js
      var collection = self.collection();
      // Put index on filename
      collection.ensureIndex([['filename', 1]], writeConcern, function(err, index) {
        if(err) return callback(err);
      
        // Get chunk collection
        var chunkCollection = self.chunkCollection();
        // Ensure index on chunk collection
        chunkCollection.ensureIndex([['files_id', 1], ['n', 1]], writeConcern, function(err, index) {
      	if(err) return callback(err);
      	_open(self, writeConcern, callback);
        });
      });
      

      The immediate callbacks upon error with ensureIndex don't render this a passive implementation. The operation could fail with MongoDB errors IndexOptionsConflict (85) or IndexKeySpecsConflict (86) - which are unrelated to the legitimacy of the open method - if either of the indices have been tailored differently, e.g. with a unique index on filename rather than the non-unique one being ensured by the driver.

      My questions are:

      1. What is the rationale behind this embedded index assurance? Is this simply included to guarantee that the developer doesn't forget this critical step?
      2. Shouldn't collection indexing be up to the developer/DBA/etc.?
      3. Should this have been implemented without the control flow (explicit callback on error, flow deviation), or should the ensureIndex invocations be removed completely?

      If a change will be considered, I can create a new improvement ticket, or simply update this one.

            Assignee:
            christkv Christian Amor Kvalheim
            Reporter:
            zammit.andrew@gmail.com Andrew Zammit
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: