Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Works as Designed
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 3.11.2, 3.12.4, 4.0.3
Component/s: API
Labels:
None
Environment:
mongodb 3.6.2 has been used, others not checked

Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

The background:

In our system we have two GridFS buckets:

I realised that the second one it getting slower and slower as long are we are using it (putting in more/bigger files). Also the delete operation was/is terribly slow on some nodes, up to 20 seconds per file.

When searching for the problem I also checked the indices, because in most of the cases wrong/missing indices are the reason for slow DB results. And I found this:

While one of the buckets has the indices expected by the GridFS specification, the other does not. And this is the case on several and independent, but not all of our server instances that run the same software, but do not share the same data.

The problem:

I analysed the source code of the java driver in the version I use (3.11) and the newer ones (3.12, 4.*, master) and found out that the indices are only created under the following conditions:

(GridFS.java)

public GridFS(final DB db, final String bucket) {

.......

// ensure standard indexes as long as collections are small
try {
    if (filesCollection.count() < 1000) {
        filesCollection.createIndex(new BasicDBObject("filename", 1).append("uploadDate", 1));
    }
    if (chunksCollection.count() < 1000) {
        chunksCollection.createIndex(new BasicDBObject("files_id", 1).append("n", 1),
                                     new BasicDBObject("unique", true));
    }
} catch (MongoException e) {
    //TODO: Logging
}

This means: When I create a GridFS object which holds less than 1000 items, these indices should have been created, but that's not the case as you can see on the screenshots.

Currently I don't know why for one DB they are created and not for the other, but my speculation is that it is has to do with the fact that on some instances especially the one bucket which is missing the index is filled up with many files directly after creation. So it could be the case, that...

1) the index is not created, because the bucket does not exist at startup

2) the index is not created on the second connect, because the db already contains more than 1000 chunks.

==> the index is never created

I will try to further investigate and provide updates. But I think this is quite an important issues, because it really drastically affects performance.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Bildschirmfoto 2020-05-15 um 23.47.06.png
26 kB
May 15 2020 09:48:32 PM UTC
Bildschirmfoto 2020-05-15 um 23.47.16.png
79 kB
May 15 2020 09:51:14 PM UTC

Assignee:: Unassigned

Reporter:: Andreas Filler

Reviewers:: None

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: May 15 2020 10:08:29 PM UTC

Updated:: Oct 27 2023 01:20:59 PM UTC

Resolved:: May 18 2020 12:05:38 PM UTC

Details

Description

The background:

The problem:

Attachments

Attachments

Activity

People

Dates