Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.9.0
Affects Version/s: 4.0.5
Component/s: Sharding
Labels:
- qexec-team

Backwards Compatibility:
Fully Compatible
Sprint:
Query 2020-10-05, Query 2020-10-19
Case:
Confidence Status:
None
Work Order:
0

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

To filter orphans during counts, sharded clusters avoid COUNT_SCAN in favor of an IXSCAN, and examine documents. Especially for queries that do not include the shard key, this may be difficult to avoid. But, the impact can be noticeable for some workloads.

Original report follows:

I have a simple collection with 40M documents distributed ver 2 shards. The documents contain a field named COUNTRY with a sparse index on it.

I am trying to count documents with a specific value for COUNTRY. When counting on the shards directly the plan looks correct and uses COUNT_SCAN with the index, returns 6M docs on each shard without loading them.

{{2019-01-11T16:03:02.643+0000 I COMMAND [conn102] command segmentation.contacts appName: "MongoDB Shell" command: count { count: "contacts", query:{ COUNTRY: "US" }, fields: {}, lsid: \{ id: UUID("60be11b5-6299-416b-9ead-8c58fd3656b8") }, $clusterTime: { clusterTime: Timestamp(1547221085, 2), signature:{ hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 }}, $db: "segmentation" } planSummary: COUNT_SCAN \{ COUNTRY: 1 } keysExamined:6923922 docsExamined:0 numYields:54093 reslen:340 locks:{ Global: { acquireCount:{ r: 54094 }}, Database: { acquireCount:{ r: 54094 }}, Collection: { acquireCount:{ r: 54094 }} } protocol:op_msg 2747ms}}

Running the exact same query on MONGOS, the count is distributed to the shards but I can see in the logs of each shard that the plan changed and it used IDX_SCAN instead and loaded all the documents which was much slower.

{{2019-01-11T16:04:13.104+0000 I COMMAND [conn111] command segmentation.contacts appName: "MongoDB Shell" command: count { count: "contacts", query:{ COUNTRY: "US" }, allowImplicitCollectionCreation: false, shardVersion: [ Timestamp(474, 1),ObjectId('5c37917aa0f162a86b270897') ], lsid: \{ id: UUID("9be8fb16-f158-4099-bf25-a3a0c97a33c2"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }, $clusterTime: { clusterTime: Timestamp(1547222609, 1), signature:{ hash: BinData(0, 1F79735A80E06C35800DE9CDC4E92608B2759F1F), keyId: 6644928741854150685 }}, $client: { application:{ name: "MongoDB Shell" }, driver: \{ name: "MongoDB Internal Client", version: "4.0.5" }, os: \{ type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "18.04" }, mongos: \{ host: "ip-192-168-169-237:27019", client: "127.0.0.1:59430", version: "4.0.5" } }, $configServerState: { opTime:{ ts: Timestamp(1547222609, 1), t: 1 }}, $db: "segmentation" } planSummary: IXSCAN \{ COUNTRY: 1 } keysExamined:6923921 docsExamined:6923921 numYields:54093 reslen:320 locks:{ Global: { acquireCount:{ r: 54094 }}, Database: { acquireCount:{ r: 54094 }}, Collection: { acquireCount:{ r: 54094 }} } protocol:op_msg 34784ms}}

Can someone explain me why the plan changed or why the docs have to be scanned ? The performance impact is huge...

Attached execution plan from MONGOS. This issue was also referenced in StackExchange

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

mongos_plan.txt
13 kB
Jan 25 2019 09:11:08 AM UTC

causes

SERVER-78071 A targeted query with a concurrent yield and chunk migration can miss results

Closed

SERVER-94315 Shard filtering bug with collation

Closed

is caused by

SERVER-3645 Sharded collection counts (on primary) can report too many results

Closed

is related to

SERVER-50857 Improve count() performance in sharded clusters

Closed

related to

SERVER-39241 Plan scoring incorrectly applies noFetchBonus bonus to all plans when no projection is provided

Closed

SERVER-23017 Fast approximate count with predicate

Backlog

(1 related to)

Assignee:: Ian Boros

Reporter:: Julien Moutte

Participants:: Alyson Cabral, Asya Kamsky, Eric Sedor, Garaudy Etienne, Githook User, Ian Boros, Julien Moutte, Sebastian Mayr

Votes:: 2 Vote for this issue

Watchers:: 25 Start watching this issue

Created:: Jan 25 2019 09:15:23 AM UTC

Updated:: Oct 28 2024 01:14:18 PM UTC

Resolved:: Oct 15 2020 08:53:46 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates