-
Type: Bug
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 2.1.1
-
Component/s: Aggregation Framework
-
Environment:Linux vm001 2.6.32-41-server #88-Ubuntu SMP Thu Mar 29 14:32:47 UTC 2012 x86_64 GNU/Linux
Ubuntu 10.04.3 LTS
System information as of Wed May 9 17:27:51 CEST 2012
System load: 0.08 Processes: 83
Usage of /: 50.4% of 37.98GB Users logged in: 1
Memory usage: 48% IP address for eth2: 192.168.11.206
Swap usage: 0%
Used software:
Java
Spring 3.1
Spring data
Mongo driver 2.7.3
Linux vm001 2.6.32-41-server #88-Ubuntu SMP Thu Mar 29 14:32:47 UTC 2012 x86_64 GNU/Linux Ubuntu 10.04.3 LTS System information as of Wed May 9 17:27:51 CEST 2012 System load: 0.08 Processes: 83 Usage of /: 50.4% of 37.98GB Users logged in: 1 Memory usage: 48% IP address for eth2: 192.168.11.206 Swap usage: 0% Used software: Java Spring 3.1 Spring data Mongo driver 2.7.3
-
ALL
I have a collection with 3 million of documents. I'm trying to do a query similar to the following SQL statement: select count(id), hdr_event_number from audits group by hdr_event_number;
In order to do this query i'm using mongo 2.1.1 nigtly version (mongodb-linux-x86_64-2012-05-01) and i'm using its aggregation framework.
I'm trying to execute the following command: { "aggregate" : "audits" , "pipeline" : [ { "$group" : { "_id" : "$hdr_event_number" , "numbEvent" :
}}]}
This is the command result (from the java log file):
17:21:23,906 INFO [MongoOperationsTest] Task [trying to group 3068871 audit messages] finished in 887641 milliseconds; executed command: { "aggregate" : "audits" , "pipeline" : [ { "$group" : { "_id" : "$hdr_event_number" , "numbEvent" :
{ "$sum" : 1}}}]}
17:21:23,906 INFO [MongoOperationsTest] CommandResult: { "serverUsed" : "192.168.11.206:27017" , "result" : [
,
{ "_id" : "XDAS_AE_DISABLE_ACCOUNT" , "numbEvent" : 15792},
{ "_id" : "XDAS_AE_SEND_DATA_VIA_ASSOC" , "numbEvent" : 19526},
{ "_id" : "XDAS_AE_TERMINATE_SESSION" , "numbEvent" : 36454},
{ "_id" : "XDAS_AE_RESOURCE_EXHAUST" , "numbEvent" : 43281},
{ "_id" : "XDAS_AE_QUERY_ACCOUNT" , "numbEvent" : 93545},
{ "_id" : "XDAS_AE_ENABLE_ACCOUNT" , "numbEvent" : 146137},
{ "_id" : "XDAS_AE_MODIFY_ACCOUNT" , "numbEvent" : 21635},
{ "_id" : "XDAS_AE_QUERY_DATA_ITEM_CONTENTS" , "numbEvent" : 28526},
{ "_id" : "XDAS_AE_DELETE_DATA_ITEM" , "numbEvent" : 61315},
{ "_id" : "XDAS_AE_DELETE_ACCOUNT" , "numbEvent" : 511476},
{ "_id" : "XDAS_AE_NOT_SPECIFIED" , "numbEvent" : 543894},
{ "_id" : "XDAS_AE_MODIFY_DATA_ITEM_ASSOC_CONTEXT" , "numbEvent" : 10},
{ "_id" : "XDAS_AE_CREATE_ACCOUNT" , "numbEvent" : 1534432}] , "ok" : 1.0}
As you can see the time in milliseconds is 887641 for counting and grouping 3Million of file.
On my mongo server i launched the top command wile the command was in xecution; this is the result:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
961 root 20 0 24.1g 462m 447m S 82.5 15.3 2:05.29 mongod
1 root 20 0 23676 1896 1292 S 0.0 0.1 0:09.31 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
4 root 20 0 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
6 root 20 0 0 0 0 S 0.0 0.0 0:00.68 events/0
7 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuset
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr
11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 sync_supers
13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 bdi-default
14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kintegrityd/0
15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kblockd/0
16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpid
17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpi_notify
18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpi_hotplug
19 root 20 0 0 0 0 S 0.0 0.0 0:00.01 ata/0
20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ata_aux
21 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksuspend_usbd
22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khubd
23 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kseriod
24 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kmmcd
26 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khungtaskd
28 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kswapd0
29 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd
30 root 20 0 0 0 0 S 0.0 0.0 0:00.00 aio/0
31 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ecryptfs-kthrea
32 root 20 0 0 0 0 S 0.0 0.0 0:00.00 crypto/0
35 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pciehpd
36 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_0
37 root 20 0 0 0 0 S 0.0 0.0 0:00.03 scsi_eh_1
39 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kstriped
40 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kmpathd/0
41 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kmpath_handlerd
Here i list the mongostat:
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 0 0 0 0 1 0 11.9g 24.1g 1.71g 0 0 0 0|0 1|0 62b 1k 3 17:16:35
0 0 0 0 0 1 0 11.9g 24.1g 1.71g 0 0 0 0|0 1|0 62b 1k 3 17:16:36
0 0 0 0 0 3 0 11.9g 24.1g 1.71g 0 0 0 0|1 1|0 178b 2k 3 17:16:38
0 0 0 0 0 1 0 11.9g 24.1g 1.71g 0 0 0 0|0 1|0 62b 1k 3 17:16:39
0 0 0 0 0 1 0 11.9g 24.1g 1.71g 0 0 0 0|0 1|0 62b 1k 3 17:16:40
0 0 0 0 0 1 0 11.9g 24.1g 1.72g 0 0 0 0|0 1|0 62b 1k 3 17:16:41
0 0 0 0 0 1 0 11.9g 24.1g 1.72g 0 0 0 0|0 1|0 62b 1k 3 17:16:42
0 0 0 0 0 3 0 11.9g 24.1g 1.72g 0 0 0 0|0 1|0 178b 2k 3 17:16:44
0 0 0 0 0 1 0 11.9g 24.1g 1.72g 0 0 0 0|0 1|0 62b 1k 3 17:16:45
0 0 0 0 0 1 0 11.9g 24.1g 1.72g 0 0 0 0|1 1|0 62b 1k 3 17:16:46
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 0 0 0 0 3 0 11.9g 24.1g 1.73g 0 0 0 0|0 1|0 178b 2k 3 17:16:48
0 0 0 0 0 1 0 11.9g 24.1g 1.73g 0 0 0 0|0 1|0 62b 1k 3 17:16:49
0 0 0 0 0 1 0 11.9g 24.1g 1.73g 0 0 0 0|0 1|0 62b 1k 3 17:16:50
0 0 0 0 0 1 0 11.9g 24.1g 1.73g 0 0 0 0|0 1|0 62b 1k 3 17:16:51
0 0 0 0 0 3 0 11.9g 24.1g 1.74g 0 0 0 0|0 1|0 178b 2k 3 17:16:52
0 0 0 0 0 1 0 11.9g 24.1g 1.74g 0 0 0 0|0 1|0 62b 1k 3 17:16:54
0 0 0 0 0 1 0 11.9g 24.1g 1.74g 0 0 0 0|0 1|0 62b 1k 3 17:16:55
0 0 0 0 0 1 0 11.9g 24.1g 1.74g 0 0 0 0|0 1|0 62b 1k 3 17:16:56
0 0 0 0 0 1 0 11.9g 24.1g 1.75g 0 0 0 0|0 1|0 62b 1k 3 17:16:57
0 0 0 0 0 3 0 11.9g 24.1g 1.75g 0 0 0 0|0 1|0 178b 2k 3 17:16:58
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 0 0 0 0 1 0 11.9g 24.1g 1.75g 0 0 0 0|0 1|0 62b 1k 3 17:16:59
0 0 0 0 0 1 0 11.9g 24.1g 1.75g 0 0 0 0|0 1|0 62b 1k 3 17:17:00
0 0 0 0 0 1 0 11.9g 24.1g 1.75g 0 0 0 0|0 1|0 62b 1k 3 17:17:01
0 0 0 0 0 1 1 11.9g 24.1g 1.75g 0 0 0 0|1 1|0 62b 1k 3 17:17:02
0 0 0 0 0 3 0 11.9g 24.1g 1.76g 0 0 0 0|0 1|0 178b 2k 3 17:17:03
0 0 0 0 0 1 0 11.9g 24.1g 1.76g 0 0 0 0|0 1|0 62b 1k 3 17:17:04
0 0 0 0 0 1 0 11.9g 24.1g 1.76g 0 0 0 0|0 1|0 62b 1k 3 17:17:06
0 0 0 0 0 1 0 11.9g 24.1g 1.76g 0 0 0 0|0 1|0 62b 1k 3 17:17:07
0 0 0 0 0 3 0 11.9g 24.1g 1.77g 0 0 0 0|0 1|0 178b 2k 3 17:17:08
0 0 0 0 0 1 0 11.9g 24.1g 1.77g 0 0 0 0|0 1|0 62b 1k 3 17:17:09
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn time
0 0 0 0 0 1 0 11.9g 24.1g 1.77g 0 0 0 0|0 1|0 62b 1k 3 17:17:10
0 0 0 0 0 1 0 11.9g 24.1g 1.77g 0 0 0 0|0 1|0 62b 1k 3 17:17:11
0 0 0 0 0 3 0 11.9g 24.1g 1.77g 0 0 0 0|0 1|0 178b 2k 3 17:17:13
0 0 0 0 0 1 0 11.9g 24.1g 1.78g 0 0 0 0|0 1|0 62b 1k 3 17:17:14
- depends on
-
SERVER-4961 $group is taking 2x as long as collection.group()
- Closed
-
SERVER-4507 aggregation: optimize $group to take advantage of sorted sequences
- Backlog