-
Type: Task
-
Resolution: Gone away
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Querying
-
Query Execution
-
Query Execution 2021-05-31
In December 2020 / January 2021, I made a diff that made SBE mode enabled by default and then I tried running all of the sys-perf benchmarks. Diff:
diff --git a/src/mongo/db/query/query_knobs.idl b/src/mongo/db/query/query_knobs.idl index 52651caa6f..f6653beb03 100644 --- a/src/mongo/db/query/query_knobs.idl +++ b/src/mongo/db/query/query_knobs.idl @@ -410,7 +410,7 @@ server_parameters: set_at: [ startup, runtime ] cpp_varname: "internalQueryEnableSlotBasedExecutionEngine" cpp_vartype: AtomicWord<bool> - default: false + default: true internalQueryDefaultDOP: description: "Default degree of parallelism. This an internal experimental parameter and should not be changed on live systems."
sys-perf Evergreen run: https://spruce.mongodb.com/version/5fb8666657e85a0819d36b59/tasks
Under SBE mode, for the "industry_benchmarks", "industry_benchmarks_wmajority", "linkbench", and "ycsb_60GB" tasks, the loading phase succeeds ("ycsb_load" or "linkbench_load"), but then the subsequent phase fails ("ycsb_100read", "ycsb_95read5update_w_majority", or "linkbench_request").
In the main Evergreen log for each of these tasks, I see the following error:
[2020/11/21 02:13:34.496] [37m02:13:34Z> [0m about to fork child process, waiting until server is ready for connections. [2020/11/21 02:13:35.721] [37m02:13:34Z> [0m forked process: 35542 [2020/11/21 02:13:35.721] [37m02:13:35Z> [0m ERROR: child process failed, exited with 4
When I looked at the "mongod.log" file for the phase that failed ("ycsb_100read", "ycsb_95read5update_w_majority", or "linkbench_request"), it appears that mongod was encountering an error during startup, and then the mongod would give up and shutdown. Here is a snippet from the "mongod.log" file:
{"t":{"$date":"2020-11-21T02:51:19.012+00:00"},"s":"E", "c":"CONTROL", "id":20539, "ctx":"initandlisten","msg":"Failed to verify auth schema version","attr":{"minSchemaVersion":3,"error":{"code":13436,"codeName":"NotPrimaryOrSecondary","errmsg":"not master or secondary; cannot currently read from this replSet member"}}} {"t":{"$date":"2020-11-21T02:51:19.012+00:00"},"s":"I", "c":"CONTROL", "id":20540, "ctx":"initandlisten","msg":"To manually repair the 'authSchema' document in the admin.system.version collection, start up with --setParameter startupAuthSchemaValidation=false to disable validation"} {"t":{"$date":"2020-11-21T02:51:19.012+00:00"},"s":"I", "c":"REPL", "id":4784900, "ctx":"initandlisten","msg":"Stepping down the ReplicationCoordinator for shutdown","attr":{"waitTimeMillis":15000}} .. {"t":{"$date":"2020-11-21T02:51:19.140+00:00"},"s":"I", "c":"CONTROL", "id":23138, "ctx":"initandlisten","msg":"Shutting down","attr":{"exitCode":4}}
The goal of this task is to investigate and understand precisely why mongod is hitting the "not master or secondary; cannot currently read from this replSet member" error during startup for these benchmarks, to develop a repro that can be done on a developer's local machine, and to open a new task (or update this task) with these findings.
Below are links to the Evergreen runs for each of the failing benchmarks:
industry_benchmarks: https://spruce.mongodb.com/task/sys_perf_linux_1_node_replSet_industry_benchmarks_patch_73d1a6f368b04161dce7c0afbcea23efb52e2070_5fb8666657e85a0819d36b59_20_11_21_01_00_30/
industry_benchmarks_wmajority: https://spruce.mongodb.com/task/sys_perf_linux_1_node_replSet_industry_benchmarks_wmajority_patch_73d1a6f368b04161dce7c0afbcea23efb52e2070_5fb8666657e85a0819d36b59_20_11_21_01_00_30/
- depends on
-
SERVER-54423 Re-run the sys-perf/bestbuy benchmarks
- Closed
- is duplicated by
-
SERVER-53074 [SBE] Add support for $map aggregation pipeline operator
- Closed
- is related to
-
SERVER-51655 Investigate sys-perf benchmark performance in SBE
- Closed