-
Type: Bug
-
Resolution: Gone away
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: libmongoc
-
(copied to CRM)
Since a mongoc_cursor_t ties itself to a server_id, it is currently possible that it may attempt to send a command using a server description that has been marked as UNKNOWN.
I have been able to reproduce such a situation by modifying example-client.c. Here is the salient bit:
printf ("initializing agg, this ties cursor to the server, but does not send the command\n"); cursor = mongoc_collection_aggregate (collection, MONGOC_QUERY_NONE, &empty_doc, NULL /* opts */, NULL /* read_prefs */); if (mongoc_cursor_error (cursor, &error)) { printf ("agg error: %s\n", error.message); } printf ("simulating a server being marked as UNKNOWN\n"); mongoc_topology_invalidate_server (client->topology, cursor->server_id, &error); printf ("done\n"); printf ("sending aggregate command\n"); mongoc_cursor_next (cursor, &doc); if (mongoc_cursor_error (cursor, &error)) { printf ("error: %s\n", error.message); }
This prints:
initializing agg, this ties cursor to the server, but does not send the command simulating a server being marked as UNKNOWN sending aggregate command error: "aggregate" command does not support readConcern with wire version 0, wire version 4 is required
The test case forces a server being marked as UNKNOWN. But I believe this can happen in a two situations:
1. A > 4.0 server receives a "not master" error. The server description is marked as unknown, but the connections are left open. In <= 4.0 I don't believe this is an issue since the connections are also reset, so the next attempt to send a command on the cursor will recreate the connection and do another handshake.
2. The background monitor receives a network error and marks the server as unknown. Because of CDRIVER-3529, this is more likely to happen on a transient network error.
This bug is extremely similar to CDRIVER-3404. I think it is worth investigating if this could surface in other parts of the codebase, since we have other wire version checks. I think it's worth considering rethinking how we're invalidating server descriptions / doing wire version checks.
- is related to
-
NODE-3648 AbstractCursor getMores should run serverSelection to ensure monitoring updates are respected
- Closed
-
CDRIVER-3404 Assertion hit when handshake runs against unknown server type
- Closed
-
CDRIVER-3653 Connections should use server descriptions from handshake, not monitoring
- Closed
-
CDRIVER-3529 Do not mark server as Unknown during topology scan until after retry has failed
- Closed
- mentioned in
-
Page Loading...