Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-9788

mongos does not re-evaluate read preference once a valid replica set member is chosen

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.6.4, 2.7.3
    • Affects Version/s: 2.4.3
    • Component/s: Sharding
    • None
    • Environment:
      All
    • ALL
    • Hide

      1) Create and start 3 member repset (primary, secondary, arbiter)
      2) Start mongos
      3) Send reads to mongos, verify they go to SEC
      4) Kill SEC
      5) Send reads to mongos, verify they go to PRI
      6) Restart SEC
      7) Send reads to mongos, verify they go to SEC (they don't).

      Show
      1) Create and start 3 member repset (primary, secondary, arbiter) 2) Start mongos 3) Send reads to mongos, verify they go to SEC 4) Kill SEC 5) Send reads to mongos, verify they go to PRI 6) Restart SEC 7) Send reads to mongos, verify they go to SEC (they don't).

      Issue Status as of Jul 22, 2014

      ISSUE SUMMARY
      When reading from a sharded cluster via mongos with a specific read preference, mongos never re-evaluates the preference as long as it connects to a valid member. This can in certain circumstances lead to situations where mongos reads from nodes for prolonged times that do not match the user's intention and expectation.

      Example:

      When the "secondaryPreferred" read preference is set, mongos connects to an available secondary on a new connection for reads. If there are no longer any available secondaries, mongos correctly switches to a primary node. However, even when a secondary node is available again, mongos does not switch back to read from the secondary node. The connection is pinned to the primary because under "secondaryPreferred", the primary is a valid target to read from and no re-evaluation is carried out until the the target becomes invalid or unreachable.

      USER IMPACT
      Reads can go to primary nodes for prolonged times even though the user specified that they prefer secondary reads. Users may not even be aware of this fact, if they don't closely monitor the state of their replica sets at all times. Depending on the application architecture, this can lead to degraded read and write throughput.

      WORKAROUNDS
      The only workaround is to forcibly unpin the connection by specifying a different readPreference on said connection.

      AFFECTED VERSIONS
      All previous production releases are affected by this issue.

      FIX VERSION
      The fix is included in the 2.6.4 production release.

      RESOLUTION DETAILS

      1. Secondary connections are now drawn from the global pool.
      2. For mongos, the active ReplicaSet connection will release its secondary connection back to the pool after the end of the query/command. This also has a side effect of 'unpinning' the read preference settings. In other words, when this connection is reused again, the node selection will be evaluated again according to the read preference.

      As these changes could not be backported to 2.6, a different fix was implemented specifically for 2.6: a new mongos server parameter, internalDBClientRSReselectNodePercentage was introduced. This can be set to any value from 0 to 100 (defaults to 0) and represents the probability (expressed in percentage) of a replica set connection in mongos to reevaluate replica set node selection from scratch, regardless of the compatibility of the current read preference to the last-used secondary node. Extra care should be taken since reselecting a replica set node will destroy the old connection and create a new connection. This means in extreme cases (for example, 100%), mongos can be creating and destroying connections for every read request.

      Original description

      During lab tests with 1 primary, 1 secondary and 1 arbiter I'm running into the following issue when using the Java drivers' "secondaryPreferred" read preference :

      We start with a healthy 3 member repset and start our load tests. This load test connects to mongos. All reads go to the secondary member. We kill the secondary and reads are correctly routed to the primary. We restart the secondary but reads continue to go to the primary indefinitely.

      Might be a Java driver issue since I was not able to reproduce in shell due to the lack of support for this read mode there (I think?)

            Assignee:
            randolph@mongodb.com Randolph Tan
            Reporter:
            remonvv Remon van Vliet
            Votes:
            4 Vote for this issue
            Watchers:
            19 Start watching this issue

              Created:
              Updated:
              Resolved: