Uploaded image for project: 'Go Driver'
  1. Go Driver
  2. GODRIVER-2109

Race condition in staleness checks if connection has not finished connecting

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Critical - P2 Critical - P2
    • 1.7.2, 1.6.2
    • Affects Version/s: 1.7.0
    • Component/s: None
    • None
    • Needed

      We're seeing a race condition in tests that set MinPoolSize when upgrading to 1.7.0. Relevant pieces of the race detector output:

      WARNING: DATA RACE
       [2021/07/28 18:53:25.035]        | Write at 0x00c000481300 by goroutine 66:
       [2021/07/28 18:53:25.035]        |   go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*connection).connect()
       [2021/07/28 18:53:25.035]        |       /data/mci/d42eeec3f9680fb6eb9086010f77b346/src/github.com/10gen/pkg/mod/go.mongodb.org/mongo-driver@v1.7.0/x/mongo/driver/topology/connection.go:231 +0x106b
       [2021/07/28 18:53:25.035]        |
       [2021/07/28 18:53:25.035]        | Previous read at 0x00c000481300 by goroutine 115:
       [2021/07/28 18:53:25.035]        |   go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).stale()
       [2021/07/28 18:53:25.035]        |       /data/mci/d42eeec3f9680fb6eb9086010f77b346/src/github.com/10gen/pkg/mod/go.mongodb.org/mongo-driver@v1.7.0/x/mongo/driver/topology/pool.go:199 +0x235
       [2021/07/28 18:53:25.035]        |   go.mongodb.org/mongo-driver/x/mongo/driver/topology.connectionExpiredFunc()
       [2021/07/28 18:53:25.035]        |       /data/mci/d42eeec3f9680fb6eb9086010f77b346/src/github.com/10gen/pkg/mod/go.mongodb.org/mongo-driver@v1.7.0/x/mongo/driver/topology/pool.go:97 +0x1e5
       [2021/07/28 18:53:25.035]        |   go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*resourcePool).Get()
       [2021/07/28 18:53:25.035]        |       /data/mci/d42eeec3f9680fb6eb9086010f77b346/src/github.com/10gen/pkg/mod/go.mongodb.org/mongo-driver@v1.7.0/x/mongo/driver/topology/resource_pool.go:125 +0x20c
       [2021/07/28 18:53:25.035]        |   go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*pool).get()
       [2021/07/28 18:53:25.035]        |       /data/mci/d42eeec3f9680fb6eb9086010f77b346/src/github.com/10gen/pkg/mod/go.mongodb.org/mongo-driver@v1.7.0/x/mongo/driver/topology/pool.go:388 +0x3f9
       [2021/07/28 18:53:25.035]        |   go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).Connection()
       [2021/07/28 18:53:25.035]        |       /data/mci/d42eeec3f9680fb6eb9086010f77b346/src/github.com/10gen/pkg/mod/go.mongodb.org/mongo-driver@v1.7.0/x/mongo/driver/topology/server.go:266 +0xf4
       [2021/07/28 18:53:25.035]        |   go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*SelectedServer).Connection()
       [2021/07/28 18:53:25.035]        |       <autogenerated>:1 +0x78
       [2021/07/28 18:53:25.035]        |   go.mongodb.org/mongo-driver/x/mongo/driver.Operation.getServerAndConnection()
       [2021/07/28 18:53:25.035]        |       /data/mci/d42eeec3f9680fb6eb9086010f77b346/src/github.com/10gen/pkg/mod/go.mongodb.org/mongo-driver@v1.7.0/x/mongo/driver/operation.go:246 +0x113
       [2021/07/28 18:53:25.035]        |   go.mongodb.org/mongo-driver/x/mongo/driver.Operation.Execute()
       [2021/07/28 18:53:25.035]        |       /data/mci/d42eeec3f9680fb6eb9086010f77b346/src/github.com/10gen/pkg/mod/go.mongodb.org/mongo-driver@v1.7.0/x/mongo/driver/operation.go:301 +0x117
      

      My understanding is that a connection resource will be added to the resource pool in the foreground while the actual establishment is done in the background. When resourcePool#Get is called, it may iterate over this resource and check if it's expired even though it hasn't finished connecting. This can probably be mitigated by a mutex/atomic, but it brings up a larger question about whether we should even be checking connections for expiration if we know they're not done connecting yet.

            Assignee:
            matt.dale@mongodb.com Matt Dale
            Reporter:
            divjot.arora@mongodb.com Divjot Arora (Inactive)
            Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: