When 1 or 2 configdbs present in the mongos command line aren't available, mongos fails to start.
Here are some reproduction scripts for a single host, and then the log and failure message when 1 or 2 configsrvs are down.
–
- Setup
mkdir /tmp/logs
mkdir /tmp/config
mkdir /tmp/config2
mkdir /tmp/config3
killall mongod mongos
- Start a configsvr
./mongod --port 28000 --logpath /tmp/logs/config.log --dbpath /tmp/config --directoryperdb --configsvr --quiet --logappend --fork - Give the configsvr time to get listening.
sleep 2 - Start a mongos
./mongos --port 27000 --logpath /tmp/logs/router.log --configdb "localhost:28000,localhost:28001,localhost:28002" --quiet --logappend --fork - Observe that mongos failed to start
sleep 2; pgrep mongos || echo "no mongos"
- Try again with 2 config servers.
killall mongod mongos
./mongod --port 28000 --logpath /tmp/logs/config.log --dbpath /tmp/config --directoryperdb --configsvr --quiet --logappend --fork
./mongod --port 28001 --logpath /tmp/logs/config2.log --dbpath /tmp/config2 --directoryperdb --configsvr --quiet --logappend --fork - Give the configsvrs time to get listening.
sleep 2
./mongos --port 27000 --logpath /tmp/logs/router.log --configdb "localhost:28000,localhost:28001,localhost:28002" --quiet --logappend --fork
sleep 2; pgrep mongos || echo "no mongos"
- Try again with 3 config servers
killall mongod mongos
./mongod --port 28000 --logpath /tmp/logs/config.log --dbpath /tmp/config --directoryperdb --configsvr --quiet --logappend --fork
./mongod --port 28001 --logpath /tmp/logs/config2.log --dbpath /tmp/config2 --directoryperdb --configsvr --quiet --logappend --fork
./mongod --port 28002 --logpath /tmp/logs/config3.log --dbpath /tmp/config3 --directoryperdb --configsvr --quiet --logappend --fork - Give the configsvrs time to get listening.
sleep 2
./mongos --port 27000 --logpath /tmp/logs/router.log --configdb "localhost:28000,localhost:28001,localhost:28002" --quiet --logappend --fork
sleep 2; pgrep mongos || echo "no mongos"
–
-
-
-
-
- SERVER RESTARTED *****
-
-
-
Fri Apr 22 18:19:24 ./mongos db version v1.9.0-pre-, pdfile version 4.5 starting (--help for usage)
Fri Apr 22 18:19:24 git version: df30db031cd1531705f23fa6ed75a88e2116568f
Fri Apr 22 18:19:24 build sys info: Darwin Richard-Kreuters-MacBook-Pro.localdomain 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386 BOOST_LIB_VERSION=1_41
Fri Apr 22 18:19:24 warning: couldn't check on config server:localhost:28001 ok for now : 11002 socket exception [6] server [localhost:28001] mongos connectionpool error: couldn't connect to server localhost:28001
Fri Apr 22 18:19:24 warning: couldn't check on config server:localhost:28002 ok for now : 11002 socket exception [6] server [localhost:28002] mongos connectionpool error: couldn't connect to server localhost:28002
Fri Apr 22 18:19:24 warning: only 1 config server reachable, continuing
Fri Apr 22 18:19:24 SyncClusterConnection connecting to [localhost:28000]
Fri Apr 22 18:19:24 SyncClusterConnection connecting to [localhost:28001]
Fri Apr 22 18:19:24 SyncClusterConnection connect fail to: localhost:28001 errmsg: couldn't connect to server localhost:28001
Fri Apr 22 18:19:24 SyncClusterConnection connecting to [localhost:28002]
Fri Apr 22 18:19:24 SyncClusterConnection connect fail to: localhost:28002 errmsg: couldn't connect to server localhost:28002
Fri Apr 22 18:19:24 trying reconnect to localhost:28001
Fri Apr 22 18:19:24 reconnect localhost:28001 failed couldn't connect to server localhost:28001
Fri Apr 22 18:19:24 trying reconnect to localhost:28002
Fri Apr 22 18:19:24 reconnect localhost:28002 failed couldn't connect to server localhost:28002
Fri Apr 22 18:19:24 ~ScopedDbConnection: _conn != null
uncaught exception in mongos main:
8003 SyncClusterConnection::insert prepare failed: 9001 socket exception [6] localhost:28001:{}9001 socket exception [6] localhost:28002:{}
-
-
-
-
- SERVER RESTARTED *****
-
-
-
Fri Apr 22 18:19:28 ./mongos db version v1.9.0-pre-, pdfile version 4.5 starting (--help for usage)
Fri Apr 22 18:19:28 git version: df30db031cd1531705f23fa6ed75a88e2116568f
Fri Apr 22 18:19:28 build sys info: Darwin Richard-Kreuters-MacBook-Pro.localdomain 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386 BOOST_LIB_VERSION=1_41
Fri Apr 22 18:19:28 warning: couldn't check on config server:localhost:28002 ok for now : 11002 socket exception [6] server [localhost:28002] mongos connectionpool error: couldn't connect to server localhost:28002
Fri Apr 22 18:19:28 SyncClusterConnection connecting to [localhost:28000]
Fri Apr 22 18:19:28 SyncClusterConnection connecting to [localhost:28001]
Fri Apr 22 18:19:28 SyncClusterConnection connecting to [localhost:28002]
Fri Apr 22 18:19:28 SyncClusterConnection connect fail to: localhost:28002 errmsg: couldn't connect to server localhost:28002
Fri Apr 22 18:19:28 trying reconnect to localhost:28002
Fri Apr 22 18:19:28 reconnect localhost:28002 failed couldn't connect to server localhost:28002
Fri Apr 22 18:19:28 ~ScopedDbConnection: _conn != null
uncaught exception in mongos main:
8003 SyncClusterConnection::insert prepare failed: 9001 socket exception [6] localhost:28002:{}
-
-
-
-
- SERVER RESTARTED *****
-
-
-
Fri Apr 22 18:19:32 ./mongos db version v1.9.0-pre-, pdfile version 4.5 starting (--help for usage)
Fri Apr 22 18:19:32 git version: df30db031cd1531705f23fa6ed75a88e2116568f
Fri Apr 22 18:19:32 build sys info: Darwin Richard-Kreuters-MacBook-Pro.localdomain 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386 BOOST_LIB_VERSION=1_41
Fri Apr 22 18:19:32 SyncClusterConnection connecting to [localhost:28000]
Fri Apr 22 18:19:32 SyncClusterConnection connecting to [localhost:28001]
Fri Apr 22 18:19:32 SyncClusterConnection connecting to [localhost:28002]
Fri Apr 22 18:19:32 [Balancer] about to contact config servers and shards
Fri Apr 22 18:19:32 [mongosMain] waiting for connections on port 27000
Fri Apr 22 18:19:32 [websvr] web admin interface listening on port 28000
Fri Apr 22 18:19:32 [websvr] listen(): bind() failed errno:48 Address already in use for socket: 0.0.0.0:28000
Fri Apr 22 18:19:32 [websvr] addr already in use
Fri Apr 22 18:19:32 [Balancer] config servers and shards contacted successfully
Fri Apr 22 18:19:32 [Balancer] balancer id: Richard-Kreuters-MacBook-Pro.localdomain:27000 started at Apr 22 18:19:32
Fri Apr 22 18:19:32 [Balancer] created new distributed lock for balancer on localhost:28000,localhost:28001,localhost:28002 ( lock timeout : 900000, legacy timeout : 0, ping interval : 30000, process : 0, legacy : 0 )
Fri Apr 22 18:19:32 [Balancer] SyncClusterConnection connecting to [localhost:28000]
Fri Apr 22 18:19:32 [Balancer] SyncClusterConnection connecting to [localhost:28001]
Fri Apr 22 18:19:32 [Balancer] SyncClusterConnection connecting to [localhost:28002]
Fri Apr 22 18:19:32 [Balancer] SyncClusterConnection connecting to [localhost:28000]
Fri Apr 22 18:19:32 [Balancer] SyncClusterConnection connecting to [localhost:28001]
Fri Apr 22 18:19:32 [Balancer] SyncClusterConnection connecting to [localhost:28002]
Fri Apr 22 18:19:32 [LockPinger] creating distributed lock ping thread for localhost:28000,localhost:28001,localhost:28002 and process Richard-Kreuters-MacBook-Pro.localdomain:27000:1303510772:16807 (sleeping for 30000ms)
Fri Apr 22 18:19:32 [Balancer] SyncClusterConnection connecting to [localhost:28000]
Fri Apr 22 18:19:32 [Balancer] SyncClusterConnection connecting to [localhost:28001]
Fri Apr 22 18:19:32 [Balancer] SyncClusterConnection connecting to [localhost:28002]
Fri Apr 22 18:19:32 [Balancer] distributed lock 'balancer/Richard-Kreuters-MacBook-Pro.localdomain:27000:1303510772:16807' acquired, now :
Fri Apr 22 18:19:42 [Balancer] distributed lock 'balancer/Richard-Kreuters-MacBook-Pro.localdomain:27000:1303510772:16807' unlocked.
Fri Apr 22 18:19:42 [Balancer] distributed lock 'balancer/Richard-Kreuters-MacBook-Pro.localdomain:27000:1303510772:16807' acquired, now :
Fri Apr 22 18:19:53 [Balancer] distributed lock 'balancer/Richard-Kreuters-MacBook-Pro.localdomain:27000:1303510772:16807' unlocked.
Fri Apr 22 18:19:53 [Balancer] distributed lock 'balancer/Richard-Kreuters-MacBook-Pro.localdomain:27000:1303510772:16807' acquired, now :
Fri Apr 22 18:20:03 [Balancer] distributed lock 'balancer/Richard-Kreuters-MacBook-Pro.localdomain:27000:1303510772:16807' unlocked.
Fri Apr 22 18:20:03 [Balancer] distributed lock 'balancer/Richard-Kreuters-MacBook-Pro.localdomain:27000:1303510772:16807' acquired, now :
- is related to
-
SERVER-5240 mongos fails to connect to active cluster with config hosts down
- Closed
- related to
-
SERVER-1936 can't start mongos if only a subset of config servers are up
- Closed