Secure cockroachdb cluster on kubernetes. crashloopbackoff cockroachdb containers

I am trying to create cockroachdb cluster on AWS EKS. I am able to do it with insecure mode using yaml files provided in the documentation.But, when i try to create using secure mode, getting the below error.

Error: grpc: addrConn.createTransport failed to connect to {cockroachdb-0.cockroachdb:26257 0 }. Err :connection error: desc = "transport: Error while dialing cannot reuse client connection". Reconnecting...

I have not made any changes to default yaml files.

here are the logs of container:

I180711 13:38:33.142903 1 cli/start.go:789 using local environment variables: COCKROACH_CHANNEL=kubernetes-secure I180711 13:38:33.142941 1 cli/start.go:796 process identity: uid 0 euid 0 gid 0 egid 0 I180711 13:38:33.142978 1 cli/start.go:461 starting cockroach node I180711 13:38:33.145061 10 storage/engine/rocksdb.go:552 opening rocksdb instance at "/cockroach/cockroach-data/cockroach-temp107022697" I180711 13:38:33.168828 10 server/server.go:734 [n?] monitoring forward clock jumps based on server.clock.forward_jump_check_enabled I180711 13:38:33.169476 10 storage/engine/rocksdb.go:552 opening rocksdb instance at "/cockroach/cockroach-data" I180711 13:38:33.182480 10 server/config.go:538 [n?] 1 storage engine initialized I180711 13:38:33.182526 10 server/config.go:541 [n?] RocksDB cache size: 1.8 GiB I180711 13:38:33.182554 10 server/config.go:541 [n?] store 0: RocksDB, max size 0 B, max open file limit 60536 W180711 13:38:33.183018 10 gossip/gossip.go:1293 [n?] no incoming or outgoing connections I180711 13:38:33.183098 10 server/server.go:1306 [n?] no stores bootstrapped and --join flag specified, awaiting init command. W180711 13:38:33.191535 82 vendor/google.golang.org/grpc/server.go:563 grpc: Server.Serve failed to complete security handshake from “x.x.x.x:46044": remote error: tls: bad certificate W180711 13:38:33.191698 76 vendor/google.golang.org/grpc/clientconn.go:830 Failed to dial cockroachdb-0.cockroachdb:26257: connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for node, not cockroachdb-0.cockroachdb"; please retry. W180711 13:38:33.191849 72 gossip/client.go:123 [n?] failed to start gossip client to cockroachdb-0.cockroachdb:26257: initial connection heartbeat failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure W180711 13:38:34.195112 62 vendor/google.golang.org/grpc/clientconn.go:830 Failed to dial cockroachdb-1.cockroachdb:26257: connection error: desc = "transport: authentication handshake failed: x509: certificate is valid for node, not cockroachdb-1.cockroachdb"; please retry. W180711 13:38:34.195213 86 gossip/client.go:123 [n?] failed to start gossip client to cockroachdb-1.cockroachdb:26257: initial connection heartbeat failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure W180711 13:38:35.192517 91 vendor/google.golang.org/grpc/clientconn.go:1158 grpc: addrConn.createTransport failed to connect to {cockroachdb-2.cockroachdb:26257 0 }. Err :connection error: desc = "transport: Error while dialing dial tcp x.x.x.x:26257: connect: connection refused". Reconnecting... W180711 13:38:35.192632 102 gossip/client.go:123 [n?] failed to start gossip client to cockroachdb-2.cockroachdb:26257: initial connection heartbeat failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure W180711 13:38:36.189890 91 vendor/google.golang.org/grpc/clientconn.go:1158 grpc: addrConn.createTransport failed to connect to {cockroachdb-2.cockroachdb:26257 0 }. Err :connection error: desc = "transport: Error while dialing cannot reuse client connection". Reconnecting... W180711 13:38:36.189994 91 vendor/google.golang.org/grpc/clientconn.go:830 Failed to dial cockroachdb-2.cockroachdb:26257: context canceled; please retry. W180711 13:38:38.199255 127 vendor/google.golang.org/grpc/clientconn.go:1158 grpc: addrConn.createTransport failed to connect to {cockroachdb-2.cockroachdb:26257 0 }. Err :connection error: desc = "transport: Error while dialing dial tcp x.x.x.x:26257: connect: connection refused". Reconnecting... W180711 13:38:38.728043 133 vendor/google.golang.org/grpc/server.go:563 grpc: Server.Serve failed to complete security handshake from “x.x.x.x:56090": remote error: tls: bad certificate W180711 13:38:39.196718 127 vendor/google.golang.org/grpc/clientconn.go:1158 grpc: addrConn.createTransport failed to connect to {cockroachdb-2.cockroachdb:26257 0 }. Err :connection error: desc = "transport: Error while dialing cannot reuse client connection". Reconnecting...

pods:

NAME READY STATUS RESTARTS AGE cluster-init-secure-99b8f 0/1 CrashLoopBackOff 221 18h cockroachdb-0 0/1 CrashLoopBackOff 395 20h cockroachdb-1 0/1 CrashLoopBackOff 389 20h cockroachdb-2 0/1 CrashLoopBackOff 391 20h

1 answer

  • answered 2018-07-11 03:40 Alex Robinson

    1. This isn't enough information to know for sure what problem you're facing.
    2. It looks like you changed the names of some of the resources in the config file (or else you'd be seeing cockroachdb-0.cockroachdb:26257, not db-0.cockroachdb:26257. It seems likely you've messed something up by not changing everything necessary.

    I'd suggest re-examining any modifications you've made, following the documentation more closely, or using the Helm chart which makes customization a little easier.