Browse Source

p2p: panic on transport error (#2968)

* p2p: panic on transport error

Addresses #2823. Currently, the acceptRoutine exits if the transport returns
an error trying to accept a new connection. Once this happens, the node
can't accept any new connections. So here, we panic instead. While we
could potentially be more intelligent by rerunning the acceptRoutine, the
error may indicate something more fundamental (eg. file desriptor limit)
that requires a restart anyways. We can leave it to process managers to
handle that restart, and notify operators about the panic.

* changelog
pull/2975/head
Ethan Buchman 6 years ago
committed by GitHub
parent
commit
1bb7e31d63
No known key found for this signature in database GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 8 additions and 0 deletions
  1. +2
    -0
      CHANGELOG.md
  2. +6
    -0
      p2p/switch.go

+ 2
- 0
CHANGELOG.md View File

@ -56,6 +56,8 @@ key types that can be used by validators.
- keep accums averaged near 0
- [types] [\#2941](https://github.com/tendermint/tendermint/issues/2941) Preserve val.Accum during ValidatorSet.Update to avoid it being
reset to 0 every time a validator is updated
- [p2p] \#2968 Panic on transport error rather than continuing to run but not
accept new connections
## v0.26.4


+ 6
- 0
p2p/switch.go View File

@ -505,6 +505,12 @@ func (sw *Switch) acceptRoutine() {
"err", err,
"numPeers", sw.peers.Size(),
)
// We could instead have a retry loop around the acceptRoutine,
// but that would need to stop and let the node shutdown eventually.
// So might as well panic and let process managers restart the node.
// There's no point in letting the node run without the acceptRoutine,
// since it won't be able to accept new connections.
panic(fmt.Errorf("accept routine exited: %v", err))
}
break


Loading…
Cancel
Save