This is a little coarse, but the idea is that we'll send information
about the channels a peer has upon the peer-up event that we send to
reactors that we can then use to reject peers (if neeeded) from reactors.
This solves the problem where statesync would hang in test networks
(and presumably real) where we would attempt to statesync from seed
nodes, thereby hanging silently forever.
This continues the push of plumbing contexts through tendermint. I
attempted to find all goroutines in the production code (non-test) and
made sure that these threads would exit when their contexts were
canceled, and I believe this PR does that.
The main (and minor) win of this PR is that the transport is fully the
responsibility of the router and the node doesn't need to be responsible for its lifecylce.
This pull request adds a new "mesage_type" label to the send/recv bytes metrics calculated in the p2p code.
Below is a snippet of the updated metrics that includes the updated label:
```
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="2551a13ed720101b271a5df4816d1e4b3d3bd133"} 652
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="4b1068420ef739db63377250553562b9a978708a"} 631
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="927c50a5e508c747830ce3ba64a3f70fdda58ef2"} 631
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="2551a13ed720101b271a5df4816d1e4b3d3bd133"} 393
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="4b1068420ef739db63377250553562b9a978708a"} 357
tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="927c50a5e508c747830ce3ba64a3f70fdda58ef2"} 386
```
This code hasn't been battle tested, and seems to have grown
increasingly flaky int tests. Given our general direction of reducing
queue complexity over the next couple of releases I think it makes
sense to remove it.
A few notes:
- this is not all the deletion that we can do, but this is the most
"simple" case: it leaves in shims, and there's some trivial
additional cleanup to the transport that can happen but that
requires writing more code, and I wanted this to be easy to review
above all else.
- This should land *after* we cut the branch for 0.35, but I'm
anticipating that to happen soon, and I wanted to run this through
CI.
I put this error log in here because I thought it might be a helpful indicator to see when a reactor sends a message to a peer that doesn't have that channel open but it turns out this is happening all the time and it's kind of annoying
This cleans up the `Router` code and adds a bunch of tests. These sorts of systems are a real pain to test, since they have a bunch of asynchronous goroutines living their own lives, so the test coverage is decent but not fantastic. Luckily we've been able to move all of the complex peer management and transport logic outside of the router, as synchronous components that are much easier to test, so the core router logic is fairly small and simple.
This also provides some initial test tooling in `p2p/p2ptest` that automatically sets up in-memory networks and channels for use in integration tests. It also includes channel-oriented test asserters in `p2p/p2ptest/require.go`, but these have primarily been written for router testing and should probably be adapted or extended for reactor testing.
This renames `PeerAddress` to `NodeAddress`, moves it and `NodeID` into a separate file `address.go`, adds tests for them, and fixes a bunch of bugs and inconsistencies.
This revises the new P2P `Transport` interface and does some preliminary code cleanups and simplifications.
The major change here is to add `Connection.Handshake()` for performing node handshakes (once the stream transport API is implemented, this can be done entirely independent of the transport). This moves most of the handshaking logic into the `Router`, such as prevention of head-of-line blocking, validation of peer's `NodeInfo`, controlling timeouts, and so on. This significantly simplifies transports, completely removes the need for internal goroutines, and shares common logic across all transports. This also allows varying the handshake `NodeInfo` across peers, e.g. to vary `ListenAddr`. Similarly, connection filtering is also moved into the switch/router so that it can be shared between transports.