This cleans up the `Router` code and adds a bunch of tests. These sorts of systems are a real pain to test, since they have a bunch of asynchronous goroutines living their own lives, so the test coverage is decent but not fantastic. Luckily we've been able to move all of the complex peer management and transport logic outside of the router, as synchronous components that are much easier to test, so the core router logic is fairly small and simple.
This also provides some initial test tooling in `p2p/p2ptest` that automatically sets up in-memory networks and channels for use in integration tests. It also includes channel-oriented test asserters in `p2p/p2ptest/require.go`, but these have primarily been written for router testing and should probably be adapted or extended for reactor testing.
This renames `PeerAddress` to `NodeAddress`, moves it and `NodeID` into a separate file `address.go`, adds tests for them, and fixes a bunch of bugs and inconsistencies.
This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog.
The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack.
The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely.
There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
closes: #5770closes: #5769
also, include node ID in the output (#5769) and modify NodeKey to use
value semantics (it makes perfect sense for NodeKey to not be a
pointer).
While working on tendermint my colleague @jinmannwong fixed a few of the unit tests that we found to be flaky in our CI. We thought that you might find this useful, see below for comments.
Attempt to fix this the below test failure by waiting for the listener to get ready. I am not at all convinced that this is the correct fix - the below indicates that the TCP socket was closed after it was set up - but I'm unable to come up with an actionable hypothesis for what caused it.
```
2020/05/14 17:25:11 Failed to accept conn: accept tcp 127.0.0.1:44737: use of closed network connection
2020/05/14 17:25:11 Failed to accept conn: accept tcp 127.0.0.1:42589: use of closed network connection
2020/05/14 17:25:11 Failed to accept conn: accept tcp 127.0.0.1:40905: use of closed network connection
2020/05/14 17:25:12 Failed to accept conn: accept tcp 127.0.0.1:39847: use of closed network connection
2020/05/14 17:25:12 Failed to accept conn: accept tcp 127.0.0.1:39989: use of closed network connection
2020/05/14 17:25:12 Failed to accept conn: accept tcp 127.0.0.1:43587: use of closed network connection
2020/05/14 17:25:12 Failed to accept conn: accept tcp 127.0.0.1:35415: use of closed network connection
2020/05/14 17:25:12 Failed to accept conn: accept tcp 127.0.0.1:38657: use of closed network connection
2020/05/14 17:25:12 Failed to accept conn: accept tcp 127.0.0.1:38217: use of closed network connection
2020/05/14 17:25:13 Failed to accept conn: accept tcp 127.0.0.1:42247: use of closed network connection
2020/05/14 17:25:16 Failed to accept conn: accept tcp 127.0.0.1:39705: use of closed network connection
2020/05/14 17:25:16 Failed to accept conn: accept tcp 127.0.0.1:39491: use of closed network connection
2020/05/14 17:25:16 Failed to accept conn: accept tcp 127.0.0.1:37107: use of closed network connection
2020/05/14 17:25:16 Failed to accept conn: accept tcp 127.0.0.1:39909: use of closed network connection
2020/05/14 17:25:16 Failed to accept conn: accept tcp 127.0.0.1:37987: use of closed network connection
2020/05/14 17:25:16 Failed to accept conn: accept tcp 127.0.0.1:41505: use of closed network connection
2020/05/14 17:25:16 Failed to accept conn: accept tcp 127.0.0.1:39121: use of closed network connection
2020/05/14 17:25:16 Failed to accept conn: accept tcp 127.0.0.1:46569: use of closed network connection
2020/05/14 17:25:16 Failed to accept conn: accept tcp 127.0.0.1:45643: use of closed network connection
2020/05/14 17:25:16 Failed to accept conn: accept tcp 127.0.0.1:35289: use of closed network connection
--- FAIL: TestTransportMultiplexAcceptMultiple (0.43s)
transport_test.go:200: auth failure: handshake failed: EOF
FAIL
```
* p2p: log error in transport tests
* p2p: exit from fast peer only when handshake is done
TestTransportMultiplexAcceptNonBlocking
fixes panic: write to a closed channel
* p2p: increase timeout in TestTransportMultiplexConnFilterTimeout
Fixes https://github.com/tendermint/tendermint/issues/4854#issuecomment-630739200
* p2p: yield control to another goroutine manually
* increase timeout in TestTransportMultiplexAcceptNonBlocking
Fixes#3521
The function NewNetAddressStringWithOptionalID is from a time when peer
IDs were optional. They're not anymore. So this should be renamed to
NewNetAddressString and should ensure the ID is provided.
* update changelog
* use NewNetAddress in transport tests
* use NewNetAddress in TestTransportMultiplexAcceptMultiple
* OriginalAddr -> SocketAddr
OriginalAddr records the originally dialed address for outbound peers,
rather than the peer's self reported address. For inbound peers, it was
nil. Here, we rename it to SocketAddr and for inbound peers, set it to
the RemoteAddr of the connection.
* use SocketAddr
Numerous places in the code call peer.NodeInfo().NetAddress().
However, this call to NetAddress() may perform a DNS lookup if the
reported NodeInfo.ListenAddr includes a name. Failure of this lookup
returns a nil address, which can lead to panics in the code.
Instead, call peer.SocketAddr() to return the static address of the
connection.
* remove nodeInfo.NetAddress()
Expose `transport.NetAddress()`, a static result determined
when the transport is created. Removing NetAddress() from the nodeInfo
prevents accidental DNS lookups.
* fixes from review
* linter
* fixes from review
ref: [#3010 (comment)](https://github.com/tendermint/tendermint/issues/3010#issuecomment-464287627)
> I tried searching for code where we authenticate a peer against its NetAddress.ID and couldn't find it. I don't see a reason to switch to Noise, but a need to ensure that the node's ID is authenticated e.g. after dialing from the address book.
* p2p: check secret conn id matches dialed id
* Fix all p2p tests & make code compile
* add simple test for dialing with wrong ID
* update changelog
* address review comments
* yet another place where to use IDAddressString and fix
testSetupMultiplexTransport
* p2p: NodeInfo is an interface
* (squash) fixes from review
* (squash) more fixes from review
* p2p: remove peerConn.HandshakeTimeout
* p2p: NodeInfo is two interfaces. Remove String()
* fixes from review
* remove test code from peer.RemoteIP()
* p2p: remove peer.OriginalAddr(). See #2618
* use a mockPeer in peer_set_test.go
* p2p: fix testNodeInfo naming
* p2p: remove unused var
* remove testRandNodeInfo
* fix linter
* fix retry dialing self
* fix rpc
This is the implementation for the design described in ADR 12[0]. It's
the first step of a larger refactor of the p2p package as tracked in
interface bundling all concerns of low-level connection handling and
isolating the rest of peer lifecycle management from the specifics of
the low-level internet protocols. Even if the swappable implementation
will never be utilised, already the isolation of conn related code in
one place will help with the reasoning about execution path and
addressation of security sensitive issues surfaced through bounty
programs and audits.
We deliberately decided to not have Peer filtering and other management
in the Transport, its sole responsibility is the translation of
connections to Peers, handing those to the caller fully setup. It's the
responsibility of the caller to reject those and or keep track. Peer
filtering will take place in the Switch and can be inspected in a the
following commit.
This changeset additionally is an exercise in clean separation of logic
and other infrastructural concerns like logging and instrumentation. By
leveraging a clean and minimal interface. How this looks can be seen in
a follow-up change.
Design #2069[2]
Refs #2067[3]
Fixes #2047[4]
Fixes #2046[5]
changes:
* describe Transport interface
* implement new default Transport: MultiplexTransport
* test MultiplexTransport with new constraints
* implement ConnSet for concurrent management of net.Conn, synchronous
to PeerSet
* implement and expose duplicate IP filter
* implemnt TransportOption for optional parametirisation
[0] https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-012-peer-transport.md
[1] https://github.com/tendermint/tendermint/issues/2067
[2] https://github.com/tendermint/tendermint/pull/2069
[3] https://github.com/tendermint/tendermint/issues/2067
[4] https://github.com/tendermint/tendermint/issues/2047
[5] https://github.com/tendermint/tendermint/issues/2046