The main (and minor) win of this PR is that the transport is fully the
responsibility of the router and the node doesn't need to be responsible for its lifecylce.
A few notes:
- this is not all the deletion that we can do, but this is the most
"simple" case: it leaves in shims, and there's some trivial
additional cleanup to the transport that can happen but that
requires writing more code, and I wanted this to be easy to review
above all else.
- This should land *after* we cut the branch for 0.35, but I'm
anticipating that to happen soon, and I wanted to run this through
CI.
This change aims to keep versions of mockery consistent across developer laptops.
This change adds mockery to the `tools.go` file so that its version can be managed consistently in the `go.mod` file.
Additionally, this change temporarily disables adding mockery's version number to generated files. There is an outstanding issue against the mockery project related to the version string behavior when running from `go get`. I have created a pull request to fix this issue in the mockery project.
see: https://github.com/vektra/mockery/issues/397
This cleans up the `Router` code and adds a bunch of tests. These sorts of systems are a real pain to test, since they have a bunch of asynchronous goroutines living their own lives, so the test coverage is decent but not fantastic. Luckily we've been able to move all of the complex peer management and transport logic outside of the router, as synchronous components that are much easier to test, so the core router logic is fairly small and simple.
This also provides some initial test tooling in `p2p/p2ptest` that automatically sets up in-memory networks and channels for use in integration tests. It also includes channel-oriented test asserters in `p2p/p2ptest/require.go`, but these have primarily been written for router testing and should probably be adapted or extended for reactor testing.
This renames `PeerAddress` to `NodeAddress`, moves it and `NodeID` into a separate file `address.go`, adds tests for them, and fixes a bunch of bugs and inconsistencies.
This revises the new P2P `Transport` interface and does some preliminary code cleanups and simplifications.
The major change here is to add `Connection.Handshake()` for performing node handshakes (once the stream transport API is implemented, this can be done entirely independent of the transport). This moves most of the handshaking logic into the `Router`, such as prevention of head-of-line blocking, validation of peer's `NodeInfo`, controlling timeouts, and so on. This significantly simplifies transports, completely removes the need for internal goroutines, and shares common logic across all transports. This also allows varying the handshake `NodeInfo` across peers, e.g. to vary `ListenAddr`. Similarly, connection filtering is also moved into the switch/router so that it can be shared between transports.
This improves the `peerStore` prototype by e.g.:
* Using a database with Protobuf for persistence, but also keeping full peer set in memory for performance.
* Simplifying the API, by taking/returning struct copies for safety, and removing errors for in-memory operations.
* Caching the ranked peer set, as a temporary solution until a better data structure is implemented.
* Adding `PeerManagerOptions.MaxPeers` and pruning the peer store (based on rank) when it's full.
* Rewriting `PeerAddress` to be independent of `url.URL`, normalizing it and tightening semantics.
Early but functional prototype of the new `p2p.Router`, see its GoDoc comment for details on how it works. Expect much of this logic to change and improve as we evolve the new P2P stack.
There is a simple test that sets up an in-memory network of four routers with reactors and passes messages between them, but otherwise no exhaustive tests since this is very much a work-in-progress.
The `NodeInfo` interface does not appear to serve any purpose at all, so I removed it and renamed the `DefaultNodeInfo` struct to `NodeInfo` (including the Protobuf representations). Let me know if this is actually needed for anything.
Only the Protobuf rename is listed in the changelog, since we do not officially support API stability of the `p2p` package (according to `README.md`). The on-wire protocol remains compatible.
This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog.
The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack.
The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely.
There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
Closes#4603
Commands used (VIM):
```
:args `rg -l errors.Wrap`
:argdo normal @q | update
```
where q is a macros rewriting the `errors.Wrap` to `fmt.Errorf`.
## Description
Previously only outbound peers can be persistent.
Now, even if the peer is inbound, if it's marked as persistent, when/if conn is lost,
Tendermint will try to reconnect. This part is actually optional and can be reverted.
Plus, seed won't disconnect from inbound peer if it's marked as
persistent. Fixes#3362
## Commits
* make persistent prop independent of conn direction
Previously only outbound peers can be persistent. Now, even if the peer
is inbound, if it's marked as persistent, when/if conn is lost,
Tendermint will try to reconnect.
Plus, seed won't disconnect from inbound peer if it's marked as
persistent. Fixes#3362
* fix TestPEXReactorDialPeer test
* add a changelog entry
* update changelog
* add two tests
* reformat code
* test UnsafeDialPeers and UnsafeDialSeeds
* add TestSwitchDialPeersAsync
* spec: update p2p/config spec
* fixes after Ismail's review
* Apply suggestions from code review
Co-Authored-By: melekes <anton.kalyaev@gmail.com>
* fix merge conflict
* remove sleep from TestPEXReactorDoesNotDisconnectFromPersistentPeerInSeedMode
We don't need it actually.
* OriginalAddr -> SocketAddr
OriginalAddr records the originally dialed address for outbound peers,
rather than the peer's self reported address. For inbound peers, it was
nil. Here, we rename it to SocketAddr and for inbound peers, set it to
the RemoteAddr of the connection.
* use SocketAddr
Numerous places in the code call peer.NodeInfo().NetAddress().
However, this call to NetAddress() may perform a DNS lookup if the
reported NodeInfo.ListenAddr includes a name. Failure of this lookup
returns a nil address, which can lead to panics in the code.
Instead, call peer.SocketAddr() to return the static address of the
connection.
* remove nodeInfo.NetAddress()
Expose `transport.NetAddress()`, a static result determined
when the transport is created. Removing NetAddress() from the nodeInfo
prevents accidental DNS lookups.
* fixes from review
* linter
* fixes from review
ref: [#3010 (comment)](https://github.com/tendermint/tendermint/issues/3010#issuecomment-464287627)
> I tried searching for code where we authenticate a peer against its NetAddress.ID and couldn't find it. I don't see a reason to switch to Noise, but a need to ensure that the node's ID is authenticated e.g. after dialing from the address book.
* p2p: check secret conn id matches dialed id
* Fix all p2p tests & make code compile
* add simple test for dialing with wrong ID
* update changelog
* address review comments
* yet another place where to use IDAddressString and fix
testSetupMultiplexTransport
* close peer's connection to avoid fd leak
Fixes#2967
* rename peer#Addr to RemoteAddr
* fix test
* fixes after Ethan's review
* bring back the check
* changelog entry
* write a test for switch#acceptRoutine
* increase timeouts? :(
* remove extra assertNPeersWithTimeout
* simplify test
* assert number of peers (just to be safe)
* Cleanup in OnStop
* run tests with verbose flag on CircleCI
* spawn a reading routine to prevent connection from closing
* get port from the listener
random port is faster, but often results in
```
panic: listen tcp 127.0.0.1:44068: bind: address already in use [recovered]
panic: listen tcp 127.0.0.1:44068: bind: address already in use
goroutine 79 [running]:
testing.tRunner.func1(0xc0001bd600)
/usr/local/go/src/testing/testing.go:792 +0x387
panic(0x974d20, 0xc0001b0500)
/usr/local/go/src/runtime/panic.go:513 +0x1b9
github.com/tendermint/tendermint/p2p.MakeSwitch(0xc0000f42a0, 0x0, 0x9fb9cc, 0x9, 0x9fc346, 0xb, 0xb42128, 0x0, 0x0, 0x0, ...)
/home/vagrant/go/src/github.com/tendermint/tendermint/p2p/test_util.go:182 +0xa28
github.com/tendermint/tendermint/p2p.MakeConnectedSwitches(0xc0000f42a0, 0x2, 0xb42128, 0xb41eb8, 0x4f1205, 0xc0001bed80, 0x4f16ed)
/home/vagrant/go/src/github.com/tendermint/tendermint/p2p/test_util.go:75 +0xf9
github.com/tendermint/tendermint/p2p.MakeSwitchPair(0xbb8d20, 0xc0001bd600, 0xb42128, 0x2f7, 0x4f16c0)
/home/vagrant/go/src/github.com/tendermint/tendermint/p2p/switch_test.go:94 +0x4c
github.com/tendermint/tendermint/p2p.TestSwitches(0xc0001bd600)
/home/vagrant/go/src/github.com/tendermint/tendermint/p2p/switch_test.go:117 +0x58
testing.tRunner(0xc0001bd600, 0xb42038)
/usr/local/go/src/testing/testing.go:827 +0xbf
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:878 +0x353
exit status 2
FAIL github.com/tendermint/tendermint/p2p 0.350s
```
* p2p: NodeInfo is an interface
* (squash) fixes from review
* (squash) more fixes from review
* p2p: remove peerConn.HandshakeTimeout
* p2p: NodeInfo is two interfaces. Remove String()
* fixes from review
* remove test code from peer.RemoteIP()
* p2p: remove peer.OriginalAddr(). See #2618
* use a mockPeer in peer_set_test.go
* p2p: fix testNodeInfo naming
* p2p: remove unused var
* remove testRandNodeInfo
* fix linter
* fix retry dialing self
* fix rpc
* stop node upon receiving SIGTERM or CTRL-Ceven during genesis sleep by setting up interrupt before starting a node
Closes#2434
* call Start, not OnStart when starting a component to avoid:
```
E[09-24|10:13:15.805] Not stopping PubSub -- have not been started yet module=pubsub impl=PubSub
```
being printed on exit
This is the implementation for the design described in ADR 12[0]. It's
the first step of a larger refactor of the p2p package as tracked in
interface bundling all concerns of low-level connection handling and
isolating the rest of peer lifecycle management from the specifics of
the low-level internet protocols. Even if the swappable implementation
will never be utilised, already the isolation of conn related code in
one place will help with the reasoning about execution path and
addressation of security sensitive issues surfaced through bounty
programs and audits.
We deliberately decided to not have Peer filtering and other management
in the Transport, its sole responsibility is the translation of
connections to Peers, handing those to the caller fully setup. It's the
responsibility of the caller to reject those and or keep track. Peer
filtering will take place in the Switch and can be inspected in a the
following commit.
This changeset additionally is an exercise in clean separation of logic
and other infrastructural concerns like logging and instrumentation. By
leveraging a clean and minimal interface. How this looks can be seen in
a follow-up change.
Design #2069[2]
Refs #2067[3]
Fixes #2047[4]
Fixes #2046[5]
changes:
* describe Transport interface
* implement new default Transport: MultiplexTransport
* test MultiplexTransport with new constraints
* implement ConnSet for concurrent management of net.Conn, synchronous
to PeerSet
* implement and expose duplicate IP filter
* implemnt TransportOption for optional parametirisation
[0] https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-012-peer-transport.md
[1] https://github.com/tendermint/tendermint/issues/2067
[2] https://github.com/tendermint/tendermint/pull/2069
[3] https://github.com/tendermint/tendermint/issues/2067
[4] https://github.com/tendermint/tendermint/issues/2047
[5] https://github.com/tendermint/tendermint/issues/2046