This improves the prototype peer manager by:
* Exporting `PeerManager`, making it accessible by e.g. reactors.
* Replacing `Router.SubscribePeerUpdates()` with `PeerManager.Subscribe()`.
* Tracking address/peer connection statistics, and retrying dial failures with exponential backoff.
* Prioritizing peers, with persistent peers configuration.
* Limiting simultaneous connections.
* Evicting peers and upgrading to higher-priority peers.
* Tracking peer heights, as a workaround for legacy shared peer state APIs.
This is getting to a point where we need to determine precise semantics and implement tests, so we should figure out whether it's a reasonable abstraction that we want to use. The main questions are around the API model (i.e. synchronous method calls with the router polling the manager, vs. an event-driven model using channels, vs. the peer manager calling methods on the router to connect/disconnect peers), and who should have the responsibility of managing actual connections (currently the router, while the manager only tracks peer state).
This adds a prototype peer lifecycle manager, `peerManager`, which stores peer data in an internal `peerStore`. The overall idea here is to have methods for peer lifecycle events which exchange a very narrow subset of peer data, and to keep all of the peer metadata (i.e. the `peerInfo` struct) internal, to decouple this from the router and simplify concurrency control. See `peerManager` GoDoc for more information.
The router is still responsible for actually dialing and accepting peer connections, and routing messages across them, but the peer manager is responsible for determining which peers to dial next, preventing multiple connections being established for the same peer (e.g. both inbound and outbound), and making sure we don't dial the same peer several times in parallel. Later it will also track retries and exponential backoff, as well as peer and address quality. It also assumes responsibility for peer updates subscriptions.
It's a bit unclear to me whether we want the peer manager to take on the responsibility of actually dialing and accepting connections as well, or if it should only be tracking peer state for the router while the router is responsible for all transport concerns. Let's revisit this later.
Early but functional prototype of the new `p2p.Router`, see its GoDoc comment for details on how it works. Expect much of this logic to change and improve as we evolve the new P2P stack.
There is a simple test that sets up an in-memory network of four routers with reactors and passes messages between them, but otherwise no exhaustive tests since this is very much a work-in-progress.
This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog.
The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack.
The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely.
There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
## Description
When downloading mockery I ran into an issue where we were using the old version. This PR updates to a more recent version.
changelog?
Closes: #XXX
## Description
partially cleanup in preparation for errcheck
i ignored a bunch of defer errors in tests but with the update to go 1.14 we can use `t.Cleanup(func() { if err := <>; err != nil {..}}` to cover those errors, I will do this in pr number two of enabling errcheck.
ref #5059
Fixes#828. Adds state sync, as outlined in [ADR-053](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md). See related PRs in Cosmos SDK (https://github.com/cosmos/cosmos-sdk/pull/5803) and Gaia (https://github.com/cosmos/gaia/pull/327).
This is split out of the previous PR #4645, and branched off of the ABCI interface in #4704.
* Adds a new P2P reactor which exchanges snapshots with peers, and bootstraps an empty local node from remote snapshots when requested.
* Adds a new configuration section `[statesync]` that enables state sync and configures the light client. Also enables `statesync:info` logging by default.
* Integrates state sync into node startup. Does not support the v2 blockchain reactor, since it needs some reorganization to defer startup.
* libs/common: Refactor libs/common 5
- move mathematical functions and types out of `libs/common` to math pkg
- move net functions out of `libs/common` to net pkg
- move string functions out of `libs/common` to strings pkg
- move async functions out of `libs/common` to async pkg
- move bit functions out of `libs/common` to bits pkg
- move cmap functions out of `libs/common` to cmap pkg
- move os functions out of `libs/common` to os pkg
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* fix testing issues
* fix tests
closes#41417
woooooooooooooooooo kill the cmn pkg
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* add changelog entry
* fix goimport issues
* run gofmt
* libs/common: refactor libs common 3
- move nil.go into types folder and make private
- move service & baseservice out of common into service pkg
ref #4147
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* add changelog entry
* OriginalAddr -> SocketAddr
OriginalAddr records the originally dialed address for outbound peers,
rather than the peer's self reported address. For inbound peers, it was
nil. Here, we rename it to SocketAddr and for inbound peers, set it to
the RemoteAddr of the connection.
* use SocketAddr
Numerous places in the code call peer.NodeInfo().NetAddress().
However, this call to NetAddress() may perform a DNS lookup if the
reported NodeInfo.ListenAddr includes a name. Failure of this lookup
returns a nil address, which can lead to panics in the code.
Instead, call peer.SocketAddr() to return the static address of the
connection.
* remove nodeInfo.NetAddress()
Expose `transport.NetAddress()`, a static result determined
when the transport is created. Removing NetAddress() from the nodeInfo
prevents accidental DNS lookups.
* fixes from review
* linter
* fixes from review
* close peer's connection to avoid fd leak
Fixes#2967
* rename peer#Addr to RemoteAddr
* fix test
* fixes after Ethan's review
* bring back the check
* changelog entry
* write a test for switch#acceptRoutine
* increase timeouts? :(
* remove extra assertNPeersWithTimeout
* simplify test
* assert number of peers (just to be safe)
* Cleanup in OnStop
* run tests with verbose flag on CircleCI
* spawn a reading routine to prevent connection from closing
* get port from the listener
random port is faster, but often results in
```
panic: listen tcp 127.0.0.1:44068: bind: address already in use [recovered]
panic: listen tcp 127.0.0.1:44068: bind: address already in use
goroutine 79 [running]:
testing.tRunner.func1(0xc0001bd600)
/usr/local/go/src/testing/testing.go:792 +0x387
panic(0x974d20, 0xc0001b0500)
/usr/local/go/src/runtime/panic.go:513 +0x1b9
github.com/tendermint/tendermint/p2p.MakeSwitch(0xc0000f42a0, 0x0, 0x9fb9cc, 0x9, 0x9fc346, 0xb, 0xb42128, 0x0, 0x0, 0x0, ...)
/home/vagrant/go/src/github.com/tendermint/tendermint/p2p/test_util.go:182 +0xa28
github.com/tendermint/tendermint/p2p.MakeConnectedSwitches(0xc0000f42a0, 0x2, 0xb42128, 0xb41eb8, 0x4f1205, 0xc0001bed80, 0x4f16ed)
/home/vagrant/go/src/github.com/tendermint/tendermint/p2p/test_util.go:75 +0xf9
github.com/tendermint/tendermint/p2p.MakeSwitchPair(0xbb8d20, 0xc0001bd600, 0xb42128, 0x2f7, 0x4f16c0)
/home/vagrant/go/src/github.com/tendermint/tendermint/p2p/switch_test.go:94 +0x4c
github.com/tendermint/tendermint/p2p.TestSwitches(0xc0001bd600)
/home/vagrant/go/src/github.com/tendermint/tendermint/p2p/switch_test.go:117 +0x58
testing.tRunner(0xc0001bd600, 0xb42038)
/usr/local/go/src/testing/testing.go:827 +0xbf
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:878 +0x353
exit status 2
FAIL github.com/tendermint/tendermint/p2p 0.350s
```
* p2p/conn: FlushStop. Use in pex. Closes#2092
In seed mode, we call StopPeer immediately after Send.
Since flushing msgs to the peer happens in the background,
the peer connection is often closed before the messages are
actually sent out. The new FlushStop method allows all msgs
to first be written and flushed out on the conn before it is closed.
* fix dummy peer
* typo
* fixes from review
* more comments
* ensure pex doesn't call FlushStop more than once
FlushStop is not safe to call more than once,
but we call it from Receive in a go-routine so Receive
doesn't block.
To ensure we only call it once, we use the lastReceivedRequests
map - if an entry already exists, then FlushStop should already have
been called and we can return.
* p2p: NodeInfo is an interface
* (squash) fixes from review
* (squash) more fixes from review
* p2p: remove peerConn.HandshakeTimeout
* p2p: NodeInfo is two interfaces. Remove String()
* fixes from review
* remove test code from peer.RemoteIP()
* p2p: remove peer.OriginalAddr(). See #2618
* use a mockPeer in peer_set_test.go
* p2p: fix testNodeInfo naming
* p2p: remove unused var
* remove testRandNodeInfo
* fix linter
* fix retry dialing self
* fix rpc
* Disable transitioning to new round upon 2/3+ of Precommit nils
Pull in ensureVote test function from https://github.com/tendermint/tendermint/pull/2132
* Add several ensureX test methods to wrap channel read with timeout
* Revert panic in tests
We are swapping the exisiting listener implementation with the newly
introduced Transport and its default implementation MultiplexTransport,
removing a large chunk of old connection setup and handling scattered
over the Peer and Switch code. The Switch requires a Transport now and
handles externally passed Peer filters.
Instead of mutating the passed in MConnConfig part of P2PConfig we just
use the default and override the values, the same as before as it was
always the default version. This is yet another good reason to not embed
information and access to config structs in our components and will go
away with the ongoing refactoring in #1325.
As both configs are concerned with the p2p packaage and PeerConfig is
only used inside of the package there is no good reason to keep the
couple of fields separate, therefore it is collapsed into the more
general P2PConifg. This is a stepping stone towards a setup where the
components inside of p2p do not have any knowledge about the config.
follow-up to #1325
As we didn't hear any voices requesting this feature, we removed the
option to disable it and always have peer connection auth encrypted.
closes#1518
follow-up #1325