This cleans up the `Router` code and adds a bunch of tests. These sorts of systems are a real pain to test, since they have a bunch of asynchronous goroutines living their own lives, so the test coverage is decent but not fantastic. Luckily we've been able to move all of the complex peer management and transport logic outside of the router, as synchronous components that are much easier to test, so the core router logic is fairly small and simple.
This also provides some initial test tooling in `p2p/p2ptest` that automatically sets up in-memory networks and channels for use in integration tests. It also includes channel-oriented test asserters in `p2p/p2ptest/require.go`, but these have primarily been written for router testing and should probably be adapted or extended for reactor testing.
This renames `PeerAddress` to `NodeAddress`, moves it and `NodeID` into a separate file `address.go`, adds tests for them, and fixes a bunch of bugs and inconsistencies.
This revises the new P2P `Transport` interface and does some preliminary code cleanups and simplifications.
The major change here is to add `Connection.Handshake()` for performing node handshakes (once the stream transport API is implemented, this can be done entirely independent of the transport). This moves most of the handshaking logic into the `Router`, such as prevention of head-of-line blocking, validation of peer's `NodeInfo`, controlling timeouts, and so on. This significantly simplifies transports, completely removes the need for internal goroutines, and shares common logic across all transports. This also allows varying the handshake `NodeInfo` across peers, e.g. to vary `ListenAddr`. Similarly, connection filtering is also moved into the switch/router so that it can be shared between transports.
This test occasionally fails because the peer is already stopped. It is unclear to me exactly what this test is supposed to do, since calling `FlushStop()` will stop the peer, but the test asserts that the peer shouldn't have been stopped by `FlushStop()` since calling `Stop()` afterwards will error in that case.
The current PEX reactor will be removed in the new P2P stack anyway.
This changes the new prototype PEX reactor to resolve peer address URLs into IP/port PEX addresses itself. Branched off of #5974.
I've spent some time thinking about address handling in the P2P stack. We currently use `PeerAddress` URLs everywhere, except for two places: when dialing a peer, and when exchanging addresses via PEX. We had two options:
1. Resolve addresses to endpoints inside `PeerManager`. This would introduce a lot of added complexity: we would have to track connection statistics per endpoint, have goroutines that asynchronously resolve and refresh these endpoints, deal with resolve scheduling before dialing (which is trickier than it sounds since it involves multiple goroutines in the peer manager and router and messes with peer rating order), handle IP address visibility issues, and so on.
2. Resolve addresses to endpoints (IP/port) only where they're used: when dialing, and in PEX. Everywhere else we use URLs.
I went with 2, because this significantly simplifies the handling of hostname resolution, and because I really think the PEX reactor should migrate to exchanging URLs instead of IP/port numbers anyway -- this allows operators to use DNS names for validators (and can easily migrate them to new IPs and/or load balance requests), and also allows different protocols (e.g. QUIC and `MemoryTransport`). Happy to discuss this.
Fixes#5899 by renaming a bunch of P2P Protobuf entities (while maintaining wire compatibility):
* `Message` to `PexMessage` (as it's only used for PEX messages).
* `PexAddrs` to `PexResponse`.
* `PexResponse.Addrs` to `PexResponse.Addresses`.
* `NetAddress` to `PexAddress` (as it's only used by PEX).
Fixes#5941.
Not entirely sure that this will fix the problem (couldn't reproduce), but in any case this is an artifact of a hack in the P2P transport refactor to make it work with the legacy P2P stack, and will be removed when the refactor is done anyway.
This implements a new `Transport` interface and related types for the P2P refactor in #5670. Previously, `conn.MConnection` was very tightly coupled to the `Peer` implementation -- in order to allow alternative non-multiplexed transports (e.g. QUIC), MConnection has now been moved below the `Transport` interface, as `MConnTransport`, and decoupled from the peer. Since the `p2p` package is not covered by our Go API stability, this is not considered a breaking change, and not listed in the changelog.
The initial approach was to implement the new interface in its final form (which also involved possible protocol changes, see https://github.com/tendermint/spec/pull/227). However, it turned out that this would require a large amount of changes to existing P2P code because of the previous tight coupling between `Peer` and `MConnection` and the reliance on subtleties in the MConnection behavior. Instead, I have broadened the `Transport` interface to expose much of the existing MConnection interface, preserved much of the existing MConnection logic and behavior in the transport implementation, and tried to make as few changes to the rest of the P2P stack as possible. We will instead reduce this interface gradually as we refactor other parts of the P2P stack.
The low-level transport code and protocol (e.g. MConnection, SecretConnection and so on) has not been significantly changed, and refactoring this is not a priority until we come up with a plan for QUIC adoption, as we may end up discarding the MConnection code entirely.
There are no tests of the new `MConnTransport`, as this code is likely to evolve as we proceed with the P2P refactor, but tests should be added before a final release. The E2E tests are sufficient for basic validation in the meanwhile.
After a reactor has failed to parse an incoming message, it shouldn't output the "bad" data into the logs, as that data is unfiltered and could have anything in it. (We also don't think this information is helpful to have in the logs anyways.)
*testing.T.TempDir() causes test cases to fail when
it is unable to remove the temporary directory once
the test case execution terminates. This seems to
happen often with pex reactor test cases.
Replace defer with t.Cleanup().
Replace the combination of ioutil.TempDir, error checking
and defer os.RemoveAll() with Go testing.T's new TempDir()
helper.
Mark auxiliary functions as test helpers.
## Description
Add test vectors for all reactors
- [x] state-sync
- [x] privval
- [x] mempool
- [x] p2p
- [x] evidence
- [ ] light?
this PR is primarily oriented at testvectors for things going over the wire. should we expand the testvectors into types as well?
Closes: #XXX
Closes#1581
This fixes the error in #1581, and also documents the purpose of this line. It ensures that if a peer tells us an address we know about, whose ID is the same as our current ID, we ignore it.
This removes the previous case where the ID's matched, but the IP's did not, which could yield a potential overwrite of the IP associated with the address later on. (This then would yield an eclipse attack)
This was not a vulnerability before though, thanks to a defensive check here 95fc7e58ee/p2p/pex/addrbook.go (L522))
## Description
This PR wraps the stdlib sync.(RW)Mutex & godeadlock.(RW)Mutex. This enables using go-deadlock via a build flag instead of using sed to replace sync with godeadlock in all files
Closes: #3242
## Description
partially cleanup in preparation for errcheck
i ignored a bunch of defer errors in tests but with the update to go 1.14 we can use `t.Cleanup(func() { if err := <>; err != nil {..}}` to cover those errors, I will do this in pr number two of enabling errcheck.
ref #5059
Closes#4603
Commands used (VIM):
```
:args `rg -l errors.Wrap`
:argdo normal @q | update
```
where q is a macros rewriting the `errors.Wrap` to `fmt.Errorf`.
to prevent malicious nodes from sending us large messages (~21MB, which
is the default `RecvMessageCapacity`)
This allows us to remove unnecessary `maxMsgSize` check in `decodeMsg`. Since each channel has a msg capacity set to `maxMsgSize`, there's no need to check it again in `decodeMsg`.
Closes#1503
in TestPEXReactorDialsPeerUpToMaxAttemptsInSeedMode
Closes#4668
______
For contributor use:
- [x] Wrote tests
- [ ] ~~Updated CHANGELOG_PENDING.md~~
- [x] Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
- [ ] ~~Updated relevant documentation (`docs/`) and code comments~~
- [x] Re-reviewed `Files changed` in the Github PR explorer
* mark unsolicited and too frequent messaged as bad
* add tests
* update changelog and fix error
* revised error types
Co-authored-by: Alexander Bezobchuk <alexanderbez@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Previously, many reactors were initialized with the name "Reactor," which made it difficult to log which reactor was doing what. This changes those reactors' names to something more descriptive.
* format: add format cmd & goimport repo
- replaced format command
- added goimports to format command
- ran goimports
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* fix outliers & undo proto file changes