* p2p/conn: FlushStop. Use in pex. Closes#2092
In seed mode, we call StopPeer immediately after Send.
Since flushing msgs to the peer happens in the background,
the peer connection is often closed before the messages are
actually sent out. The new FlushStop method allows all msgs
to first be written and flushed out on the conn before it is closed.
* fix dummy peer
* typo
* fixes from review
* more comments
* ensure pex doesn't call FlushStop more than once
FlushStop is not safe to call more than once,
but we call it from Receive in a go-routine so Receive
doesn't block.
To ensure we only call it once, we use the lastReceivedRequests
map - if an entry already exists, then FlushStop should already have
been called and we can return.
* use READ lock/unlock in ConsensusState#GetLastHeight
Refs #2721
* do not use defers when there's no need
* fix peer formatting (output its address instead of the pointer)
```
[54310]: E[11-02|11:59:39.851] Connection failed @ sendRoutine module=p2p peer=0xb78f00 conn=MConn{74.207.236.148:26656} err="pong timeout"
```
https://github.com/tendermint/tendermint/issues/2721#issuecomment-435326581
* panic if peer has no state
https://github.com/tendermint/tendermint/issues/2721#issuecomment-435347165
It's confusing that sometimes we check if peer has a state, but most of
the times we expect it to be there
1. add79700b5/mempool/reactor.go (L138)
2. add79700b5/rpc/core/consensus.go (L196) (edited)
I will change everything to always assume peer has a state and panic
otherwise
that should help identify issues earlier
* abci/localclient: extend lock on app callback
App callback should be protected by lock as well (note this was already
done for InitChainAsync, why not for others???). Otherwise, when we
execute the block, tx might come in and call the callback in the same
time we're updating it in execBlockOnProxyApp => DATA RACE
Fixes#2721
Consensus state is locked
```
goroutine 113333 [semacquire, 309 minutes]:
sync.runtime_SemacquireMutex(0xc00180009c, 0xc0000c7e00)
/usr/local/go/src/runtime/sema.go:71 +0x3d
sync.(*RWMutex).RLock(0xc001800090)
/usr/local/go/src/sync/rwmutex.go:50 +0x4e
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).GetRoundState(0xc001800000, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:218 +0x46
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusReactor).queryMaj23Routine(0xc0017def80, 0x11104a0, 0xc0072488f0, 0xc007248
9c0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/reactor.go:735 +0x16d
created by github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusReactor).AddPeer
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/reactor.go:172 +0x236
```
because localClient is locked
```
goroutine 1899 [semacquire, 309 minutes]:
sync.runtime_SemacquireMutex(0xc00003363c, 0xc0000cb500)
/usr/local/go/src/runtime/sema.go:71 +0x3d
sync.(*Mutex).Lock(0xc000033638)
/usr/local/go/src/sync/mutex.go:134 +0xff
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client.(*localClient).SetResponseCallback(0xc0001fb560, 0xc007868540)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client/local_client.go:32 +0x33
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy.(*appConnConsensus).SetResponseCallback(0xc00002f750, 0xc007868540)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy/app_conn.go:57 +0x40
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state.execBlockOnProxyApp(0x1104e20, 0xc002ca0ba0, 0x11092a0, 0xc00002f750, 0xc0001fe960, 0xc000bfc660, 0x110cfe0, 0xc000090330, 0xc9d12, 0xc000d9d5a0, ...)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state/execution.go:230 +0x1fd
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state.(*BlockExecutor).ApplyBlock(0xc002c2a230, 0x7, 0x0, 0xc000eae880, 0x6, 0xc002e52c60, 0x16, 0x1f927, 0xc9d12, 0xc000d9d5a0, ...)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state/execution.go:96 +0x142
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).finalizeCommit(0xc001800000, 0x1f928)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1339 +0xa3e
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).tryFinalizeCommit(0xc001800000, 0x1f928)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1270 +0x451
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).enterCommit.func1(0xc001800000, 0x0, 0x1f928)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1218 +0x90
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).enterCommit(0xc001800000, 0x1f928, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1247 +0x6b8
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote(0xc001800000, 0xc003d8dea0, 0xc000cf4cc0, 0x28, 0xf1, 0xc003bc7ad0, 0xc003bc7b10)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1659 +0xbad
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote(0xc001800000, 0xc003d8dea0, 0xc000cf4cc0, 0x28, 0xf1, 0xf1, 0xf1)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1517 +0x59
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg(0xc001800000, 0xd98200, 0xc0070dbed0, 0xc000cf4cc0, 0x28)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:660 +0x64b
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine(0xc001800000, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:617 +0x670
created by github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).OnStart
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:311 +0x132
```
tx comes in and CheckTx is executed right when we execute the block
```
goroutine 111044 [semacquire, 309 minutes]:
sync.runtime_SemacquireMutex(0xc00003363c, 0x0)
/usr/local/go/src/runtime/sema.go:71 +0x3d
sync.(*Mutex).Lock(0xc000033638)
/usr/local/go/src/sync/mutex.go:134 +0xff
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client.(*localClient).CheckTxAsync(0xc0001fb0e0, 0xc002d94500, 0x13f, 0x280, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client/local_client.go:85 +0x47
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy.(*appConnMempool).CheckTxAsync(0xc00002f720, 0xc002d94500, 0x13f, 0x280, 0x1)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy/app_conn.go:114 +0x51
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/mempool.(*Mempool).CheckTx(0xc002d3a320, 0xc002d94500, 0x13f, 0x280, 0xc0072355f0, 0x0, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/mempool/mempool.go:316 +0x17b
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/core.BroadcastTxSync(0xc002d94500, 0x13f, 0x280, 0x0, 0x0, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/core/mempool.go:93 +0xb8
reflect.Value.call(0xd85560, 0x10326c0, 0x13, 0xec7b8b, 0x4, 0xc00663f180, 0x1, 0x1, 0xc00663f180, 0xc00663f188, ...)
/usr/local/go/src/reflect/value.go:447 +0x449
reflect.Value.Call(0xd85560, 0x10326c0, 0x13, 0xc00663f180, 0x1, 0x1, 0x0, 0x0, 0xc005cc9344)
/usr/local/go/src/reflect/value.go:308 +0xa4
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server.makeHTTPHandler.func2(0x1102060, 0xc00663f100, 0xc0082d7900)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server/handlers.go:269 +0x188
net/http.HandlerFunc.ServeHTTP(0xc002c81f20, 0x1102060, 0xc00663f100, 0xc0082d7900)
/usr/local/go/src/net/http/server.go:1964 +0x44
net/http.(*ServeMux).ServeHTTP(0xc002c81b60, 0x1102060, 0xc00663f100, 0xc0082d7900)
/usr/local/go/src/net/http/server.go:2361 +0x127
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server.maxBytesHandler.ServeHTTP(0x10f8a40, 0xc002c81b60, 0xf4240, 0x1102060, 0xc00663f100, 0xc0082d7900)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server/http_server.go:219 +0xcf
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server.RecoverAndLogHandler.func1(0x1103220, 0xc00121e620, 0xc0082d7900)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server/http_server.go:192 +0x394
net/http.HandlerFunc.ServeHTTP(0xc002c06ea0, 0x1103220, 0xc00121e620, 0xc0082d7900)
/usr/local/go/src/net/http/server.go:1964 +0x44
net/http.serverHandler.ServeHTTP(0xc001a1aa90, 0x1103220, 0xc00121e620, 0xc0082d7900)
/usr/local/go/src/net/http/server.go:2741 +0xab
net/http.(*conn).serve(0xc00785a3c0, 0x11041a0, 0xc000f844c0)
/usr/local/go/src/net/http/server.go:1847 +0x646
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:2851 +0x2f5
```
* consensus: use read lock in Receive#VoteMessage
* use defer to unlock mutex because application might panic
* use defer in every method of the localClient
* add a changelog entry
* drain channels before Unsubscribe(All)
Read 55362ed766/libs/pubsub/pubsub.go (L13)
for the detailed explanation of the issue.
We'll need to fix it someday. Make sure to keep an eye on
https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-033-pubsub.md
* retry instead of panic when peer has no state in reactors other than consensus
in /dump_consensus_state RPC endpoint, skip a peer with no state
* rpc/core/mempool: simplify error messages
* rpc/core/mempool: use time.After instead of timer
also, do not log DeliverTx result (to be consistent with other memthods)
* unlock before calling the callback in reqRes#SetCallback
* p2p: re-check after sleeps
* use NodeInfo as an interface
* Revert "use NodeInfo as an interface"
This reverts commit 5f7d055e6c.
* Revert "p2p: re-check after sleeps"
This reverts commit 7f41070da0.
* preserve dial to itself
* ignore ensured connections while re-connecting
* re-check after sleep
* keep protocol definition on net addresses
* decrease log level
* Revert "preserve dial to itself"
This reverts commit 0c6e0fc58d.
* correct func comment according to modification
Co-Authored-By: mgurevin <mehmet@gurevin.net>
* Require addressbook to only store addresses with valid ID
* Do not shut down peer immediately after sending pex addrs in SeedMode
* p2p: fix#2773
* seed mode: use go-routine to sleep before stopping peer
* validate reactor messages
Refs #2683
* validate blockchain messages
Refs #2683
* validate evidence messages
Refs #2683
* todo
* check ProposalPOL and signature sizes
* add a changelog entry
* check addr is valid when we add it to the addrbook
* validate incoming netAddr (not just nil check!)
* fixes after Bucky's review
* check timestamps
* beef up block#ValidateBasic
* move some checks into bcBlockResponseMessage
* update Gopkg.lock
Fix
```
grouped write of manifest, lock and vendor: failed to export github.com/tendermint/go-amino: fatal: failed to unpack tree object 6dcc6ddc14
```
by running `dep ensure -update`
* bump year since now we check it
* generate test/p2p/data on the fly using tendermint testnet
* allow sync chains older than 1 year
* use full path when creating a testnet
* move testnet gen to test/docker/Dockerfile
* relax LastCommitRound check
Refs #2737
* fix conflicts after merge
* add small comment
* some ValidateBasic updates
* fixes
* AppHash length is not fixed
* Introduce EventValidBlock for informing peer about wanted block
* Merge with develop
* Add isCommit flag to NewValidBlock message
- Add test for the case of +2/3 Precommit from the previous round
* p2p: add protocol Version to NodeInfo
* update node pkg. remove extraneous version files
* update changelog and docs
* fix test
* p2p: Version -> ProtocolVersion; more ValidateBasic and tests
* p2p: NodeInfo is an interface
* (squash) fixes from review
* (squash) more fixes from review
* p2p: remove peerConn.HandshakeTimeout
* p2p: NodeInfo is two interfaces. Remove String()
* fixes from review
* remove test code from peer.RemoteIP()
* p2p: remove peer.OriginalAddr(). See #2618
* use a mockPeer in peer_set_test.go
* p2p: fix testNodeInfo naming
* p2p: remove unused var
* remove testRandNodeInfo
* fix linter
* fix retry dialing self
* fix rpc
* require block.Time of the fist block to be genesis time
Refs #2587:
```
We only start validating block.Time when Height > 1, because there is no
commit to compute the median timestamp from for the first block. This
means a faulty proposer could make the first block with whatever time
they want.
Instead, we should require the timestamp of block 1 to match the genesis
time.
I discovered this while refactoring the ValidateBlock tests to be
table-driven while working on tests for #2560.
```
* do not accept blocks with negative height
* update changelog and spec
* nanos precision for test genesis time
* Fix failing test (#2607)
* Switch nodeID to be tmhash.Size, add test names for net addr tests
Both of these came up when locally trying to change tmhash size.
* fix error introduced by merge
* Disable transitioning to new round upon 2/3+ of Precommit nils
Pull in ensureVote test function from https://github.com/tendermint/tendermint/pull/2132
* Add several ensureX test methods to wrap channel read with timeout
* Revert panic in tests
* stop node upon receiving SIGTERM or CTRL-Ceven during genesis sleep by setting up interrupt before starting a node
Closes#2434
* call Start, not OnStart when starting a component to avoid:
```
E[09-24|10:13:15.805] Not stopping PubSub -- have not been started yet module=pubsub impl=PubSub
```
being printed on exit
We are swapping the exisiting listener implementation with the newly
introduced Transport and its default implementation MultiplexTransport,
removing a large chunk of old connection setup and handling scattered
over the Peer and Switch code. The Switch requires a Transport now and
handles externally passed Peer filters.
This is the implementation for the design described in ADR 12[0]. It's
the first step of a larger refactor of the p2p package as tracked in
interface bundling all concerns of low-level connection handling and
isolating the rest of peer lifecycle management from the specifics of
the low-level internet protocols. Even if the swappable implementation
will never be utilised, already the isolation of conn related code in
one place will help with the reasoning about execution path and
addressation of security sensitive issues surfaced through bounty
programs and audits.
We deliberately decided to not have Peer filtering and other management
in the Transport, its sole responsibility is the translation of
connections to Peers, handing those to the caller fully setup. It's the
responsibility of the caller to reject those and or keep track. Peer
filtering will take place in the Switch and can be inspected in a the
following commit.
This changeset additionally is an exercise in clean separation of logic
and other infrastructural concerns like logging and instrumentation. By
leveraging a clean and minimal interface. How this looks can be seen in
a follow-up change.
Design #2069[2]
Refs #2067[3]
Fixes #2047[4]
Fixes #2046[5]
changes:
* describe Transport interface
* implement new default Transport: MultiplexTransport
* test MultiplexTransport with new constraints
* implement ConnSet for concurrent management of net.Conn, synchronous
to PeerSet
* implement and expose duplicate IP filter
* implemnt TransportOption for optional parametirisation
[0] https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-012-peer-transport.md
[1] https://github.com/tendermint/tendermint/issues/2067
[2] https://github.com/tendermint/tendermint/pull/2069
[3] https://github.com/tendermint/tendermint/issues/2067
[4] https://github.com/tendermint/tendermint/issues/2047
[5] https://github.com/tendermint/tendermint/issues/2046
* ignore existing peers in DialPeersAsync
Fixes#2253
* rename HasPeerWithAddress to IsDialingOrExistingAddress
[breaking] remove Switch#IsDialing
* check if addrBook is nil
to be consistent with other usages of addrBook across Switch
* different log messages for 2 use-cases
* update secret connection to use a little endian encoded nonce
* update encoding of chunk length to be little endian, too
* update comment
* Change comment slightly to trigger circelci
* [p2p/pex] connect to more than 10 peers
also, remove DefaultMinNumOutboundPeers because a) I am not sure it's
needed b) it's super confusing
look closely
```
maxPeers := sw.config.MaxNumPeers - DefaultMinNumOutboundPeers
if maxPeers <= sw.peers.Size() {
sw.Logger.Info("Ignoring inbound connection: already have enough peers", "address", inConn.RemoteAddr().String(), "numPeers", sw.peers.Size(), "max", maxPeers)
```
we print maxPeers = config.MaxPeers - DefaultMinNumOutboundPeers. So we
may not have enough peers even though we say we have enough.
Refs #2130
* update spec
* replace MaxNumPeers with MaxNumInboundPeers/MaxNumOutboundPeers
Refs #2130
* update changelog
* make max rpc conns formula visible to users
* update spec
* docs: note max outbound peers excludes persistent
* p2p/pex: Allow configured seed nodes to be offline
Previously you couldn't startup tendermint if a seed node was offline.
This now allows you to startup tendermint, as long as all seed node addresses
are formatted correctly. In the event that all seed nodes are down,
and the address book is empty, then it crashes with an informative error msg.
(This case doesn't occur if no seeds were specified)
Closes#1716
* (Squash this) Address melekes' comments
* (squash this) fix package imports
* (squash this) fix pex_reactor comment
* (squash this) add a test case
This is to reduce wait times when initially connecting. This still runs checks
such as whether you still want additional peers.
A test case has been created, which fails if this change is not included.
This now uses one hkdf on the X25519 shared secret to create
a key for the sender and receiver.
The hkdf call is now just called upon the computed shared
secret, since the shared secret is a function of the pubkeys.
The nonces now start at 0, as we are using chacha as a stream
cipher, and the sender and receiver now have different keys.
Generate keys with HKDF instead of hash functions, which provides better security properties.
Add xchacha20poly1305 to secret connection. (Due to rebasing, this code has been removed)