@p4u from vocdoni.io reported that the mempool might behave incorrectly under a
high load. The consequences can range from pauses between blocks to the peers
disconnecting from this node.
My current theory is that the flowrate lib we're using to control flow
(multiplex over a single TCP connection) was not designed w/ large blobs
(1MB batch of txs) in mind.
I've tried decreasing the Mempool reactor priority, but that did not
have any visible effect. What actually worked is adding a time.Sleep
into mempool.Reactor#broadcastTxRoutine after an each successful send ==
manual control flow of sort.
As a temporary remedy (until the mempool package
is refactored), the max-batch-bytes was disabled. Transactions will be sent
one by one without batching
Closes#5796
When set to true, an invalid transaction will be kept in the cache (this may help some applications to protect against spam).
NOTE: this is a temporary config option. The more correct solution would be to add a TTL to each transaction (i.e. CheckTx may return a TTL in ResponseCheckTx).
Closes: #5751
time_iota_ms is intended to ensure that an honest validator always generates timestamps
with time increasing monotonically. For this purpose, it always suffices to have this parameter
set to `1ms`. Allowing users to choose different numbers increases bug surface area.
Thus the code now ignores the user provided time_iota_ms parameter (marking it as unused),
and uses 1ms internally.
The `NodeInfo` interface does not appear to serve any purpose at all, so I removed it and renamed the `DefaultNodeInfo` struct to `NodeInfo` (including the Protobuf representations). Let me know if this is actually needed for anything.
Only the Protobuf rename is listed in the changelog, since we do not officially support API stability of the `p2p` package (according to `README.md`). The on-wire protocol remains compatible.
Closes#5766
* memoize the scSchedulerFail error to avoid printing it every scheduleFreq
* blockchain/v2: modify switchIO funcs to accept peer instead of peerID
closes: #5770closes: #5769
also, include node ID in the output (#5769) and modify NodeKey to use
value semantics (it makes perfect sense for NodeKey to not be a
pointer).
I introduced a new variable - syncEnded, which is now used to prevent
sending new events to channels (which would block otherwise) if reactor
is finished syncing
Closes#4591
`abci.Client`:
- Sync and Async methods now accept a context for cancellation
* grpc client uses context to cancel both Sync and Async requests
* local client ignores context parameter
* socket client uses context to cancel Sync requests and to drop Async requests before sending them if context was cancelled prior to that
- Async methods return an error
* socket client returns an error immediately if queue is full for Async requests
* local client always returns nil error
* grpc client returns an error if context was cancelled before we got response or the receiving queue had a space for response (do not confuse with the sending queue from the socket client)
- specify clients semantics in [doc.go](https://raw.githubusercontent.com/tendermint/tendermint/27112fffa62276bc016d56741f686f0f77931748/abci/client/doc.go)
`mempool.TxInfo`
- add optional `Context` to `TxInfo`, which can be used to cancel `CheckTx` request
Closes#5190
Closes#5444
Now we record the fact that a peer does not have a requested block and later use this information to make a new request for the same block from another peer.
Removes `p2p.FuzzedConnection`, since it does not appear to be in use. While these sorts of test wrappers may be useful, they should be injected directly instead of bleeding through into the main application configuration. We'll implement something similar if and when necessary, for the new P2P abstractions in #2067.
Before: scheduler receives psBlockProcessed event, but does not mark block as processed because peer timed out (or was removed for other reasons) and all associated blocks were rescheduled.
After: scheduler receives psBlockProcessed event and marks block as processed in any case (even if peer who provided this block errors).
Closes#5387
When a peer is stopped due to some network issue, the Reactor calls scheduler#handleRemovePeer, which removes the peer from the scheduler. BUT the peer stays in the processor, which sometimes could lead to "duplicate block enqueued by processor" panic WHEN the same block is requested by the scheduler again from a different peer. The solution is to return scPeerError, which will be propagated to the processor. The processor will clean up the blocks associated with the peer in purgePeer.
Closes#5513, #5517
Fixes#5439. This is really a workaround for #5519 (unless we require async implementations to return ordered responses, but that kind of defeats the purpose of having an async API).