tendermint

Commit Graph

Author	SHA1	Message	Date
JayT106	84ffaaaf37	statesync/rpc: metrics for the statesync and the rpc SyncInfo (#6795 )	3 years ago
Sam Kleinman	9dfdc62eb7	proxy: move proxy package to internal (#6953 )	3 years ago
Callum Waters	bda948e814	statesync: implement p2p state provider (#6807 )	3 years ago
JayT106	e70445f942	statesync/event: emit statesync start/end event (#6700 )	3 years ago
Callum Waters	6dd0cf92c8	router/statesync: add helpful log messages (#6724 )	3 years ago
Sam Kleinman	ab5c63eff3	statesync: increase dispatcher timeout (#6714 )	3 years ago
Callum Waters	a12e2bbb60	statesync: use initial height as a floor to backfilling (#6709 )	3 years ago
William Banfield	cabd916517	Revert "statesync: keep peer despite lightblock query fail (#6692 )" (#6696 ) * Revert "statesync: keep peer despite lightblock query fail (#6692)" This reverts commit `50b00dff71`.	3 years ago
William Banfield	50b00dff71	statesync: keep peer despite lightblock query fail (#6692 ) When a peer responds with no lightblock for the height we queried, we call the [removePeer method](https://github.com/tendermint/tendermint/blob/master/internal/statesync/reactor.go#L339). This removes the peer from the [dispatcher's list of called peer's](`ad65883152/internal/statesync/dispatcher.go (L159))`. When the dispatcher then receives responses from the removed peer, it [drops their responses](`ad65883152/internal/statesync/dispatcher.go (L130))`. These responses may be meaningful or contain a block or data that will help statesync proceed. [The logs](https://gist.github.com/tychoish/34a1f61eaae3c36c23efc7d0001e805c), when this change is applied, show an additional 3 networking testnets passing. addresses: #6691	3 years ago
Callum Waters	051e127d38	light: correctly handle contexts (#6687 )	3 years ago
Aleksandr Bezobchuk	1dec3e139a	add stacktrace to panic logs (#6662 )	3 years ago
Callum Waters	a1e1e6c290	test: fix non-deterministic backfill test (#6648 )	3 years ago
Sam Kleinman	917180dfd2	p2p: reduce buffering on channels (#6609 ) Having smaller buffers in each reactor/channel will mean that there will be fewer stale messages.	3 years ago
Callum Waters	6e238b5b9d	statesync: make fetching chunks more robust (#6587 )	3 years ago
Callum Waters	25bb556fee	p2p: increase queue size to 16MB (#6588 )	3 years ago
Aleksandr Bezobchuk	7d961b55b2	state sync: tune request timeout and chunkers (#6566 )	3 years ago
Callum Waters	74af343f28	statesync: tune backfill process (#6565 ) This PR make some tweaks to backfill after running e2e tests: - Separates sync and backfill as two distinct processes that the node calls. The reason is because if sync fails then the node should fail but if backfill fails it is still possible to proceed. - Removes peers who don't have the block at a height from the local peer list. As the process goes backwards if a node doesn't have a block at a height they're likely pruning blocks and thus they won't have any prior ones either. - Sleep when we've run out of peers, then try again.	3 years ago
Callum Waters	6f6ac5c04e	state sync: reverse sync implementation (#6463 )	3 years ago
Sam Kleinman	a855f96946	p2p: renames for reactors and routing layer internal moves (#6547 )	3 years ago
Marko	719e028e00	libs: internalize some packages (#6366 ) ## Description Internalize some libs. This reduces the amount ot public API tendermint is supporting. The moved libraries are mainly ones that are used within Tendermint-core.	3 years ago
Sam Kleinman	d36a5905a6	statesync: improve e2e test outcomes (#6378 ) I believe that this, in my testing seems to help the e2e state-sync tests complete more reliably, by fixing some potential, range-related slice building, as well as the way the test app hashes snapshots. Additionally, and I'm not sure if we want to do this, but I added this hook to the reactor that re-sends the request for snapshots during the retry. This helps in tests prevent systems from getting stuck, but I think in reality, it might create more traffic, and operators would just restart a state-syncing node to get a similar effect.	4 years ago
Aleksandr Bezobchuk	a554005136	p2p: revised router message scheduling (#6126 )	4 years ago
Aleksandr Bezobchuk	16bbe8c862	consensus: p2p refactor (#5969 )	4 years ago
Erik Grinaker	9b6d6a3ad0	p2p: tighten up Router and add tests (#6044 ) This cleans up the `Router` code and adds a bunch of tests. These sorts of systems are a real pain to test, since they have a bunch of asynchronous goroutines living their own lives, so the test coverage is decent but not fantastic. Luckily we've been able to move all of the complex peer management and transport logic outside of the router, as synchronous components that are much easier to test, so the core router logic is fairly small and simple. This also provides some initial test tooling in `p2p/p2ptest` that automatically sets up in-memory networks and channels for use in integration tests. It also includes channel-oriented test asserters in `p2p/p2ptest/require.go`, but these have primarily been written for router testing and should probably be adapted or extended for reactor testing.	4 years ago
Erik Grinaker	2aad26e2f1	p2p: tighten up and test PeerManager (#6034 ) This tightens up the `PeerManager` and related code, adds a ton of tests, and fixes a bunch of inconsistencies and bugs.	4 years ago
Aleksandr Bezobchuk	68bd2116f0	mempool: p2p refactor (#5919 )	4 years ago
Aleksandr Bezobchuk	62d7a5d028	blockchain v0: p2p refactor (#5858 )	4 years ago
Aleksandr Bezobchuk	e986602649	evidence: p2p refactor (#5747 )	4 years ago
Aleksandr Bezobchuk	8bf77d9b1a	statesync: do not recover panic on peer updates (#5869 )	4 years ago
Erik Grinaker	1b6df6783d	p2p: replace PeerID with NodeID	4 years ago
Anton Kaliaev	aef1ac7ba5	modify Reactor priorities (#5826 ) blockchain/vX reactor priority was decreased because during the normal operation (i.e. when the node is not fast syncing) blockchain priority can't be the same as consensus reactor priority. Otherwise, it's theoretically possible to slow down consensus by constantly requesting blocks from the node. NOTE: ideally blockchain/vX reactor priority would be dynamic. e.g. when the node is fast syncing, the priority is 10 (max), but when it's done fast syncing - the priority gets decreased to 5 (only to serve blocks for other nodes). But it's not possible now, therefore I decided to focus on the normal operation (priority = 5). evidence and consensus critical messages are more important than the mempool ones, hence priorities are bumped by 1 (from 5 to 6). statesync reactor priority was changed from 1 to 5 to be the same as blockchain/vX priority. Refs https://github.com/tendermint/tendermint/issues/5816	4 years ago
Aleksandr Bezobchuk	0565eb5943	state sync: cleanup (#5776 )	4 years ago
Aleksandr Bezobchuk	a879eb444d	p2p: state sync reactor refactor (#5671 )	4 years ago
Tess Rinearson	79890d8393	reactors: omit incoming message bytes from reactor logs (#5743 ) After a reactor has failed to parse an incoming message, it shouldn't output the "bad" data into the logs, as that data is unfiltered and could have anything in it. (We also don't think this information is helpful to have in the logs anyways.)	4 years ago
Anton Kaliaev	e13b4386ff	abci: modify Client interface and socket client (#5673 ) `abci.Client`: - Sync and Async methods now accept a context for cancellation * grpc client uses context to cancel both Sync and Async requests * local client ignores context parameter * socket client uses context to cancel Sync requests and to drop Async requests before sending them if context was cancelled prior to that - Async methods return an error * socket client returns an error immediately if queue is full for Async requests * local client always returns nil error * grpc client returns an error if context was cancelled before we got response or the receiving queue had a space for response (do not confuse with the sending queue from the socket client) - specify clients semantics in [doc.go](https://raw.githubusercontent.com/tendermint/tendermint/27112fffa62276bc016d56741f686f0f77931748/abci/client/doc.go) `mempool.TxInfo` - add optional `Context` to `TxInfo`, which can be used to cancel `CheckTx` request Closes #5190	4 years ago
Anton Kaliaev	f2f6a78809	docs: warn developers about calling blocking funcs in Receive (#5679 ) Refs #2888	4 years ago
Erik Grinaker	f83ecdad1d	config: add state sync discovery_time setting (#5399 ) Reduces the state sync discovery time from 20 to 15 seconds, and makes it configurable.	4 years ago
Erik Grinaker	2f4c1f60c7	statesync: broadcast snapshot request to all peers on startup (#5320 ) On startup, the peer-to-peer stack may have peers connected before the state sync process begins, causing these to not trigger `AddPeer` events and thus not be used for snapshot discovery. Broadcasting a snapshot request to these explicitly makes sure we discover snapshots from existing peers as well.	4 years ago
Marko	2ac5a559b4	libs: wrap mutexes for build flag with godeadlock (#5126 ) ## Description This PR wraps the stdlib sync.(RW)Mutex & godeadlock.(RW)Mutex. This enables using go-deadlock via a build flag instead of using sed to replace sync with godeadlock in all files Closes: #3242	4 years ago
Marko	dedf0d2350	proto: folder structure adhere to buf (#5025 )	4 years ago
Marko	4e6a844d6f	statesync: use Protobuf instead of Amino for p2p traffic (#4943 ) ## Description Closes: #XXX	4 years ago
Erik Grinaker	511ab6717c	add state sync reactor (#4705 ) Fixes #828. Adds state sync, as outlined in [ADR-053](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md). See related PRs in Cosmos SDK (https://github.com/cosmos/cosmos-sdk/pull/5803) and Gaia (https://github.com/cosmos/gaia/pull/327). This is split out of the previous PR #4645, and branched off of the ABCI interface in #4704. * Adds a new P2P reactor which exchanges snapshots with peers, and bootstraps an empty local node from remote snapshots when requested. * Adds a new configuration section `[statesync]` that enables state sync and configures the light client. Also enables `statesync:info` logging by default. * Integrates state sync into node startup. Does not support the v2 blockchain reactor, since it needs some reorganization to defer startup.	5 years ago

19 Commits (84ffaaaf37f2456f37d19e71cd8a311eede40e7c)