tendermint

Commit Graph

Author	SHA1	Message	Date
William Banfield	177850a2c9	statesync: remove deadlock on init fail (#7029 ) When statesync is stopped during shutdown, it has the possibility of deadlocking. A dump of goroutines reveals that this is related to the peerUpdates channel not returning anything on its `Done()` channel when `OnStop` is called. As this is occuring, `processPeerUpdate` is attempting to acquire the reactor lock. It appears that this lock can never be acquired. I looked for the places where the lock may remain locked accidentally and cleaned them up in hopes to eradicate the issue. Dumps of the relevant goroutines may be found below. Note that the line numbers below are relative to the code in the `v0.35.0-rc1` tag. ``` goroutine 36 [chan receive]: github.com/tendermint/tendermint/internal/statesync.(Reactor).OnStop(0xc00058f200) github.com/tendermint/tendermint/internal/statesync/reactor.go:243 +0x117 github.com/tendermint/tendermint/libs/service.(BaseService).Stop(0xc00058f200, 0x0, 0x0) github.com/tendermint/tendermint/libs/service/service.go:171 +0x323 github.com/tendermint/tendermint/node.(nodeImpl).OnStop(0xc0001ea240) github.com/tendermint/tendermint/node/node.go:769 +0x132 github.com/tendermint/tendermint/libs/service.(BaseService).Stop(0xc0001ea240, 0x0, 0x0) github.com/tendermint/tendermint/libs/service/service.go:171 +0x323 github.com/tendermint/tendermint/cmd/tendermint/commands.NewRunNodeCmd.func1.1() github.com/tendermint/tendermint/cmd/tendermint/commands/run_node.go:143 +0x62 github.com/tendermint/tendermint/libs/os.TrapSignal.func1(0xc000629500, 0x7fdb52f96358, 0xc0002b5030, 0xc00000daa0) github.com/tendermint/tendermint/libs/os/os.go:26 +0x102 created by github.com/tendermint/tendermint/libs/os.TrapSignal github.com/tendermint/tendermint/libs/os/os.go:22 +0xe6 goroutine 188 [semacquire]: sync.runtime_SemacquireMutex(0xc00026b1cc, 0x0, 0x1) runtime/sema.go:71 +0x47 sync.(Mutex).lockSlow(0xc00026b1c8) sync/mutex.go:138 +0x105 sync.(Mutex).Lock(...) sync/mutex.go:81 sync.(RWMutex).Lock(0xc00026b1c8) sync/rwmutex.go:111 +0x90 github.com/tendermint/tendermint/internal/statesync.(Reactor).processPeerUpdate(0xc00026b080, 0xc000650008, 0x28, 0x124de90, 0x4) github.com/tendermint/tendermint/internal/statesync/reactor.go:849 +0x1a5 github.com/tendermint/tendermint/internal/statesync.(Reactor).processPeerUpdates(0xc00026b080) github.com/tendermint/tendermint/internal/statesync/reactor.go:883 +0xab created by github.com/tendermint/tendermint/internal/statesync.(Reactor.OnStart github.com/tendermint/tendermint/internal/statesync/reactor.go:219 +0xcd) ```	3 years ago
Sam Kleinman	1c4950dbd2	state: move package to internal (#6964 )	3 years ago
JayT106	84ffaaaf37	statesync/rpc: metrics for the statesync and the rpc SyncInfo (#6795 )	3 years ago
Sam Kleinman	9dfdc62eb7	proxy: move proxy package to internal (#6953 )	3 years ago
Sam Kleinman	ae5f98881b	p2p: make NodeID and NetAddress public (#6583 )	3 years ago
Callum Waters	6e238b5b9d	statesync: make fetching chunks more robust (#6587 )	3 years ago
Sam Kleinman	a855f96946	p2p: renames for reactors and routing layer internal moves (#6547 )	3 years ago
Marko	1709e49813	version: revert version through ldflag only (#6494 ) ## Description Add version back to versions, but allow it to be overridden via a ldflag. Reason: Many users are not setting the ldflag causing issues with tooling that relies on it (cosmjs) closes #6488 cc @webmaster128	3 years ago
Marko	719e028e00	libs: internalize some packages (#6366 ) ## Description Internalize some libs. This reduces the amount ot public API tendermint is supporting. The moved libraries are mainly ones that are used within Tendermint-core.	3 years ago
Sam Kleinman	399c366185	statesync: sort snapshots by commonness (#6385 )	4 years ago
Sam Kleinman	d36a5905a6	statesync: improve e2e test outcomes (#6378 ) I believe that this, in my testing seems to help the e2e state-sync tests complete more reliably, by fixing some potential, range-related slice building, as well as the way the test app hashes snapshots. Additionally, and I'm not sure if we want to do this, but I added this hook to the reactor that re-sends the request for snapshots during the retry. This helps in tests prevent systems from getting stuck, but I think in reality, it might create more traffic, and operators would just restart a state-syncing node to get a similar effect.	4 years ago
Marko	70bb8cc8b7	proto: seperate native and proto types (#5994 ) ## Description Separate protobuf and domain types. We should avoid using protobuf in our core logic. ref #5460	4 years ago
Aleksandr Bezobchuk	c75dee5a02	state sync: Fix TestSyncer_SyncAny (#5835 )	4 years ago
Erik Grinaker	1b6df6783d	p2p: replace PeerID with NodeID	4 years ago
Aleksandr Bezobchuk	a879eb444d	p2p: state sync reactor refactor (#5671 )	4 years ago
Anton Kaliaev	e13b4386ff	abci: modify Client interface and socket client (#5673 ) `abci.Client`: - Sync and Async methods now accept a context for cancellation * grpc client uses context to cancel both Sync and Async requests * local client ignores context parameter * socket client uses context to cancel Sync requests and to drop Async requests before sending them if context was cancelled prior to that - Async methods return an error * socket client returns an error immediately if queue is full for Async requests * local client always returns nil error * grpc client returns an error if context was cancelled before we got response or the receiving queue had a space for response (do not confuse with the sending queue from the socket client) - specify clients semantics in [doc.go](https://raw.githubusercontent.com/tendermint/tendermint/27112fffa62276bc016d56741f686f0f77931748/abci/client/doc.go) `mempool.TxInfo` - add optional `Context` to `TxInfo`, which can be used to cancel `CheckTx` request Closes #5190	4 years ago
Anton Kaliaev	85a4be87a7	rpc/client: take context as first param (#5347 ) Closes #5145 also applies to light/client	4 years ago
Marko	2ac5a559b4	libs: wrap mutexes for build flag with godeadlock (#5126 ) ## Description This PR wraps the stdlib sync.(RW)Mutex & godeadlock.(RW)Mutex. This enables using go-deadlock via a build flag instead of using sed to replace sync with godeadlock in all files Closes: #3242	4 years ago
Marko	6ccccb0933	lint: errcheck (#5091 ) ## Description add more error checks to tests gonna do a third PR that tackles the non test cases	4 years ago
Erik Grinaker	59a17b28a7	proto: improve enums (#5099 ) Fixes some minor issues with Protobuf enums, not likely to break anything. Branched off of #5096, rebase to `master` before merging.	4 years ago
Erik Grinaker	bf3c87c864	test: deflake TestAddAndRemoveListenerConcurrency and TestSyncer_SyncAny (#5101 ) Fixes #5094.	4 years ago
Marko	dedf0d2350	proto: folder structure adhere to buf (#5025 )	4 years ago
Marko	7a8224f8a3	state: proto migration (#4972 ) ## Description the second part of state proto migration Closes: #XXX	4 years ago
Marko	b9af87c4ea	state: proto migration (#4951 )	4 years ago
Marko	4e6a844d6f	statesync: use Protobuf instead of Amino for p2p traffic (#4943 ) ## Description Closes: #XXX	4 years ago
Erik Grinaker	81c2798df0	abci: fix protobuf lint issues Fix some linter issues to conform with the Protobuf style guide. The state sync enum changes are ok to break since it's not released yet. Personally I find the uppercase kind of ugly, but that's what the guide says. Couldn't find a way to generate camel case in Go, short of specifying custom names for each and every enum variant. Another option would be to simply disable the enum case lint.	5 years ago
Erik Grinaker	511ab6717c	add state sync reactor (#4705 ) Fixes #828. Adds state sync, as outlined in [ADR-053](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md). See related PRs in Cosmos SDK (https://github.com/cosmos/cosmos-sdk/pull/5803) and Gaia (https://github.com/cosmos/gaia/pull/327). This is split out of the previous PR #4645, and branched off of the ABCI interface in #4704. * Adds a new P2P reactor which exchanges snapshots with peers, and bootstraps an empty local node from remote snapshots when requested. * Adds a new configuration section `[statesync]` that enables state sync and configures the light client. Also enables `statesync:info` logging by default. * Integrates state sync into node startup. Does not support the v2 blockchain reactor, since it needs some reorganization to defer startup.	5 years ago

7 Commits (ce89292712288be9b7baa57e7d84ca4bd79025a1)