tendermint

Commit Graph

Author	SHA1	Message	Date
Erik Grinaker	c900303ac6	test: fix flaky router broadcast test (#6006 ) Fixes #6004 by reordering test to avoid race condition. Will redesign router tests to be resistant to this later.	4 years ago
Erik Grinaker	363804ac21	test: fix TestRouter to take into account PeerManager reconnects (#6002 ) Fixes #5981, which was caused by changes in Router behavior after the introduction of the peer manager, leading to a race condition that could halt the test. This is a temporary measure, I'll start tightening up the new P2P core tomorrow and write "real" tests with better test infrastructure.	4 years ago
Erik Grinaker	5a9b740acb	test: fix TestSwitchAcceptRoutine by ignoring spurious error (#6001 ) Another fix for `TestSwitchAcceptRoutine` following from #6000, since the `SetDeadline()` call also errors when the connection has been closed.	4 years ago
Erik Grinaker	aead4ab555	test: fix test data race in p2p.MemoryTransport with logger (#5995 ) This patches over a test data race where the logger would try to read struct internals via `reflect` while these were concurrently modified (specifically `MemoryTransport.closeOnce`).	4 years ago
Aleksandr Bezobchuk	bd8a9372d2	consensus: Groom Logs (#5917 ) Executed a local network using simapp and looked for logs that seemed superfluous. This isn't by any means an exhaustive grooming, but should drastically help legibility of logs. ref: #5912	4 years ago
Marko	70bb8cc8b7	proto: seperate native and proto types (#5994 ) ## Description Separate protobuf and domain types. We should avoid using protobuf in our core logic. ref #5460	4 years ago
Erik Grinaker	4dca066aab	test: disable TestPEXReactorSeedModeFlushStop due to flake (#5996 ) This test occasionally fails because the peer is already stopped. It is unclear to me exactly what this test is supposed to do, since calling `FlushStop()` will stop the peer, but the test asserts that the peer shouldn't have been stopped by `FlushStop()` since calling `Stop()` afterwards will error in that case. The current PEX reactor will be removed in the new P2P stack anyway.	4 years ago
Erik Grinaker	6e3c58204a	test: fix TestSwitchAcceptRoutine flake by ignoring error type (#6000 ) Fixes #5998. Sometimes the connection returns "use of closed network connection" instead, so for now we just accept any error. The switch is not long for this world anyway.	4 years ago
Erik Grinaker	f54f80bf0d	test: don't use foo-bar.net in TestHTTPClientMakeHTTPDialer (#5997 ) This test relied on connecting to the external site `foo-bar.net`, and (predictably) the site went down and broke all of our CI runs. This changes it to use local HTTP servers instead.	4 years ago
Anton Kaliaev	8ce254cdb7	CONTRIBUTING.md: update testing section (#5979 ) [✌️ RENDERED](`ad5a2ec28b/CONTRIBUTING.md)` Closes #5874	4 years ago
Anton Kaliaev	a2e684e51f	.github: archive crashers and fix set-crashers-count step (#5992 )	4 years ago
Sergey	3759bc511b	docs: fix typo in state sync example (#5989 )	4 years ago
Erik Grinaker	06de7459c9	p2p: use stopCtx when dialing peers in Router (#5983 ) This ensures we don't leak dial goroutines when shutting down the router.	4 years ago
Aleksandr Bezobchuk	642ecc3f5c	mempool: fix mempool tests timeout (#5988 )	4 years ago
Aleksandr Bezobchuk	b19acfb605	mempool: fix TestReactorNoBroadcastToSender (#5984 ) ## Description Looks like I missed a test in the original PR when fixing the tests. Closes: #5956	4 years ago
Erik Grinaker	937a18468a	test/p2p: close transports to avoid goroutine leak failures (#5982 )	4 years ago
Erik Grinaker	fe5b312337	p2p: resolve PEX addresses in PEX reactor (#5980 ) This changes the new prototype PEX reactor to resolve peer address URLs into IP/port PEX addresses itself. Branched off of #5974. I've spent some time thinking about address handling in the P2P stack. We currently use `PeerAddress` URLs everywhere, except for two places: when dialing a peer, and when exchanging addresses via PEX. We had two options: 1. Resolve addresses to endpoints inside `PeerManager`. This would introduce a lot of added complexity: we would have to track connection statistics per endpoint, have goroutines that asynchronously resolve and refresh these endpoints, deal with resolve scheduling before dialing (which is trickier than it sounds since it involves multiple goroutines in the peer manager and router and messes with peer rating order), handle IP address visibility issues, and so on. 2. Resolve addresses to endpoints (IP/port) only where they're used: when dialing, and in PEX. Everywhere else we use URLs. I went with 2, because this significantly simplifies the handling of hostname resolution, and because I really think the PEX reactor should migrate to exchanging URLs instead of IP/port numbers anyway -- this allows operators to use DNS names for validators (and can easily migrate them to new IPs and/or load balance requests), and also allows different protocols (e.g. QUIC and `MemoryTransport`). Happy to discuss this.	4 years ago
Erik Grinaker	7ea8746ed1	proto/p2p: rename PEX messages and fields (#5974 ) Fixes #5899 by renaming a bunch of P2P Protobuf entities (while maintaining wire compatibility): * `Message` to `PexMessage` (as it's only used for PEX messages). * `PexAddrs` to `PexResponse`. * `PexResponse.Addrs` to `PexResponse.Addresses`. * `NetAddress` to `PexAddress` (as it's only used by PEX).	4 years ago
Erik Grinaker	51aca684b8	p2p: add prototype PEX reactor for new stack (#5971 ) This adds a prototype PEX reactor for the new P2P stack.	4 years ago
Anton Kaliaev	8718f6f5ff	terminate go-fuzz gracefully (w/ SIGINT) (#5973 ) and preserve exit code. ``` 2021/01/26 03:34:49 workers: 2, corpus: 4 (8m28s ago), crashers: 0, restarts: 1/9976, execs: 11013732 (21596/sec), cover: 121, uptime: 8m30s make: *** [fuzz-mempool] Terminated Makefile:5: recipe for target 'fuzz-mempool' failed Error: Process completed with exit code 124. ``` https://github.com/tendermint/tendermint/runs/1766661614 `continue-on-error` should make GH ignore any error codes.	4 years ago
Marko	91823eba32	tests: fix `make test` (#5966 ) ## Description - bump deadlock dep to master - fixes `make test` since we now use `deadlock.Once` Closes: #XXX	4 years ago
Aleksandr Bezobchuk	b3aae970d8	blockchain v0: fix waitgroup data race (#5970 ) ## Description Fixes the data race in usage of `WaitGroup`. Specifically, the case where we invoke `Wait` _before_ the first delta `Add` call when the current waitgroup counter is zero. See https://golang.org/pkg/sync/#WaitGroup.Add. Still not sure how this manifests itself in a test since the reactor has to be stopped virtually immediately after being started (I think?). Regardless, this is the appropriate fix. closes: #5968	4 years ago
Erik Grinaker	13e772c916	p2p: add PeerManager.Advertise() (#5957 ) Adds a naïve `PeerManager.Advertise()` method that the new PEX reactor can use to fetch addresses to advertise, as well as some other `FIXME`s on address advertisement.	4 years ago
Erik Grinaker	81daaacae9	p2p: simplify PeerManager upgrade logic (#5962 ) Follow-up from #5947, branched off of #5954. This simplifies the upgrade logic by adding explicit eviction requests, which can also be useful for other use-cases (e.g. if we need to ban a peer that's misbehaving). Changes: * Add `evict` map which queues up peers to explicitly evict. * `upgrading` now only tracks peers that we're upgrading via dialing (`DialNext` → `Dialed`/`DialFailed`). * `Dialed` will unmark `upgrading`, and queue `evict` if still beyond capacity. * `Accepted` will pick a random lower-scored peer to upgrade to, if appropriate, and doesn't care about `upgrading` (the dial will fail later, since it's already connected). * `EvictNext` will return a peer scheduled in `evict` if any, otherwise if beyond capacity just evict the lowest-scored peer. This limits all of the `upgrading` logic to `DialNext`, `Dialed`, and `DialFailed`, making it much simplier, and it should generally do the right thing in all cases I can think of.	4 years ago
Erik Grinaker	a741314c97	p2p: improve peerStore prototype (#5954 ) This improves the `peerStore` prototype by e.g.: * Using a database with Protobuf for persistence, but also keeping full peer set in memory for performance. * Simplifying the API, by taking/returning struct copies for safety, and removing errors for in-memory operations. * Caching the ranked peer set, as a temporary solution until a better data structure is implemented. * Adding `PeerManagerOptions.MaxPeers` and pruning the peer store (based on rank) when it's full. * Rewriting `PeerAddress` to be independent of `url.URL`, normalizing it and tightening semantics.	4 years ago
Aleksandr Bezobchuk	9e158839f6	mempool: fix reactor tests (#5967 ) ## Description Update the faux router to either drop channel errors or handle them based on an argument. This prevents deadlocks in tests where we try to send an error on the mempool channel but there is no reader. Closes: #5956	4 years ago
Callum Waters	aecfb0ecf0	e2e: add control over the log level of nodes (#5958 )	4 years ago
Anton Kaliaev	680fb18414	.github: fix fuzz-nightly job (#5965 ) outputs is a property of the job, not an individual step.	4 years ago
Marko	962a82c06e	docs: log level docs (#5945 ) ## Description add section on configuring log levels Closes: #XXX	4 years ago
Anton Kaliaev	d76add65a6	libs/log: format []byte as hexidecimal string (uppercased) (#5960 ) Closes: #5806 Co-authored-by: Lanie Hei <heixx011@umn.edu>	4 years ago
Erik Grinaker	7e0436c6e6	p2p: make PeerManager.DialNext() and EvictNext() block (#5947 ) See #5936 and #5938 for background. The plan was initially to have `DialNext()` and `EvictNext()` return a channel. However, implementing this became unnecessarily complicated and error-prone. As an example, the channel would be both consumed and populated (via method calls) by the same driving method (e.g. `Router.dialPeers()`) which could easily cause deadlocks where a method call blocked while sending on the channel that the caller itself was responsible for consuming (but couldn't since it was busy making the method call). It would also require a set of goroutines in the peer manager that would interact with the goroutines in the router in non-obvious ways, and fully populating the channel on startup could cause deadlocks with other startup tasks. Several issues like these made the solution hard to reason about. I therefore simply made `DialNext()` and `EvictNext()` block until the next peer was available, using internal triggers to wake these methods up in a non-blocking fashion when any relevant state changes occurred. This proved much simpler to reason about, since there are no goroutines in the peer manager (except for trivial retry timers), nor any blocking channel sends, and it instead relies entirely on the existing goroutine structure of the router for concurrency. This also happens to be the same pattern used by the `Transport.Accept()` API, following Go stdlib conventions, so all router goroutines end up using a consistent pattern as well.	4 years ago
odidev	cd3ebe8754	docker: release Linux/ARM64 image (#5925 ) Co-authored-by: Marko <marbar3778@yahoo.com>	4 years ago
Marko	b958ba3440	docker: dont login when in PR (#5961 )	4 years ago
Anton Kaliaev	df22e7354c	test/fuzz: move fuzz tests into this repo (#5918 ) Co-authored-by: Emmanuel T Odeke <emmanuel@orijtech.com> Closes #5907 - add init-corpus to blockchain reactor - remove validator-set FromBytes test now that we have proto, we don't need to test it! bye amino - simplify mempool test do we want to test remote ABCI app? - do not recreate mux on every crash in jsonrpc test - update p2p pex reactor test - remove p2p/listener test the API has changed + I did not understand what it's tested anyway - update secretconnection test - add readme and makefile - list inputs in readme - add nightly workflow - remove blockchain fuzz test EncodeMsg / DecodeMsg no longer exist	4 years ago
Erik Grinaker	9c98af4277	test: fix TestPEXReactorRunning data race (#5955 ) Fixes #5941. Not entirely sure that this will fix the problem (couldn't reproduce), but in any case this is an artifact of a hack in the P2P transport refactor to make it work with the legacy P2P stack, and will be removed when the refactor is done anyway.	4 years ago
Erik Grinaker	ac49ea8bb7	Makefile: always pull image in proto-gen-docker. (#5953 ) The `proto-gen-docker` target didn't pull an updated Docker image, and would use a local image if present which could be outdated and produce wrong results.	4 years ago
Marko	a72fb2fbad	docs: change v0.33 version (#5950 ) ## Description - change version for v0.33.x Closes: #XXX	4 years ago
Aleksandr Bezobchuk	68bd2116f0	mempool: p2p refactor (#5919 )	4 years ago
Erik Grinaker	670e9b427b	p2p: improve PeerManager prototype (#5936 ) This improves the prototype peer manager by: * Exporting `PeerManager`, making it accessible by e.g. reactors. * Replacing `Router.SubscribePeerUpdates()` with `PeerManager.Subscribe()`. * Tracking address/peer connection statistics, and retrying dial failures with exponential backoff. * Prioritizing peers, with persistent peers configuration. * Limiting simultaneous connections. * Evicting peers and upgrading to higher-priority peers. * Tracking peer heights, as a workaround for legacy shared peer state APIs. This is getting to a point where we need to determine precise semantics and implement tests, so we should figure out whether it's a reasonable abstraction that we want to use. The main questions are around the API model (i.e. synchronous method calls with the router polling the manager, vs. an event-driven model using channels, vs. the peer manager calling methods on the router to connect/disconnect peers), and who should have the responsibility of managing actual connections (currently the router, while the manager only tracks peer state).	4 years ago
Aleksandr Bezobchuk	15c1936b85	p2p: revise shim log levels (#5940 ) Downgrade some noisy logs to DEBUG.	4 years ago
Marko	c63854f732	proto: docker deployment (#5931 )	4 years ago
Jack Yeh	527550f372	Update metrics.md (#5930 )	4 years ago
Marko	64961e2267	e2e: releases nightly (#5906 )	4 years ago
Tess Rinearson	eff1b16a0c	changelog: update changelog for v0.34.3 (#5927 ) (No changelog pending updates, since we all forgot to update the changelog pending with all these changes 🤡 )	4 years ago
Tess Rinearson	ea77360ecf	.github/workflows: enable manual dispatch for some workflows (#5929 )	4 years ago
Tess Rinearson	5972105b06	docs: update package-lock.json (#5928 )	4 years ago
Callum	af723eca8a	use correct source of evidence time Conflicting votes are now sent to the evidence pool to form duplicate vote evidence only once the height of the evidence is finished and the time of the block finalised.	4 years ago
Aleksandr Bezobchuk	62d7a5d028	blockchain v0: p2p refactor (#5858 )	4 years ago
Erik Grinaker	96215a06ed	p2p: add prototype peer lifecycle manager (#5882 ) This adds a prototype peer lifecycle manager, `peerManager`, which stores peer data in an internal `peerStore`. The overall idea here is to have methods for peer lifecycle events which exchange a very narrow subset of peer data, and to keep all of the peer metadata (i.e. the `peerInfo` struct) internal, to decouple this from the router and simplify concurrency control. See `peerManager` GoDoc for more information. The router is still responsible for actually dialing and accepting peer connections, and routing messages across them, but the peer manager is responsible for determining which peers to dial next, preventing multiple connections being established for the same peer (e.g. both inbound and outbound), and making sure we don't dial the same peer several times in parallel. Later it will also track retries and exponential backoff, as well as peer and address quality. It also assumes responsibility for peer updates subscriptions. It's a bit unclear to me whether we want the peer manager to take on the responsibility of actually dialing and accepting connections as well, or if it should only be tracking peer state for the router while the router is responsible for all transport concerns. Let's revisit this later.	4 years ago
Tess Rinearson	3ef0b90afd	readme: add security mailing list (#5916 ) No one knows we have this mailing list 🙈	4 years ago

... 17 18 19 20 21 ...

9494 Commits (80d3765ebfff1e028550e3d6989e76a73cae6f7c) All Branches Search

9494 Commits (80d3765ebfff1e028550e3d6989e76a73cae6f7c)

All Branches