tendermint

Commit Graph

Author	SHA1	Message	Date
Anton Kaliaev	43348022d6	pex: various follow-ups (#3605 ) * p2p: merge switch cases also improve the error msg in privval * pex: refactor code plus update specification follow-up to https://github.com/tendermint/tendermint/pull/3603 * Update docs/spec/reactors/pex/pex.md Co-Authored-By: melekes <anton.kalyaev@gmail.com>	6 years ago
Roman Shtylman	40dbad9915	pex: dial seeds when address book needs more addresses (#3603 ) If we are low on addresses for peering, we need to discover more peers. The previous behavior would query existing peers; however, if an existing peer does not participate in peer exchange, then our node will not discover more peers. This change consults both existing peers as well as seeds when there is a deficit in address book addresses. This allows for discovering peers though existing channels as well as via seeds if existing peers do not share addresses.	6 years ago
Thane Thomson	70592cc4d8	libs/common: remove deprecated PanicXXX functions (#3595 ) * Remove deprecated PanicXXX functions from codebase As per discussion over [here](https://github.com/tendermint/tendermint/pull/3456#discussion_r278423492), we need to remove these `PanicXXX` functions and eliminate our dependence on them. In this PR, each and every `PanicXXX` function call is replaced with a simple `panic` call. * add a changelog entry	6 years ago
Anton Kaliaev	c0e8fb5085	p2p: (seed mode) limit the number of attempts to connect to a peer (#3573 ) * use dialPeer function in a seed mode Fixes #3532 by storing a number of attempts we've tried to connect in-memory and removing the address from addrbook when number of attempts > 16	6 years ago
Martin Dyring-Andersen	a453628c4e	Fix a couple of typos (#3547 ) Fix some typos in p2p/transport.go	6 years ago
Alexander Simmerl	b5b3b85697	Bring back NodeInfo NetAddress form the dead (#3545 ) A prior change to address accidental DNS lookups introduced the SocketAddr on peer, which was then used to add it to the addressbook. Which in turn swallowed the self reported port of the peer, which is important on a reconnect. This change revives the NetAddress on NodeInfo which the Peer carries, but now returns an error to avoid nil dereferencing another issue observed in the past. Additionally we could potentially address #3532, yet the original problem statemenf of that issue stands. As a drive-by optimisation `MarkAsGood` now takes only a `p2p.ID` which makes it interface a bit stricter and leaner.	6 years ago
Anton Kaliaev	bcec8be035	p2p: do not log err if peer is private (#3474 ) * add actionable advice for ErrAddrBookNonRoutable err Should replace https://github.com/tendermint/tendermint/pull/3463 * reorder checks in addrbook#addAddress so ErrAddrBookPrivate is returned first and do not log error in DialPeersAsync if the address is private because it's not an error	6 years ago
Anton Kaliaev	f965a4db15	p2p: seed mode refactoring (#3011 ) ListOfKnownAddresses is removed panic if addrbook size is less than zero CrawlPeers does not attempt to connect to existing or peers we're currently dialing various perf. fixes improved tests (though not complete) move IsDialingOrExistingAddress check into DialPeerWithAddress (Fixes #2716) * addrbook: preallocate memory when saving addrbook to file * addrbook: remove oldestFirst struct and check for ID * oldestFirst replaced with sort.Slice * ID is now mandatory, so no need to check * addrbook: remove ListOfKnownAddresses GetSelection is used instead in seed mode. * addrbook: panic if size is less than 0 * rewrite addrbook#saveToFile to not use a counter * test AttemptDisconnects func * move IsDialingOrExistingAddress check into DialPeerWithAddress * save and cleanup crawl peer data * get rid of DefaultSeedDisconnectWaitPeriod * make linter happy * fix TestPEXReactorSeedMode * fix comment * add a changelog entry * Apply suggestions from code review Co-Authored-By: melekes <anton.kalyaev@gmail.com> * rename ErrDialingOrExistingAddress to ErrCurrentlyDialingOrExistingAddress * lowercase errors * do not persist seed data pros: - no extra files - less IO cons: - if the node crashes, seed might crawl a peer too soon * fixes after Ethan's review * add a changelog entry * we should only consult Switch about peers checking addrbook size does not make sense since only PEX reactor uses it for dialing peers! https://github.com/tendermint/tendermint/pull/3011#discussion_r270948875	6 years ago
Ethan Buchman	882622ec10	Fixes tendermint/tendermint#3522 * OriginalAddr -> SocketAddr OriginalAddr records the originally dialed address for outbound peers, rather than the peer's self reported address. For inbound peers, it was nil. Here, we rename it to SocketAddr and for inbound peers, set it to the RemoteAddr of the connection. * use SocketAddr Numerous places in the code call peer.NodeInfo().NetAddress(). However, this call to NetAddress() may perform a DNS lookup if the reported NodeInfo.ListenAddr includes a name. Failure of this lookup returns a nil address, which can lead to panics in the code. Instead, call peer.SocketAddr() to return the static address of the connection. * remove nodeInfo.NetAddress() Expose `transport.NetAddress()`, a static result determined when the transport is created. Removing NetAddress() from the nodeInfo prevents accidental DNS lookups. * fixes from review * linter * fixes from review	6 years ago
Anton Kaliaev	5fa540bdc9	mempool: add a safety check, write tests for mempoolIDs (#3487 ) * mempool: add a safety check, write tests for mempoolIDs and document 65536 limit in the mempool reactor spec follow-up to https://github.com/tendermint/tendermint/pull/2778 * rename the test * fixes after Ismail's review	6 years ago
zjubfd	5a25b75b1d	p2p: refactor GetSelectionWithBias for addressbook (#3475 ) Why submit this pr: we have suffered from infinite loop in addrbook bug which takes us a long time to find out why process become a zombie peer. It have been fixed in #3232. But the ADDRS_LOOP is still there, risk of infinite loop is still exist. The algorithm that to random pick a bucket is not stable, which means the peer may unluckily always choose the wrong bucket for a long time, the time and cpu cost is meaningless. A simple improvement: shuffle bucketsNew and bucketsOld, and pick necessary number of address from them. A stable algorithm.	6 years ago
Anton Kaliaev	7af4b5086a	Remove RepeatTimer and refactor Switch#Broadcast (#3429 ) * p2p: refactor Switch#Broadcast func - call wg.Add only once - do not call peers.List twice! * bad for perfomance * peers list can change in between calls! Refs #3306 * p2p: use time.Ticker instead of RepeatTimer no need in RepeatTimer since we don't Reset them Refs #3306 * libs/common: remove RepeatTimer (also TimerMaker and Ticker interface) "ancient code that’s caused no end of trouble" Ethan I believe there's much simplier way to write a ticker than can be reset https://medium.com/@arpith/resetting-a-ticker-in-go-63858a2c17ec	6 years ago
Anton Kaliaev	100ff08de9	p2p: do not panic when filter times out (#3384 ) Fixes #3369	6 years ago
Anton Kaliaev	b6a510a3e7	make ineffassign linter pass (#3386 ) Refs #3262 This fixes two small bugs: 1) lite/dbprovider: return `ok` instead of true in parse* functions. It's weird that we're ignoring `ok` value before. 2) consensus/state: previously because of the shadowing we almost never output "Error with msg". Now we declare both `added` and `err` in the beginning of the function, so there's no shadowing.	6 years ago
zjubfd	d95894152b	fix dirty data in peerset,resolve #3304 (#3359 ) * fix dirty data in peerset startInitPeer before PeerSet add the peer, once mconnection start and Receive of one Reactor faild, will try to remove it from PeerSet while PeerSet still not contain the peer. Fix this by change the order. * fix test FilterDuplicate * fix start/stop race * fix err	6 years ago
Ismail Khoffi	6797d85851	p2p: fix comment in secret connection (#3348 ) Just a minor followup on the review if #3347: Fixes a comment. [#3347 (comment)](https://github.com/tendermint/tendermint/pull/3347#discussion_r259582330)	6 years ago
Ismail Khoffi	e0adc5e807	secret connection check all zeroes (#3347 ) * reject the shared secret if is all zeros in case the blacklist was not sufficient * Add test that verifies lower order pub-keys are rejected at the DH step * Update changelog * fix typo in test-comment	6 years ago
Ismail Khoffi	d2c7f8dbcf	p2p: check secret conn id matches dialed id (#3321 ) ref: [#3010 (comment)](https://github.com/tendermint/tendermint/issues/3010#issuecomment-464287627) > I tried searching for code where we authenticate a peer against its NetAddress.ID and couldn't find it. I don't see a reason to switch to Noise, but a need to ensure that the node's ID is authenticated e.g. after dialing from the address book. * p2p: check secret conn id matches dialed id * Fix all p2p tests & make code compile * add simple test for dialing with wrong ID * update changelog * address review comments * yet another place where to use IDAddressString and fix testSetupMultiplexTransport	6 years ago
Anton Kaliaev	7fd51e6ade	make govet linter pass (#3292 ) * make govet linter pass Refs #3262 * close PipeReader and check for err	6 years ago
Anton Kaliaev	11e36d0bfb	addrbook_test: preallocate memory for bookSizes (#3268 ) Fixes https://circleci.com/gh/tendermint/tendermint/44901	6 years ago
Ethan Buchman	354a08c25a	p2p: fix infinite loop in addrbook (#3232 ) * failing test * fix infinite loop in addrbook There are cases where we only have a small number of addresses marked good ("old"), but the selection mechanism keeps trying to select more of these addresses, and hence ends up in an infinite loop. Here we fix this to only try and select such "old" addresses if we have enough of them. Note this means, if we don't have enough of them, we may return more "new" addresses than otherwise expected by the newSelectionBias. This whole GetSelectionWithBias method probably needs to be rewritten, but this is a quick fix for the issue. * changelog * fix infinite loop if not enough new addrs * fix another potential infinite loop if a.nNew == 0 -> pickFromOldBucket=true, but we don't have enough items (a.nOld > len(oldBucketToAddrsMap) false) * Revert "fix another potential infinite loop" This reverts commit `146540c112`. * check num addresses instead of buckets, new test * fixed the int division * add slack to bias % in test, lint fixes * Added checks for selection content in test * test cleanup * Apply suggestions from code review Co-Authored-By: ebuchman <ethan@coinculture.info> * address review comments * change after Anton's review comments * use the same docker image we use for testing when building a binary for localnet * switch back to circleci classic * more review comments * more review comments * refactor addrbook_test * build linux binary inside docker in attempt to fix ``` --> Running dep + make build-linux GOOS=linux GOARCH=amd64 make build make[1]: Entering directory `/home/circleci/.go_workspace/src/github.com/tendermint/tendermint' CGO_ENABLED=0 go build -ldflags "-X github.com/tendermint/tendermint/version.GitCommit=`git rev-parse --short=8 HEAD`" -tags 'tendermint' -o build/tendermint ./cmd/tendermint/ p2p/pex/addrbook.go:373:13: undefined: math.Round ``` * change dir from /usr to /go * use concrete Go version for localnet binary * check for nil addresses just to be sure	6 years ago
Ethan Buchman	9e9026452c	p2p/conn: don't hold stopMtx while waiting (#3254 ) * p2p/conn: fix deadlock in FlushStop/OnStop * makefile: set_with_deadlock * close doneSendRoutine at end of sendRoutine * conn: initialize channs in OnStart	6 years ago
Ethan Buchman	45b70ae031	fix non deterministic test failures and race in privval socket (#3258 ) * node: decrease retry conn timeout in test Should fix #3256 The retry timeout was set to the default, which is the same as the accept timeout, so it's no wonder this would fail. Here we decrease the retry timeout so we can try many times before the accept timeout. * p2p: increase handshake timeout in test This fails sometimes, presumably because the handshake timeout is so low (only 50ms). So increase it to 1s. Should fix #3187 * privval: fix race with ping. closes #3237 Pings happen in a go-routine and can happen concurrently with other messages. Since we use a request/response protocol, we expect to send a request and get back the corresponding response. But with pings happening concurrently, this assumption could be violated. We were using a mutex, but only a RWMutex, where the RLock was being held for sending messages - this was to allow the underlying connection to be replaced if it fails. Turns out we actually need to use a full lock (not just a read lock) to prevent multiple requests from happening concurrently. * node: fix test name. DelayedStop -> DelayedStart * autofile: Wait() method In the TestWALTruncate in consensus/wal_test.go we remove the WAL directory at the end of the test. However the wal.Stop() does not properly wait for the autofile group to finish shutting down. Hence it was possible that the group's go-routine is still running when the cleanup happens, which causes a panic since the directory disappeared. Here we add a Wait() method to properly wait until the go-routine exits so we can safely clean up. This fixes #2852.	6 years ago
Ethan Buchman	4429826229	cmn: GetFreePort (#3255 )	6 years ago
Anton Kaliaev	6941d1bb35	use nolint label instead of commenting	6 years ago
Anton Kaliaev	3c8156a55a	preallocating memory when we can	6 years ago
Anton Kaliaev	ffd3bf8448	remove or comment out unused code	6 years ago
Ethan Buchman	eb4e23b91e	fix FlushStop (#3247 ) * p2p/pex: failing test * p2p/conn: add stopMtx for FlushStop and OnStop * changelog	6 years ago
Anton Kaliaev	d470945503	update gometalinter to 3.0.0 (#3233 ) in the attempt to fix https://circleci.com/gh/tendermint/tendermint/43165 also code is simplified by running gofmt -s . remove unused vars enable linters we're currently passing remove deprecated linters	6 years ago
Ismail Khoffi	6dd817cbbc	secret connection: check for low order points (#3040 ) > Implement a check for the blacklisted low order points, ala the X25519 has_small_order() function in libsodium (#3010 (comment)) resolves first half of #3010	6 years ago
Thane Thomson	a335caaedb	alias amino imports (#3219 ) As per conversation here: https://github.com/tendermint/tendermint/pull/3218#discussion_r251364041 This is the result of running the following code on the repo: ```bash find . -name '*.go' \| grep -v 'vendor/' \| xargs -n 1 goimports -w ```	6 years ago
Anton Kaliaev	c4157549ab	only log "Reached max attempts to dial" once (#3144 ) Closes #3037	6 years ago
Anton Kaliaev	2449bf7300	p2p: file descriptor leaks (#3150 ) * close peer's connection to avoid fd leak Fixes #2967 * rename peer#Addr to RemoteAddr * fix test * fixes after Ethan's review * bring back the check * changelog entry * write a test for switch#acceptRoutine * increase timeouts? :( * remove extra assertNPeersWithTimeout * simplify test * assert number of peers (just to be safe) * Cleanup in OnStop * run tests with verbose flag on CircleCI * spawn a reading routine to prevent connection from closing * get port from the listener random port is faster, but often results in ``` panic: listen tcp 127.0.0.1:44068: bind: address already in use [recovered] panic: listen tcp 127.0.0.1:44068: bind: address already in use goroutine 79 [running]: testing.tRunner.func1(0xc0001bd600) /usr/local/go/src/testing/testing.go:792 +0x387 panic(0x974d20, 0xc0001b0500) /usr/local/go/src/runtime/panic.go:513 +0x1b9 github.com/tendermint/tendermint/p2p.MakeSwitch(0xc0000f42a0, 0x0, 0x9fb9cc, 0x9, 0x9fc346, 0xb, 0xb42128, 0x0, 0x0, 0x0, ...) /home/vagrant/go/src/github.com/tendermint/tendermint/p2p/test_util.go:182 +0xa28 github.com/tendermint/tendermint/p2p.MakeConnectedSwitches(0xc0000f42a0, 0x2, 0xb42128, 0xb41eb8, 0x4f1205, 0xc0001bed80, 0x4f16ed) /home/vagrant/go/src/github.com/tendermint/tendermint/p2p/test_util.go:75 +0xf9 github.com/tendermint/tendermint/p2p.MakeSwitchPair(0xbb8d20, 0xc0001bd600, 0xb42128, 0x2f7, 0x4f16c0) /home/vagrant/go/src/github.com/tendermint/tendermint/p2p/switch_test.go:94 +0x4c github.com/tendermint/tendermint/p2p.TestSwitches(0xc0001bd600) /home/vagrant/go/src/github.com/tendermint/tendermint/p2p/switch_test.go:117 +0x58 testing.tRunner(0xc0001bd600, 0xb42038) /usr/local/go/src/testing/testing.go:827 +0xbf created by testing.(*T).Run /usr/local/go/src/testing/testing.go:878 +0x353 exit status 2 FAIL github.com/tendermint/tendermint/p2p 0.350s ```	6 years ago
Ismail Khoffi	40c887baf7	Normalize priorities to not exceed total voting power (#3049 ) * more proposer priority tests - test that we don't reset to zero when updating / adding - test that same power validators alternate * add another test to track / simulate similar behaviour as in #2960 * address some of Chris' review comments * address some more of Chris' review comments * temporarily pushing branch with the following changes: The total power might change if: - a validator is added - a validator is removed - a validator is updated Decrement the accums (of all validators) directly after any of these events (by the inverse of the change) * Fix 2960 by re-normalizing / scaling priorities to be in bounds of total power, additionally: - remove heap where it doesn't make sense - avg. only at the end of IncrementProposerPriority instead of each iteration - update (and slightly improve) TestAveragingInIncrementProposerPriorityWithVotingPower to reflect above changes * Fix 2960 by re-normalizing / scaling priorities to be in bounds of total power, additionally: - remove heap where it doesn't make sense - avg. only at the end of IncrementProposerPriority instead of each iteration - update (and slightly improve) TestAveragingInIncrementProposerPriorityWithVotingPower to reflect above changes * fix tests * add comment * update changelog pending & some minor changes * comment about division will floor the result & fix typo * Update TestLargeGenesisValidator: - remove TODO and increase large genesis validator's voting power accordingly * move changelog entry to P2P Protocol * Ceil instead of flooring when dividing & update test * quickly fix failing TestProposerPriorityDoesNotGetResetToZero: - divide by Ceil((maxPriority - minPriority) / 2totalVotingPower) fix typo: rename getValWitMostPriority -> getValWithMostPriority * test proposer frequencies * return absolute value for diff. keep testing * use for loop for div * cleanup, more tests * spellcheck * get rid of using floats: manually ceil where necessary * Remove float, simplify, fix tests to match chris's proof (#3157)	6 years ago
Anton Kaliaev	dcb8f88525	add "chain_id" label for all metrics (#3123 ) * add "chain_id" label for all metrics Refs #3082 * fix labels extraction	6 years ago
Ethan Buchman	ef94a322b8	Make SecretConnection thread safe (#3111 ) * p2p/conn: add failing tests * p2p/conn: make SecretConnection thread safe * changelog * fix from review	6 years ago
Mauricio Serna	7f607d0ce2	docs: fix p2p readme links (#3109 )	6 years ago
needkane	2348f38927	Update node_info.go (#3059 )	6 years ago
Ethan Buchman	0533c73a50	crypto: revert to mainline Go crypto lib (#3027 ) * crypto: revert to mainline Go crypto lib We used to use a fork for a modified bcrypt so we could pass our own randomness but this was largely unecessary, unused, and a burden. So now we just use the mainline Go crypto lib. * changelog * fix tests * version and changelog	6 years ago
Jae Kwon	9a6dd96cba	Revert to using defers in addrbook. (#3025 ) * Revert to using defers in addrbook. ValidateBasic->Validate since it requires DNS * Update CHANGELOG_PENDING	6 years ago
Anton Kaliaev	f69e2c6d6c	p2p: set MConnection#created during init (#2990 ) Fixes #2715 In crawlPeersRoutine, which is performed when seedMode is run, there is logic that disconnects the peer's state information at 3-hour intervals through the duration value. The duration value is calculated by referring to the created value of MConnection. When MConnection is created for the first time, the created value is not initiated, so it is not disconnected every 3 hours but every time it is disconnected. So, normal nodes are connected to seedNode and disconnected immediately, so address exchange does not work properly. https://github.com/tendermint/tendermint/blob/master/p2p/pex/pex_reactor.go#L629 This point is not work correctly. I think, https://github.com/tendermint/tendermint/blob/master/p2p/conn/connection.go#L148 created variable is missing the current time setting.	6 years ago
Ethan Buchman	a14fd8eba0	p2p: fix peer count mismatch #2332 (#2969 ) * p2p: test case for peer count mismatch #2332 * p2p: fix peer count mismatch #2332 * changelog * use httptest.Server to scrape Prometheus metrics	6 years ago
Ethan Buchman	1bb7e31d63	p2p: panic on transport error (#2968 ) * p2p: panic on transport error Addresses #2823. Currently, the acceptRoutine exits if the transport returns an error trying to accept a new connection. Once this happens, the node can't accept any new connections. So here, we panic instead. While we could potentially be more intelligent by rerunning the acceptRoutine, the error may indicate something more fundamental (eg. file desriptor limit) that requires a restart anyways. We can leave it to process managers to handle that restart, and notify operators about the panic. * changelog	6 years ago
Ismail Khoffi	b30c34e713	rename Accum -> ProposerPriority: (#2932 ) - rename fields, methods, comments, tests	6 years ago
Dev Ojha	4039276085	remove unnecessary "crypto" import alias (#2940 )	6 years ago
nagarajmanjunath	bef39f3346	Updated Marshal and unmarshal methods to make compatible with protobuf (#2918 ) * upadtes in grpc Marshal and unmarshal * update comments	6 years ago
Ethan Buchman	6168b404a7	p2p: NewMultiplexTransport takes an MConnConfig (#2869 ) * p2p: NewMultiplexTransport takes an MConnConfig * changelog * move test func to test file	6 years ago
Ethan Buchman	0d5e0d2f13	p2p/conn: FlushStop. Use in pex. Closes #2092 (#2802 ) * p2p/conn: FlushStop. Use in pex. Closes #2092 In seed mode, we call StopPeer immediately after Send. Since flushing msgs to the peer happens in the background, the peer connection is often closed before the messages are actually sent out. The new FlushStop method allows all msgs to first be written and flushed out on the conn before it is closed. * fix dummy peer * typo * fixes from review * more comments * ensure pex doesn't call FlushStop more than once FlushStop is not safe to call more than once, but we call it from Receive in a go-routine so Receive doesn't block. To ensure we only call it once, we use the lastReceivedRequests map - if an entry already exists, then FlushStop should already have been called and we can return.	6 years ago
Ethan Buchman	d8ab8509de	p2p: log 'Send failed' on Debug (#2857 )	6 years ago
Anton Kaliaev	5a6822c8ac	abci: localClient improvements & bugfixes & pubsub Unsubscribe issues (#2748 ) * use READ lock/unlock in ConsensusState#GetLastHeight Refs #2721 * do not use defers when there's no need * fix peer formatting (output its address instead of the pointer) ``` [54310]: E[11-02\|11:59:39.851] Connection failed @ sendRoutine module=p2p peer=0xb78f00 conn=MConn{74.207.236.148:26656} err="pong timeout" ``` https://github.com/tendermint/tendermint/issues/2721#issuecomment-435326581 * panic if peer has no state https://github.com/tendermint/tendermint/issues/2721#issuecomment-435347165 It's confusing that sometimes we check if peer has a state, but most of the times we expect it to be there 1. `add79700b5/mempool/reactor.go (L138)` 2. `add79700b5/rpc/core/consensus.go (L196)` (edited) I will change everything to always assume peer has a state and panic otherwise that should help identify issues earlier * abci/localclient: extend lock on app callback App callback should be protected by lock as well (note this was already done for InitChainAsync, why not for others???). Otherwise, when we execute the block, tx might come in and call the callback in the same time we're updating it in execBlockOnProxyApp => DATA RACE Fixes #2721 Consensus state is locked ``` goroutine 113333 [semacquire, 309 minutes]: sync.runtime_SemacquireMutex(0xc00180009c, 0xc0000c7e00) /usr/local/go/src/runtime/sema.go:71 +0x3d sync.(RWMutex).RLock(0xc001800090) /usr/local/go/src/sync/rwmutex.go:50 +0x4e github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusState).GetRoundState(0xc001800000, 0x0) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:218 +0x46 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusReactor).queryMaj23Routine(0xc0017def80, 0x11104a0, 0xc0072488f0, 0xc007248 9c0) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/reactor.go:735 +0x16d created by github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusReactor).AddPeer /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/reactor.go:172 +0x236 ``` because localClient is locked ``` goroutine 1899 [semacquire, 309 minutes]: sync.runtime_SemacquireMutex(0xc00003363c, 0xc0000cb500) /usr/local/go/src/runtime/sema.go:71 +0x3d sync.(Mutex).Lock(0xc000033638) /usr/local/go/src/sync/mutex.go:134 +0xff github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client.(localClient).SetResponseCallback(0xc0001fb560, 0xc007868540) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client/local_client.go:32 +0x33 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy.(appConnConsensus).SetResponseCallback(0xc00002f750, 0xc007868540) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy/app_conn.go:57 +0x40 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state.execBlockOnProxyApp(0x1104e20, 0xc002ca0ba0, 0x11092a0, 0xc00002f750, 0xc0001fe960, 0xc000bfc660, 0x110cfe0, 0xc000090330, 0xc9d12, 0xc000d9d5a0, ...) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state/execution.go:230 +0x1fd github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state.(BlockExecutor).ApplyBlock(0xc002c2a230, 0x7, 0x0, 0xc000eae880, 0x6, 0xc002e52c60, 0x16, 0x1f927, 0xc9d12, 0xc000d9d5a0, ...) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state/execution.go:96 +0x142 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusState).finalizeCommit(0xc001800000, 0x1f928) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1339 +0xa3e github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusState).tryFinalizeCommit(0xc001800000, 0x1f928) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1270 +0x451 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusState).enterCommit.func1(0xc001800000, 0x0, 0x1f928) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1218 +0x90 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusState).enterCommit(0xc001800000, 0x1f928, 0x0) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1247 +0x6b8 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusState).addVote(0xc001800000, 0xc003d8dea0, 0xc000cf4cc0, 0x28, 0xf1, 0xc003bc7ad0, 0xc003bc7b10) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1659 +0xbad github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusState).tryAddVote(0xc001800000, 0xc003d8dea0, 0xc000cf4cc0, 0x28, 0xf1, 0xf1, 0xf1) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1517 +0x59 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusState).handleMsg(0xc001800000, 0xd98200, 0xc0070dbed0, 0xc000cf4cc0, 0x28) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:660 +0x64b github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusState).receiveRoutine(0xc001800000, 0x0) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:617 +0x670 created by github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(ConsensusState).OnStart /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:311 +0x132 ``` tx comes in and CheckTx is executed right when we execute the block ``` goroutine 111044 [semacquire, 309 minutes]: sync.runtime_SemacquireMutex(0xc00003363c, 0x0) /usr/local/go/src/runtime/sema.go:71 +0x3d sync.(Mutex).Lock(0xc000033638) /usr/local/go/src/sync/mutex.go:134 +0xff github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client.(localClient).CheckTxAsync(0xc0001fb0e0, 0xc002d94500, 0x13f, 0x280, 0x0) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client/local_client.go:85 +0x47 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy.(appConnMempool).CheckTxAsync(0xc00002f720, 0xc002d94500, 0x13f, 0x280, 0x1) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy/app_conn.go:114 +0x51 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/mempool.(Mempool).CheckTx(0xc002d3a320, 0xc002d94500, 0x13f, 0x280, 0xc0072355f0, 0x0, 0x0) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/mempool/mempool.go:316 +0x17b github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/core.BroadcastTxSync(0xc002d94500, 0x13f, 0x280, 0x0, 0x0, 0x0) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/core/mempool.go:93 +0xb8 reflect.Value.call(0xd85560, 0x10326c0, 0x13, 0xec7b8b, 0x4, 0xc00663f180, 0x1, 0x1, 0xc00663f180, 0xc00663f188, ...) /usr/local/go/src/reflect/value.go:447 +0x449 reflect.Value.Call(0xd85560, 0x10326c0, 0x13, 0xc00663f180, 0x1, 0x1, 0x0, 0x0, 0xc005cc9344) /usr/local/go/src/reflect/value.go:308 +0xa4 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server.makeHTTPHandler.func2(0x1102060, 0xc00663f100, 0xc0082d7900) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server/handlers.go:269 +0x188 net/http.HandlerFunc.ServeHTTP(0xc002c81f20, 0x1102060, 0xc00663f100, 0xc0082d7900) /usr/local/go/src/net/http/server.go:1964 +0x44 net/http.(ServeMux).ServeHTTP(0xc002c81b60, 0x1102060, 0xc00663f100, 0xc0082d7900) /usr/local/go/src/net/http/server.go:2361 +0x127 github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server.maxBytesHandler.ServeHTTP(0x10f8a40, 0xc002c81b60, 0xf4240, 0x1102060, 0xc00663f100, 0xc0082d7900) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server/http_server.go:219 +0xcf github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server.RecoverAndLogHandler.func1(0x1103220, 0xc00121e620, 0xc0082d7900) /root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server/http_server.go:192 +0x394 net/http.HandlerFunc.ServeHTTP(0xc002c06ea0, 0x1103220, 0xc00121e620, 0xc0082d7900) /usr/local/go/src/net/http/server.go:1964 +0x44 net/http.serverHandler.ServeHTTP(0xc001a1aa90, 0x1103220, 0xc00121e620, 0xc0082d7900) /usr/local/go/src/net/http/server.go:2741 +0xab net/http.(conn).serve(0xc00785a3c0, 0x11041a0, 0xc000f844c0) /usr/local/go/src/net/http/server.go:1847 +0x646 created by net/http.(Server).Serve /usr/local/go/src/net/http/server.go:2851 +0x2f5 ``` * consensus: use read lock in Receive#VoteMessage * use defer to unlock mutex because application might panic * use defer in every method of the localClient * add a changelog entry * drain channels before Unsubscribe(All) Read `55362ed766/libs/pubsub/pubsub.go (L13)` for the detailed explanation of the issue. We'll need to fix it someday. Make sure to keep an eye on https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-033-pubsub.md * retry instead of panic when peer has no state in reactors other than consensus in /dump_consensus_state RPC endpoint, skip a peer with no state * rpc/core/mempool: simplify error messages * rpc/core/mempool: use time.After instead of timer also, do not log DeliverTx result (to be consistent with other memthods) * unlock before calling the callback in reqRes#SetCallback	6 years ago

1 2 3 4 5 ...

699 Commits (696e13e1ff4c0be2052d2a4f4e36ef3031b11598)