This commit contains commit messages from the 52 commits from Tendermint 0.32.0 to 0.32.7. This is a result of creating releases from our security advisories, rather than merging these advisories back into the main repo before creating releases. In the future, we will adopt a git workflow that will reduce these commits to only the commits that make up RC2 for (for example) Tendermint 0.32.8.
* docs: fix consensus spec formatting (#3804)
* abci/server: recover from app panics in socket server (#3809)
fixes#3800
* abci/client: fix DATA RACE in gRPC client (#3798)
* Remove go func {}()
closes#357
- Remove go func(){}() that caused race condiditon
- To reproduce
- add -race in make file to `install_abci`
- Remove `CGO_ENABLED=0` & add -race to `install`
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* remove -race
* fix data race
also, reorder callbacks similarly to socket client
* docs: "Writing a built-in Tendermint Core application in Go" guide (#3608)
* docs: go built-in guide
* fix package imports, add badger db, simplify Query
* newTendermint function
* working example
* finish the first guide
* add one more note
* add the second Golang guide - external ABCI app
* fix typos
* libs: Remove db from tendermint in favor of tendermint/tm-cmn (#3811)
* Remove db from tendemrint in favor of tendermint/tm-cmn
- remove db from `libs`
- update dependancy, there have been no breaking changes in the updated deps
- https://github.com/grpc/grpc-go/releases
- https://github.com/golang/protobuf/releases
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* changelog add
* gofmt
* more gofmt
* docs: add A TOC to the Readme.md of ADR Section (#3820)
* ADR TOC in readme.md
* Added A TOC to the Readme.md of ADR Section
- Added table of contents to the Readme of the architecture section.
- Easier to traverse and when you know what is there.
- If the Adr's become viewable online it would help guide the user
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* add tm-cmn to subprojects
* normalize word
* rpc: make max_body_bytes and max_header_bytes configurable (#3818)
* rpc: make max_body_bytes and max_header_bytes configurable
* update changelog pending
* p2p/conn: Add Bufferpool (#3664)
* use byte buffer pool to decreass allocs
* wrap to put buffer in defer
* wapper defer
* add dependency
* remove Gopkg,*
* add change log
* rpc: /broadcast_evidence (#3481)
* implement broadcast_duplicate_vote endpoint
* fix test_cover
* address comments
* address comments
* Update abci/example/kvstore/persistent_kvstore.go
Co-Authored-By: mossid <torecursedivine@gmail.com>
* Update rpc/client/main_test.go
Co-Authored-By: mossid <torecursedivine@gmail.com>
* address comments in progress
* reformat the code
* make linter happy
* make tests pass
* replace BroadcastDuplicateVote with BroadcastEvidence
* fix test
* fix endpoint name
* improve doc
* fix TestBroadcastEvidenceDuplicateVote
* Update rpc/core/evidence.go
Co-Authored-By: Thane Thomson <connect@thanethomson.com>
* add changelog entry
* fix TestBroadcastEvidenceDuplicateVote
* mempool: make max_msg_bytes configurable (#3826)
* mempool: make max_msg_bytes configurable
* apply suggestions from code review
* update changelog pending
* apply suggestions from code review again
* rpc: return err if page is incorrect (less than 0 or greater than tot… (#3825)
* rpc: return err if page is incorrect (less than 0 or greater than total pages)
Fixes#3813
* fix rpc_test
* blockchain: Reorg reactor (#3561)
* go routines in blockchain reactor
* Added reference to the go routine diagram
* Initial commit
* cleanup
* Undo testing_logger change, committed by mistake
* Fix the test loggers
* pulled some fsm code into pool.go
* added pool tests
* changes to the design
added block requests under peer
moved the request trigger in the reactor poolRoutine, triggered now by a ticker
in general moved everything required for making block requests smarter in the poolRoutine
added a simple map of heights to keep track of what will need to be requested next
added a few more tests
* send errors to FSM in a different channel than blocks
send errors (RemovePeer) from switch on a different channel than the
one receiving blocks
renamed channels
added more pool tests
* more pool tests
* lint errors
* more tests
* more tests
* switch fast sync to new implementation
* fixed data race in tests
* cleanup
* finished fsm tests
* address golangci comments :)
* address golangci comments :)
* Added timeout on next block needed to advance
* updating docs and cleanup
* fix issue in test from previous cleanup
* cleanup
* Added termination scenarios, tests and more cleanup
* small fixes to adr, comments and cleanup
* Fix bug in sendRequest()
If we tried to send a request to a peer not present in the switch, a
missing continue statement caused the request to be blackholed in a peer
that was removed and never retried.
While this bug was manifesting, the reactor kept asking for other
blocks that would be stored and never consumed. Added the number of
unconsumed blocks in the math for requesting blocks ahead of current
processing height so eventually there will be no more blocks requested
until the already received ones are consumed.
* remove bpPeer's didTimeout field
* Use distinct err codes for peer timeout and FSM timeouts
* Don't allow peers to update with lower height
* review comments from Ethan and Zarko
* some cleanup, renaming, comments
* Move block execution in separate goroutine
* Remove pool's numPending
* review comments
* fix lint, remove old blockchain reactor and duplicates in fsm tests
* small reorg around peer after review comments
* add the reactor spec
* verify block only once
* review comments
* change to int for max number of pending requests
* cleanup and godoc
* Add configuration flag fast sync version
* golangci fixes
* fix config template
* move both reactor versions under blockchain
* cleanup, golint, renaming stuff
* updated documentation, fixed more golint warnings
* integrate with behavior package
* sync with master
* gofmt
* add changelog_pending entry
* move to improvments
* suggestion to changelog entry
* Renamed wire.go to codec.go (#3827)
* Renamed wire.go to codec.go
- Wire was the previous name of amino
- Codec describes the file better than `wire` & `amino`
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* ide error
* rename amino.go to codec.go
* docs: add guides to docs (#3830)
* add staticcheck linting (#3828)
cleanup to add linter
grpc change:
https://godoc.org/google.golang.org/grpc#WithContextDialerhttps://godoc.org/google.golang.org/grpc#WithDialer
grpc/grpc-go#2627
prometheous change:
due to UninstrumentedHandler, being deprecated in the future
empty branch = empty if or else statement
didn't delete them entirely but commented
couldn't find a reason to have them
could not replicate the issue #3406
but if want to keep it commented then we should comment out the if statement as well
* types: move MakeVote / MakeBlock functions (#3819)
to the types package
Paritally Fixes#3584
* p2p: Fix error logging for connection stop (#3824)
* p2p: fix false-positive error logging when stopping connections
This changeset fixes two types of false-positive errors occurring during
connection shutdown.
The first occurs when the process invokes FlushStop() or Stop() on a
connection. While the previous behavior did properly wait for the sendRoutine
to finish, it did not notify the recvRoutine that the connection was shutting
down. This would cause the recvRouting to receive and error when reading and
log this error. The changeset fixes this by notifying the recvRoutine that
the connection is shutting down.
The second occurs when the connection is terminated (gracefully) by the other side.
The recvRoutine would get an EOF error during the read, log it, and stop the connection
with an error. The changeset detects EOF and gracefully shuts down the connection.
* bring back the comment about flushing
* add changelog entry
* listen for quitRecvRoutine too
* we have to call stopForError
Otherwise peer won't be removed from the peer set and maybe readded
later.
* p2p: Do not write 'Couldn't connect to any seeds' if there are no seeds (#3834)
* Do not write 'Couldn't connect to any seeds' if there are no seeds
* changelog
* remove privValUpgrade
* Fix typo in changelog
* Update CHANGELOG_PENDING.md
Co-Authored-By: Marko <marbar3778@yahoo.com>
I'm setting up all peers dynamically by calling dial_peers, so p2p.seeds in configs is empty, and I'm seeing error log a lot in logs.
* docs: add a footer to guides (#3835)
* docs: "Writing a Tendermint Core application in Kotlin (gRPC)" guide (#3838)
* add abci grpc kotlin guide
* Update docs/guides/kotlin.md
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update docs/guides/kotlin.md
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update docs/guides/kotlin.md
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update kotlin.md
* node: allow replacing existing p2p.Reactor(s) (#3846)
* node: allow replacing existing p2p.Reactor(s)
using [`CustomReactors`
option](https://godoc.org/github.com/tendermint/tendermint/node#CustomReactors).
Warning: beware of accidental name clashes. Here is the list of existing
reactors: MEMPOOL, BLOCKCHAIN, CONSENSUS, EVIDENCE, PEX.
* check the absence of "CUSTOM" prefix
* merge 2 tests
* add doc.go to node package
* gocritic (1/2) (#3836)
Add gocritic as a linter
The linting is not complete, but should i complete in this PR or in a following.
23 files have been touched so it may be better to do in a following PR
Commits:
* Add gocritic to linting
- Added gocritic to linting
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* gocritic
* pr comments
* remove switch in cmdBatch
* tm-cmn to tm-db (#3850)
* tm-cmn to tm-db
* go.mod changes
* go.mod changes
* more go.mod
* fix tm-db
* ci fix, pending change
* version tmdb (#3854)
* txindexer: Refactor Tx Search Aggregation (#3851)
- Replace the previous intersect call, which was called at each query condition, with a map intersection.
- Replace fmt.Sprintf with string()
closes: #3076
Benchmarks
```
Old
goos: darwin
goarch: amd64
pkg: github.com/tendermint/tendermint/state/txindex/kv
BenchmarkTxSearch-4 200 103641206 ns/op 7998416 B/op 71171 allocs/op
PASS
ok github.com/tendermint/tendermint/state/txindex/kv 26.019s
New
goos: darwin
goarch: amd64
pkg: github.com/tendermint/tendermint/state/txindex/kv
BenchmarkTxSearch-4 1000 38615024 ns/op 13515226 B/op 166460 allocs/op
PASS
ok github.com/tendermint/tendermint/state/txindex/kv 53.618s
```
~62% performance improvement
Commits:
* Refactor tx search
* Add pending changelog entry
* Add tx search benchmarking
* remove intermediate hashes list
also reset timer in BenchmarkTxSearch
and fix other benchmark
* fix import
* Add test cases
* Fix searching
* Replace fmt.Sprintf with string
* Update state/txindex/kv/kv.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Rename params
* Cleanup
* Check error in benchmarks
* release for v0.32.2
* Merge PR #3860: Update log v0.32.2
* changelog updates
* pr comments
* Fix for panic in signature verification if a peer sends a nil public key.
* update version.go
* Changelog update
* Update CHANGELOG.md
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* update changelog
* p2p: only allow ed25519 pubkeys when connecting
also, recover from any possible failures in acceptPeers
Refs #4030
* update changelog and bump version to v0.32.6
* set the date to today
* cs: panic only when WAL#WriteSync fails
- modify WAL#Write and WAL#WriteSync to return an error
* types: validate Part#Proof
add ValidateBasic to crypto/merkle/SimpleProof
* cs: limit max bit array size and block parts count
* cs: test new limits
* cs: only assert important stuff
* update changelog and bump version to 0.32.7
* fixes after Ethan's review
* align max wal msg and max consensus msg sizes
* fix tests
* fix test
* Rc2 v0.32.8
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* move issue to big fix
Some linting/cleanup missed from the initial events refactor
Don't panic; instead, return false, error when matching breaks unexpectedly
Strip non-numeric chars from values when attempting to match against query values
Have the server log during send upon error
* cleanup/lint Query#Conditions and do not panic
* cleanup/lint Query#Matches and do not panic
* cleanup/lint matchValue and do not panic
* rever to panic in Query#Conditions
* linting
* strip alpha chars when attempting to match
* add pending log entries
* Update libs/pubsub/query/query.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* build: update variable names
* update matchValue to return an error
* update Query#Matches to return an error
* update TestMatches
* log error in send
* Fix tests
* Fix TestEmptyQueryMatchesAnything
* fix linting
* Update libs/pubsub/query/query.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update libs/pubsub/query/query.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update libs/pubsub/query/query.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update libs/pubsub/query/query.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update libs/pubsub/query/query.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update libs/pubsub/pubsub.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* add missing errors pkg import
* update Query#Conditions to return an error
* update query pkg unit tests
* update TxIndex#Search
* update pending changelog
* Fix long line errors in abci, crypto, and libs packages
* Fix long lines in p2p and rpc packages
* Fix long lines in abci, state, and tools packages
* Fix long lines in behaviour and blockchain packages
* Fix long lines in cmd and config packages
* Begin fixing long lines in consensus package
* Finish fixing long lines in consensus package
* Add lll exclusion for lines containing URLs
* Fix long lines in crypto package
* Fix long lines in evidence package
* Fix long lines in mempool and node packages
* Fix long lines in libs package
* Fix long lines in lite package
* Fix new long line in node package
* Fix long lines in p2p package
* Ignore gocritic warning
* Fix long lines in privval package
* Fix long lines in rpc package
* Fix long lines in scripts package
* Fix long lines in state package
* Fix long lines in tools package
* Fix long lines in types package
* Enable lll linter
## PR
This PR introduces a fundamental breaking change to the structure of ABCI response and tx tags and the way they're processed. Namely, the SDK can support more complex and aggregated events for distribution and slashing. In addition, block responses can include duplicate keys in events.
Implement new Event type. An event has a type and a list of KV pairs (ie. list-of-lists). Typical events may look like:
"rewards": [{"amount": "5000uatom", "validator": "...", "recipient": "..."}]
"sender": [{"address": "...", "balance": "100uatom"}]
The events are indexed by {even.type}.{even.attribute[i].key}/.... In this case a client would subscribe or query for rewards.recipient='...'
ABCI response types and related types now include Events []Event instead of Tags []cmn.KVPair.
PubSub logic now publishes/matches against map[string][]string instead of map[string]string to support duplicate keys in response events (from #1385). A match is successful if the value is found in the slice of strings.
closes: #1859closes: #2905
## Commits:
* Implement Event ABCI type and updates responses to use events
* Update messages_test.go
* Update kvstore.go
* Update event_bus.go
* Update subscription.go
* Update pubsub.go
* Update kvstore.go
* Update query logic to handle slice of strings in events
* Update Empty#Matches and unit tests
* Update pubsub logic
* Update EventBus#Publish
* Update kv tx indexer
* Update godocs
* Update ResultEvent to use slice of strings; update RPC
* Update more tests
* Update abci.md
* Check for key in validateAndStringifyEvents
* Fix KV indexer to skip empty keys
* Fix linting errors
* Update CHANGELOG_PENDING.md
* Update docs/spec/abci/abci.md
Co-Authored-By: Federico Kunze <31522760+fedekunze@users.noreply.github.com>
* Update abci/types/types.proto
Co-Authored-By: Ethan Buchman <ethan@coinculture.info>
* Update docs/spec/abci/abci.md
Co-Authored-By: Ethan Buchman <ethan@coinculture.info>
* Update libs/pubsub/query/query.go
Co-Authored-By: Ethan Buchman <ethan@coinculture.info>
* Update match function to match if ANY value matches
* Implement TestSubscribeDuplicateKeys
* Update TestMatches to include multi-key test cases
* Update events.go
* Update Query interface godoc
* Update match godoc
* Add godoc for matchValue
* DRY-up tx indexing
* Return error from PublishWithEvents in EventBus#Publish
* Update PublishEventNewBlockHeader to return an error
* Fix build
* Update events doc in ABCI
* Update ABCI events godoc
* Implement TestEventBusPublishEventTxDuplicateKeys
* Update TestSubscribeDuplicateKeys to be table-driven
* Remove mod file
* Remove markdown from events godoc
* Implement TestTxSearchDeprecatedIndexing test
* limit number of /subscribe clients and queries per client
Add the following config variables (under [rpc] section):
* max_subscription_clients
* max_subscriptions_per_client
* timeout_broadcast_tx_commit
Fixes#2826
new HTTPClient interface for subscriptions
finalize HTTPClient events interface
remove EventSubscriber
fix data race
```
WARNING: DATA RACE
Read at 0x00c000a36060 by goroutine 129:
github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe.func1()
/go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:168 +0x1f0
Previous write at 0x00c000a36060 by goroutine 132:
github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe()
/go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:191 +0x4e0
github.com/tendermint/tendermint/rpc/client.WaitForOneEvent()
/go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178
github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1()
/go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298
testing.tRunner()
/usr/local/go/src/testing/testing.go:827 +0x162
Goroutine 129 (running) created at:
github.com/tendermint/tendermint/rpc/client.(*Local).Subscribe()
/go/src/github.com/tendermint/tendermint/rpc/client/localclient.go:164 +0x4b7
github.com/tendermint/tendermint/rpc/client.WaitForOneEvent()
/go/src/github.com/tendermint/tendermint/rpc/client/helpers.go:64 +0x178
github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync.func1()
/go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:139 +0x298
testing.tRunner()
/usr/local/go/src/testing/testing.go:827 +0x162
Goroutine 132 (running) created at:
testing.(*T).Run()
/usr/local/go/src/testing/testing.go:878 +0x659
github.com/tendermint/tendermint/rpc/client_test.TestTxEventsSentWithBroadcastTxSync()
/go/src/github.com/tendermint/tendermint/rpc/client/event_test.go:119 +0x186
testing.tRunner()
/usr/local/go/src/testing/testing.go:827 +0x162
==================
```
lite client works (tested manually)
godoc comments
httpclient: do not close the out channel
use TimeoutBroadcastTxCommit
no timeout for unsubscribe
but 1s Local (5s HTTP) timeout for resubscribe
format code
change Subscribe#out cap to 1
and replace config vars with RPCConfig
TimeoutBroadcastTxCommit can't be greater than rpcserver.WriteTimeout
rpc: Context as first parameter to all functions
reformat code
fixes after my own review
fixes after Ethan's review
add test stubs
fix config.toml
* fixes after manual testing
- rpc: do not recommend to use BroadcastTxCommit because it's slow and wastes
Tendermint resources (pubsub)
- rpc: better error in Subscribe and BroadcastTxCommit
- HTTPClient: do not resubscribe if err = ErrAlreadySubscribed
* fixes after Ismail's review
* Update rpc/grpc/grpc_test.go
Co-Authored-By: melekes <anton.kalyaev@gmail.com>
* fix dirty data in peerset
startInitPeer before PeerSet add the peer,
once mconnection start and Receive of one
Reactor faild, will try to remove it from PeerSet
while PeerSet still not contain the peer. Fix
this by change the order.
* fix test FilterDuplicate
* fix start/stop race
* fix err
* green pubsub tests :OK:
* get rid of clientToQueryMap
* Subscribe and SubscribeUnbuffered
* start adapting other pkgs to new pubsub
* nope
* rename MsgAndTags to Message
* remove TagMap
it does not bring any additional benefits
* bring back EventSubscriber
* fix test
* fix data race in TestStartNextHeightCorrectly
```
Write at 0x00c0001c7418 by goroutine 796:
github.com/tendermint/tendermint/consensus.TestStartNextHeightCorrectly()
/go/src/github.com/tendermint/tendermint/consensus/state_test.go:1296 +0xad
testing.tRunner()
/usr/local/go/src/testing/testing.go:827 +0x162
Previous read at 0x00c0001c7418 by goroutine 858:
github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote()
/go/src/github.com/tendermint/tendermint/consensus/state.go:1631 +0x1366
github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote()
/go/src/github.com/tendermint/tendermint/consensus/state.go:1476 +0x8f
github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg()
/go/src/github.com/tendermint/tendermint/consensus/state.go:667 +0xa1e
github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine()
/go/src/github.com/tendermint/tendermint/consensus/state.go:628 +0x794
Goroutine 796 (running) created at:
testing.(*T).Run()
/usr/local/go/src/testing/testing.go:878 +0x659
testing.runTests.func1()
/usr/local/go/src/testing/testing.go:1119 +0xa8
testing.tRunner()
/usr/local/go/src/testing/testing.go:827 +0x162
testing.runTests()
/usr/local/go/src/testing/testing.go:1117 +0x4ee
testing.(*M).Run()
/usr/local/go/src/testing/testing.go:1034 +0x2ee
main.main()
_testmain.go:214 +0x332
Goroutine 858 (running) created at:
github.com/tendermint/tendermint/consensus.(*ConsensusState).startRoutines()
/go/src/github.com/tendermint/tendermint/consensus/state.go:334 +0x221
github.com/tendermint/tendermint/consensus.startTestRound()
/go/src/github.com/tendermint/tendermint/consensus/common_test.go:122 +0x63
github.com/tendermint/tendermint/consensus.TestStateFullRound1()
/go/src/github.com/tendermint/tendermint/consensus/state_test.go:255 +0x397
testing.tRunner()
/usr/local/go/src/testing/testing.go:827 +0x162
```
* fixes after my own review
* fix formatting
* wait 100ms before kicking a subscriber out
+ a test for indexer_service
* fixes after my second review
* no timeout
* add changelog entries
* fix merge conflicts
* fix typos after Thane's review
Co-Authored-By: melekes <anton.kalyaev@gmail.com>
* reformat code
* rewrite indexer service in the attempt to fix failing test
https://github.com/tendermint/tendermint/pull/3227/#issuecomment-462316527
* Revert "rewrite indexer service in the attempt to fix failing test"
This reverts commit 0d9107a098.
* another attempt to fix indexer
* fixes after Ethan's review
* use unbuffered channel when indexing transactions
Refs https://github.com/tendermint/tendermint/pull/3227#discussion_r258786716
* add a comment for EventBus#SubscribeUnbuffered
* format code
* use READ lock/unlock in ConsensusState#GetLastHeight
Refs #2721
* do not use defers when there's no need
* fix peer formatting (output its address instead of the pointer)
```
[54310]: E[11-02|11:59:39.851] Connection failed @ sendRoutine module=p2p peer=0xb78f00 conn=MConn{74.207.236.148:26656} err="pong timeout"
```
https://github.com/tendermint/tendermint/issues/2721#issuecomment-435326581
* panic if peer has no state
https://github.com/tendermint/tendermint/issues/2721#issuecomment-435347165
It's confusing that sometimes we check if peer has a state, but most of
the times we expect it to be there
1. add79700b5/mempool/reactor.go (L138)
2. add79700b5/rpc/core/consensus.go (L196) (edited)
I will change everything to always assume peer has a state and panic
otherwise
that should help identify issues earlier
* abci/localclient: extend lock on app callback
App callback should be protected by lock as well (note this was already
done for InitChainAsync, why not for others???). Otherwise, when we
execute the block, tx might come in and call the callback in the same
time we're updating it in execBlockOnProxyApp => DATA RACE
Fixes#2721
Consensus state is locked
```
goroutine 113333 [semacquire, 309 minutes]:
sync.runtime_SemacquireMutex(0xc00180009c, 0xc0000c7e00)
/usr/local/go/src/runtime/sema.go:71 +0x3d
sync.(*RWMutex).RLock(0xc001800090)
/usr/local/go/src/sync/rwmutex.go:50 +0x4e
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).GetRoundState(0xc001800000, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:218 +0x46
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusReactor).queryMaj23Routine(0xc0017def80, 0x11104a0, 0xc0072488f0, 0xc007248
9c0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/reactor.go:735 +0x16d
created by github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusReactor).AddPeer
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/reactor.go:172 +0x236
```
because localClient is locked
```
goroutine 1899 [semacquire, 309 minutes]:
sync.runtime_SemacquireMutex(0xc00003363c, 0xc0000cb500)
/usr/local/go/src/runtime/sema.go:71 +0x3d
sync.(*Mutex).Lock(0xc000033638)
/usr/local/go/src/sync/mutex.go:134 +0xff
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client.(*localClient).SetResponseCallback(0xc0001fb560, 0xc007868540)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client/local_client.go:32 +0x33
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy.(*appConnConsensus).SetResponseCallback(0xc00002f750, 0xc007868540)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy/app_conn.go:57 +0x40
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state.execBlockOnProxyApp(0x1104e20, 0xc002ca0ba0, 0x11092a0, 0xc00002f750, 0xc0001fe960, 0xc000bfc660, 0x110cfe0, 0xc000090330, 0xc9d12, 0xc000d9d5a0, ...)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state/execution.go:230 +0x1fd
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state.(*BlockExecutor).ApplyBlock(0xc002c2a230, 0x7, 0x0, 0xc000eae880, 0x6, 0xc002e52c60, 0x16, 0x1f927, 0xc9d12, 0xc000d9d5a0, ...)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/state/execution.go:96 +0x142
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).finalizeCommit(0xc001800000, 0x1f928)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1339 +0xa3e
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).tryFinalizeCommit(0xc001800000, 0x1f928)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1270 +0x451
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).enterCommit.func1(0xc001800000, 0x0, 0x1f928)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1218 +0x90
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).enterCommit(0xc001800000, 0x1f928, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1247 +0x6b8
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote(0xc001800000, 0xc003d8dea0, 0xc000cf4cc0, 0x28, 0xf1, 0xc003bc7ad0, 0xc003bc7b10)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1659 +0xbad
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).tryAddVote(0xc001800000, 0xc003d8dea0, 0xc000cf4cc0, 0x28, 0xf1, 0xf1, 0xf1)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:1517 +0x59
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).handleMsg(0xc001800000, 0xd98200, 0xc0070dbed0, 0xc000cf4cc0, 0x28)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:660 +0x64b
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).receiveRoutine(0xc001800000, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:617 +0x670
created by github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus.(*ConsensusState).OnStart
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/consensus/state.go:311 +0x132
```
tx comes in and CheckTx is executed right when we execute the block
```
goroutine 111044 [semacquire, 309 minutes]:
sync.runtime_SemacquireMutex(0xc00003363c, 0x0)
/usr/local/go/src/runtime/sema.go:71 +0x3d
sync.(*Mutex).Lock(0xc000033638)
/usr/local/go/src/sync/mutex.go:134 +0xff
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client.(*localClient).CheckTxAsync(0xc0001fb0e0, 0xc002d94500, 0x13f, 0x280, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/abci/client/local_client.go:85 +0x47
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy.(*appConnMempool).CheckTxAsync(0xc00002f720, 0xc002d94500, 0x13f, 0x280, 0x1)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/proxy/app_conn.go:114 +0x51
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/mempool.(*Mempool).CheckTx(0xc002d3a320, 0xc002d94500, 0x13f, 0x280, 0xc0072355f0, 0x0, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/mempool/mempool.go:316 +0x17b
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/core.BroadcastTxSync(0xc002d94500, 0x13f, 0x280, 0x0, 0x0, 0x0)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/core/mempool.go:93 +0xb8
reflect.Value.call(0xd85560, 0x10326c0, 0x13, 0xec7b8b, 0x4, 0xc00663f180, 0x1, 0x1, 0xc00663f180, 0xc00663f188, ...)
/usr/local/go/src/reflect/value.go:447 +0x449
reflect.Value.Call(0xd85560, 0x10326c0, 0x13, 0xc00663f180, 0x1, 0x1, 0x0, 0x0, 0xc005cc9344)
/usr/local/go/src/reflect/value.go:308 +0xa4
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server.makeHTTPHandler.func2(0x1102060, 0xc00663f100, 0xc0082d7900)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server/handlers.go:269 +0x188
net/http.HandlerFunc.ServeHTTP(0xc002c81f20, 0x1102060, 0xc00663f100, 0xc0082d7900)
/usr/local/go/src/net/http/server.go:1964 +0x44
net/http.(*ServeMux).ServeHTTP(0xc002c81b60, 0x1102060, 0xc00663f100, 0xc0082d7900)
/usr/local/go/src/net/http/server.go:2361 +0x127
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server.maxBytesHandler.ServeHTTP(0x10f8a40, 0xc002c81b60, 0xf4240, 0x1102060, 0xc00663f100, 0xc0082d7900)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server/http_server.go:219 +0xcf
github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server.RecoverAndLogHandler.func1(0x1103220, 0xc00121e620, 0xc0082d7900)
/root/go/src/github.com/MinterTeam/minter-go-node/vendor/github.com/tendermint/tendermint/rpc/lib/server/http_server.go:192 +0x394
net/http.HandlerFunc.ServeHTTP(0xc002c06ea0, 0x1103220, 0xc00121e620, 0xc0082d7900)
/usr/local/go/src/net/http/server.go:1964 +0x44
net/http.serverHandler.ServeHTTP(0xc001a1aa90, 0x1103220, 0xc00121e620, 0xc0082d7900)
/usr/local/go/src/net/http/server.go:2741 +0xab
net/http.(*conn).serve(0xc00785a3c0, 0x11041a0, 0xc000f844c0)
/usr/local/go/src/net/http/server.go:1847 +0x646
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:2851 +0x2f5
```
* consensus: use read lock in Receive#VoteMessage
* use defer to unlock mutex because application might panic
* use defer in every method of the localClient
* add a changelog entry
* drain channels before Unsubscribe(All)
Read 55362ed766/libs/pubsub/pubsub.go (L13)
for the detailed explanation of the issue.
We'll need to fix it someday. Make sure to keep an eye on
https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-033-pubsub.md
* retry instead of panic when peer has no state in reactors other than consensus
in /dump_consensus_state RPC endpoint, skip a peer with no state
* rpc/core/mempool: simplify error messages
* rpc/core/mempool: use time.After instead of timer
also, do not log DeliverTx result (to be consistent with other memthods)
* unlock before calling the callback in reqRes#SetCallback
Currently the top level directory contains basically all of the code
for the crypto package. This PR moves the crypto code into submodules
in a similar manner to what `golang/x/crypto` does. This improves code
organization.
Ref discussion: https://github.com/tendermint/tendermint/pull/1966Closes#1956
Refs #1755
I started with writing a test for wsConnection (WebsocketManager) where
I:
- create a WS connection
- do a simple echo call
- close it
No leaking goroutines, nor any leaking memory were detected.
For useful shortcuts see my blog post
https://blog.cosmos.network/debugging-the-memory-leak-in-tendermint-210186711420
Then I went to the rpc tests to see if calling Subscribe results in
memory growth. It did.
I used a slightly modified version of TestHeaderEvents function:
```
func TestHeaderEvents(t *testing.T) {
// memory heap before
f, err := os.Create("/tmp/mem1.mprof")
if err != nil {
t.Fatal(err)
}
pprof.WriteHeapProfile(f)
f.Close()
for i := 0; i < 100; i++ {
c := getHTTPClient()
err = c.Start()
require.Nil(t, err)
evtTyp := types.EventNewBlockHeader
evt, err := client.WaitForOneEvent(c, evtTyp, waitForEventTimeout)
require.Nil(t, err)
_, ok := evt.(types.EventDataNewBlockHeader)
require.True(t, ok)
c.Stop()
c = nil
}
runtime.GC()
// memory heap before
f, err = os.Create("/tmp/mem2.mprof")
if err != nil {
t.Fatal(err)
}
pprof.WriteHeapProfile(f)
f.Close()
// dump all running goroutines
time.Sleep(10 * time.Second)
pprof.Lookup("goroutine").WriteTo(os.Stdout, 1)
}
```
```
Showing nodes accounting for 35159.16kB, 100% of 35159.16kB total
Showing top 10 nodes out of 48
flat flat% sum% cum cum%
32022.23kB 91.08% 91.08% 32022.23kB 91.08% github.com/tendermint/tendermint/libs/pubsub/query.(*QueryParser).Init
1056.33kB 3.00% 94.08% 1056.33kB 3.00% bufio.NewReaderSize
528.17kB 1.50% 95.58% 528.17kB 1.50% bufio.NewWriterSize
528.17kB 1.50% 97.09% 528.17kB 1.50% github.com/tendermint/tendermint/consensus.NewConsensusState
512.19kB 1.46% 98.54% 512.19kB 1.46% runtime.malg
512.08kB 1.46% 100% 512.08kB 1.46% syscall.ByteSliceFromString
0 0% 100% 512.08kB 1.46% github.com/tendermint/tendermint/consensus.(*ConsensusState).(github.com/tendermint/tendermint/consensus.defaultDecideProposal)-fm
0 0% 100% 512.08kB 1.46% github.com/tendermint/tendermint/consensus.(*ConsensusState).addVote
0 0% 100% 512.08kB 1.46% github.com/tendermint/tendermint/consensus.(*ConsensusState).defaultDecideProposal
0 0% 100% 512.08kB 1.46% github.com/tendermint/tendermint/consensus.(*ConsensusState).enterNewRound
```
100 subscriptions produce 32MB.
Again, no additional goroutines are running after the end of the test
(wsConnection readRoutine and writeRoutine both finishes). **It means
that some exiting goroutine or object is holding a reference to the
*Query objects, which are leaking.**
One of them is pubsub#loop. It's using state.queries to map queries to
clients and state.clients to map clients to queries.
Before this commit, we're not thoroughly cleaning state.queries, which
was the reason for memory leakage.