When using the RPC client in my test suite (with -race enabled), I do a lot of Subscribe/Unsubscribe operations, at some point (randomly) the race detector returns the following warning:
WARNING: DATA RACE
Read at 0x00c0009dbe30 by goroutine 31:
runtime.mapiterinit()
/usr/local/go/src/runtime/map.go:804 +0x0
github.com/tendermint/tendermint/rpc/client.(*WSEvents).redoSubscriptionsAfter()
/go/pkg/mod/github.com/tendermint/tendermint@v0.31.5/rpc/client/httpclient.go:364 +0xc0
github.com/tendermint/tendermint/rpc/client.(*WSEvents).eventListener()
/go/pkg/mod/github.com/tendermint/tendermint@v0.31.5/rpc/client/httpclient.go:393 +0x3c6
Turns out that the redoSubscriptionAfter is not protecting the access to subscriptions.
The following change protects the read access to the subscription map behind the mutex
* manually swagging
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* three definitions with polymorphism
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* added blockchain and block
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* low quality generation, commit, block_response and validators
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* genesis and consensus states endpoints
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* fix indentation
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* consensus parameters
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* fix indentation
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* add height to consensus parameters endpoint
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* unconfirmed_txs and num_unconfirmed_txs
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* add missing query parameter
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* add ABCI queries
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* added index document for swagger documentation
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* add missing routes
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* contract tests added on CCI
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* contract tests job should be in the test suite
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* simplify requirements to test
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* typo
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* build is a prerequisite to start localnet
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* reduce nodejs size, move goodman to get_tools, add docs, fix comments
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* Update scripts/get_tools.sh
That's cleaner, thanks!
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* xz not supported by cci image, let's keep it simple
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* REMOVE-indirect debug of CCI paths
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* dirty experiment, volume is empty but binary has been produced
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* dirty experiment, volume is empty but binary has been produced
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* dirty experiment going on
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* locally works, CCI have difficulties with second layaer containers volumes
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* restore experiment, use machine instead of docker for contract tests
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* simplify a bit
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* rollback on machine golang
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* Document the changes
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* Changelog
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
* comments
Signed-off-by: Karoly Albert Szabo <szabo.karoly.a@gmail.com>
This PR is related to #3107 and a continuation of #3351
It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction.
Given two hosts A and B:
Host A is listener/client
Host B is dialer/server (contains the secret key)
When A requires a signature, it needs to wait for B to dial in before it can issue a request.
A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect.
The original rationale behind this design was based on security.
Host B only allows outbound connections to a list of whitelisted hosts.
It is not possible to reach B unless B dials in. There are no listening/open ports in B.
This PR results in the following changes:
Refactors ping/heartbeat to avoid previously existing race conditions.
Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow.
Unifies and abstracts away the differences between unix and tcp sockets.
A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj)
The signer request handler (server side) is customizable to increase testability.
Updates and extends unit tests
A high level overview of the classes is as follows:
Transport (endpoints): The following classes take care of establishing a connection
SignerDialerEndpoint
SignerListeningEndpoint
SignerEndpoint groups common functionality (read/write/timeouts/etc.)
Signing (client/server): The following classes take care of exchanging request/responses
SignerClient
SignerServer
This PR also closes#3601
Commits:
* refactoring - work in progress
* reworking unit tests
* Encapsulating and fixing unit tests
* Improve tests
* Clean up
* Fix/improve unit tests
* clean up tests
* Improving service endpoint
* fixing unit test
* fix linter issues
* avoid invalid cache values (improve later?)
* complete implementation
* wip
* improved connection loop
* Improve reconnections + fixing unit tests
* addressing comments
* small formatting changes
* clean up
* Update node/node.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_client.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_client_test.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* check during initialization
* dropping connecting when writing fails
* removing break
* use t.log instead
* unifying and using cmn.GetFreePort()
* review fixes
* reordering and unifying drop connection
* closing instead of signalling
* refactored service loop
* removed superfluous brackets
* GetPubKey can return errors
* Revert "GetPubKey can return errors"
This reverts commit 68c06f19b4.
* adding entry to changelog
* Update CHANGELOG_PENDING.md
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_client.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_dialer_endpoint.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_dialer_endpoint.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_dialer_endpoint.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_dialer_endpoint.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_listener_endpoint_test.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* updating node.go
* review fixes
* fixes linter
* fixing unit test
* small fixes in comments
* addressing review comments
* addressing review comments 2
* reverting suggestion
* Update privval/signer_client_test.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update privval/signer_client_test.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update privval/signer_listener_endpoint_test.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* do not expose brokenSignerDialerEndpoint
* clean up logging
* unifying methods
shorten test time
signer also drops
* reenabling pings
* improving testability + unit test
* fixing go fmt + unit test
* remove unused code
* Addressing review comments
* simplifying connection workflow
* fix linter/go import issue
* using base service quit
* updating comment
* Simplifying design + adjusting names
* fixing linter issues
* refactoring test harness + fixes
* Addressing review comments
* cleaning up
* adding additional error check
* node: allow replacing existing p2p.Reactor(s)
using [`CustomReactors`
option](https://godoc.org/github.com/tendermint/tendermint/node#CustomReactors).
Warning: beware of accidental name clashes. Here is the list of existing
reactors: MEMPOOL, BLOCKCHAIN, CONSENSUS, EVIDENCE, PEX.
* check the absence of "CUSTOM" prefix
* merge 2 tests
* add doc.go to node package
* Do not write 'Couldn't connect to any seeds' if there are no seeds
* changelog
* remove privValUpgrade
* Fix typo in changelog
* Update CHANGELOG_PENDING.md
Co-Authored-By: Marko <marbar3778@yahoo.com>
I'm setting up all peers dynamically by calling dial_peers, so p2p.seeds in configs is empty, and I'm seeing error log a lot in logs.
* p2p: fix false-positive error logging when stopping connections
This changeset fixes two types of false-positive errors occurring during
connection shutdown.
The first occurs when the process invokes FlushStop() or Stop() on a
connection. While the previous behavior did properly wait for the sendRoutine
to finish, it did not notify the recvRoutine that the connection was shutting
down. This would cause the recvRouting to receive and error when reading and
log this error. The changeset fixes this by notifying the recvRoutine that
the connection is shutting down.
The second occurs when the connection is terminated (gracefully) by the other side.
The recvRoutine would get an EOF error during the read, log it, and stop the connection
with an error. The changeset detects EOF and gracefully shuts down the connection.
* bring back the comment about flushing
* add changelog entry
* listen for quitRecvRoutine too
* we have to call stopForError
Otherwise peer won't be removed from the peer set and maybe readded
later.
* go routines in blockchain reactor
* Added reference to the go routine diagram
* Initial commit
* cleanup
* Undo testing_logger change, committed by mistake
* Fix the test loggers
* pulled some fsm code into pool.go
* added pool tests
* changes to the design
added block requests under peer
moved the request trigger in the reactor poolRoutine, triggered now by a ticker
in general moved everything required for making block requests smarter in the poolRoutine
added a simple map of heights to keep track of what will need to be requested next
added a few more tests
* send errors to FSM in a different channel than blocks
send errors (RemovePeer) from switch on a different channel than the
one receiving blocks
renamed channels
added more pool tests
* more pool tests
* lint errors
* more tests
* more tests
* switch fast sync to new implementation
* fixed data race in tests
* cleanup
* finished fsm tests
* address golangci comments :)
* address golangci comments :)
* Added timeout on next block needed to advance
* updating docs and cleanup
* fix issue in test from previous cleanup
* cleanup
* Added termination scenarios, tests and more cleanup
* small fixes to adr, comments and cleanup
* Fix bug in sendRequest()
If we tried to send a request to a peer not present in the switch, a
missing continue statement caused the request to be blackholed in a peer
that was removed and never retried.
While this bug was manifesting, the reactor kept asking for other
blocks that would be stored and never consumed. Added the number of
unconsumed blocks in the math for requesting blocks ahead of current
processing height so eventually there will be no more blocks requested
until the already received ones are consumed.
* remove bpPeer's didTimeout field
* Use distinct err codes for peer timeout and FSM timeouts
* Don't allow peers to update with lower height
* review comments from Ethan and Zarko
* some cleanup, renaming, comments
* Move block execution in separate goroutine
* Remove pool's numPending
* review comments
* fix lint, remove old blockchain reactor and duplicates in fsm tests
* small reorg around peer after review comments
* add the reactor spec
* verify block only once
* review comments
* change to int for max number of pending requests
* cleanup and godoc
* Add configuration flag fast sync version
* golangci fixes
* fix config template
* move both reactor versions under blockchain
* cleanup, golint, renaming stuff
* updated documentation, fixed more golint warnings
* integrate with behavior package
* sync with master
* gofmt
* add changelog_pending entry
* move to improvments
* suggestion to changelog entry
* Remove db from tendemrint in favor of tendermint/tm-cmn
- remove db from `libs`
- update dependancy, there have been no breaking changes in the updated deps
- https://github.com/grpc/grpc-go/releases
- https://github.com/golang/protobuf/releases
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* changelog add
* gofmt
* more gofmt
* node: allow replacing existing p2p.Reactor(s)
using [`CustomReactors`
option](https://godoc.org/github.com/tendermint/tendermint/node#CustomReactors).
Warning: beware of accidental name clashes. Here is the list of existing
reactors: MEMPOOL, BLOCKCHAIN, CONSENSUS, EVIDENCE, PEX.
* check the absence of "CUSTOM" prefix
* merge 2 tests
* add doc.go to node package
* Do not write 'Couldn't connect to any seeds' if there are no seeds
* changelog
* remove privValUpgrade
* Fix typo in changelog
* Update CHANGELOG_PENDING.md
Co-Authored-By: Marko <marbar3778@yahoo.com>
I'm setting up all peers dynamically by calling dial_peers, so p2p.seeds in configs is empty, and I'm seeing error log a lot in logs.
* p2p: fix false-positive error logging when stopping connections
This changeset fixes two types of false-positive errors occurring during
connection shutdown.
The first occurs when the process invokes FlushStop() or Stop() on a
connection. While the previous behavior did properly wait for the sendRoutine
to finish, it did not notify the recvRoutine that the connection was shutting
down. This would cause the recvRouting to receive and error when reading and
log this error. The changeset fixes this by notifying the recvRoutine that
the connection is shutting down.
The second occurs when the connection is terminated (gracefully) by the other side.
The recvRoutine would get an EOF error during the read, log it, and stop the connection
with an error. The changeset detects EOF and gracefully shuts down the connection.
* bring back the comment about flushing
* add changelog entry
* listen for quitRecvRoutine too
* we have to call stopForError
Otherwise peer won't be removed from the peer set and maybe readded
later.
* go routines in blockchain reactor
* Added reference to the go routine diagram
* Initial commit
* cleanup
* Undo testing_logger change, committed by mistake
* Fix the test loggers
* pulled some fsm code into pool.go
* added pool tests
* changes to the design
added block requests under peer
moved the request trigger in the reactor poolRoutine, triggered now by a ticker
in general moved everything required for making block requests smarter in the poolRoutine
added a simple map of heights to keep track of what will need to be requested next
added a few more tests
* send errors to FSM in a different channel than blocks
send errors (RemovePeer) from switch on a different channel than the
one receiving blocks
renamed channels
added more pool tests
* more pool tests
* lint errors
* more tests
* more tests
* switch fast sync to new implementation
* fixed data race in tests
* cleanup
* finished fsm tests
* address golangci comments :)
* address golangci comments :)
* Added timeout on next block needed to advance
* updating docs and cleanup
* fix issue in test from previous cleanup
* cleanup
* Added termination scenarios, tests and more cleanup
* small fixes to adr, comments and cleanup
* Fix bug in sendRequest()
If we tried to send a request to a peer not present in the switch, a
missing continue statement caused the request to be blackholed in a peer
that was removed and never retried.
While this bug was manifesting, the reactor kept asking for other
blocks that would be stored and never consumed. Added the number of
unconsumed blocks in the math for requesting blocks ahead of current
processing height so eventually there will be no more blocks requested
until the already received ones are consumed.
* remove bpPeer's didTimeout field
* Use distinct err codes for peer timeout and FSM timeouts
* Don't allow peers to update with lower height
* review comments from Ethan and Zarko
* some cleanup, renaming, comments
* Move block execution in separate goroutine
* Remove pool's numPending
* review comments
* fix lint, remove old blockchain reactor and duplicates in fsm tests
* small reorg around peer after review comments
* add the reactor spec
* verify block only once
* review comments
* change to int for max number of pending requests
* cleanup and godoc
* Add configuration flag fast sync version
* golangci fixes
* fix config template
* move both reactor versions under blockchain
* cleanup, golint, renaming stuff
* updated documentation, fixed more golint warnings
* integrate with behavior package
* sync with master
* gofmt
* add changelog_pending entry
* move to improvments
* suggestion to changelog entry
* Remove db from tendemrint in favor of tendermint/tm-cmn
- remove db from `libs`
- update dependancy, there have been no breaking changes in the updated deps
- https://github.com/grpc/grpc-go/releases
- https://github.com/golang/protobuf/releases
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* changelog add
* gofmt
* more gofmt
* Release branch for v0.32.1
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* add links to changelog
* wording change
* Apply suggestions from code review
Comment from PR
Co-Authored-By: Ethan Buchman <ethan@coinculture.info>
* add link bump abci Version
* include doc change for abci/app
* moved abci change to features as it doesnt break the abci
* pr comments (#3803)
* pr comments
* abci changelog change
* Marko/update release1 (#3806)
* pr comments
* remove empty space
* more minor cleanup of libs
Remove unused `version.go`, `assert.go` and `libs/circle.yml`
* Update types/vote_set_test.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* spelling change
- The removed functions are not used in Iavl, Cosmos-sdk and tendermint repos
- Code-hygenie `whoop whoop`
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* Remove file heap.go
- cmn.Heap is not being used in cosmos-sdk, iavl nor tendermint repo.
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* changelog entry
* Update CHANGELOG_PENDING.md
closes#2432
* change invocation of NewNode across
* custom reactor name are prefixed with CUSTOM_
* upgate changelog pending
* improve comments
* node: refactor NewNode to use functional options
Calling ensurePeers outside of ensurePeersRoutine can lead to nodes
disconnecting from us due to "sent next PEX request too soon" error.
Solution is to just dial addrs we got from src instead of calling
ensurePeers.
Refs #2093Fixes#3338
As per #2127, this refactors the RequestCheckTx ProtoBuf struct to allow for a flag indicating whether a query is a recheck or not (and allows for possible future, more nuanced states).
In order to pass this extended information through to the ABCI app, the proxy.AppConnMempool (and, for consistency, the proxy.AppConnConsensus) interface seems to need to be refactored along with abcicli.Client.
And, as per this comment, I've made the following modification to the protobuf definition for the RequestCheckTx structure:
enum CheckTxType {
New = 0;
Recheck = 1;
}
message RequestCheckTx {
bytes tx = 1;
CheckTxType type = 2;
}
* Refactor ABCI CheckTx to notify of recheck
As per #2127, this refactors the `RequestCheckTx` ProtoBuf struct to allow for:
1. a flag indicating whether a query is a recheck or not (and allows for
possible future, more nuanced states)
2. an `additional_data` bytes array to provide information for those more
nuanced states.
In order to pass this extended information through to the ABCI app, the
`proxy.AppConnMempool` (and, for consistency, the
`proxy.AppConnConsensus`) interface seems to need to be refactored.
Commits:
* Fix linting issue
* Add CHANGELOG_PENDING entry
* Remove extraneous explicit initialization
* Update ABCI spec doc to include new CheckTx params
* Rename method param for consistency
* Rename CheckTxType enum values and remove additional_data param