Continues from #3280 in building support for batched requests/responses in the JSON RPC (as per issue #3213).
* Add JSON RPC batching for client and server
As per #3213, this adds support for [JSON RPC batch requests and
responses](https://www.jsonrpc.org/specification#batch).
* Add additional checks to ensure client responses are the same as results
* Fix case where a notification is sent and no response is expected
* Add test to check that JSON RPC notifications in a batch are left out in responses
* Update CHANGELOG_PENDING.md
* Update PR number now that PR has been created
* Make errors start with lowercase letter
* Refactor batch functionality to be standalone
This refactors the batching functionality to rather act in a standalone
way. In light of supporting concurrent goroutines making use of the same
client, it would make sense to have batching functionality where one
could create a batch of requests per goroutine and send that batch
without interfering with a batch from another goroutine.
* Add examples for simple and batch HTTP client usage
* Check errors from writer and remove nolinter directives
* Make error strings start with lowercase letter
* Refactor examples to make them testable
* Use safer deferred shutdown for example Tendermint test node
* Recompose rpcClient interface from pre-existing interface components
* Rename WaitGroup for brevity
* Replace empty ID string with request ID
* Remove extraneous test case
* Convert first letter of errors.Wrap() messages to lowercase
* Remove extraneous function parameter
* Make variable declaration terse
* Reorder WaitGroup.Done call to help prevent race conditions in the face of failure
* Swap mutex to value representation and remove initialization
* Restore empty JSONRPC string ID in response to prevent nil
* Make JSONRPCBufferedRequest private
* Revert PR hard link in CHANGELOG_PENDING
* Add client ID for JSONRPCClient
This adds code to automatically generate a randomized client ID for the
JSONRPCClient, and adds a check of the IDs in the responses (if one was
set in the requests).
* Extract response ID validation into separate function
* Remove extraneous comments
* Reorder fields to indicate clearly which are protected by the mutex
* Refactor for loop to remove indexing
* Restructure and combine loop
* Flatten conditional block for better readability
* Make multi-variable declaration slightly more readable
* Change for loop style
* Compress error check statements
* Make function description more generic to show that we support different protocols
* Preallocate memory for request and result objects
* use dialPeer function in a seed mode
Fixes#3532
by storing a number of attempts we've tried to connect in-memory and
removing the address from addrbook when number of attempts > 16
Fixes#3457
The topic of the issue is that : write a BlockRequest int requestsCh channel will create an timer at the same time that stop the peer 15s later if no block have been received . But pop a BlockRequest from requestsCh and send it out may delay more than 15s later. So that the peer will be stopped for error("send nothing to us").
Extracting requestsCh into its own goroutine can make sure that every BlockRequest been handled timely.
Instead of the requestsCh handling, we should probably pull the didProcessCh handling in a separate go routine since this is the one "starving" the other channel handlers. I believe the way it is right now, we still have issues with high delays in errorsCh handling that might cause sending requests to invalid/ disconnected peers.
What happened:
New code was supposed to fall back to last height changed when/if it
failed to find validators at checkpoint height (to make release
non-breaking).
But because we did not check if validator set is empty, the fall back
logic was never executed => resulting in LoadValidators returning an
empty validator set for cases where `lastStoredHeight` is checkpoint
height (i.e. almost all heights if the application does not change
validator set often).
How it was found:
one of our users - @sunboshan reported a bug here
https://github.com/tendermint/tendermint/pull/3537#issuecomment-482711833
* use last height changed in validator set is empty
* add a changelog entry
A prior change to address accidental DNS lookups introduced the
SocketAddr on peer, which was then used to add it to the addressbook.
Which in turn swallowed the self reported port of the peer, which is
important on a reconnect. This change revives the NetAddress on NodeInfo
which the Peer carries, but now returns an error to avoid nil
dereferencing another issue observed in the past. Additionally we could
potentially address #3532, yet the original problem statemenf of that
issue stands.
As a drive-by optimisation `MarkAsGood` now takes only a `p2p.ID` which
makes it interface a bit stricter and leaner.
* rpc: store validator info periodly
* increase ValidatorSetStoreInterval
also
- unexpose it
- add a comment
- refactor code
- add a benchmark, which shows that 100000 results in ~ 100ms to get 100
validators
* make the change non-breaking
* expand comment
* rename valSetStoreInterval to valSetCheckpointInterval
* change the panic msg
* add a test and changelog entry
* update changelog entry
* update changelog entry
* add a link to PR
* fix test
* Update CHANGELOG_PENDING.md
Co-Authored-By: melekes <anton.kalyaev@gmail.com>
* update comment
* use MaxInt64 func
* add actionable advice for ErrAddrBookNonRoutable err
Should replace https://github.com/tendermint/tendermint/pull/3463
* reorder checks in addrbook#addAddress so
ErrAddrBookPrivate is returned first
and do not log error in DialPeersAsync if the address is private
because it's not an error
ListOfKnownAddresses is removed
panic if addrbook size is less than zero
CrawlPeers does not attempt to connect to existing or peers we're currently dialing
various perf. fixes
improved tests (though not complete)
move IsDialingOrExistingAddress check into DialPeerWithAddress (Fixes#2716)
* addrbook: preallocate memory when saving addrbook to file
* addrbook: remove oldestFirst struct and check for ID
* oldestFirst replaced with sort.Slice
* ID is now mandatory, so no need to check
* addrbook: remove ListOfKnownAddresses
GetSelection is used instead in seed mode.
* addrbook: panic if size is less than 0
* rewrite addrbook#saveToFile to not use a counter
* test AttemptDisconnects func
* move IsDialingOrExistingAddress check into DialPeerWithAddress
* save and cleanup crawl peer data
* get rid of DefaultSeedDisconnectWaitPeriod
* make linter happy
* fix TestPEXReactorSeedMode
* fix comment
* add a changelog entry
* Apply suggestions from code review
Co-Authored-By: melekes <anton.kalyaev@gmail.com>
* rename ErrDialingOrExistingAddress to ErrCurrentlyDialingOrExistingAddress
* lowercase errors
* do not persist seed data
pros:
- no extra files
- less IO
cons:
- if the node crashes, seed might crawl a peer too soon
* fixes after Ethan's review
* add a changelog entry
* we should only consult Switch about peers
checking addrbook size does not make sense since only PEX reactor uses
it for dialing peers!
https://github.com/tendermint/tendermint/pull/3011#discussion_r270948875
* OriginalAddr -> SocketAddr
OriginalAddr records the originally dialed address for outbound peers,
rather than the peer's self reported address. For inbound peers, it was
nil. Here, we rename it to SocketAddr and for inbound peers, set it to
the RemoteAddr of the connection.
* use SocketAddr
Numerous places in the code call peer.NodeInfo().NetAddress().
However, this call to NetAddress() may perform a DNS lookup if the
reported NodeInfo.ListenAddr includes a name. Failure of this lookup
returns a nil address, which can lead to panics in the code.
Instead, call peer.SocketAddr() to return the static address of the
connection.
* remove nodeInfo.NetAddress()
Expose `transport.NetAddress()`, a static result determined
when the transport is created. Removing NetAddress() from the nodeInfo
prevents accidental DNS lookups.
* fixes from review
* linter
* fixes from review
* docs: fix broken links (#3482)
A bunch of links were broken in the documentation s they included the
`docs` prefix.
* Update CHANGELOG_PENDING
* docs: switch to relative links for github compatitibility (#3482)
* docs: fix broken links (#3482)
A bunch of links were broken in the documentation s they included the
`docs` prefix.
* Update CHANGELOG_PENDING
* docs: switch to relative links for github compatitibility (#3482)
* mempool: add a safety check, write tests for mempoolIDs
and document 65536 limit in the mempool reactor spec
follow-up to https://github.com/tendermint/tendermint/pull/2778
* rename the test
* fixes after Ismail's review