implementation spec of Improved Trusted Peering ADR-050 by B-Harvest
- add unconditional_peer_ids and persistent_peers_max_dial_period to config
- add unconditionalPeerIDs map to Switch struct
default config value of persistent_peers_max_dial_period is 0s(disabled)
Refs #4072, #4053
* p2p/conn: simplify secret connection handshake malleability fix with merlin
Introduces new dependencies on github.com/gtank/merlin and sha3 as a cryptographic primitive
This also only uses the transcript hash as a MAC.
* p2p/conn: avoid string to byte conversion
https://github.com/uber-go/guide/blob/master/style.md#avoid-string-to-byte-conversion
* Add pagination to /validators
- closes#3472
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* add swagger params, default returns all
* address pr comments
* golint fix
* swagger default change, change to default in comment
* swagger.yaml: replace x-example with example
https://swagger.io/docs/specification/adding-examples/
* Revert "swagger.yaml: replace x-example with example"
This reverts commit 9df1b006de.
* update changelog and remove extra body close
## Issue:
This is an approach to fixing secret connection that is more noise-ish than actually noise.
but it essentially fixes the problem that #3315 is trying to solve by making the secret connection handshake non-malleable. It's easy to understand and I think will be acceptable to @jaekwon
.. the formal reasoning is basically, if the "view" of the transcript between diverges between the sender and the receiver at any point in the protocol, the handshake would terminate.
The base protocol of Station to Station mistakenly assumes that if the sender and receiver arrive at shared secret they have the same view. This is only true for a DH on prime order groups.
This robustly solves the problem by having each cryptographic operation commit to operators view of the protocol.
Another nice thing about a transcript is it provides the basis for "secure" (barring cryptographic breakages, horrible design flaws, or implementation bugs) downgrades, where a backwards compatible handshake can be used to offer newer protocol features/extensions, peers agree to the common subset of what they support, and both sides have to agree on what the other offered for the transcript MAC to verify.
With something like Protos/Amino you already get "extensions" for free (TLS uses a simple TLV format https://tools.ietf.org/html/rfc8446#section-4.2 for extensions not too far off from Protos/Amino), so as long as you cryptographically commit to what they contain in the transcript, it should be possible to extend the protocol in a backwards-compatible manner.
## Commits:
* Minimal changes to remove malleability of secret connection removes the need to check for lower order points.
Breaks compatibility. Secret connections that have no been updated will fail
* Remove the redundant blacklist
* remove remainders of blacklist in tests to make the code compile again
Signed-off-by: Ismail Khoffi <Ismail.Khoffi@gmail.com>
* Apply suggestions from code review
Apply Ismail's error handling
Co-Authored-By: Ismail Khoffi <Ismail.Khoffi@gmail.com>
* fix error check for io.ReadFull
Signed-off-by: Ismail Khoffi <Ismail.Khoffi@gmail.com>
* Update p2p/conn/secret_connection.go
Co-Authored-By: Ismail Khoffi <Ismail.Khoffi@gmail.com>
* Update p2p/conn/secret_connection.go
Co-Authored-By: Bot from GolangCI <42910462+golangcibot@users.noreply.github.com>
* update changelog and format the code
* move hkdfInit closer to where it's used
* Fix long line errors in abci, crypto, and libs packages
* Fix long lines in p2p and rpc packages
* Fix long lines in abci, state, and tools packages
* Fix long lines in behaviour and blockchain packages
* Fix long lines in cmd and config packages
* Begin fixing long lines in consensus package
* Finish fixing long lines in consensus package
* Add lll exclusion for lines containing URLs
* Fix long lines in crypto package
* Fix long lines in evidence package
* Fix long lines in mempool and node packages
* Fix long lines in libs package
* Fix long lines in lite package
* Fix new long line in node package
* Fix long lines in p2p package
* Ignore gocritic warning
* Fix long lines in privval package
* Fix long lines in rpc package
* Fix long lines in scripts package
* Fix long lines in state package
* Fix long lines in tools package
* Fix long lines in types package
* Enable lll linter
* New lint version upgrade
- linter was upgraded
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* enable-a;; is deprecated
* minor change
* another try
* some more changes
* some more changes
* reenable prealloc
* add version till bot is fixed
* Pin range scope vars
* Don't disable scopelint
This PR repairs linter errors seen when running the following commands:
golangci-lint run --no-config --disable-all=true --enable=scopelint
Contributes to #3262
* Remove unnecessary type conversions
* Consolidate repeated strings into consts
* Clothe return statements
* Update blockchain/v1/reactor_fsm_test.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
This PR repairs linter errors seen when running the following commands:
golangci-lint run --no-config --disable-all=true --enable=unconvert
golangci-lint run --no-config --disable-all=true --enable=goconst
golangci-lint run --no-config --disable-all=true --enable=nakedret
Contributes to #3262
* init of (2/2) common errors
* Remove instances of cmn.Error (2/2)
- Replace usage of cmnError and errorWrap
- ref #3862
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* comment wording
* simplify IsErrXXX functions
* log panic along with stopping the MConnection
* (1/2) of replace errors.go with github.com/pkg/errors
ref #3862
- step one in removing instances of errors.go in favor of github.com/pkg/errors
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* gofmt
* add in /store
Add gocritic as a linter
The linting is not complete, but should i complete in this PR or in a following.
23 files have been touched so it may be better to do in a following PR
Commits:
* Add gocritic to linting
- Added gocritic to linting
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* gocritic
* pr comments
* remove switch in cmdBatch
* node: allow replacing existing p2p.Reactor(s)
using [`CustomReactors`
option](https://godoc.org/github.com/tendermint/tendermint/node#CustomReactors).
Warning: beware of accidental name clashes. Here is the list of existing
reactors: MEMPOOL, BLOCKCHAIN, CONSENSUS, EVIDENCE, PEX.
* check the absence of "CUSTOM" prefix
* merge 2 tests
* add doc.go to node package
* Do not write 'Couldn't connect to any seeds' if there are no seeds
* changelog
* remove privValUpgrade
* Fix typo in changelog
* Update CHANGELOG_PENDING.md
Co-Authored-By: Marko <marbar3778@yahoo.com>
I'm setting up all peers dynamically by calling dial_peers, so p2p.seeds in configs is empty, and I'm seeing error log a lot in logs.
* p2p: fix false-positive error logging when stopping connections
This changeset fixes two types of false-positive errors occurring during
connection shutdown.
The first occurs when the process invokes FlushStop() or Stop() on a
connection. While the previous behavior did properly wait for the sendRoutine
to finish, it did not notify the recvRoutine that the connection was shutting
down. This would cause the recvRouting to receive and error when reading and
log this error. The changeset fixes this by notifying the recvRoutine that
the connection is shutting down.
The second occurs when the connection is terminated (gracefully) by the other side.
The recvRoutine would get an EOF error during the read, log it, and stop the connection
with an error. The changeset detects EOF and gracefully shuts down the connection.
* bring back the comment about flushing
* add changelog entry
* listen for quitRecvRoutine too
* we have to call stopForError
Otherwise peer won't be removed from the peer set and maybe readded
later.
cleanup to add linter
grpc change:
https://godoc.org/google.golang.org/grpc#WithContextDialerhttps://godoc.org/google.golang.org/grpc#WithDialer
grpc/grpc-go#2627
prometheous change:
due to UninstrumentedHandler, being deprecated in the future
empty branch = empty if or else statement
didn't delete them entirely but commented
couldn't find a reason to have them
could not replicate the issue #3406
but if want to keep it commented then we should comment out the if statement as well
* Renamed wire.go to codec.go
- Wire was the previous name of amino
- Codec describes the file better than `wire` & `amino`
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* ide error
* rename amino.go to codec.go
* Remove db from tendemrint in favor of tendermint/tm-cmn
- remove db from `libs`
- update dependancy, there have been no breaking changes in the updated deps
- https://github.com/grpc/grpc-go/releases
- https://github.com/golang/protobuf/releases
Signed-off-by: Marko Baricevic <marbar3778@yahoo.com>
* changelog add
* gofmt
* more gofmt
* p2p: extract ID validation into a separate func
- NewNetAddress panics if ID is invalid
- NetAddress#Valid returns an error
- remove ErrAddrBookInvalidAddrNoID
Fixes#2722
* p2p: remove repetitive check in ReceiveAddrs
* fix netaddress test
* change invocation of NewNode across
* custom reactor name are prefixed with CUSTOM_
* upgate changelog pending
* improve comments
* node: refactor NewNode to use functional options
Calling ensurePeers outside of ensurePeersRoutine can lead to nodes
disconnecting from us due to "sent next PEX request too soon" error.
Solution is to just dial addrs we got from src instead of calling
ensurePeers.
Refs #2093Fixes#3338
* Move peer behaviour into it's own package
* refactor wip
* Adjust API and fix tests
* remove unused test struct
* Better error message
* Restructure:
+ Now behaviour is it's own package, we don't need to include
PeerBehaviour in every type.
+ Split up behaviours and reporters into seperate files
* doc string fixes
* Fix minor typos
* Update behaviour/reporter.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update behaviour/reporter.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
Fixes#3521
The function NewNetAddressStringWithOptionalID is from a time when peer
IDs were optional. They're not anymore. So this should be renamed to
NewNetAddressString and should ensure the ID is provided.
* update changelog
* use NewNetAddress in transport tests
* use NewNetAddress in TestTransportMultiplexAcceptMultiple
* Peer behaviour test tweaks:
Address the remaining test issues mentioned in ##3552 notably:
+ switch to p2p_test package
+ Use `Error` instead of `Errorf` when not using formatting
+ Add expected/got errors to
`TestMockPeerBehaviourReporterConcurrency` test
* Peer behaviour equal behaviours test
+ slices of PeerBehaviours should be compared as histograms to
ensure they have the same set of PeerBehaviours at the same
freequncy.
* TestEqualPeerBehaviours:
+ Add tests for the equivalence between sets of PeerBehaviours
* fix peer state init to late
Peer does not have a state yet. We set it in AddPeer.
We need an new interface before mconnection is started
* pex message to soon
fix reconnection pex send too fast,
error is caused lastReceivedRequests is still
not deleted when a peer reconnected
* add test case for initpeer
* add prove case
* remove potentially infinite loop
* Update consensus/reactor.go
Co-Authored-By: guagualvcha <baifudong@lancai.cn>
* Update consensus/reactor_test.go
Co-Authored-By: guagualvcha <baifudong@lancai.cn>
* document Reactor interface better
* refactor TestReactorReceiveDoesNotPanicIfAddPeerHasntBeenCalledYet
* fix merge conflicts
* blockchain: remove peer's ID from the pool in InitPeer
Refs #3338
* pex: resetPeersRequestsInfo both upon InitPeer and RemovePeer
* ensure RemovePeer is always called before InitPeer
by removing the peer from the switch last (after we've stopped it and
removed from all reactors)
* add some comments for ConsensusReactor#InitPeer
* fix pex reactor
* format code
* fix spelling
* update changelog
* remove unused methods
* do not clear lastReceivedRequests upon error
only in RemovePeer
* call InitPeer before we start the peer!
* add a comment to InitPeer
* write a test
* use waitUntilSwitchHasAtLeastNPeers func
* bring back timeouts
* Test to ensure Receive panics if InitPeer has not been called
* p2p/pex: consult seeds in crawlPeersRoutine
This changeset alters the startup behavior for crawlPeersRoutine. Previously
the routine would crawl a random selection of peers on startup. For a
new seed node, there are no peers. As a result, new seed nodes are unable
to bootstrap themselves with a list of peers until another node with a list
of peers connects to the seed. If this node relies on the seed node for peers,
then the two will not discover more peers.
This changeset makes the startup behavior for crawlPeersRoutine connect to
any seed nodes. Upon connecting, a request for peers will be sent to the seed node
thus helping bootstrap our seed node.
* p2p/pex: Adjust error message for no peers
Co-Authored-By: Ethan Buchman <ethan@coinculture.info>
* p2p: initial implementation of peer behaviour
* [p2p] re-use newMockPeer
* p2p: add inline docs for peer_behaviour interface
* [p2p] Align PeerBehaviour interface (#3558)
* [p2p] make switchedPeerHebaviour private
* [p2p] make storePeerHebaviour private
* [p2p] Add CHANGELOG_PENDING entry for PeerBehaviour
* [p2p] Adjustment naming for PeerBehaviour
* [p2p] Add coarse lock around storedPeerBehaviour
* [p2p]: Fix non-pointer methods in storedPeerBehaviour
+ Structs with embeded locks must specify all methods with pointer
receivers to avoid creating a copy of the embeded lock.
* [p2p] Thorough refactoring based on comments in #3552
+ Decouple PeerBehaviour interface from Peer by parametrizing
methods with `p2p.ID` instead of `p2p.Peer`
+ Setter methods wrapped in a write lock
+ Getter methods wrapped in a read lock
+ Getter methods on storedPeerBehaviour now take a `p2p.ID`
+ Getter methods return a copy of underlying stored behaviours
+ Add doc strings to public types and methods
* [p2p] make structs public
* [p2p] Test empty StoredPeerBehaviour
* [p2p] typo fix
* [p2p] add TestStoredPeerBehaviourConcurrency
+ Add a test which uses StoredPeerBehaviour in multiple goroutines
to ensure thread-safety.
* Update p2p/peer_behaviour.go
Co-Authored-By: brapse <brapse@gmail.com>
* Update p2p/peer_behaviour.go
Co-Authored-By: brapse <brapse@gmail.com>
* Update p2p/peer_behaviour.go
Co-Authored-By: brapse <brapse@gmail.com>
* Update p2p/peer_behaviour.go
Co-Authored-By: brapse <brapse@gmail.com>
* Update p2p/peer_behaviour.go
Co-Authored-By: brapse <brapse@gmail.com>
* Update p2p/peer_behaviour_test.go
Co-Authored-By: brapse <brapse@gmail.com>
* Update p2p/peer_behaviour.go
Co-Authored-By: brapse <brapse@gmail.com>
* Update p2p/peer_behaviour.go
Co-Authored-By: brapse <brapse@gmail.com>
* Update p2p/peer_behaviour.go
Co-Authored-By: brapse <brapse@gmail.com>
* Update p2p/peer_behaviour.go
Co-Authored-By: brapse <brapse@gmail.com>
* [p2p] field ordering convention
* p2p: peer behaviour refactor
+ Change naming of reporting behaviour to `Report`
+ Remove the responsibility of distinguishing between the categories
of good and bad behaviour and instead focus on reporting behaviour.
* p2p: rename PeerReporter -> PeerBehaviourReporter
## Description
also
handle errors from DialPeersAsync
remove nil addr from log msg
fix TestPEXReactorDoesNotDisconnectFromPersistentPeerInSeedMode
This is a follow-up from
#3593 (review)
Fixes most of the #3617, except #3593 (comment)
## Commits
* rpc: /dial_peers: only mark peers as persistent if flag is on
also
- handle errors from DialPeersAsync
- remove nil addr from log msg
- fix TestPEXReactorDoesNotDisconnectFromPersistentPeerInSeedMode
This is a follow-up from
https://github.com/tendermint/tendermint/pull/3593#pullrequestreview-233556909
* remove a call to AddPersistentPeers
TestDialFail will trigger a reconnect
## Description
Previously only outbound peers can be persistent.
Now, even if the peer is inbound, if it's marked as persistent, when/if conn is lost,
Tendermint will try to reconnect. This part is actually optional and can be reverted.
Plus, seed won't disconnect from inbound peer if it's marked as
persistent. Fixes#3362
## Commits
* make persistent prop independent of conn direction
Previously only outbound peers can be persistent. Now, even if the peer
is inbound, if it's marked as persistent, when/if conn is lost,
Tendermint will try to reconnect.
Plus, seed won't disconnect from inbound peer if it's marked as
persistent. Fixes#3362
* fix TestPEXReactorDialPeer test
* add a changelog entry
* update changelog
* add two tests
* reformat code
* test UnsafeDialPeers and UnsafeDialSeeds
* add TestSwitchDialPeersAsync
* spec: update p2p/config spec
* fixes after Ismail's review
* Apply suggestions from code review
Co-Authored-By: melekes <anton.kalyaev@gmail.com>
* fix merge conflict
* remove sleep from TestPEXReactorDoesNotDisconnectFromPersistentPeerInSeedMode
We don't need it actually.
If we are low on addresses for peering, we need to discover more peers. The
previous behavior would query existing peers; however, if an existing peer
does not participate in peer exchange, then our node will not discover more peers.
This change consults both existing peers as well as seeds when there is a deficit
in address book addresses. This allows for discovering peers though existing channels
as well as via seeds if existing peers do not share addresses.
* Remove deprecated PanicXXX functions from codebase
As per discussion over
[here](https://github.com/tendermint/tendermint/pull/3456#discussion_r278423492),
we need to remove these `PanicXXX` functions and eliminate our
dependence on them. In this PR, each and every `PanicXXX` function call
is replaced with a simple `panic` call.
* add a changelog entry
* use dialPeer function in a seed mode
Fixes#3532
by storing a number of attempts we've tried to connect in-memory and
removing the address from addrbook when number of attempts > 16
A prior change to address accidental DNS lookups introduced the
SocketAddr on peer, which was then used to add it to the addressbook.
Which in turn swallowed the self reported port of the peer, which is
important on a reconnect. This change revives the NetAddress on NodeInfo
which the Peer carries, but now returns an error to avoid nil
dereferencing another issue observed in the past. Additionally we could
potentially address #3532, yet the original problem statemenf of that
issue stands.
As a drive-by optimisation `MarkAsGood` now takes only a `p2p.ID` which
makes it interface a bit stricter and leaner.
* add actionable advice for ErrAddrBookNonRoutable err
Should replace https://github.com/tendermint/tendermint/pull/3463
* reorder checks in addrbook#addAddress so
ErrAddrBookPrivate is returned first
and do not log error in DialPeersAsync if the address is private
because it's not an error
ListOfKnownAddresses is removed
panic if addrbook size is less than zero
CrawlPeers does not attempt to connect to existing or peers we're currently dialing
various perf. fixes
improved tests (though not complete)
move IsDialingOrExistingAddress check into DialPeerWithAddress (Fixes#2716)
* addrbook: preallocate memory when saving addrbook to file
* addrbook: remove oldestFirst struct and check for ID
* oldestFirst replaced with sort.Slice
* ID is now mandatory, so no need to check
* addrbook: remove ListOfKnownAddresses
GetSelection is used instead in seed mode.
* addrbook: panic if size is less than 0
* rewrite addrbook#saveToFile to not use a counter
* test AttemptDisconnects func
* move IsDialingOrExistingAddress check into DialPeerWithAddress
* save and cleanup crawl peer data
* get rid of DefaultSeedDisconnectWaitPeriod
* make linter happy
* fix TestPEXReactorSeedMode
* fix comment
* add a changelog entry
* Apply suggestions from code review
Co-Authored-By: melekes <anton.kalyaev@gmail.com>
* rename ErrDialingOrExistingAddress to ErrCurrentlyDialingOrExistingAddress
* lowercase errors
* do not persist seed data
pros:
- no extra files
- less IO
cons:
- if the node crashes, seed might crawl a peer too soon
* fixes after Ethan's review
* add a changelog entry
* we should only consult Switch about peers
checking addrbook size does not make sense since only PEX reactor uses
it for dialing peers!
https://github.com/tendermint/tendermint/pull/3011#discussion_r270948875
* OriginalAddr -> SocketAddr
OriginalAddr records the originally dialed address for outbound peers,
rather than the peer's self reported address. For inbound peers, it was
nil. Here, we rename it to SocketAddr and for inbound peers, set it to
the RemoteAddr of the connection.
* use SocketAddr
Numerous places in the code call peer.NodeInfo().NetAddress().
However, this call to NetAddress() may perform a DNS lookup if the
reported NodeInfo.ListenAddr includes a name. Failure of this lookup
returns a nil address, which can lead to panics in the code.
Instead, call peer.SocketAddr() to return the static address of the
connection.
* remove nodeInfo.NetAddress()
Expose `transport.NetAddress()`, a static result determined
when the transport is created. Removing NetAddress() from the nodeInfo
prevents accidental DNS lookups.
* fixes from review
* linter
* fixes from review