From 1acb12edf5e8aae084c0f6e9b257d7a2f9dbfe13 Mon Sep 17 00:00:00 2001 From: Ethan Buchman Date: Sun, 31 Dec 2017 17:07:08 -0500 Subject: [PATCH] p2p docs --- p2p/README.md | 119 ++--------------------------------------- p2p/docs/connection.md | 116 +++++++++++++++++++++++++++++++++++++++ p2p/docs/node.md | 53 ++++++++++++++++++ p2p/docs/peer.md | 105 ++++++++++++++++++++++++++++++++++++ p2p/docs/reputation.md | 23 ++++++++ 5 files changed, 302 insertions(+), 114 deletions(-) create mode 100644 p2p/docs/connection.md create mode 100644 p2p/docs/node.md create mode 100644 p2p/docs/peer.md create mode 100644 p2p/docs/reputation.md diff --git a/p2p/README.md b/p2p/README.md index d653b2caf..5d1f984cb 100644 --- a/p2p/README.md +++ b/p2p/README.md @@ -4,119 +4,10 @@ `tendermint/tendermint/p2p` provides an abstraction around peer-to-peer communication.
-## MConnection +See: -`MConnection` is a multiplex connection: +- [docs/connection] for details on how connections and multiplexing work +- [docs/peer] for details on peer ID, handshakes, and peer exchange +- [docs/node] for details about different types of nodes and how they should work +- [docs/reputation] for details on how peer reputation is managed -__multiplex__ *noun* a system or signal involving simultaneous transmission of -several messages along a single channel of communication. - -Each `MConnection` handles message transmission on multiple abstract communication -`Channel`s. Each channel has a globally unique byte id. -The byte id and the relative priorities of each `Channel` are configured upon -initialization of the connection. - -The `MConnection` supports three packet types: Ping, Pong, and Msg. - -### Ping and Pong - -The ping and pong messages consist of writing a single byte to the connection; 0x1 and 0x2, respectively - -When we haven't received any messages on an `MConnection` in a time `pingTimeout`, we send a ping message. -When a ping is received on the `MConnection`, a pong is sent in response. - -If a pong is not received in sufficient time, the peer's score should be decremented (TODO). - -### Msg - -Messages in channels are chopped into smaller msgPackets for multiplexing. - -``` -type msgPacket struct { - ChannelID byte - EOF byte // 1 means message ends here. - Bytes []byte -} -``` - -The msgPacket is serialized using go-wire, and prefixed with a 0x3. -The received `Bytes` of a sequential set of packets are appended together -until a packet with `EOF=1` is received, at which point the complete serialized message -is returned for processing by the corresponding channels `onReceive` function. - -### Multiplexing - -Messages are sent from a single `sendRoutine`, which loops over a select statement that results in the sending -of a ping, a pong, or a batch of data messages. The batch of data messages may include messages from multiple channels. -Message bytes are queued for sending in their respective channel, with each channel holding one unsent message at a time. -Messages are chosen for a batch one a time from the channel with the lowest ratio of recently sent bytes to channel priority. - -## Sending Messages - -There are two methods for sending messages: -```go -func (m MConnection) Send(chID byte, msg interface{}) bool {} -func (m MConnection) TrySend(chID byte, msg interface{}) bool {} -``` - -`Send(chID, msg)` is a blocking call that waits until `msg` is successfully queued -for the channel with the given id byte `chID`. The message `msg` is serialized -using the `tendermint/wire` submodule's `WriteBinary()` reflection routine. - -`TrySend(chID, msg)` is a nonblocking call that returns false if the channel's -queue is full. - -`Send()` and `TrySend()` are also exposed for each `Peer`. - -## Peer - -Each peer has one `MConnection` instance, and includes other information such as whether the connection -was outbound, whether the connection should be recreated if it closes, various identity information about the node, -and other higher level thread-safe data used by the reactors. - -## Switch/Reactor - -The `Switch` handles peer connections and exposes an API to receive incoming messages -on `Reactors`. Each `Reactor` is responsible for handling incoming messages of one -or more `Channels`. So while sending outgoing messages is typically performed on the peer, -incoming messages are received on the reactor. - -```go -// Declare a MyReactor reactor that handles messages on MyChannelID. -type MyReactor struct{} - -func (reactor MyReactor) GetChannels() []*ChannelDescriptor { - return []*ChannelDescriptor{ChannelDescriptor{ID:MyChannelID, Priority: 1}} -} - -func (reactor MyReactor) Receive(chID byte, peer *Peer, msgBytes []byte) { - r, n, err := bytes.NewBuffer(msgBytes), new(int64), new(error) - msgString := ReadString(r, n, err) - fmt.Println(msgString) -} - -// Other Reactor methods omitted for brevity -... - -switch := NewSwitch([]Reactor{MyReactor{}}) - -... - -// Send a random message to all outbound connections -for _, peer := range switch.Peers().List() { - if peer.IsOutbound() { - peer.Send(MyChannelID, "Here's a random message") - } -} -``` - -### PexReactor/AddrBook - -A `PEXReactor` reactor implementation is provided to automate peer discovery. - -```go -book := p2p.NewAddrBook(addrBookFilePath) -pexReactor := p2p.NewPEXReactor(book) -... -switch := NewSwitch([]Reactor{pexReactor, myReactor, ...}) -``` diff --git a/p2p/docs/connection.md b/p2p/docs/connection.md new file mode 100644 index 000000000..72847fa11 --- /dev/null +++ b/p2p/docs/connection.md @@ -0,0 +1,116 @@ +## MConnection + +`MConnection` is a multiplex connection: + +__multiplex__ *noun* a system or signal involving simultaneous transmission of +several messages along a single channel of communication. + +Each `MConnection` handles message transmission on multiple abstract communication +`Channel`s. Each channel has a globally unique byte id. +The byte id and the relative priorities of each `Channel` are configured upon +initialization of the connection. + +The `MConnection` supports three packet types: Ping, Pong, and Msg. + +### Ping and Pong + +The ping and pong messages consist of writing a single byte to the connection; 0x1 and 0x2, respectively + +When we haven't received any messages on an `MConnection` in a time `pingTimeout`, we send a ping message. +When a ping is received on the `MConnection`, a pong is sent in response. + +If a pong is not received in sufficient time, the peer's score should be decremented (TODO). + +### Msg + +Messages in channels are chopped into smaller msgPackets for multiplexing. + +``` +type msgPacket struct { + ChannelID byte + EOF byte // 1 means message ends here. + Bytes []byte +} +``` + +The msgPacket is serialized using go-wire, and prefixed with a 0x3. +The received `Bytes` of a sequential set of packets are appended together +until a packet with `EOF=1` is received, at which point the complete serialized message +is returned for processing by the corresponding channels `onReceive` function. + +### Multiplexing + +Messages are sent from a single `sendRoutine`, which loops over a select statement that results in the sending +of a ping, a pong, or a batch of data messages. The batch of data messages may include messages from multiple channels. +Message bytes are queued for sending in their respective channel, with each channel holding one unsent message at a time. +Messages are chosen for a batch one a time from the channel with the lowest ratio of recently sent bytes to channel priority. + +## Sending Messages + +There are two methods for sending messages: +```go +func (m MConnection) Send(chID byte, msg interface{}) bool {} +func (m MConnection) TrySend(chID byte, msg interface{}) bool {} +``` + +`Send(chID, msg)` is a blocking call that waits until `msg` is successfully queued +for the channel with the given id byte `chID`. The message `msg` is serialized +using the `tendermint/wire` submodule's `WriteBinary()` reflection routine. + +`TrySend(chID, msg)` is a nonblocking call that returns false if the channel's +queue is full. + +`Send()` and `TrySend()` are also exposed for each `Peer`. + +## Peer + +Each peer has one `MConnection` instance, and includes other information such as whether the connection +was outbound, whether the connection should be recreated if it closes, various identity information about the node, +and other higher level thread-safe data used by the reactors. + +## Switch/Reactor + +The `Switch` handles peer connections and exposes an API to receive incoming messages +on `Reactors`. Each `Reactor` is responsible for handling incoming messages of one +or more `Channels`. So while sending outgoing messages is typically performed on the peer, +incoming messages are received on the reactor. + +```go +// Declare a MyReactor reactor that handles messages on MyChannelID. +type MyReactor struct{} + +func (reactor MyReactor) GetChannels() []*ChannelDescriptor { + return []*ChannelDescriptor{ChannelDescriptor{ID:MyChannelID, Priority: 1}} +} + +func (reactor MyReactor) Receive(chID byte, peer *Peer, msgBytes []byte) { + r, n, err := bytes.NewBuffer(msgBytes), new(int64), new(error) + msgString := ReadString(r, n, err) + fmt.Println(msgString) +} + +// Other Reactor methods omitted for brevity +... + +switch := NewSwitch([]Reactor{MyReactor{}}) + +... + +// Send a random message to all outbound connections +for _, peer := range switch.Peers().List() { + if peer.IsOutbound() { + peer.Send(MyChannelID, "Here's a random message") + } +} +``` + +### PexReactor/AddrBook + +A `PEXReactor` reactor implementation is provided to automate peer discovery. + +```go +book := p2p.NewAddrBook(addrBookFilePath) +pexReactor := p2p.NewPEXReactor(book) +... +switch := NewSwitch([]Reactor{pexReactor, myReactor, ...}) +``` diff --git a/p2p/docs/node.md b/p2p/docs/node.md new file mode 100644 index 000000000..a8afc85ce --- /dev/null +++ b/p2p/docs/node.md @@ -0,0 +1,53 @@ +# Tendermint Peer Discovery + +A Tendermint P2P network has different kinds of nodes with different requirements for connectivity to others. +This document describes what kind of nodes Tendermint should enable and how they should work. + +## Node startup options +--p2p.seed_mode // If present, this node operates in seed mode. It will kick incoming peers after sharing some peers. +--p2p.seeds “1.2.3.4:466656,2.3.4.5:4444” // Dials these seeds to get peers and disconnects. +--p2p.persistent_peers “1.2.3.4:46656,2.3.4.5:466656” // These connections will be auto-redialed. If dial_seeds and persistent intersect, the user will be WARNED that seeds may auto-close connections and the node may not be able to keep the connection persistent + +## Seeds + +Seeds are the first point of contact for a new node. +They return a list of known active peers and disconnect. + +Seeds should operate full nodes, and with the PEX reactor in a "crawler" mode +that continuously explores to validate the availability of peers. + +Seeds should only respond with some top percentile of the best peers it knows about. + +## New Full Node + +A new node has seeds hardcoded into the software, but they can also be set manually (config file or flags). +The new node must also have access to a recent block height, H, and hash, HASH. + +The node then queries some seeds for peers for its chain, +dials those peers, and runs the Tendermint protocols with those it successfully connects to. + +When the peer catches up to height H, it ensures the block hash matches HASH. + +## Restarted Full Node + +A node checks its address book on startup and attempts to connect to peers from there. +If it can't connect to any peers after some time, it falls back to the seeds to find more. + +## Validator Node + +A validator node is a node that interfaces with a validator signing key. +These nodes require the highest security, and should not accept incoming connections. +They should maintain outgoing connections to a controlled set of "Sentry Nodes" that serve +as their proxy shield to the rest of the network. + +Validators that know and trust each other can accept incoming connections from one another and maintain direct private connectivity via VPN. + +## Sentry Node + +Sentry nodes are guardians of a validator node and provide it access to the rest of the network. +Sentry nodes may be dynamic, but should maintain persistent connections to some evolving random subset of each other. +They should always expect to have direct incoming connections from the validator node and its backup/s. +They do not report the validator node's address in the PEX. +They may be more strict about the quality of peers they keep. + +Sentry nodes belonging to validators that trust each other may wish to maintain persistent connections via VPN with one another, but only report each other sparingly in the PEX. diff --git a/p2p/docs/peer.md b/p2p/docs/peer.md new file mode 100644 index 000000000..15870ea7d --- /dev/null +++ b/p2p/docs/peer.md @@ -0,0 +1,105 @@ +# Tendermint Peers + +This document explains how Tendermint Peers are identified, how they connect to one another, +and how other peers are found. + +## Peer Identity + +Tendermint peers are expected to maintain long-term persistent identities in the form of a private key. +Each peer has an ID defined as `peer.ID == peer.PrivKey.Address()`, where `Address` uses the scheme defined in go-crypto. + +Peer ID's must come with some Proof-of-Work; that is, +they must satisfy `peer.PrivKey.Address() < target` for some difficulty target. +This ensures they are not too easy to generate. + +A single peer ID can have multiple IP addresses associated with - for simplicity, we only keep track +of the latest one. + +When attempting to connect to a peer, we use the PeerURL: `@:`. +We will attempt to connect to the peer at IP:PORT, and verify, +via authenticated encryption, that it is in possession of the private key +corresponding to ``. This prevents man-in-the-middle attacks on the peer layer. + +Peers can also be connected to without specifying an ID, ie. `:`. +In this case, the peer cannot be authenticated and other means, such as a VPN, +must be used. + +## Connections + +All p2p connections use TCP. +Upon establishing a successful TCP connection with a peer, +two handhsakes are performed: one for authenticated encryption, and one for Tendermint versioning. +Both handshakes have configurable timeouts (they should complete quickly). + +### Authenticated Encryption Handshake + +Tendermint implements the Station-to-Station protocol +using ED25519 keys for Diffie-Helman key-exchange and NACL SecretBox for encryption. +It goes as follows: +- generate an emphemeral ED25519 keypair +- send the ephemeral public key to the peer +- wait to receive the peer's ephemeral public key +- compute the Diffie-Hellman shared secret using the peers ephemeral public key and our ephemeral private key +- generate nonces to use for encryption + - TODO +- all communications from now on are encrypted using the shared secret +- generate a common challenge to sign +- sign the common challenge with our persistent private key +- send the signed challenge and persistent public key to the peer +- wait to receive the signed challenge and persistent public key from the peer +- verify the signature in the signed challenge using the peers persistent public key + + +If this is an outgoing connection (we dialed the peer) and we used a peer ID, +then finally verify that the `peer.PubKey` corresponds to the peer ID we dialed, +ie. `peer.PubKey.Address() == `. + +The connection has now been authenticated. All traffic is encrypted. + +Note that only the dialer can authenticate the identity of the peer, +but this is what we care about since when we join the network we wish to +ensure we have reached the intended peer (and are not being MITMd). + + +### Peer Filter + +Before continuing, we check if the new peer has the same ID has ourselves or +an existing peer. If so, we disconnect. + +We also check the peer's address and public key against +an optional whitelist which can be managed through the ABCI app - +if the whitelist is enabled and the peer is not on it, the connection is +terminated. + + +### Tendermint Version Handshake + +The Tendermint Version Handshake allows the peers to exchange their NodeInfo, which contains: + +``` +type NodeInfo struct { + PubKey crypto.PubKey `json:"pub_key"` + Moniker string `json:"moniker"` + Network string `json:"network"` + RemoteAddr string `json:"remote_addr"` + ListenAddr string `json:"listen_addr"` // accepting in + Version string `json:"version"` // major.minor.revision + Channels []int8 `json:"channels"` // active reactor channels + Other []string `json:"other"` // other application specific data +} +``` + +The connection is disconnected if: +- `peer.NodeInfo.PubKey != peer.PubKey` +- `peer.NodeInfo.Version` is not formatted as `X.X.X` where X are integers known as Major, Minor, and Revision +- `peer.NodeInfo.Version` Major is not the same as ours +- `peer.NodeInfo.Version` Minor is not the same as ours +- `peer.NodeInfo.Network` is not the same as ours + + +At this point, if we have not disconnected, the peer is valid and added to the switch, +so it is added to all reactors. + + +### Connection Activity + diff --git a/p2p/docs/reputation.md b/p2p/docs/reputation.md new file mode 100644 index 000000000..a2a995e5d --- /dev/null +++ b/p2p/docs/reputation.md @@ -0,0 +1,23 @@ + +# Peer Strategy + +Peers are managed using an address book and a trust metric. +The book keeps a record of vetted peers and unvetted peers. +When we need more peers, we pick them randomly from the addrbook with some +configurable bias for unvetted peers. When we’re asked for peers, we provide a random selection with no bias. + +The trust metric tracks the quality of the peers. +When a peer exceeds a certain quality for a certain amount of time, +it is marked as vetted in the addrbook. +If a vetted peer's quality degrades sufficiently, it is booted, and must prove itself from scratch. +If we need to make room for a new vetted peer, we move the lowest scoring vetted peer back to unvetted. +If we need to make room for a new unvetted peer, we remove the lowest scoring unvetted peer - +possibly only if its below some absolute minimum ? + +Peer quality is tracked in the connection and across the reactors. +Behaviours are defined as one of: + - fatal - something outright malicious. we should disconnect and remember them. + - bad - any kind of timeout, msgs that dont unmarshal, or fail other validity checks, or msgs we didn't ask for or arent expecting + - neutral - normal correct behaviour. unknown channels/msg types (version upgrades). + - good - some random majority of peers per reactor sending us useful messages +