## Description - separate docs related to running nodes into the nodes dir. - keep old files but dont display them - bring over debugging like a pro blog Closes: #XXXpull/5708/head
@ -0,0 +1,43 @@ | |||
--- | |||
order: 1 | |||
parent: | |||
title: Nodes | |||
order: 4 | |||
--- | |||
This section will focus on how to operate full nodes, validators and light clients. | |||
- [Node Types](#node-types) | |||
- [Configuration](./configuration.md) | |||
- [Configure State sync](./state_sync.md) | |||
- [Validator Guides](./validators.md) | |||
- [How to secure your keys](./validators.md#validator_keys) | |||
- [Light Client guides](./light-client.md) | |||
- [How to sync a light client](./light-client.md#) | |||
- [Metrics](./metrics.md) | |||
## Node Types | |||
We will cover the various types of node types within Tendermint. | |||
### Full Node | |||
A full node is a node that participates in the network but will not help secure it. Full nodes can be used to store the entire state of a blockchain. For Tendermint there are two forms of state. First, blockchain state, this represents the blocks of a blockchain. Secondly, there is Application state, this represents the state that transactions modify. The knowledge of how a transaction can modify state is not held by Tendermint but rather the application on the other side of the ABCI boundary. | |||
> Note: If you have not read about the seperation of consensus and application please take a few minutes to read up on it as it will provide a better understanding to many of the terms we use throughout the documentation. You can find more information on the ABCI [here](../app-dev/app-architecture.md). | |||
As a full node operator you are providing services to the network that helps it come to consensus and others catch up to the current block. Even though a full node only helps the network come to consensus it is important to secure your node from adversarial actors. We recommend using a firewall and a proxy if possible. Running a full node can be easy, but it varies from network to network. Verify your applications documentation prior running a node. | |||
### Seed Nodes | |||
A seed node provides a node with a list of peers which a node can connect to. When starting a node you must provide at least one type of node to be able to connect to the desired network. By providing a seed node you will be able to populate your address quickly. A seed node will not be kept as a peer but will disconnect from your node after it has provided a list of peers. | |||
### Sentry Node | |||
A sentry node is similar to a full node in almost every way. The difference is a sentry node will have one or more private peers. These peers may be validators or other full nodes in the network. A sentry node is meant to provide a layer of security for your validator, similar to how a firewall works with a computer. | |||
### Validators | |||
Validators are nodes that participate in the security of a network. Validators have an associated power in Tendermint, this power can represent stake in a [proof of stake](https://en.wikipedia.org/wiki/Proof_of_stake) system, reputation in [proof of authority](https://en.wikipedia.org/wiki/Proof_of_authority) or any sort of measurable unit. Running a secure and consistently online validator is crucial to a networks health. A validator must be secure and fault tolerant, it is recommended to run your validator with 2 or more sentry nodes. | |||
As a validator there is the potential to have your weight reduced, this is defined by the application. Tendermint is notified by the application if a validator should have there weight increased or reduced. Application have different types of malicious behavior which lead to slashing of the validators power. Please check the documentation of the application you will be running in order to find more information. |
@ -0,0 +1,492 @@ | |||
--- | |||
order: 3 | |||
--- | |||
# Configuration | |||
Tendermint Core can be configured via a TOML file in | |||
`$TMHOME/config/config.toml`. Some of these parameters can be overridden by | |||
command-line flags. For most users, the options in the `##### main base configuration options #####` are intended to be modified while config options | |||
further below are intended for advance power users. | |||
## Options | |||
The default configuration file create by `tendermint init` has all | |||
the parameters set with their default values. It will look something | |||
like the file below, however, double check by inspecting the | |||
`config.toml` created with your version of `tendermint` installed: | |||
```toml | |||
# This is a TOML config file. | |||
# For more information, see https://github.com/toml-lang/toml | |||
# NOTE: Any path below can be absolute (e.g. "/var/myawesomeapp/data") or | |||
# relative to the home directory (e.g. "data"). The home directory is | |||
# "$HOME/.tendermint" by default, but could be changed via $TMHOME env variable | |||
# or --home cmd flag. | |||
####################################################################### | |||
### Main Base Config Options ### | |||
####################################################################### | |||
# TCP or UNIX socket address of the ABCI application, | |||
# or the name of an ABCI application compiled in with the Tendermint binary | |||
proxy_app = "tcp://127.0.0.1:26658" | |||
# A custom human readable name for this node | |||
moniker = "anonymous" | |||
# If this node is many blocks behind the tip of the chain, FastSync | |||
# allows them to catchup quickly by downloading blocks in parallel | |||
# and verifying their commits | |||
fast_sync = true | |||
# Database backend: goleveldb | cleveldb | boltdb | rocksdb | badgerdb | |||
# * goleveldb (github.com/syndtr/goleveldb - most popular implementation) | |||
# - pure go | |||
# - stable | |||
# * cleveldb (uses levigo wrapper) | |||
# - fast | |||
# - requires gcc | |||
# - use cleveldb build tag (go build -tags cleveldb) | |||
# * boltdb (uses etcd's fork of bolt - github.com/etcd-io/bbolt) | |||
# - EXPERIMENTAL | |||
# - may be faster is some use-cases (random reads - indexer) | |||
# - use boltdb build tag (go build -tags boltdb) | |||
# * rocksdb (uses github.com/tecbot/gorocksdb) | |||
# - EXPERIMENTAL | |||
# - requires gcc | |||
# - use rocksdb build tag (go build -tags rocksdb) | |||
# * badgerdb (uses github.com/dgraph-io/badger) | |||
# - EXPERIMENTAL | |||
# - use badgerdb build tag (go build -tags badgerdb) | |||
db_backend = "goleveldb" | |||
# Database directory | |||
db_dir = "data" | |||
# Output level for logging, including package level options | |||
log_level = "main:info,state:info,statesync:info,*:error" | |||
# Output format: 'plain' (colored text) or 'json' | |||
log_format = "plain" | |||
##### additional base config options ##### | |||
# Path to the JSON file containing the initial validator set and other meta data | |||
genesis_file = "config/genesis.json" | |||
# Path to the JSON file containing the private key to use as a validator in the consensus protocol | |||
priv_validator_key_file = "config/priv_validator_key.json" | |||
# Path to the JSON file containing the last sign state of a validator | |||
priv_validator_state_file = "data/priv_validator_state.json" | |||
# TCP or UNIX socket address for Tendermint to listen on for | |||
# connections from an external PrivValidator process | |||
priv_validator_laddr = "" | |||
# Path to the JSON file containing the private key to use for node authentication in the p2p protocol | |||
node_key_file = "config/node_key.json" | |||
# Mechanism to connect to the ABCI application: socket | grpc | |||
abci = "socket" | |||
# If true, query the ABCI app on connecting to a new peer | |||
# so the app can decide if we should keep the connection or not | |||
filter_peers = false | |||
####################################################################### | |||
### Advanced Configuration Options ### | |||
####################################################################### | |||
####################################################### | |||
### RPC Server Configuration Options ### | |||
####################################################### | |||
[rpc] | |||
# TCP or UNIX socket address for the RPC server to listen on | |||
laddr = "tcp://127.0.0.1:26657" | |||
# A list of origins a cross-domain request can be executed from | |||
# Default value '[]' disables cors support | |||
# Use '["*"]' to allow any origin | |||
cors_allowed_origins = [] | |||
# A list of methods the client is allowed to use with cross-domain requests | |||
cors_allowed_methods = ["HEAD", "GET", "POST", ] | |||
# A list of non simple headers the client is allowed to use with cross-domain requests | |||
cors_allowed_headers = ["Origin", "Accept", "Content-Type", "X-Requested-With", "X-Server-Time", ] | |||
# TCP or UNIX socket address for the gRPC server to listen on | |||
# NOTE: This server only supports /broadcast_tx_commit | |||
grpc_laddr = "" | |||
# Maximum number of simultaneous connections. | |||
# Does not include RPC (HTTP&WebSocket) connections. See max_open_connections | |||
# If you want to accept a larger number than the default, make sure | |||
# you increase your OS limits. | |||
# 0 - unlimited. | |||
# Should be < {ulimit -Sn} - {MaxNumInboundPeers} - {MaxNumOutboundPeers} - {N of wal, db and other open files} | |||
# 1024 - 40 - 10 - 50 = 924 = ~900 | |||
grpc_max_open_connections = 900 | |||
# Activate unsafe RPC commands like /dial_seeds and /unsafe_flush_mempool | |||
unsafe = false | |||
# Maximum number of simultaneous connections (including WebSocket). | |||
# Does not include gRPC connections. See grpc_max_open_connections | |||
# If you want to accept a larger number than the default, make sure | |||
# you increase your OS limits. | |||
# 0 - unlimited. | |||
# Should be < {ulimit -Sn} - {MaxNumInboundPeers} - {MaxNumOutboundPeers} - {N of wal, db and other open files} | |||
# 1024 - 40 - 10 - 50 = 924 = ~900 | |||
max_open_connections = 900 | |||
# Maximum number of unique clientIDs that can /subscribe | |||
# If you're using /broadcast_tx_commit, set to the estimated maximum number | |||
# of broadcast_tx_commit calls per block. | |||
max_subscription_clients = 100 | |||
# Maximum number of unique queries a given client can /subscribe to | |||
# If you're using GRPC (or Local RPC client) and /broadcast_tx_commit, set to | |||
# the estimated # maximum number of broadcast_tx_commit calls per block. | |||
max_subscriptions_per_client = 5 | |||
# How long to wait for a tx to be committed during /broadcast_tx_commit. | |||
# WARNING: Using a value larger than 10s will result in increasing the | |||
# global HTTP write timeout, which applies to all connections and endpoints. | |||
# See https://github.com/tendermint/tendermint/issues/3435 | |||
timeout_broadcast_tx_commit = "10s" | |||
# Maximum size of request body, in bytes | |||
max_body_bytes = 1000000 | |||
# Maximum size of request header, in bytes | |||
max_header_bytes = 1048576 | |||
# The path to a file containing certificate that is used to create the HTTPS server. | |||
# Migth be either absolute path or path related to tendermint's config directory. | |||
# If the certificate is signed by a certificate authority, | |||
# the certFile should be the concatenation of the server's certificate, any intermediates, | |||
# and the CA's certificate. | |||
# NOTE: both tls_cert_file and tls_key_file must be present for Tendermint to create HTTPS server. | |||
# Otherwise, HTTP server is run. | |||
tls_cert_file = "" | |||
# The path to a file containing matching private key that is used to create the HTTPS server. | |||
# Migth be either absolute path or path related to tendermint's config directory. | |||
# NOTE: both tls_cert_file and tls_key_file must be present for Tendermint to create HTTPS server. | |||
# Otherwise, HTTP server is run. | |||
tls_key_file = "" | |||
# pprof listen address (https://golang.org/pkg/net/http/pprof) | |||
pprof_laddr = "" | |||
####################################################### | |||
### P2P Configuration Options ### | |||
####################################################### | |||
[p2p] | |||
# Address to listen for incoming connections | |||
laddr = "tcp://0.0.0.0:26656" | |||
# Address to advertise to peers for them to dial | |||
# If empty, will use the same port as the laddr, | |||
# and will introspect on the listener or use UPnP | |||
# to figure out the address. | |||
external_address = "" | |||
# Comma separated list of seed nodes to connect to | |||
seeds = "" | |||
# Comma separated list of nodes to keep persistent connections to | |||
persistent_peers = "" | |||
# UPNP port forwarding | |||
upnp = false | |||
# Path to address book | |||
addr_book_file = "config/addrbook.json" | |||
# Set true for strict address routability rules | |||
# Set false for private or local networks | |||
addr_book_strict = true | |||
# Maximum number of inbound peers | |||
max_num_inbound_peers = 40 | |||
# Maximum number of outbound peers to connect to, excluding persistent peers | |||
max_num_outbound_peers = 10 | |||
# List of node IDs, to which a connection will be (re)established ignoring any existing limits | |||
unconditional_peer_ids = "" | |||
# Maximum pause when redialing a persistent peer (if zero, exponential backoff is used) | |||
persistent_peers_max_dial_period = "0s" | |||
# Time to wait before flushing messages out on the connection | |||
flush_throttle_timeout = "100ms" | |||
# Maximum size of a message packet payload, in bytes | |||
max_packet_msg_payload_size = 1024 | |||
# Rate at which packets can be sent, in bytes/second | |||
send_rate = 5120000 | |||
# Rate at which packets can be received, in bytes/second | |||
recv_rate = 5120000 | |||
# Set true to enable the peer-exchange reactor | |||
pex = true | |||
# Seed mode, in which node constantly crawls the network and looks for | |||
# peers. If another node asks it for addresses, it responds and disconnects. | |||
# | |||
# Does not work if the peer-exchange reactor is disabled. | |||
seed_mode = false | |||
# Comma separated list of peer IDs to keep private (will not be gossiped to other peers) | |||
private_peer_ids = "" | |||
# Toggle to disable guard against peers connecting from the same ip. | |||
allow_duplicate_ip = false | |||
# Peer connection configuration. | |||
handshake_timeout = "20s" | |||
dial_timeout = "3s" | |||
####################################################### | |||
### Mempool Configurattion Option ### | |||
####################################################### | |||
[mempool] | |||
recheck = true | |||
broadcast = true | |||
wal_dir = "" | |||
# Maximum number of transactions in the mempool | |||
size = 5000 | |||
# Limit the total size of all txs in the mempool. | |||
# This only accounts for raw transactions (e.g. given 1MB transactions and | |||
# max_txs_bytes=5MB, mempool will only accept 5 transactions). | |||
max_txs_bytes = 1073741824 | |||
# Size of the cache (used to filter transactions we saw earlier) in transactions | |||
cache_size = 10000 | |||
# Maximum size of a single transaction. | |||
# NOTE: the max size of a tx transmitted over the network is {max_tx_bytes}. | |||
max_tx_bytes = 1048576 | |||
# Maximum size of a batch of transactions to send to a peer | |||
# Including space needed by encoding (one varint per transaction). | |||
max_batch_bytes = 10485760 | |||
####################################################### | |||
### State Sync Configuration Options ### | |||
####################################################### | |||
[statesync] | |||
# State sync rapidly bootstraps a new node by discovering, fetching, and restoring a state machine | |||
# snapshot from peers instead of fetching and replaying historical blocks. Requires some peers in | |||
# the network to take and serve state machine snapshots. State sync is not attempted if the node | |||
# has any local state (LastBlockHeight > 0). The node will have a truncated block history, | |||
# starting from the height of the snapshot. | |||
enable = false | |||
# RPC servers (comma-separated) for light client verification of the synced state machine and | |||
# retrieval of state data for node bootstrapping. Also needs a trusted height and corresponding | |||
# header hash obtained from a trusted source, and a period during which validators can be trusted. | |||
# | |||
# For Cosmos SDK-based chains, trust_period should usually be about 2/3 of the unbonding time (~2 | |||
# weeks) during which they can be financially punished (slashed) for misbehavior. | |||
rpc_servers = "" | |||
trust_height = 0 | |||
trust_hash = "" | |||
trust_period = "168h0m0s" | |||
# Time to spend discovering snapshots before initiating a restore. | |||
discovery_time = "15s" | |||
# Temporary directory for state sync snapshot chunks, defaults to the OS tempdir (typically /tmp). | |||
# Will create a new, randomly named directory within, and remove it when done. | |||
temp_dir = "" | |||
####################################################### | |||
### Fast Sync Configuration Connections ### | |||
####################################################### | |||
[fastsync] | |||
# Fast Sync version to use: | |||
# 1) "v0" (default) - the legacy fast sync implementation | |||
# 2) "v1" - refactor of v0 version for better testability | |||
# 2) "v2" - complete redesign of v0, optimized for testability & readability | |||
version = "v0" | |||
####################################################### | |||
### Consensus Configuration Options ### | |||
####################################################### | |||
[consensus] | |||
wal_file = "data/cs.wal/wal" | |||
# How long we wait for a proposal block before prevoting nil | |||
timeout_propose = "3s" | |||
# How much timeout_propose increases with each round | |||
timeout_propose_delta = "500ms" | |||
# How long we wait after receiving +2/3 prevotes for “anything” (ie. not a single block or nil) | |||
timeout_prevote = "1s" | |||
# How much the timeout_prevote increases with each round | |||
timeout_prevote_delta = "500ms" | |||
# How long we wait after receiving +2/3 precommits for “anything” (ie. not a single block or nil) | |||
timeout_precommit = "1s" | |||
# How much the timeout_precommit increases with each round | |||
timeout_precommit_delta = "500ms" | |||
# How long we wait after committing a block, before starting on the new | |||
# height (this gives us a chance to receive some more precommits, even | |||
# though we already have +2/3). | |||
timeout_commit = "1s" | |||
# How many blocks to look back to check existence of the node's consensus votes before joining consensus | |||
# When non-zero, the node will panic upon restart | |||
# if the same consensus key was used to sign {double_sign_check_height} last blocks. | |||
# So, validators should stop the state machine, wait for some blocks, and then restart the state machine to avoid panic. | |||
double_sign_check_height = 0 | |||
# Make progress as soon as we have all the precommits (as if TimeoutCommit = 0) | |||
skip_timeout_commit = false | |||
# EmptyBlocks mode and possible interval between empty blocks | |||
create_empty_blocks = true | |||
create_empty_blocks_interval = "0s" | |||
# Reactor sleep duration parameters | |||
peer_gossip_sleep_duration = "100ms" | |||
peer_query_maj23_sleep_duration = "2s" | |||
####################################################### | |||
### Transaction Indexer Configuration Options ### | |||
####################################################### | |||
[tx_index] | |||
# What indexer to use for transactions | |||
# | |||
# The application will set which txs to index. In some cases a node operator will be able | |||
# to decide which txs to index based on configuration set in the application. | |||
# | |||
# Options: | |||
# 1) "null" | |||
# 2) "kv" (default) - the simplest possible indexer, backed by key-value storage (defaults to levelDB; see DBBackend). | |||
# - When "kv" is chosen "tx.height" and "tx.hash" will always be indexed. | |||
indexer = "kv" | |||
####################################################### | |||
### Instrumentation Configuration Options ### | |||
####################################################### | |||
[instrumentation] | |||
# When true, Prometheus metrics are served under /metrics on | |||
# PrometheusListenAddr. | |||
# Check out the documentation for the list of available metrics. | |||
prometheus = false | |||
# Address to listen for Prometheus collector(s) connections | |||
prometheus_listen_addr = ":26660" | |||
# Maximum number of simultaneous connections. | |||
# If you want to accept a larger number than the default, make sure | |||
# you increase your OS limits. | |||
# 0 - unlimited. | |||
max_open_connections = 3 | |||
# Instrumentation namespace | |||
namespace = "tendermint" | |||
``` | |||
## Empty blocks VS no empty blocks | |||
### create_empty_blocks = true | |||
If `create_empty_blocks` is set to `true` in your config, blocks will be | |||
created ~ every second (with default consensus parameters). You can regulate | |||
the delay between blocks by changing the `timeout_commit`. E.g. `timeout_commit = "10s"` should result in ~ 10 second blocks. | |||
### create_empty_blocks = false | |||
In this setting, blocks are created when transactions received. | |||
Note after the block H, Tendermint creates something we call a "proof block" | |||
(only if the application hash changed) H+1. The reason for this is to support | |||
proofs. If you have a transaction in block H that changes the state to X, the | |||
new application hash will only be included in block H+1. If after your | |||
transaction is committed, you want to get a light-client proof for the new state | |||
(X), you need the new block to be committed in order to do that because the new | |||
block has the new application hash for the state X. That's why we make a new | |||
(empty) block if the application hash changes. Otherwise, you won't be able to | |||
make a proof for the new state. | |||
Plus, if you set `create_empty_blocks_interval` to something other than the | |||
default (`0`), Tendermint will be creating empty blocks even in the absence of | |||
transactions every `create_empty_blocks_interval`. For instance, with | |||
`create_empty_blocks = false` and `create_empty_blocks_interval = "30s"`, | |||
Tendermint will only create blocks if there are transactions, or after waiting | |||
30 seconds without receiving any transactions. | |||
## Consensus timeouts explained | |||
There's a variety of information about timeouts in [Running in | |||
production](./running-in-production.md) | |||
You can also find more detailed technical explanation in the spec: [The latest | |||
gossip on BFT consensus](https://arxiv.org/abs/1807.04938). | |||
```toml | |||
[consensus] | |||
... | |||
timeout_propose = "3s" | |||
timeout_propose_delta = "500ms" | |||
timeout_prevote = "1s" | |||
timeout_prevote_delta = "500ms" | |||
timeout_precommit = "1s" | |||
timeout_precommit_delta = "500ms" | |||
timeout_commit = "1s" | |||
``` | |||
Note that in a successful round, the only timeout that we absolutely wait no | |||
matter what is `timeout_commit`. | |||
Here's a brief summary of the timeouts: | |||
- `timeout_propose` = how long we wait for a proposal block before prevoting | |||
nil | |||
- `timeout_propose_delta` = how much timeout_propose increases with each round | |||
- `timeout_prevote` = how long we wait after receiving +2/3 prevotes for | |||
anything (ie. not a single block or nil) | |||
- `timeout_prevote_delta` = how much the timeout_prevote increases with each | |||
round | |||
- `timeout_precommit` = how long we wait after receiving +2/3 precommits for | |||
anything (ie. not a single block or nil) | |||
- `timeout_precommit_delta` = how much the timeout_precommit increases with | |||
each round | |||
- `timeout_commit` = how long we wait after committing a block, before starting | |||
on the new height (this gives us a chance to receive some more precommits, | |||
even though we already have +2/3) | |||
## P2P settings | |||
This section will cover settings within the p2p section of the `config.toml`. | |||
- `external_address` = is the address that will be advertised for other nodes to use. We recommend setting this field with your public IP and p2p port. | |||
- `seeds` = is a list of comma separated seed nodes that you will connect upon a start and ask for peers. A seed node is a node that does not participate in consensus but only helps propagate peers to nodes in the networks | |||
- `persistent_peers` = is a list of comma separated peers that you will always want to be connected to. If you're already connected to the maximum number of peers, persistent peers will not be added. | |||
- `max_num_inbound_peers` = is the maximum number of peers you will accept inbound connections from at one time (where they dial your address and initiate the connection). | |||
- `max_num_outbound_peers` = is the maximum number of peers you will initiate outbound connects to at one time (where you dial their address and initiate the connection). | |||
- `unconditional_peer_ids` = is similar to `persistent_peers` except that these peers will be connected to even if you are already connected to the maximum number of peers. This can be a validator node ID on your sentry node. | |||
- `pex` = turns the peer exchange reactor on or off. Validator node will want the `pex` turned off so it would not begin gossiping to unknown peers on the network. PeX can also be turned off for statically configured networks with fixed network connectivity. For full nodes on open, dynamic networks, it should be turned on. | |||
- `seed_mode` = is used for when node operators want to run their node as a seed node. Seed node's run a variation of the PeX protocol that disconnects from peers after sending them a list of peers to connect to. To minimize the servers usage, it is recommended to set the mempool's size to 0. | |||
- `private_peer_ids` = is a comma separated list of node ids that you would not like exposed to other peers (ie. you will not tell other peers about the private_peer_ids). This can be filled with a validators node id. |
@ -0,0 +1,39 @@ | |||
--- | |||
order: 6 | |||
--- | |||
# Configure a Light Client | |||
Tendermint comes with a built-in `tendermint light` command, which can be used | |||
to run a light client proxy server, verifying Tendermint RPC. All calls that | |||
can be tracked back to a block header by a proof will be verified before | |||
passing them back to the caller. Other than that, it will present the same | |||
interface as a full Tendermint node. | |||
You can start the light client proxy server by running `tendermint light <chainID>`, | |||
with a variety of flags to specify the primary node, the witness nodes (which cross-check | |||
the information provided by the primary), the hash and height of the trusted header, | |||
and more. | |||
For example: | |||
```bash | |||
$ tendermint light supernova -p tcp://233.123.0.140:26657 \ | |||
-w tcp://179.63.29.15:26657,tcp://144.165.223.135:26657 \ | |||
--height=10 --hash=37E9A6DD3FA25E83B22C18835401E8E56088D0D7ABC6FD99FCDC920DD76C1C57 | |||
``` | |||
For additional options, run `tendermint light --help`. | |||
## Where to obtain trusted height & hash | |||
One way to obtain a semi-trusted hash & height is to query multiple full nodes | |||
and compare their hashes: | |||
```bash | |||
$ curl -s https://233.123.0.140:26657:26657/commit | jq "{height: .result.signed_header.header.height, hash: .result.signed_header.commit.block_id.hash}" | |||
{ | |||
"height": "273", | |||
"hash": "188F4F36CBCD2C91B57509BBF231C777E79B52EE3E0D90D06B1A25EB16E6E23D" | |||
} | |||
``` |
@ -0,0 +1,60 @@ | |||
--- | |||
order: 4 | |||
--- | |||
# Metrics | |||
Tendermint can report and serve the Prometheus metrics, which in their turn can | |||
be consumed by Prometheus collector(s). | |||
This functionality is disabled by default. | |||
To enable the Prometheus metrics, set `instrumentation.prometheus=true` if your | |||
config file. Metrics will be served under `/metrics` on 26660 port by default. | |||
Listen address can be changed in the config file (see | |||
`instrumentation.prometheus\_listen\_addr`). | |||
## List of available metrics | |||
The following metrics are available: | |||
| **Name** | **Type** | **Tags** | **Description** | | |||
| -------------------------------------- | --------- | ------------- | ---------------------------------------------------------------------- | | |||
| consensus_height | Gauge | | Height of the chain | | |||
| consensus_validators | Gauge | | Number of validators | | |||
| consensus_validators_power | Gauge | | Total voting power of all validators | | |||
| consensus_validator_power | Gauge | | Voting power of the node if in the validator set | | |||
| consensus_validator_last_signed_height | Gauge | | Last height the node signed a block, if the node is a validator | | |||
| consensus_validator_missed_blocks | Gauge | | Total amount of blocks missed for the node, if the node is a validator | | |||
| consensus_missing_validators | Gauge | | Number of validators who did not sign | | |||
| consensus_missing_validators_power | Gauge | | Total voting power of the missing validators | | |||
| consensus_byzantine_validators | Gauge | | Number of validators who tried to double sign | | |||
| consensus_byzantine_validators_power | Gauge | | Total voting power of the byzantine validators | | |||
| consensus_block_interval_seconds | Histogram | | Time between this and last block (Block.Header.Time) in seconds | | |||
| consensus_rounds | Gauge | | Number of rounds | | |||
| consensus_num_txs | Gauge | | Number of transactions | | |||
| consensus_total_txs | Gauge | | Total number of transactions committed | | |||
| consensus_block_parts | counter | peer_id | number of blockparts transmitted by peer | | |||
| consensus_latest_block_height | gauge | | /status sync_info number | | |||
| consensus_fast_syncing | gauge | | either 0 (not fast syncing) or 1 (syncing) | | |||
| consensus_state_syncing | gauge | | either 0 (not state syncing) or 1 (syncing) | | |||
| consensus_block_size_bytes | Gauge | | Block size in bytes | | |||
| p2p_peers | Gauge | | Number of peers node's connected to | | |||
| p2p_peer_receive_bytes_total | counter | peer_id, chID | number of bytes per channel received from a given peer | | |||
| p2p_peer_send_bytes_total | counter | peer_id, chID | number of bytes per channel sent to a given peer | | |||
| p2p_peer_pending_send_bytes | gauge | peer_id | number of pending bytes to be sent to a given peer | | |||
| p2p_num_txs | gauge | peer_id | number of transactions submitted by each peer_id | | |||
| p2p_pending_send_bytes | gauge | peer_id | amount of data pending to be sent to peer | | |||
| mempool_size | Gauge | | Number of uncommitted transactions | | |||
| mempool_tx_size_bytes | histogram | | transaction sizes in bytes | | |||
| mempool_failed_txs | counter | | number of failed transactions | | |||
| mempool_recheck_times | counter | | number of transactions rechecked in the mempool | | |||
| state_block_processing_time | histogram | | time between BeginBlock and EndBlock in ms | | |||
## Useful queries | |||
Percentage of missing + byzantine validators: | |||
```md | |||
((consensus\_byzantine\_validators\_power + consensus\_missing\_validators\_power) / consensus\_validators\_power) * 100 | |||
``` |
@ -0,0 +1,42 @@ | |||
--- | |||
order: 5 | |||
--- | |||
# Configure State-Sync | |||
State sync will continuously work in the background to supply nodes with chunked data when bootstrapping. | |||
> NOTE: Before trying to use state sync, see if the application you are operating a node for supports it. | |||
Under the state sync section in `config.toml` you will find multiple settings that need to be configured in order for your node to use state sync. | |||
Lets breakdown the settings: | |||
- `enable`: Enable is to inform the node that you will be using state sync to bootstrap your node. | |||
- `rpc_servers`: RPC servers are needed because state sync utilizes the light client for verification. | |||
- 2 servers are required, more is always helpful. | |||
- `temp_dir`: Temporary directory is store the chunks in the machines local storage, If nothing is set it will create a directory in `/tmp` | |||
The next information you will need to acquire it through publicly exposed RPC's or a block explorer which you trust. | |||
- `trust_height`: Trusted height defines at which height your node should trust the chain. | |||
- `trust_hash`: Trusted hash is the hash in the `BlockID` corresponding to the trusted height. | |||
- `trust_period`: Trust period is the period in which headers can be verified. | |||
> :warning: This value should be significantly smaller than the unbonding period. | |||
If you are relying on publicly exposed RPC's to get the need information, you can use `curl`. | |||
Example: | |||
```bash | |||
curl -s https://233.123.0.140:26657:26657/commit | jq "{height: .result.signed_header.header.height, hash: .result.signed_header.commit.block_id.hash}" | |||
``` | |||
The response will be: | |||
```json | |||
{ | |||
"height": "273", | |||
"hash": "188F4F36CBCD2C91B57509BBF231C777E79B52EE3E0D90D06B1A25EB16E6E23D" | |||
} | |||
``` |
@ -0,0 +1,114 @@ | |||
--- | |||
order: 2 | |||
--- | |||
# Validators | |||
Validators are responsible for committing new blocks in the blockchain. | |||
These validators participate in the consensus protocol by broadcasting | |||
_votes_ which contain cryptographic signatures signed by each | |||
validator's private key. | |||
Some Proof-of-Stake consensus algorithms aim to create a "completely" | |||
decentralized system where all stakeholders (even those who are not | |||
always available online) participate in the committing of blocks. | |||
Tendermint has a different approach to block creation. Validators are | |||
expected to be online, and the set of validators is permissioned/curated | |||
by some external process. Proof-of-stake is not required, but can be | |||
implemented on top of Tendermint consensus. That is, validators may be | |||
required to post collateral on-chain, off-chain, or may not be required | |||
to post any collateral at all. | |||
Validators have a cryptographic key-pair and an associated amount of | |||
"voting power". Voting power need not be the same. | |||
## Becoming a Validator | |||
There are two ways to become validator. | |||
1. They can be pre-established in the [genesis state](./using-tendermint.md#genesis) | |||
2. The ABCI app responds to the EndBlock message with changes to the | |||
existing validator set. | |||
## Setting up a Validator | |||
When setting up a validator there are countless ways to configure your setup. This guide is aimed at showing one of them, the sentry node design. This design is mainly for DDOS prevention. | |||
### Network Layout | |||
![ALT Network Layout](./sentry_layout.png) | |||
The diagram is based on AWS, other cloud providers will have similar solutions to design a solution. Running nodes is not limited to cloud providers, you can run nodes on bare metal systems as well. The architecture will be the same no matter which setup you decide to go with. | |||
The proposed network diagram is similar to the classical backend/frontend separation of services in a corporate environment. The “backend” in this case is the private network of the validator in the data center. The data center network might involve multiple subnets, firewalls and redundancy devices, which is not detailed on this diagram. The important point is that the data center allows direct connectivity to the chosen cloud environment. Amazon AWS has “Direct Connect”, while Google Cloud has “Partner Interconnect”. This is a dedicated connection to the cloud provider (usually directly to your virtual private cloud instance in one of the regions). | |||
All sentry nodes (the “frontend”) connect to the validator using this private connection. The validator does not have a public IP address to provide its services. | |||
Amazon has multiple availability zones within a region. One can install sentry nodes in other regions too. In this case the second, third and further regions need to have a private connection to the validator node. This can be achieved by VPC Peering (“VPC Network Peering” in Google Cloud). In this case, the second, third and further region sentry nodes will be directed to the first region and through the direct connect to the data center, arriving to the validator. | |||
A more persistent solution (not detailed on the diagram) is to have multiple direct connections to different regions from the data center. This way VPC Peering is not mandatory, although still beneficial for the sentry nodes. This overcomes the risk of depending on one region. It is more costly. | |||
### Local Configuration | |||
![ALT Local Configuration](./local_config.png) | |||
The validator will only talk to the sentry that are provided, the sentry nodes will communicate to the validator via a secret connection and the rest of the network through a normal connection. The sentry nodes do have the option of communicating with each other as well. | |||
When initializing nodes there are five parameters in the `config.toml` that may need to be altered. | |||
- `pex:` boolean. This turns the peer exchange reactor on or off for a node. When `pex=false`, only the `persistent_peers` list is available for connection. | |||
- `persistent_peers:` a comma separated list of `nodeID@ip:port` values that define a list of peers that are expected to be online at all times. This is necessary at first startup because by setting `pex=false` the node will not be able to join the network. | |||
- `unconditional_peer_ids:` comma separated list of nodeID's. These nodes will be connected to no matter the limits of inbound and outbound peers. This is useful for when sentry nodes have full address books. | |||
- `private_peer_ids:` comma separated list of nodeID's. These nodes will not be gossiped to the network. This is an important field as you do not want your validator IP gossiped to the network. | |||
- `addr_book_strict:` boolean. By default nodes with a routable address will be considered for connection. If this setting is turned off (false), non-routable IP addresses, like addresses in a private network can be added to the address book. | |||
- `double_sign_check_height` int64 height. How many blocks to look back to check existence of the node's consensus votes before joining consensus When non-zero, the node will panic upon restart if the same consensus key was used to sign {double_sign_check_height} last blocks. So, validators should stop the state machine, wait for some blocks, and then restart the state machine to avoid panic. | |||
#### Validator Node Configuration | |||
| Config Option | Setting | | |||
| ------------------------ | -------------------------- | | |||
| pex | false | | |||
| persistent_peers | list of sentry nodes | | |||
| private_peer_ids | none | | |||
| unconditional_peer_ids | optionally sentry node IDs | | |||
| addr_book_strict | false | | |||
| double_sign_check_height | 10 | | |||
The validator node should have `pex=false` so it does not gossip to the entire network. The persistent peers will be your sentry nodes. Private peers can be left empty as the validator is not trying to hide who it is communicating with. Setting unconditional peers is optional for a validator because they will not have a full address books. | |||
#### Sentry Node Configuration | |||
| Config Option | Setting | | |||
| ---------------------- | --------------------------------------------- | | |||
| pex | true | | |||
| persistent_peers | validator node, optionally other sentry nodes | | |||
| private_peer_ids | validator node ID | | |||
| unconditional_peer_ids | validator node ID, optionally sentry node IDs | | |||
| addr_book_strict | false | | |||
The sentry nodes should be able to talk to the entire network hence why `pex=true`. The persistent peers of a sentry node will be the validator, and optionally other sentry nodes. The sentry nodes should make sure that they do not gossip the validator's ip, to do this you must put the validators nodeID as a private peer. The unconditional peer IDs will be the validator ID and optionally other sentry nodes. | |||
> Note: Do not forget to secure your node's firewalls when setting them up. | |||
More Information can be found at these links: | |||
- <https://kb.certus.one/> | |||
- <https://forum.cosmos.network/t/sentry-node-architecture-overview/454> | |||
### Validator keys | |||
Protecting a validator's consensus key is the most important factor to take in when designing your setup. The key that a validator is given upon creation of the node is called a consensus key, it has to be online at all times in order to vote on blocks. It is **not recommended** to merely hold your private key in the default json file (`priv_validator_key.json`). Fortunately, the [Interchain Foundation](https://interchain.io/) has worked with a team to build a key management server for validators. You can find documentation on how to use it [here](https://github.com/iqlusioninc/tmkms), it is used extensively in production. You are not limited to using this tool, there are also [HSMs](https://safenet.gemalto.com/data-encryption/hardware-security-modules-hsms/), there is not a recommended HSM. | |||
Currently Tendermint uses [Ed25519](https://ed25519.cr.yp.to/) keys which are widely supported across the security sector and HSMs. | |||
## Committing a Block | |||
> **+2/3 is short for "more than 2/3"** | |||
A block is committed when +2/3 of the validator set sign [precommit | |||
votes](https://github.com/tendermint/spec/blob/953523c3cb99fdb8c8f7a2d21e3a99094279e9de/spec/blockchain/blockchain.md#vote) for that block at the same `round`. | |||
The +2/3 set of precommit votes is called a | |||
[_commit_](https://github.com/tendermint/spec/blob/953523c3cb99fdb8c8f7a2d21e3a99094279e9de/spec/blockchain/blockchain.md#commit). While any +2/3 set of | |||
precommits for the same block at the same height&round can serve as | |||
validation, the canonical commit is included in the next block (see | |||
[LastCommit](https://github.com/tendermint/spec/blob/953523c3cb99fdb8c8f7a2d21e3a99094279e9de/spec/blockchain/blockchain.md#lastcommit)). |
@ -1,492 +1,7 @@ | |||
--- | |||
order: 3 | |||
order: false | |||
--- | |||
# Configuration | |||
Tendermint Core can be configured via a TOML file in | |||
`$TMHOME/config/config.toml`. Some of these parameters can be overridden by | |||
command-line flags. For most users, the options in the `##### main base configuration options #####` are intended to be modified while config options | |||
further below are intended for advance power users. | |||
## Options | |||
The default configuration file create by `tendermint init` has all | |||
the parameters set with their default values. It will look something | |||
like the file below, however, double check by inspecting the | |||
`config.toml` created with your version of `tendermint` installed: | |||
```toml | |||
# This is a TOML config file. | |||
# For more information, see https://github.com/toml-lang/toml | |||
# NOTE: Any path below can be absolute (e.g. "/var/myawesomeapp/data") or | |||
# relative to the home directory (e.g. "data"). The home directory is | |||
# "$HOME/.tendermint" by default, but could be changed via $TMHOME env variable | |||
# or --home cmd flag. | |||
####################################################################### | |||
### Main Base Config Options ### | |||
####################################################################### | |||
# TCP or UNIX socket address of the ABCI application, | |||
# or the name of an ABCI application compiled in with the Tendermint binary | |||
proxy_app = "tcp://127.0.0.1:26658" | |||
# A custom human readable name for this node | |||
moniker = "anonymous" | |||
# If this node is many blocks behind the tip of the chain, FastSync | |||
# allows them to catchup quickly by downloading blocks in parallel | |||
# and verifying their commits | |||
fast_sync = true | |||
# Database backend: goleveldb | cleveldb | boltdb | rocksdb | badgerdb | |||
# * goleveldb (github.com/syndtr/goleveldb - most popular implementation) | |||
# - pure go | |||
# - stable | |||
# * cleveldb (uses levigo wrapper) | |||
# - fast | |||
# - requires gcc | |||
# - use cleveldb build tag (go build -tags cleveldb) | |||
# * boltdb (uses etcd's fork of bolt - github.com/etcd-io/bbolt) | |||
# - EXPERIMENTAL | |||
# - may be faster is some use-cases (random reads - indexer) | |||
# - use boltdb build tag (go build -tags boltdb) | |||
# * rocksdb (uses github.com/tecbot/gorocksdb) | |||
# - EXPERIMENTAL | |||
# - requires gcc | |||
# - use rocksdb build tag (go build -tags rocksdb) | |||
# * badgerdb (uses github.com/dgraph-io/badger) | |||
# - EXPERIMENTAL | |||
# - use badgerdb build tag (go build -tags badgerdb) | |||
db_backend = "goleveldb" | |||
# Database directory | |||
db_dir = "data" | |||
# Output level for logging, including package level options | |||
log_level = "main:info,state:info,statesync:info,*:error" | |||
# Output format: 'plain' (colored text) or 'json' | |||
log_format = "plain" | |||
##### additional base config options ##### | |||
# Path to the JSON file containing the initial validator set and other meta data | |||
genesis_file = "config/genesis.json" | |||
# Path to the JSON file containing the private key to use as a validator in the consensus protocol | |||
priv_validator_key_file = "config/priv_validator_key.json" | |||
# Path to the JSON file containing the last sign state of a validator | |||
priv_validator_state_file = "data/priv_validator_state.json" | |||
# TCP or UNIX socket address for Tendermint to listen on for | |||
# connections from an external PrivValidator process | |||
priv_validator_laddr = "" | |||
# Path to the JSON file containing the private key to use for node authentication in the p2p protocol | |||
node_key_file = "config/node_key.json" | |||
# Mechanism to connect to the ABCI application: socket | grpc | |||
abci = "socket" | |||
# If true, query the ABCI app on connecting to a new peer | |||
# so the app can decide if we should keep the connection or not | |||
filter_peers = false | |||
####################################################################### | |||
### Advanced Configuration Options ### | |||
####################################################################### | |||
####################################################### | |||
### RPC Server Configuration Options ### | |||
####################################################### | |||
[rpc] | |||
# TCP or UNIX socket address for the RPC server to listen on | |||
laddr = "tcp://127.0.0.1:26657" | |||
# A list of origins a cross-domain request can be executed from | |||
# Default value '[]' disables cors support | |||
# Use '["*"]' to allow any origin | |||
cors_allowed_origins = [] | |||
# A list of methods the client is allowed to use with cross-domain requests | |||
cors_allowed_methods = ["HEAD", "GET", "POST", ] | |||
# A list of non simple headers the client is allowed to use with cross-domain requests | |||
cors_allowed_headers = ["Origin", "Accept", "Content-Type", "X-Requested-With", "X-Server-Time", ] | |||
# TCP or UNIX socket address for the gRPC server to listen on | |||
# NOTE: This server only supports /broadcast_tx_commit | |||
grpc_laddr = "" | |||
# Maximum number of simultaneous connections. | |||
# Does not include RPC (HTTP&WebSocket) connections. See max_open_connections | |||
# If you want to accept a larger number than the default, make sure | |||
# you increase your OS limits. | |||
# 0 - unlimited. | |||
# Should be < {ulimit -Sn} - {MaxNumInboundPeers} - {MaxNumOutboundPeers} - {N of wal, db and other open files} | |||
# 1024 - 40 - 10 - 50 = 924 = ~900 | |||
grpc_max_open_connections = 900 | |||
# Activate unsafe RPC commands like /dial_seeds and /unsafe_flush_mempool | |||
unsafe = false | |||
# Maximum number of simultaneous connections (including WebSocket). | |||
# Does not include gRPC connections. See grpc_max_open_connections | |||
# If you want to accept a larger number than the default, make sure | |||
# you increase your OS limits. | |||
# 0 - unlimited. | |||
# Should be < {ulimit -Sn} - {MaxNumInboundPeers} - {MaxNumOutboundPeers} - {N of wal, db and other open files} | |||
# 1024 - 40 - 10 - 50 = 924 = ~900 | |||
max_open_connections = 900 | |||
# Maximum number of unique clientIDs that can /subscribe | |||
# If you're using /broadcast_tx_commit, set to the estimated maximum number | |||
# of broadcast_tx_commit calls per block. | |||
max_subscription_clients = 100 | |||
# Maximum number of unique queries a given client can /subscribe to | |||
# If you're using GRPC (or Local RPC client) and /broadcast_tx_commit, set to | |||
# the estimated # maximum number of broadcast_tx_commit calls per block. | |||
max_subscriptions_per_client = 5 | |||
# How long to wait for a tx to be committed during /broadcast_tx_commit. | |||
# WARNING: Using a value larger than 10s will result in increasing the | |||
# global HTTP write timeout, which applies to all connections and endpoints. | |||
# See https://github.com/tendermint/tendermint/issues/3435 | |||
timeout_broadcast_tx_commit = "10s" | |||
# Maximum size of request body, in bytes | |||
max_body_bytes = 1000000 | |||
# Maximum size of request header, in bytes | |||
max_header_bytes = 1048576 | |||
# The path to a file containing certificate that is used to create the HTTPS server. | |||
# Migth be either absolute path or path related to tendermint's config directory. | |||
# If the certificate is signed by a certificate authority, | |||
# the certFile should be the concatenation of the server's certificate, any intermediates, | |||
# and the CA's certificate. | |||
# NOTE: both tls_cert_file and tls_key_file must be present for Tendermint to create HTTPS server. | |||
# Otherwise, HTTP server is run. | |||
tls_cert_file = "" | |||
# The path to a file containing matching private key that is used to create the HTTPS server. | |||
# Migth be either absolute path or path related to tendermint's config directory. | |||
# NOTE: both tls_cert_file and tls_key_file must be present for Tendermint to create HTTPS server. | |||
# Otherwise, HTTP server is run. | |||
tls_key_file = "" | |||
# pprof listen address (https://golang.org/pkg/net/http/pprof) | |||
pprof_laddr = "" | |||
####################################################### | |||
### P2P Configuration Options ### | |||
####################################################### | |||
[p2p] | |||
# Address to listen for incoming connections | |||
laddr = "tcp://0.0.0.0:26656" | |||
# Address to advertise to peers for them to dial | |||
# If empty, will use the same port as the laddr, | |||
# and will introspect on the listener or use UPnP | |||
# to figure out the address. | |||
external_address = "" | |||
# Comma separated list of seed nodes to connect to | |||
seeds = "" | |||
# Comma separated list of nodes to keep persistent connections to | |||
persistent_peers = "" | |||
# UPNP port forwarding | |||
upnp = false | |||
# Path to address book | |||
addr_book_file = "config/addrbook.json" | |||
# Set true for strict address routability rules | |||
# Set false for private or local networks | |||
addr_book_strict = true | |||
# Maximum number of inbound peers | |||
max_num_inbound_peers = 40 | |||
# Maximum number of outbound peers to connect to, excluding persistent peers | |||
max_num_outbound_peers = 10 | |||
# List of node IDs, to which a connection will be (re)established ignoring any existing limits | |||
unconditional_peer_ids = "" | |||
# Maximum pause when redialing a persistent peer (if zero, exponential backoff is used) | |||
persistent_peers_max_dial_period = "0s" | |||
# Time to wait before flushing messages out on the connection | |||
flush_throttle_timeout = "100ms" | |||
# Maximum size of a message packet payload, in bytes | |||
max_packet_msg_payload_size = 1024 | |||
# Rate at which packets can be sent, in bytes/second | |||
send_rate = 5120000 | |||
# Rate at which packets can be received, in bytes/second | |||
recv_rate = 5120000 | |||
# Set true to enable the peer-exchange reactor | |||
pex = true | |||
# Seed mode, in which node constantly crawls the network and looks for | |||
# peers. If another node asks it for addresses, it responds and disconnects. | |||
# | |||
# Does not work if the peer-exchange reactor is disabled. | |||
seed_mode = false | |||
# Comma separated list of peer IDs to keep private (will not be gossiped to other peers) | |||
private_peer_ids = "" | |||
# Toggle to disable guard against peers connecting from the same ip. | |||
allow_duplicate_ip = false | |||
# Peer connection configuration. | |||
handshake_timeout = "20s" | |||
dial_timeout = "3s" | |||
####################################################### | |||
### Mempool Configurattion Option ### | |||
####################################################### | |||
[mempool] | |||
recheck = true | |||
broadcast = true | |||
wal_dir = "" | |||
# Maximum number of transactions in the mempool | |||
size = 5000 | |||
# Limit the total size of all txs in the mempool. | |||
# This only accounts for raw transactions (e.g. given 1MB transactions and | |||
# max_txs_bytes=5MB, mempool will only accept 5 transactions). | |||
max_txs_bytes = 1073741824 | |||
# Size of the cache (used to filter transactions we saw earlier) in transactions | |||
cache_size = 10000 | |||
# Maximum size of a single transaction. | |||
# NOTE: the max size of a tx transmitted over the network is {max_tx_bytes}. | |||
max_tx_bytes = 1048576 | |||
# Maximum size of a batch of transactions to send to a peer | |||
# Including space needed by encoding (one varint per transaction). | |||
max_batch_bytes = 10485760 | |||
####################################################### | |||
### State Sync Configuration Options ### | |||
####################################################### | |||
[statesync] | |||
# State sync rapidly bootstraps a new node by discovering, fetching, and restoring a state machine | |||
# snapshot from peers instead of fetching and replaying historical blocks. Requires some peers in | |||
# the network to take and serve state machine snapshots. State sync is not attempted if the node | |||
# has any local state (LastBlockHeight > 0). The node will have a truncated block history, | |||
# starting from the height of the snapshot. | |||
enable = false | |||
# RPC servers (comma-separated) for light client verification of the synced state machine and | |||
# retrieval of state data for node bootstrapping. Also needs a trusted height and corresponding | |||
# header hash obtained from a trusted source, and a period during which validators can be trusted. | |||
# | |||
# For Cosmos SDK-based chains, trust_period should usually be about 2/3 of the unbonding time (~2 | |||
# weeks) during which they can be financially punished (slashed) for misbehavior. | |||
rpc_servers = "" | |||
trust_height = 0 | |||
trust_hash = "" | |||
trust_period = "168h0m0s" | |||
# Time to spend discovering snapshots before initiating a restore. | |||
discovery_time = "15s" | |||
# Temporary directory for state sync snapshot chunks, defaults to the OS tempdir (typically /tmp). | |||
# Will create a new, randomly named directory within, and remove it when done. | |||
temp_dir = "" | |||
####################################################### | |||
### Fast Sync Configuration Connections ### | |||
####################################################### | |||
[fastsync] | |||
# Fast Sync version to use: | |||
# 1) "v0" (default) - the legacy fast sync implementation | |||
# 2) "v1" - refactor of v0 version for better testability | |||
# 2) "v2" - complete redesign of v0, optimized for testability & readability | |||
version = "v0" | |||
####################################################### | |||
### Consensus Configuration Options ### | |||
####################################################### | |||
[consensus] | |||
wal_file = "data/cs.wal/wal" | |||
# How long we wait for a proposal block before prevoting nil | |||
timeout_propose = "3s" | |||
# How much timeout_propose increases with each round | |||
timeout_propose_delta = "500ms" | |||
# How long we wait after receiving +2/3 prevotes for “anything” (ie. not a single block or nil) | |||
timeout_prevote = "1s" | |||
# How much the timeout_prevote increases with each round | |||
timeout_prevote_delta = "500ms" | |||
# How long we wait after receiving +2/3 precommits for “anything” (ie. not a single block or nil) | |||
timeout_precommit = "1s" | |||
# How much the timeout_precommit increases with each round | |||
timeout_precommit_delta = "500ms" | |||
# How long we wait after committing a block, before starting on the new | |||
# height (this gives us a chance to receive some more precommits, even | |||
# though we already have +2/3). | |||
timeout_commit = "1s" | |||
# How many blocks to look back to check existence of the node's consensus votes before joining consensus | |||
# When non-zero, the node will panic upon restart | |||
# if the same consensus key was used to sign {double_sign_check_height} last blocks. | |||
# So, validators should stop the state machine, wait for some blocks, and then restart the state machine to avoid panic. | |||
double_sign_check_height = 0 | |||
# Make progress as soon as we have all the precommits (as if TimeoutCommit = 0) | |||
skip_timeout_commit = false | |||
# EmptyBlocks mode and possible interval between empty blocks | |||
create_empty_blocks = true | |||
create_empty_blocks_interval = "0s" | |||
# Reactor sleep duration parameters | |||
peer_gossip_sleep_duration = "100ms" | |||
peer_query_maj23_sleep_duration = "2s" | |||
####################################################### | |||
### Transaction Indexer Configuration Options ### | |||
####################################################### | |||
[tx_index] | |||
# What indexer to use for transactions | |||
# | |||
# The application will set which txs to index. In some cases a node operator will be able | |||
# to decide which txs to index based on configuration set in the application. | |||
# | |||
# Options: | |||
# 1) "null" | |||
# 2) "kv" (default) - the simplest possible indexer, backed by key-value storage (defaults to levelDB; see DBBackend). | |||
# - When "kv" is chosen "tx.height" and "tx.hash" will always be indexed. | |||
indexer = "kv" | |||
####################################################### | |||
### Instrumentation Configuration Options ### | |||
####################################################### | |||
[instrumentation] | |||
# When true, Prometheus metrics are served under /metrics on | |||
# PrometheusListenAddr. | |||
# Check out the documentation for the list of available metrics. | |||
prometheus = false | |||
# Address to listen for Prometheus collector(s) connections | |||
prometheus_listen_addr = ":26660" | |||
# Maximum number of simultaneous connections. | |||
# If you want to accept a larger number than the default, make sure | |||
# you increase your OS limits. | |||
# 0 - unlimited. | |||
max_open_connections = 3 | |||
# Instrumentation namespace | |||
namespace = "tendermint" | |||
``` | |||
## Empty blocks VS no empty blocks | |||
### create_empty_blocks = true | |||
If `create_empty_blocks` is set to `true` in your config, blocks will be | |||
created ~ every second (with default consensus parameters). You can regulate | |||
the delay between blocks by changing the `timeout_commit`. E.g. `timeout_commit = "10s"` should result in ~ 10 second blocks. | |||
### create_empty_blocks = false | |||
In this setting, blocks are created when transactions received. | |||
Note after the block H, Tendermint creates something we call a "proof block" | |||
(only if the application hash changed) H+1. The reason for this is to support | |||
proofs. If you have a transaction in block H that changes the state to X, the | |||
new application hash will only be included in block H+1. If after your | |||
transaction is committed, you want to get a light-client proof for the new state | |||
(X), you need the new block to be committed in order to do that because the new | |||
block has the new application hash for the state X. That's why we make a new | |||
(empty) block if the application hash changes. Otherwise, you won't be able to | |||
make a proof for the new state. | |||
Plus, if you set `create_empty_blocks_interval` to something other than the | |||
default (`0`), Tendermint will be creating empty blocks even in the absence of | |||
transactions every `create_empty_blocks_interval`. For instance, with | |||
`create_empty_blocks = false` and `create_empty_blocks_interval = "30s"`, | |||
Tendermint will only create blocks if there are transactions, or after waiting | |||
30 seconds without receiving any transactions. | |||
## Consensus timeouts explained | |||
There's a variety of information about timeouts in [Running in | |||
production](./running-in-production.md) | |||
You can also find more detailed technical explanation in the spec: [The latest | |||
gossip on BFT consensus](https://arxiv.org/abs/1807.04938). | |||
```toml | |||
[consensus] | |||
... | |||
timeout_propose = "3s" | |||
timeout_propose_delta = "500ms" | |||
timeout_prevote = "1s" | |||
timeout_prevote_delta = "500ms" | |||
timeout_precommit = "1s" | |||
timeout_precommit_delta = "500ms" | |||
timeout_commit = "1s" | |||
``` | |||
Note that in a successful round, the only timeout that we absolutely wait no | |||
matter what is `timeout_commit`. | |||
Here's a brief summary of the timeouts: | |||
- `timeout_propose` = how long we wait for a proposal block before prevoting | |||
nil | |||
- `timeout_propose_delta` = how much timeout_propose increases with each round | |||
- `timeout_prevote` = how long we wait after receiving +2/3 prevotes for | |||
anything (ie. not a single block or nil) | |||
- `timeout_prevote_delta` = how much the timeout_prevote increases with each | |||
round | |||
- `timeout_precommit` = how long we wait after receiving +2/3 precommits for | |||
anything (ie. not a single block or nil) | |||
- `timeout_precommit_delta` = how much the timeout_precommit increases with | |||
each round | |||
- `timeout_commit` = how long we wait after committing a block, before starting | |||
on the new height (this gives us a chance to receive some more precommits, | |||
even though we already have +2/3) | |||
## P2P settings | |||
This section will cover settings within the p2p section of the `config.toml`. | |||
- `external_address` = is the address that will be advertised for other nodes to use. We recommend setting this field with your public IP and p2p port. | |||
- `seeds` = is a list of comma separated seed nodes that you will connect upon a start and ask for peers. A seed node is a node that does not participate in consensus but only helps propagate peers to nodes in the networks | |||
- `persistent_peers` = is a list of comma separated peers that you will always want to be connected to. If you're already connected to the maximum number of peers, persistent peers will not be added. | |||
- `max_num_inbound_peers` = is the maximum number of peers you will accept inbound connections from at one time (where they dial your address and initiate the connection). | |||
- `max_num_outbound_peers` = is the maximum number of peers you will initiate outbound connects to at one time (where you dial their address and initiate the connection). | |||
- `unconditional_peer_ids` = is similar to `persistent_peers` except that these peers will be connected to even if you are already connected to the maximum number of peers. This can be a validator node ID on your sentry node. | |||
- `pex` = turns the peer exchange reactor on or off. Validator node will want the `pex` turned off so it would not begin gossiping to unknown peers on the network. PeX can also be turned off for statically configured networks with fixed network connectivity. For full nodes on open, dynamic networks, it should be turned on. | |||
- `seed_mode` = is used for when node operators want to run their node as a seed node. Seed node's run a variation of the PeX protocol that disconnects from peers after sending them a list of peers to connect to. To minimize the servers usage, it is recommended to set the mempool's size to 0. | |||
- `private_peer_ids` = is a comma separated list of node ids that you would not like exposed to other peers (ie. you will not tell other peers about the private_peer_ids). This can be filled with a validators node id. | |||
This file has moved to the [node_operators section](../node_operators/configuration.md). |
@ -1,60 +1,7 @@ | |||
--- | |||
order: 5 | |||
order: false | |||
--- | |||
# Metrics | |||
Tendermint can report and serve the Prometheus metrics, which in their turn can | |||
be consumed by Prometheus collector(s). | |||
This functionality is disabled by default. | |||
To enable the Prometheus metrics, set `instrumentation.prometheus=true` if your | |||
config file. Metrics will be served under `/metrics` on 26660 port by default. | |||
Listen address can be changed in the config file (see | |||
`instrumentation.prometheus\_listen\_addr`). | |||
## List of available metrics | |||
The following metrics are available: | |||
| **Name** | **Type** | **Tags** | **Description** | | |||
| -------------------------------------- | --------- | ------------- | ---------------------------------------------------------------------- | | |||
| consensus_height | Gauge | | Height of the chain | | |||
| consensus_validators | Gauge | | Number of validators | | |||
| consensus_validators_power | Gauge | | Total voting power of all validators | | |||
| consensus_validator_power | Gauge | | Voting power of the node if in the validator set | | |||
| consensus_validator_last_signed_height | Gauge | | Last height the node signed a block, if the node is a validator | | |||
| consensus_validator_missed_blocks | Gauge | | Total amount of blocks missed for the node, if the node is a validator | | |||
| consensus_missing_validators | Gauge | | Number of validators who did not sign | | |||
| consensus_missing_validators_power | Gauge | | Total voting power of the missing validators | | |||
| consensus_byzantine_validators | Gauge | | Number of validators who tried to double sign | | |||
| consensus_byzantine_validators_power | Gauge | | Total voting power of the byzantine validators | | |||
| consensus_block_interval_seconds | Histogram | | Time between this and last block (Block.Header.Time) in seconds | | |||
| consensus_rounds | Gauge | | Number of rounds | | |||
| consensus_num_txs | Gauge | | Number of transactions | | |||
| consensus_total_txs | Gauge | | Total number of transactions committed | | |||
| consensus_block_parts | counter | peer_id | number of blockparts transmitted by peer | | |||
| consensus_latest_block_height | gauge | | /status sync_info number | | |||
| consensus_fast_syncing | gauge | | either 0 (not fast syncing) or 1 (syncing) | | |||
| consensus_state_syncing | gauge | | either 0 (not state syncing) or 1 (syncing) | | |||
| consensus_block_size_bytes | Gauge | | Block size in bytes | | |||
| p2p_peers | Gauge | | Number of peers node's connected to | | |||
| p2p_peer_receive_bytes_total | counter | peer_id, chID | number of bytes per channel received from a given peer | | |||
| p2p_peer_send_bytes_total | counter | peer_id, chID | number of bytes per channel sent to a given peer | | |||
| p2p_peer_pending_send_bytes | gauge | peer_id | number of pending bytes to be sent to a given peer | | |||
| p2p_num_txs | gauge | peer_id | number of transactions submitted by each peer_id | | |||
| p2p_pending_send_bytes | gauge | peer_id | amount of data pending to be sent to peer | | |||
| mempool_size | Gauge | | Number of uncommitted transactions | | |||
| mempool_tx_size_bytes | histogram | | transaction sizes in bytes | | |||
| mempool_failed_txs | counter | | number of failed transactions | | |||
| mempool_recheck_times | counter | | number of transactions rechecked in the mempool | | |||
| state_block_processing_time | histogram | | time between BeginBlock and EndBlock in ms | | |||
## Useful queries | |||
Percentage of missing + byzantine validators: | |||
```md | |||
((consensus\_byzantine\_validators\_power + consensus\_missing\_validators\_power) / consensus\_validators\_power) * 100 | |||
``` | |||
This file has moved to the [node_operators section](../node_operators/metrics.md). |
@ -1,114 +1,7 @@ | |||
--- | |||
order: 6 | |||
order: false | |||
--- | |||
# Validators | |||
Validators are responsible for committing new blocks in the blockchain. | |||
These validators participate in the consensus protocol by broadcasting | |||
_votes_ which contain cryptographic signatures signed by each | |||
validator's private key. | |||
Some Proof-of-Stake consensus algorithms aim to create a "completely" | |||
decentralized system where all stakeholders (even those who are not | |||
always available online) participate in the committing of blocks. | |||
Tendermint has a different approach to block creation. Validators are | |||
expected to be online, and the set of validators is permissioned/curated | |||
by some external process. Proof-of-stake is not required, but can be | |||
implemented on top of Tendermint consensus. That is, validators may be | |||
required to post collateral on-chain, off-chain, or may not be required | |||
to post any collateral at all. | |||
Validators have a cryptographic key-pair and an associated amount of | |||
"voting power". Voting power need not be the same. | |||
## Becoming a Validator | |||
There are two ways to become validator. | |||
1. They can be pre-established in the [genesis state](./using-tendermint.md#genesis) | |||
2. The ABCI app responds to the EndBlock message with changes to the | |||
existing validator set. | |||
## Setting up a Validator | |||
When setting up a validator there are countless ways to configure your setup. This guide is aimed at showing one of them, the sentry node design. This design is mainly for DDOS prevention. | |||
### Network Layout | |||
![ALT Network Layout](./sentry_layout.png) | |||
The diagram is based on AWS, other cloud providers will have similar solutions to design a solution. Running nodes is not limited to cloud providers, you can run nodes on bare metal systems as well. The architecture will be the same no matter which setup you decide to go with. | |||
The proposed network diagram is similar to the classical backend/frontend separation of services in a corporate environment. The “backend” in this case is the private network of the validator in the data center. The data center network might involve multiple subnets, firewalls and redundancy devices, which is not detailed on this diagram. The important point is that the data center allows direct connectivity to the chosen cloud environment. Amazon AWS has “Direct Connect”, while Google Cloud has “Partner Interconnect”. This is a dedicated connection to the cloud provider (usually directly to your virtual private cloud instance in one of the regions). | |||
All sentry nodes (the “frontend”) connect to the validator using this private connection. The validator does not have a public IP address to provide its services. | |||
Amazon has multiple availability zones within a region. One can install sentry nodes in other regions too. In this case the second, third and further regions need to have a private connection to the validator node. This can be achieved by VPC Peering (“VPC Network Peering” in Google Cloud). In this case, the second, third and further region sentry nodes will be directed to the first region and through the direct connect to the data center, arriving to the validator. | |||
A more persistent solution (not detailed on the diagram) is to have multiple direct connections to different regions from the data center. This way VPC Peering is not mandatory, although still beneficial for the sentry nodes. This overcomes the risk of depending on one region. It is more costly. | |||
### Local Configuration | |||
![ALT Local Configuration](./local_config.png) | |||
The validator will only talk to the sentry that are provided, the sentry nodes will communicate to the validator via a secret connection and the rest of the network through a normal connection. The sentry nodes do have the option of communicating with each other as well. | |||
When initializing nodes there are five parameters in the `config.toml` that may need to be altered. | |||
- `pex:` boolean. This turns the peer exchange reactor on or off for a node. When `pex=false`, only the `persistent_peers` list is available for connection. | |||
- `persistent_peers:` a comma separated list of `nodeID@ip:port` values that define a list of peers that are expected to be online at all times. This is necessary at first startup because by setting `pex=false` the node will not be able to join the network. | |||
- `unconditional_peer_ids:` comma separated list of nodeID's. These nodes will be connected to no matter the limits of inbound and outbound peers. This is useful for when sentry nodes have full address books. | |||
- `private_peer_ids:` comma separated list of nodeID's. These nodes will not be gossiped to the network. This is an important field as you do not want your validator IP gossiped to the network. | |||
- `addr_book_strict:` boolean. By default nodes with a routable address will be considered for connection. If this setting is turned off (false), non-routable IP addresses, like addresses in a private network can be added to the address book. | |||
- `double_sign_check_height` int64 height. How many blocks to look back to check existence of the node's consensus votes before joining consensus When non-zero, the node will panic upon restart if the same consensus key was used to sign {double_sign_check_height} last blocks. So, validators should stop the state machine, wait for some blocks, and then restart the state machine to avoid panic. | |||
#### Validator Node Configuration | |||
| Config Option | Setting | | |||
| ------------------------ | -------------------------- | | |||
| pex | false | | |||
| persistent_peers | list of sentry nodes | | |||
| private_peer_ids | none | | |||
| unconditional_peer_ids | optionally sentry node IDs | | |||
| addr_book_strict | false | | |||
| double_sign_check_height | 10 | | |||
The validator node should have `pex=false` so it does not gossip to the entire network. The persistent peers will be your sentry nodes. Private peers can be left empty as the validator is not trying to hide who it is communicating with. Setting unconditional peers is optional for a validator because they will not have a full address books. | |||
#### Sentry Node Configuration | |||
| Config Option | Setting | | |||
| ---------------------- | --------------------------------------------- | | |||
| pex | true | | |||
| persistent_peers | validator node, optionally other sentry nodes | | |||
| private_peer_ids | validator node ID | | |||
| unconditional_peer_ids | validator node ID, optionally sentry node IDs | | |||
| addr_book_strict | false | | |||
The sentry nodes should be able to talk to the entire network hence why `pex=true`. The persistent peers of a sentry node will be the validator, and optionally other sentry nodes. The sentry nodes should make sure that they do not gossip the validator's ip, to do this you must put the validators nodeID as a private peer. The unconditional peer IDs will be the validator ID and optionally other sentry nodes. | |||
> Note: Do not forget to secure your node's firewalls when setting them up. | |||
More Information can be found at these links: | |||
- <https://kb.certus.one/> | |||
- <https://forum.cosmos.network/t/sentry-node-architecture-overview/454> | |||
### Validator keys | |||
Protecting a validator's consensus key is the most important factor to take in when designing your setup. The key that a validator is given upon creation of the node is called a consensus key, it has to be online at all times in order to vote on blocks. It is **not recommended** to merely hold your private key in the default json file (`priv_validator_key.json`). Fortunately, the [Interchain Foundation](https://interchain.io/) has worked with a team to build a key management server for validators. You can find documentation on how to use it [here](https://github.com/iqlusioninc/tmkms), it is used extensively in production. You are not limited to using this tool, there are also [HSMs](https://safenet.gemalto.com/data-encryption/hardware-security-modules-hsms/), there is not a recommended HSM. | |||
Currently Tendermint uses [Ed25519](https://ed25519.cr.yp.to/) keys which are widely supported across the security sector and HSMs. | |||
## Committing a Block | |||
> **+2/3 is short for "more than 2/3"** | |||
A block is committed when +2/3 of the validator set sign [precommit | |||
votes](https://github.com/tendermint/spec/blob/953523c3cb99fdb8c8f7a2d21e3a99094279e9de/spec/blockchain/blockchain.md#vote) for that block at the same `round`. | |||
The +2/3 set of precommit votes is called a | |||
[_commit_](https://github.com/tendermint/spec/blob/953523c3cb99fdb8c8f7a2d21e3a99094279e9de/spec/blockchain/blockchain.md#commit). While any +2/3 set of | |||
precommits for the same block at the same height&round can serve as | |||
validation, the canonical commit is included in the next block (see | |||
[LastCommit](https://github.com/tendermint/spec/blob/953523c3cb99fdb8c8f7a2d21e3a99094279e9de/spec/blockchain/blockchain.md#lastcommit)). | |||
This file has moved to the [node_operators section](../node_operators/validators.md). |
@ -0,0 +1,77 @@ | |||
--- | |||
order: 2 | |||
--- | |||
# Debug Like A Pro | |||
## Intro | |||
Tendermint Core is a fairly robust BFT replication engine. Unfortunately, as with other software, failures sometimes do happen. The question is then “what do you do” when the system deviates from the expected behavior. | |||
The first response is usually to take a look at the logs. By default, Tendermint writes logs to standard output ¹. | |||
```sh | |||
I[2020-05-29|03:03:16.145] Committed state module=state height=2282 txs=0 appHash=0A27BC6B0477A8A50431704D2FB90DB99CBFCB67A2924B5FBF6D4E78538B67C1I[2020-05-29|03:03:21.690] Executed block module=state height=2283 validTxs=0 invalidTxs=0I[2020-05-29|03:03:21.698] Committed state module=state height=2283 txs=0 appHash=EB4E409D3AF4095A0757C806BF160B3DE4047AC0416F584BFF78FC0D44C44BF3I[2020-05-29|03:03:27.994] Executed block module=state height=2284 validTxs=0 invalidTxs=0I[2020-05-29|03:03:28.003] Committed state module=state height=2284 txs=0 appHash=3FC9237718243A2CAEE3A8B03AE05E1FC3CA28AEFE8DF0D3D3DCE00D87462866E[2020-05-29|03:03:32.975] enterPrevote: ProposalBlock is invalid module=consensus height=2285 round=0 err="wrong signature (#35): C683341000384EA00A345F9DB9608292F65EE83B51752C0A375A9FCFC2BD895E0792A0727925845DC13BA0E208C38B7B12B2218B2FE29B6D9135C53D7F253D05" | |||
``` | |||
If you’re running a validator in production, it might be a good idea to forward the logs for analysis using filebeat or similar tools. Also, you can set up a notification in case of any errors. | |||
The logs should give you the basic idea of what has happened. In the worst-case scenario, the node has stalled and does not produce any logs (or simply panicked). | |||
The next step is to call /status, /net_info, /consensus_state and /dump_consensus_state RPC endpoints. | |||
```sh | |||
curl http://<server>:26657/status$ curl http://<server>:26657/net_info$ curl http://<server>:26657/consensus_state$ curl http://<server>:26657/dump_consensus_state | |||
``` | |||
Please note that /consensus_state and /dump_consensus_state may not return a result if the node has stalled (since they try to get a hold of the consensus mutex). | |||
The output of these endpoints contains all the information needed for developers to understand the state of the node. It will give you an idea if the node is lagging behind the network, how many peers it’s connected to, and what the latest consensus state is. | |||
At this point, if the node is stalled and you want to restart it, the best thing you can do is to kill it with -6 signal: | |||
```sh | |||
kill -6 <PID> | |||
``` | |||
which will dump the list of the currently running goroutines. The list is super useful when debugging a deadlock. | |||
`PID` is the Tendermint’s process ID. You can find it out by running `ps -a | grep tendermint | awk ‘{print $1}’` | |||
## Tendermint debug kill | |||
To ease the burden of collecting different pieces of data Tendermint Core (since v0.33 version) provides the Tendermint debug kill tool, which will do all of the above steps for you, wrapping everything into a nice archive file. | |||
```sh | |||
tendermint debug kill <pid> </path/to/out.zip> — home=</path/to/app.d> | |||
``` | |||
Here’s the official documentation page — <https://docs.tendermint.com/master/tools/debugging.html> | |||
If you’re using a process supervisor, like systemd, it will restart the Tendermint automatically. We strongly advise you to have one in production. If not, you will need to restart the node by hand. | |||
Another advantage of using Tendermint debug is that the same archive file can be given to Tendermint Core developers, in cases where you think there’s a software issue. | |||
## Tendermint debug dump | |||
Okay, but what if the node has not stalled, but its state is degrading over time? Tendermint debug dump to the rescue! | |||
```sh | |||
tendermint debug dump </path/to/out> — home=</path/to/app.d> | |||
``` | |||
It won’t kill the node, but it will gather all of the above data and package it into an archive file. Plus, it will also make a heap dump, which should help if Tendermint is leaking memory. | |||
At this point, depending on how severe the degradation is, you may want to restart the process. | |||
## Outro | |||
We’re hoping that the `tendermint debug` subcommand will become de facto the first response to any accidents. | |||
Let us know what your experience has been so far! Have you had a chance to try `tendermint debug` yet? | |||
Join our chat, where we discuss the current issues and future improvements. | |||
— | |||
[1]: Of course, you’re free to redirect the Tendermint’s output to a file or forward it to another server. |