This document attempts to capture and discuss some of the areas of Tendermint that seem to be cited as causing performance issue. I'm hoping to continue to gather feedback and input on this document to better understand what issues Tendermint performance may cause for our users.
The overall goal of this document is to allow the maintainers and community to get a better sense of these issues and to be more capably able to discuss them and weight trade-offs about any proposed performance-focused changes. This document does not aim to propose any performance improvements. It does suggest useful places for benchmarks and places where additional metrics would be useful for diagnosing and further understanding Tendermint performance.
Please comment with areas where my reasoning seems off or with additional areas that Tendermint performance may be causing user pain.
Communication in Tendermint among consensus nodes, applications, and operator
tools all use different message formats and transport mechanisms. In some
cases there are multiple options. Having all these options complicates both the
code and the developer experience, and hides bugs. To support a more robust,
trustworthy, and usable system, we should document which communication paths
are essential, which could be removed or reduced in scope, and what we can
improve for the most important use cases.
This document proposes a variety of possible improvements of varying size and
scope. Specific design proposals should get their own documentation.
The responses from node RPCs encode hash values as hexadecimal strings. This
behaviour is stipulated in our OpenAPI documentation. In some cases, however,
hashes received as JSON parameters were being decoded as byte buffers, as is
the convention for JSON.
This resulted in the confusing situation that a hash reported by one request
(e.g., broadcast_tx_commit) could not be passed as a parameter to another
(e.g., tx) via JSON, without translating the hex-encoded output hash into the
base64 encoding used by JSON for opaque bytes.
Fixes#6802.
The 0.35 release cycle renamed the 'fastsync' functionality to 'blocksync'. This change brings the configuration parameters in line with that change. Namely, it updates the configuration file `[fastsync]` field to be `[blocksync]` and changes the command line flag and config file parameters `--fast-sync` and `fast-sync` to `--enable-block-sync` and `enable-block-sync` respectively.
Error messages were added to help users encountering these changes be able to quickly make the needed update to their files/scripts.
When using the old command line argument for fast-sync, the following is printed
```
./build/tendermint start --proxy-app=kvstore --consensus.create-empty-blocks=false --fast-sync=false
ERROR: invalid argument "false" for "--fast-sync" flag: --fast-sync has been deprecated, please use --enable-block-sync
```
When using one of the old config file parameters, the following is printed:
```
./build/tendermint start --proxy-app=kvstore --consensus.create-empty-blocks=false
ERROR: error in config file: a configuration parameter named 'fast-sync' was found in the configuration file. The 'fast-sync' parameter has been renamed to 'enable-block-sync', please update the 'fast-sync' field in your configuration file to 'enable-block-sync'
```
Update the schema and implementation of the Postgres event indexer to improve
certain types of queries against the index. These changes address the use cases
raised by #6843, and are partly inspired by the prototype schema in that issue.
In the old schema, events were flattened, making it difficult to find all the events
associated with a particular block or transaction. In addition, events with no key/value
attributes were entirely lost, since entries were generated only for attributes.
To address these issues, this new schema records blocks, transactions, events,
and attributes in separate tables, and provides views that join these tables to
give a more convenient query surface for block and transaction events.
- All events for a given block can be queried from the `block_events` view.
- All events for a given transaction can be queried from the `tx_events` view.
- Multiple events for the same key can be indexed for both blocks and transactions.
The tests have been reworked, but all of the existing test cases for the old schema
still pass with the new implementation. Various other minor cleanups are included,
ADR-065 is also updated to reflect the updated schema.
An explicit exit prevents the deferred cleanup code from running. In this case,
falling off the end of main will achieve the same goal as an explicit exit.
EDIT: Updated, see [comment below]( https://github.com/tendermint/tendermint/pull/6785#issuecomment-897793175)
This change adds a sketch of the `Debug` mode.
This change adds a `Debug` struct to the node package. This `Debug` struct is intended to be created and started by a command in the `cmd` directory. The `Debug` struct runs the RPC server on the data directories: both the state store and the block store.
This change required a good deal of refactoring. Namely, a new `rpc.go` file was added to the `node` package. This file encapsulates functions for starting RPC servers used by nodes. A potential additional change is to further factor this code into shared code _in_ the `rpc` package.
Minor API tweaks were also made that seemed appropriate such as the mechanism for fetching routes from the `rpc/core` package.
Additional work is required to register the `Debug` service as a command in the `cmd` directory but I am looking for feedback on if this direction seems appropriate before diving much further.
closes: #5908
This ADR restores a variation of the old Request for Comments documentation
that we previously used. The proposal differs from the original formulation,
and does not replace ADRs.
closes#2498
solves part of #3365
Note: difficult to test the event emit in SwitchToFastSync part, might need to change `stateSyncReactor` to an interface in the `nodeImpl` struct
Per conversations earlier today, we'll consider all proposed implementation changes part of the ADR process rather than the RFC process (which will remain, for now, on the spec; this may get incorporated instead into the burgeoning "CIPS" process).
This change renames RFC 1 to ADR 66, leaving space for the not-yet-merged ADR 65.
* add time warping lunatic attack test
* create too high and connecton refused errors and add to the light client provider
* add height check to provider
* introduce block lag
* add detection logic for processing forward lunatic attack
* add node-side verification logic
* clean up tests and formatting
* update adr's
* update testing
* fix fetching the latest block
* format
* update changelog
* implement suggestions
* modify ADR's
* format
* clean up node evidence verification