You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

405 lines
14 KiB

  1. # ADR 065: Custom Event Indexing
  2. - [ADR 065: Custom Event Indexing](#adr-065-custom-event-indexing)
  3. - [Changelog](#changelog)
  4. - [Status](#status)
  5. - [Context](#context)
  6. - [Alternative Approaches](#alternative-approaches)
  7. - [Decision](#decision)
  8. - [Detailed Design](#detailed-design)
  9. - [EventSink](#eventsink)
  10. - [Supported Sinks](#supported-sinks)
  11. - [`KVEventSink`](#kveventsink)
  12. - [`PSQLEventSink`](#psqleventsink)
  13. - [Configuration](#configuration)
  14. - [Future Improvements](#future-improvements)
  15. - [Consequences](#consequences)
  16. - [Positive](#positive)
  17. - [Negative](#negative)
  18. - [Neutral](#neutral)
  19. - [References](#references)
  20. ## Changelog
  21. - April 1, 2021: Initial Draft (@alexanderbez)
  22. - April 28, 2021: Specify search capabilities are only supported through the KV indexer (@marbar3778)
  23. - May 19, 2021: Update the SQL schema and the eventsink interface (@jayt106)
  24. - Aug 30, 2021: Update the SQL schema and the psql implementation (@creachadair)
  25. ## Status
  26. Accepted
  27. ## Context
  28. Currently, Tendermint Core supports block and transaction event indexing through
  29. the `tx_index.indexer` configuration. Events are captured in transactions and
  30. are indexed via a `TxIndexer` type. Events are captured in blocks, specifically
  31. from `BeginBlock` and `EndBlock` application responses, and are indexed via a
  32. `BlockIndexer` type. Both of these types are managed by a single `IndexerService`
  33. which is responsible for consuming events and sending those events off to be
  34. indexed by the respective type.
  35. In addition to indexing, Tendermint Core also supports the ability to query for
  36. both indexed transaction and block events via Tendermint's RPC layer. The ability
  37. to query for these indexed events facilitates a great multitude of upstream client
  38. and application capabilities, e.g. block explorers, IBC relayers, and auxiliary
  39. data availability and indexing services.
  40. Currently, Tendermint only supports indexing via a `kv` indexer, which is supported
  41. by an underlying embedded key/value store database. The `kv` indexer implements
  42. its own indexing and query mechanisms. While the former is somewhat trivial,
  43. providing a rich and flexible query layer is not as trivial and has caused many
  44. issues and UX concerns for upstream clients and applications.
  45. The fragile nature of the proprietary `kv` query engine and the potential
  46. performance and scaling issues that arise when a large number of consumers are
  47. introduced, motivate the need for a more robust and flexible indexing and query
  48. solution.
  49. ## Alternative Approaches
  50. With regards to alternative approaches to a more robust solution, the only serious
  51. contender that was considered was to transition to using [SQLite](https://www.sqlite.org/index.html).
  52. While the approach would work, it locks us into a specific query language and
  53. storage layer, so in some ways it's only a bit better than our current approach.
  54. In addition, the implementation would require the introduction of CGO into the
  55. Tendermint Core stack, whereas right now CGO is only introduced depending on
  56. the database used.
  57. ## Decision
  58. We will adopt a similar approach to that of the Cosmos SDK's `KVStore` state
  59. listening described in [ADR-038](https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md).
  60. Namely, we will perform the following:
  61. - Introduce a new interface, `EventSink`, that all data sinks must implement.
  62. - Augment the existing `tx_index.indexer` configuration to now accept a series
  63. of one or more indexer types, i.e sinks.
  64. - Combine the current `TxIndexer` and `BlockIndexer` into a single `KVEventSink`
  65. that implements the `EventSink` interface.
  66. - Introduce an additional `EventSink` that is backed by [PostgreSQL](https://www.postgresql.org/).
  67. - Implement the necessary schemas to support both block and transaction event
  68. indexing.
  69. - Update `IndexerService` to use a series of `EventSinks`.
  70. - Proxy queries to the relevant sink's native query layer.
  71. - Update all relevant RPC methods.
  72. ## Detailed Design
  73. ### EventSink
  74. We introduce the `EventSink` interface type that all supported sinks must implement.
  75. The interface is defined as follows:
  76. ```go
  77. type EventSink interface {
  78. IndexBlockEvents(types.EventDataNewBlockHeader) error
  79. IndexTxEvents([]*abci.TxResult) error
  80. SearchBlockEvents(context.Context, *query.Query) ([]int64, error)
  81. SearchTxEvents(context.Context, *query.Query) ([]*abci.TxResult, error)
  82. GetTxByHash([]byte) (*abci.TxResult, error)
  83. HasBlock(int64) (bool, error)
  84. Type() EventSinkType
  85. Stop() error
  86. }
  87. ```
  88. The `IndexerService` will accept a list of one or more `EventSink` types. During
  89. the `OnStart` method it will call the appropriate APIs on each `EventSink` to
  90. index both block and transaction events.
  91. ### Supported Sinks
  92. We will initially support two `EventSink` types out of the box.
  93. #### `KVEventSink`
  94. This type of `EventSink` is a combination of the `TxIndexer` and `BlockIndexer`
  95. indexers, both of which are backed by a single embedded key/value database.
  96. A bulk of the existing business logic will remain the same, but the existing APIs
  97. mapped to the new `EventSink` API. Both types will be removed in favor of a single
  98. `KVEventSink` type.
  99. The `KVEventSink` will be the only `EventSink` enabled by default, so from a UX
  100. perspective, operators should not notice a difference apart from a configuration
  101. change.
  102. We omit `EventSink` implementation details as it should be fairly straightforward
  103. to map the existing business logic to the new APIs.
  104. #### `PSQLEventSink`
  105. This type of `EventSink` indexes block and transaction events into a [PostgreSQL](https://www.postgresql.org/).
  106. database. We define and automatically migrate the following schema when the
  107. `IndexerService` starts.
  108. The postgres eventsink will not support `tx_search`, `block_search`, `GetTxByHash` and `HasBlock`.
  109. ```sql
  110. -- Table Definition ----------------------------------------------
  111. -- The blocks table records metadata about each block.
  112. -- The block record does not include its events or transactions (see tx_results).
  113. CREATE TABLE blocks (
  114. rowid BIGSERIAL PRIMARY KEY,
  115. height BIGINT NOT NULL,
  116. chain_id VARCHAR NOT NULL,
  117. -- When this block header was logged into the sink, in UTC.
  118. created_at TIMESTAMPTZ NOT NULL,
  119. UNIQUE (height, chain_id)
  120. );
  121. -- Index blocks by height and chain, since we need to resolve block IDs when
  122. -- indexing transaction records and transaction events.
  123. CREATE INDEX idx_blocks_height_chain ON blocks(height, chain_id);
  124. -- The tx_results table records metadata about transaction results. Note that
  125. -- the events from a transaction are stored separately.
  126. CREATE TABLE tx_results (
  127. rowid BIGSERIAL PRIMARY KEY,
  128. -- The block to which this transaction belongs.
  129. block_id BIGINT NOT NULL REFERENCES blocks(rowid),
  130. -- The sequential index of the transaction within the block.
  131. index INTEGER NOT NULL,
  132. -- When this result record was logged into the sink, in UTC.
  133. created_at TIMESTAMPTZ NOT NULL,
  134. -- The hex-encoded hash of the transaction.
  135. tx_hash VARCHAR NOT NULL,
  136. -- The protobuf wire encoding of the TxResult message.
  137. tx_result BYTEA NOT NULL,
  138. UNIQUE (block_id, index)
  139. );
  140. -- The events table records events. All events (both block and transaction) are
  141. -- associated with a block ID; transaction events also have a transaction ID.
  142. CREATE TABLE events (
  143. rowid BIGSERIAL PRIMARY KEY,
  144. -- The block and transaction this event belongs to.
  145. -- If tx_id is NULL, this is a block event.
  146. block_id BIGINT NOT NULL REFERENCES blocks(rowid),
  147. tx_id BIGINT NULL REFERENCES tx_results(rowid),
  148. -- The application-defined type label for the event.
  149. type VARCHAR NOT NULL
  150. );
  151. -- The attributes table records event attributes.
  152. CREATE TABLE attributes (
  153. event_id BIGINT NOT NULL REFERENCES events(rowid),
  154. key VARCHAR NOT NULL, -- bare key
  155. composite_key VARCHAR NOT NULL, -- composed type.key
  156. value VARCHAR NULL,
  157. UNIQUE (event_id, key)
  158. );
  159. -- A joined view of events and their attributes. Events that do not have any
  160. -- attributes are represented as a single row with empty key and value fields.
  161. CREATE VIEW event_attributes AS
  162. SELECT block_id, tx_id, type, key, composite_key, value
  163. FROM events LEFT JOIN attributes ON (events.rowid = attributes.event_id);
  164. -- A joined view of all block events (those having tx_id NULL).
  165. CREATE VIEW block_events AS
  166. SELECT blocks.rowid as block_id, height, chain_id, type, key, composite_key, value
  167. FROM blocks JOIN event_attributes ON (blocks.rowid = event_attributes.block_id)
  168. WHERE event_attributes.tx_id IS NULL;
  169. -- A joined view of all transaction events.
  170. CREATE VIEW tx_events AS
  171. SELECT height, index, chain_id, type, key, composite_key, value, tx_results.created_at
  172. FROM blocks JOIN tx_results ON (blocks.rowid = tx_results.block_id)
  173. JOIN event_attributes ON (tx_results.rowid = event_attributes.tx_id)
  174. WHERE event_attributes.tx_id IS NOT NULL;
  175. ```
  176. The `PSQLEventSink` will implement the `EventSink` interface as follows
  177. (some details omitted for brevity):
  178. ```go
  179. func NewEventSink(connStr, chainID string) (*EventSink, error) {
  180. db, err := sql.Open(driverName, connStr)
  181. // ...
  182. return &EventSink{
  183. store: db,
  184. chainID: chainID,
  185. }, nil
  186. }
  187. func (es *EventSink) IndexBlockEvents(h types.EventDataNewBlockHeader) error {
  188. ts := time.Now().UTC()
  189. return runInTransaction(es.store, func(tx *sql.Tx) error {
  190. // Add the block to the blocks table and report back its row ID for use
  191. // in indexing the events for the block.
  192. blockID, err := queryWithID(tx, `
  193. INSERT INTO blocks (height, chain_id, created_at)
  194. VALUES ($1, $2, $3)
  195. ON CONFLICT DO NOTHING
  196. RETURNING rowid;
  197. `, h.Header.Height, es.chainID, ts)
  198. // ...
  199. // Insert the special block meta-event for height.
  200. if err := insertEvents(tx, blockID, 0, []abci.Event{
  201. makeIndexedEvent(types.BlockHeightKey, fmt.Sprint(h.Header.Height)),
  202. }); err != nil {
  203. return fmt.Errorf("block meta-events: %w", err)
  204. }
  205. // Insert all the block events. Order is important here,
  206. if err := insertEvents(tx, blockID, 0, h.ResultBeginBlock.Events); err != nil {
  207. return fmt.Errorf("begin-block events: %w", err)
  208. }
  209. if err := insertEvents(tx, blockID, 0, h.ResultEndBlock.Events); err != nil {
  210. return fmt.Errorf("end-block events: %w", err)
  211. }
  212. return nil
  213. })
  214. }
  215. func (es *EventSink) IndexTxEvents(txrs []*abci.TxResult) error {
  216. ts := time.Now().UTC()
  217. for _, txr := range txrs {
  218. // Encode the result message in protobuf wire format for indexing.
  219. resultData, err := proto.Marshal(txr)
  220. // ...
  221. // Index the hash of the underlying transaction as a hex string.
  222. txHash := fmt.Sprintf("%X", types.Tx(txr.Tx).Hash())
  223. if err := runInTransaction(es.store, func(tx *sql.Tx) error {
  224. // Find the block associated with this transaction.
  225. blockID, err := queryWithID(tx, `
  226. SELECT rowid FROM blocks WHERE height = $1 AND chain_id = $2;
  227. `, txr.Height, es.chainID)
  228. // ...
  229. // Insert a record for this tx_result and capture its ID for indexing events.
  230. txID, err := queryWithID(tx, `
  231. INSERT INTO tx_results (block_id, index, created_at, tx_hash, tx_result)
  232. VALUES ($1, $2, $3, $4, $5)
  233. ON CONFLICT DO NOTHING
  234. RETURNING rowid;
  235. `, blockID, txr.Index, ts, txHash, resultData)
  236. // ...
  237. // Insert the special transaction meta-events for hash and height.
  238. if err := insertEvents(tx, blockID, txID, []abci.Event{
  239. makeIndexedEvent(types.TxHashKey, txHash),
  240. makeIndexedEvent(types.TxHeightKey, fmt.Sprint(txr.Height)),
  241. }); err != nil {
  242. return fmt.Errorf("indexing transaction meta-events: %w", err)
  243. }
  244. // Index any events packaged with the transaction.
  245. if err := insertEvents(tx, blockID, txID, txr.Result.Events); err != nil {
  246. return fmt.Errorf("indexing transaction events: %w", err)
  247. }
  248. return nil
  249. }); err != nil {
  250. return err
  251. }
  252. }
  253. return nil
  254. }
  255. // SearchBlockEvents is not implemented by this sink, and reports an error for all queries.
  256. func (es *EventSink) SearchBlockEvents(ctx context.Context, q *query.Query) ([]int64, error)
  257. // SearchTxEvents is not implemented by this sink, and reports an error for all queries.
  258. func (es *EventSink) SearchTxEvents(ctx context.Context, q *query.Query) ([]*abci.TxResult, error)
  259. // GetTxByHash is not implemented by this sink, and reports an error for all queries.
  260. func (es *EventSink) GetTxByHash(hash []byte) (*abci.TxResult, error)
  261. // HasBlock is not implemented by this sink, and reports an error for all queries.
  262. func (es *EventSink) HasBlock(h int64) (bool, error)
  263. ```
  264. ### Configuration
  265. The current `tx_index.indexer` configuration would be changed to accept a list
  266. of supported `EventSink` types instead of a single value.
  267. Example:
  268. ```toml
  269. [tx_index]
  270. indexer = [
  271. "kv",
  272. "psql"
  273. ]
  274. ```
  275. If the `indexer` list contains the `null` indexer, then no indexers will be used
  276. regardless of what other values may exist.
  277. Additional configuration parameters might be required depending on what event
  278. sinks are supplied to `tx_index.indexer`. The `psql` will require an additional
  279. connection configuration.
  280. ```toml
  281. [tx_index]
  282. indexer = [
  283. "kv",
  284. "psql"
  285. ]
  286. pqsql_conn = "postgresql://<user>:<password>@<host>:<port>/<db>?<opts>"
  287. ```
  288. Any invalid or misconfigured `tx_index` configuration should yield an error as
  289. early as possible.
  290. ## Future Improvements
  291. Although not technically required to maintain feature parity with the current
  292. existing Tendermint indexer, it would be beneficial for operators to have a method
  293. of performing a "re-index". Specifically, Tendermint operators could invoke an
  294. RPC method that allows the Tendermint node to perform a re-indexing of all block
  295. and transaction events between two given heights, H<sub>1</sub> and H<sub>2</sub>,
  296. so long as the block store contains the blocks and transaction results for all
  297. the heights specified in a given range.
  298. ## Consequences
  299. ### Positive
  300. - A more robust and flexible indexing and query engine for indexing and search
  301. block and transaction events.
  302. - The ability to not have to support a custom indexing and query engine beyond
  303. the legacy `kv` type.
  304. - The ability to offload/proxy indexing and querying to the underling sink.
  305. - Scalability and reliability that essentially comes "for free" from the underlying
  306. sink, if it supports it.
  307. ### Negative
  308. - The need to support multiple and potentially a growing set of custom `EventSink`
  309. types.
  310. ### Neutral
  311. ## References
  312. - [Cosmos SDK ADR-038](https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md)
  313. - [PostgreSQL](https://www.postgresql.org/)
  314. - [SQLite](https://www.sqlite.org/index.html)