You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

377 lines
13 KiB

  1. # ADR 065: Custom Event Indexing
  2. - [ADR 065: Custom Event Indexing](#adr-065-custom-event-indexing)
  3. - [Changelog](#changelog)
  4. - [Status](#status)
  5. - [Context](#context)
  6. - [Alternative Approaches](#alternative-approaches)
  7. - [Decision](#decision)
  8. - [Detailed Design](#detailed-design)
  9. - [EventSink](#eventsink)
  10. - [Supported Sinks](#supported-sinks)
  11. - [`KVEventSink`](#kveventsink)
  12. - [`PSQLEventSink`](#psqleventsink)
  13. - [Configuration](#configuration)
  14. - [Future Improvements](#future-improvements)
  15. - [Consequences](#consequences)
  16. - [Positive](#positive)
  17. - [Negative](#negative)
  18. - [Neutral](#neutral)
  19. - [References](#references)
  20. ## Changelog
  21. - April 1, 2021: Initial Draft (@alexanderbez)
  22. - April 28, 2021: Specify search capabilities are only supported through the KV indexer (@marbar3778)
  23. - May 19, 2021: Update the SQL schema and the eventsink interface (@jayt106)
  24. ## Status
  25. Accepted
  26. ## Context
  27. Currently, Tendermint Core supports block and transaction event indexing through
  28. the `tx_index.indexer` configuration. Events are captured in transactions and
  29. are indexed via a `TxIndexer` type. Events are captured in blocks, specifically
  30. from `BeginBlock` and `EndBlock` application responses, and are indexed via a
  31. `BlockIndexer` type. Both of these types are managed by a single `IndexerService`
  32. which is responsible for consuming events and sending those events off to be
  33. indexed by the respective type.
  34. In addition to indexing, Tendermint Core also supports the ability to query for
  35. both indexed transaction and block events via Tendermint's RPC layer. The ability
  36. to query for these indexed events facilitates a great multitude of upstream client
  37. and application capabilities, e.g. block explorers, IBC relayers, and auxiliary
  38. data availability and indexing services.
  39. Currently, Tendermint only supports indexing via a `kv` indexer, which is supported
  40. by an underlying embedded key/value store database. The `kv` indexer implements
  41. its own indexing and query mechanisms. While the former is somewhat trivial,
  42. providing a rich and flexible query layer is not as trivial and has caused many
  43. issues and UX concerns for upstream clients and applications.
  44. The fragile nature of the proprietary `kv` query engine and the potential
  45. performance and scaling issues that arise when a large number of consumers are
  46. introduced, motivate the need for a more robust and flexible indexing and query
  47. solution.
  48. ## Alternative Approaches
  49. With regards to alternative approaches to a more robust solution, the only serious
  50. contender that was considered was to transition to using [SQLite](https://www.sqlite.org/index.html).
  51. While the approach would work, it locks us into a specific query language and
  52. storage layer, so in some ways it's only a bit better than our current approach.
  53. In addition, the implementation would require the introduction of CGO into the
  54. Tendermint Core stack, whereas right now CGO is only introduced depending on
  55. the database used.
  56. ## Decision
  57. We will adopt a similar approach to that of the Cosmos SDK's `KVStore` state
  58. listening described in [ADR-038](https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md).
  59. Namely, we will perform the following:
  60. - Introduce a new interface, `EventSink`, that all data sinks must implement.
  61. - Augment the existing `tx_index.indexer` configuration to now accept a series
  62. of one or more indexer types, i.e sinks.
  63. - Combine the current `TxIndexer` and `BlockIndexer` into a single `KVEventSink`
  64. that implements the `EventSink` interface.
  65. - Introduce an additional `EventSink` that is backed by [PostgreSQL](https://www.postgresql.org/).
  66. - Implement the necessary schemas to support both block and transaction event
  67. indexing.
  68. - Update `IndexerService` to use a series of `EventSinks`.
  69. - Proxy queries to the relevant sink's native query layer.
  70. - Update all relevant RPC methods.
  71. ## Detailed Design
  72. ### EventSink
  73. We introduce the `EventSink` interface type that all supported sinks must implement.
  74. The interface is defined as follows:
  75. ```go
  76. type EventSink interface {
  77. IndexBlockEvents(types.EventDataNewBlockHeader) error
  78. IndexTxEvents([]*abci.TxResult) error
  79. SearchBlockEvents(context.Context, *query.Query) ([]int64, error)
  80. SearchTxEvents(context.Context, *query.Query) ([]*abci.TxResult, error)
  81. GetTxByHash([]byte) (*abci.TxResult, error)
  82. HasBlock(int64) (bool, error)
  83. Type() EventSinkType
  84. Stop() error
  85. }
  86. ```
  87. The `IndexerService` will accept a list of one or more `EventSink` types. During
  88. the `OnStart` method it will call the appropriate APIs on each `EventSink` to
  89. index both block and transaction events.
  90. ### Supported Sinks
  91. We will initially support two `EventSink` types out of the box.
  92. #### `KVEventSink`
  93. This type of `EventSink` is a combination of the `TxIndexer` and `BlockIndexer`
  94. indexers, both of which are backed by a single embedded key/value database.
  95. A bulk of the existing business logic will remain the same, but the existing APIs
  96. mapped to the new `EventSink` API. Both types will be removed in favor of a single
  97. `KVEventSink` type.
  98. The `KVEventSink` will be the only `EventSink` enabled by default, so from a UX
  99. perspective, operators should not notice a difference apart from a configuration
  100. change.
  101. We omit `EventSink` implementation details as it should be fairly straightforward
  102. to map the existing business logic to the new APIs.
  103. #### `PSQLEventSink`
  104. This type of `EventSink` indexes block and transaction events into a [PostgreSQL](https://www.postgresql.org/).
  105. database. We define and automatically migrate the following schema when the
  106. `IndexerService` starts.
  107. The postgres eventsink will not support `tx_search`, `block_search`, `GetTxByHash` and `HasBlock`.
  108. ```sql
  109. -- Table Definition ----------------------------------------------
  110. CREATE TYPE block_event_type AS ENUM ('begin_block', 'end_block', '');
  111. CREATE TABLE block_events (
  112. id SERIAL PRIMARY KEY,
  113. key VARCHAR NOT NULL,
  114. value VARCHAR NOT NULL,
  115. height INTEGER NOT NULL,
  116. type block_event_type,
  117. created_at TIMESTAMPTZ NOT NULL,
  118. chain_id VARCHAR NOT NULL
  119. );
  120. CREATE TABLE tx_results (
  121. id SERIAL PRIMARY KEY,
  122. tx_result BYTEA NOT NULL,
  123. created_at TIMESTAMPTZ NOT NULL
  124. );
  125. CREATE TABLE tx_events (
  126. id SERIAL PRIMARY KEY,
  127. key VARCHAR NOT NULL,
  128. value VARCHAR NOT NULL,
  129. height INTEGER NOT NULL,
  130. hash VARCHAR NOT NULL,
  131. tx_result_id SERIAL,
  132. created_at TIMESTAMPTZ NOT NULL,
  133. chain_id VARCHAR NOT NULL,
  134. FOREIGN KEY (tx_result_id)
  135. REFERENCES tx_results(id)
  136. ON DELETE CASCADE
  137. );
  138. -- Indices -------------------------------------------------------
  139. CREATE INDEX idx_block_events_key_value ON block_events(key, value);
  140. CREATE INDEX idx_tx_events_key_value ON tx_events(key, value);
  141. CREATE INDEX idx_tx_events_hash ON tx_events(hash);
  142. ```
  143. The `PSQLEventSink` will implement the `EventSink` interface as follows
  144. (some details omitted for brevity):
  145. ```go
  146. func NewPSQLEventSink(connStr string, chainID string) (*PSQLEventSink, error) {
  147. db, err := sql.Open("postgres", connStr)
  148. if err != nil {
  149. return nil, err
  150. }
  151. // ...
  152. }
  153. func (es *PSQLEventSink) IndexBlockEvents(h types.EventDataNewBlockHeader) error {
  154. sqlStmt := sq.Insert("block_events").Columns("key", "value", "height", "type", "created_at", "chain_id")
  155. // index the reserved block height index
  156. ts := time.Now()
  157. sqlStmt = sqlStmt.Values(types.BlockHeightKey, h.Header.Height, h.Header.Height, "", ts, es.chainID)
  158. for _, event := range h.ResultBeginBlock.Events {
  159. // only index events with a non-empty type
  160. if len(event.Type) == 0 {
  161. continue
  162. }
  163. for _, attr := range event.Attributes {
  164. if len(attr.Key) == 0 {
  165. continue
  166. }
  167. // index iff the event specified index:true and it's not a reserved event
  168. compositeKey := fmt.Sprintf("%s.%s", event.Type, string(attr.Key))
  169. if compositeKey == types.BlockHeightKey {
  170. return fmt.Errorf("event type and attribute key \"%s\" is reserved; please use a different key", compositeKey)
  171. }
  172. if attr.GetIndex() {
  173. sqlStmt = sqlStmt.Values(compositeKey, string(attr.Value), h.Header.Height, BlockEventTypeBeginBlock, ts, es.chainID)
  174. }
  175. }
  176. }
  177. // index end_block events...
  178. // execute sqlStmt db query...
  179. }
  180. func (es *PSQLEventSink) IndexTxEvents(txr []*abci.TxResult) error {
  181. sqlStmtEvents := sq.Insert("tx_events").Columns("key", "value", "height", "hash", "tx_result_id", "created_at", "chain_id")
  182. sqlStmtTxResult := sq.Insert("tx_results").Columns("tx_result", "created_at")
  183. ts := time.Now()
  184. for _, tx := range txr {
  185. // store the tx result
  186. txBz, err := proto.Marshal(tx)
  187. if err != nil {
  188. return err
  189. }
  190. sqlStmtTxResult = sqlStmtTxResult.Values(txBz, ts)
  191. // execute sqlStmtTxResult db query...
  192. var txID uint32
  193. err = sqlStmtTxResult.QueryRow().Scan(&txID)
  194. if err != nil {
  195. return err
  196. }
  197. // index the reserved height and hash indices
  198. hash := types.Tx(tx.Tx).Hash()
  199. sqlStmtEvents = sqlStmtEvents.Values(types.TxHashKey, hash, tx.Height, hash, txID, ts, es.chainID)
  200. sqlStmtEvents = sqlStmtEvents.Values(types.TxHeightKey, tx.Height, tx.Height, hash, txID, ts, es.chainID)
  201. for _, event := range result.Result.Events {
  202. // only index events with a non-empty type
  203. if len(event.Type) == 0 {
  204. continue
  205. }
  206. for _, attr := range event.Attributes {
  207. if len(attr.Key) == 0 {
  208. continue
  209. }
  210. // index if `index: true` is set
  211. compositeTag := fmt.Sprintf("%s.%s", event.Type, string(attr.Key))
  212. // ensure event does not conflict with a reserved prefix key
  213. if compositeTag == types.TxHashKey || compositeTag == types.TxHeightKey {
  214. return fmt.Errorf("event type and attribute key \"%s\" is reserved; please use a different key", compositeTag)
  215. }
  216. if attr.GetIndex() {
  217. sqlStmtEvents = sqlStmtEvents.Values(compositeKey, string(attr.Value), tx.Height, hash, txID, ts, es.chainID)
  218. }
  219. }
  220. }
  221. }
  222. // execute sqlStmtEvents db query...
  223. }
  224. func (es *PSQLEventSink) SearchBlockEvents(ctx context.Context, q *query.Query) ([]int64, error) {
  225. return nil, errors.New("block search is not supported via the postgres event sink")
  226. }
  227. func (es *PSQLEventSink) SearchTxEvents(ctx context.Context, q *query.Query) ([]*abci.TxResult, error) {
  228. return nil, errors.New("tx search is not supported via the postgres event sink")
  229. }
  230. func (es *PSQLEventSink) GetTxByHash(hash []byte) (*abci.TxResult, error) {
  231. return nil, errors.New("getTxByHash is not supported via the postgres event sink")
  232. }
  233. func (es *PSQLEventSink) HasBlock(h int64) (bool, error) {
  234. return false, errors.New("hasBlock is not supported via the postgres event sink")
  235. }
  236. ```
  237. ### Configuration
  238. The current `tx_index.indexer` configuration would be changed to accept a list
  239. of supported `EventSink` types instead of a single value.
  240. Example:
  241. ```toml
  242. [tx_index]
  243. indexer = [
  244. "kv",
  245. "psql"
  246. ]
  247. ```
  248. If the `indexer` list contains the `null` indexer, then no indexers will be used
  249. regardless of what other values may exist.
  250. Additional configuration parameters might be required depending on what event
  251. sinks are supplied to `tx_index.indexer`. The `psql` will require an additional
  252. connection configuration.
  253. ```toml
  254. [tx_index]
  255. indexer = [
  256. "kv",
  257. "psql"
  258. ]
  259. pqsql_conn = "postgresql://<user>:<password>@<host>:<port>/<db>?<opts>"
  260. ```
  261. Any invalid or misconfigured `tx_index` configuration should yield an error as
  262. early as possible.
  263. ## Future Improvements
  264. Although not technically required to maintain feature parity with the current
  265. existing Tendermint indexer, it would be beneficial for operators to have a method
  266. of performing a "re-index". Specifically, Tendermint operators could invoke an
  267. RPC method that allows the Tendermint node to perform a re-indexing of all block
  268. and transaction events between two given heights, H<sub>1</sub> and H<sub>2</sub>,
  269. so long as the block store contains the blocks and transaction results for all
  270. the heights specified in a given range.
  271. ## Consequences
  272. ### Positive
  273. - A more robust and flexible indexing and query engine for indexing and search
  274. block and transaction events.
  275. - The ability to not have to support a custom indexing and query engine beyond
  276. the legacy `kv` type.
  277. - The ability to offload/proxy indexing and querying to the underling sink.
  278. - Scalability and reliability that essentially comes "for free" from the underlying
  279. sink, if it supports it.
  280. ### Negative
  281. - The need to support multiple and potentially a growing set of custom `EventSink`
  282. types.
  283. ### Neutral
  284. ## References
  285. - [Cosmos SDK ADR-038](https://github.com/cosmos/cosmos-sdk/blob/master/docs/architecture/adr-038-state-listening.md)
  286. - [PostgreSQL](https://www.postgresql.org/)
  287. - [SQLite](https://www.sqlite.org/index.html)