tendermint

Commit Graph

Author	SHA1	Message	Date
M. J. Fromberger	ab1788b922	Fix incorrect tests using the PSQL sink. (#7349 ) Some of our tests were creating a psql event sink and expecting it to report (or not report) certain kinds of errors. These tests were ill-founded in a couple of ways: 1. Tests that required the Postgres driver were not loading it. This led to spurious successes on tests that wanted "some error" from the sink constructor, but didn't exercise the right path. 2. Tests that wanted a Postgres sink to succeed without a database. These tests "passed" because they weren't actually establishing a connection to the database, but if they had would have failed for the lack of one. To fix this: - Load the postgres driver in tests that need it. - Verify connectivity before reporting successful creation of a PSQL event sink. - Remove tests that wanted a psql sink without a database, since that case is already tested elsewhere.	3 years ago
Sam Kleinman	070445bc10	pex: improve goroutine lifecycle (#7343 ) I saw a race detected in a test here that I think would be better handled by just wiring up these threads.	3 years ago
Sam Kleinman	4af2dbd03b	eventbus: plumb contexts (#7337 ) * eventbus: plumb contexts * fix lint	3 years ago
M. J. Fromberger	1dca1a8f97	Performance improvements for the event query API (#7319 ) Rework the implementation of event query parsing and execution to improve performance and reduce memory usage. Previous memory and CPU profiles of the pubsub service showed query processing as a significant hotspot. While we don't have evidence that this is visibly hurting users, fixing it is fairly easy and self-contained. Updates #6439. Typical benchmark results comparing the original implementation (PEG) with the reworked implementation (Custom): ``` TEST TIME/OP BYTES/OP ALLOCS/OP SPEEDUP MEM SAVING BenchmarkParsePEG-12 51716 ns 526832 27 BenchmarkParseCustom-12 2167 ns 4616 17 23.8x 99.1% BenchmarkMatchPEG-12 3086 ns 1097 22 BenchmarkMatchCustom-12 294.2 ns 64 3 10.5x 94.1% ``` Components: * Add a basic parsing benchmark. * Move the original query implementation to a subdirectory. * Add lexical scanner for Query expressions. * Add a parser for Query expressions. * Implement query compiler. * Add test cases based on OpenAPI examples. * Add MustCompile to replace the original MustParse, and update usage.	3 years ago
Sam Kleinman	7e58f02eb8	service: remove quit method (#7293 )	3 years ago
Sam Kleinman	6ab62fe7b6	service: remove stop method and use contexts (#7292 )	3 years ago
Sam Kleinman	d7606777cf	libs/service: pass logger explicitly (#7288 ) This is a very small change, but removes a method from the `service.Service` interface (a win!) and forces callers to explicitly pass loggers in to objects during construction rather than (later) injecting them. There's not a real need for this kind of lazy construction of loggers, and I think a decent potential for confusion for mutable loggers. The main concern I have is that this changes the constructor API for ABCI clients. I think this is fine, and I suspect that as we plumb contexts through, and make changes to the RPC services there'll be a number of similar sorts of changes to various (quasi) public interfaces, which I think we should welcome.	3 years ago
Sam Kleinman	2a455be46c	libs/os: remove arbitrary os.Exit (#7284 ) I think calling os.Exit at arbitrary points is _bad_ and is good to delete. I think panics in the case of data courruption have a chance of providing useful information.	3 years ago
Sam Kleinman	a15ae5b53a	node+consensus: handshaker initialization (#7283 ) This mostly just pushes more of initialization out of the node package.	3 years ago
Sam Kleinman	27560cf7a4	p2p: reduce peer score for dial failures (#7265 ) When dialing fails to succeed we should reduce the score of the peer, which puts the peer at (potentially) greater chances of being removed from the peer manager, and reduces the chance of the peer being gossiped by the PEX reactor.	3 years ago
William Banfield	4acd117b5e	evidence: remove source of non-determinism from test (#7266 ) The evidence test produces a set of mock evidence in the evidence pool of the 'Primary' node. The test then fills the evidence pools of secondaries with half of this mock evidence. Finally, the test waits until the secondary has an evidence pool as full as the primary. The assertions that are removed here were checking that the primary and secondaries' evidence channels were empty. However, nothing in the test actually ensures that the channels are empty. The test only waits for the secondaries to have received the complete set of evidence, and the secondaries already received half of the evidence at the beginning. It's more than possible that the secondaries can receive the complete set of evidence and not finish reading the duplicate evidence off the channels.	3 years ago
M. J. Fromberger	9dc3d7f9a2	Set a cap on the length of subscription queries. (#7263 ) As a safety measure, don't allow a query string to be unreasonably long. The query filter is not especially efficient, so a query that needs more than basic detail should filter coarsely in the subscriber and refine on the client side. This affects Subscribe and TxSearch queries.	3 years ago
Callum Waters	b3b90f820c	consensus: add some more checks to vote counting (#7253 )	3 years ago
M. J. Fromberger	d5865af1f4	Add basic metrics to the indexer package. (#7250 ) This follows the same model as we did in the p2p package. Rework the indexer service constructor to take a struct of arguments, that makes it easier to construct the optional settings. Deprecate but do not remove the existing constructor. Clean up node initialization a little bit.	3 years ago
M. J. Fromberger	54d7030510	pubsub: Move indexing out of the primary subscription path (#7231 ) This is part of the work described by #7156. Remove "unbuffered subscriptions" from the pubsub service. Replace them with a dedicated blocking "observer" mechanism. Use the observer mechanism for indexing. Add a SubscribeWithArgs method and deprecate the old Subscribe method. Remove SubscribeUnbuffered entirely (breaking). Rework the Subscription interface to eliminate exposed channels. Subscriptions now use a context to manage lifecycle notifications. Internalize the eventbus package.	3 years ago
Sam Kleinman	63192ac300	consensus: remove stale WAL benchmark (#7194 )	3 years ago
M. J. Fromberger	d32913c889	pubsub: Use a dynamic queue for buffered subscriptions (#7177 ) Updates #7156, and a follow-up to #7070. Event subscriptions in Tendermint currently use a fixed-length Go channel as a queue. When the channel fills up, the publisher immediately terminates the subscription. This prevents slow subscribers from creating memory pressure on the node by not servicing their queue fast enough. Replace the buffered channel used to deliver events to buffered subscribers with an explicit queue. The queue provides a soft quota and burst credit mechanism: Clients that usually keep up can survive occasional bursts, without allowing truly slow clients to hog resources indefinitely.	3 years ago
Sam Kleinman	5cc980698a	mempool: consoldate implementations (#7171 ) * mempool: consoldate implementations * update chagelog * fix test * Apply suggestions from code review Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com> * cleanup locking comments * context twiddle * migrate away from deprecated ioutil APIs (#7175) Co-authored-by: Callum Waters <cmwaters19@gmail.com> Co-authored-by: M. J. Fromberger <fromberger@interchain.io> Co-authored-by: M. J. Fromberger <michael.j.fromberger@gmail.com> Co-authored-by: Callum Waters <cmwaters19@gmail.com> Co-authored-by: M. J. Fromberger <fromberger@interchain.io>	3 years ago
Sharad Chand	8441b3715a	migrate away from deprecated ioutil APIs (#7175 ) Co-authored-by: Callum Waters <cmwaters19@gmail.com> Co-authored-by: M. J. Fromberger <fromberger@interchain.io>	3 years ago
Sam Kleinman	e2a103a315	mempool: port reactor tests from legacy implementation (#7162 )	3 years ago
Sam Kleinman	93eb940dcd	config: WriteConfigFile should return error (#7169 )	3 years ago
Sam Kleinman	4bd8c5ab6f	p2p: transport should be captive resposibility of router (#7160 ) The main (and minor) win of this PR is that the transport is fully the responsibility of the router and the node doesn't need to be responsible for its lifecylce.	3 years ago
Sam Kleinman	b15b2c1b78	flowrate: cleanup unused files (#7158 ) I saw one of these tests fail and it looks like it was using code that wasn't being called anywhere, so I deleted it, and avoided the package name aliasing.	3 years ago
William Banfield	b4bc6bb4e8	p2p: add message type into the send/recv bytes metrics (#7155 ) This pull request adds a new "mesage_type" label to the send/recv bytes metrics calculated in the p2p code. Below is a snippet of the updated metrics that includes the updated label: ``` tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="2551a13ed720101b271a5df4816d1e4b3d3bd133"} 652 tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="4b1068420ef739db63377250553562b9a978708a"} 631 tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_HasVote",peer_id="927c50a5e508c747830ce3ba64a3f70fdda58ef2"} 631 tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="2551a13ed720101b271a5df4816d1e4b3d3bd133"} 393 tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="4b1068420ef739db63377250553562b9a978708a"} 357 tendermint_p2p_peer_receive_bytes_total{chID="32",chain_id="ci",message_type="consensus_NewRoundStep",peer_id="927c50a5e508c747830ce3ba64a3f70fdda58ef2"} 386 ```	3 years ago
Sam Kleinman	23be048294	p2p: use correct transport configuration (#7152 )	3 years ago
Callum Waters	68ca65f5d7	pex: remove legacy proto messages (#7147 ) This PR implements the proto changes made in https://github.com/tendermint/spec/pull/352, removing the legacy messages that were used in the pex reactor.	3 years ago
Callum Waters	a8ff617773	state: add height assertion to rollback function (#7143 )	3 years ago
William Banfield	b0130c88fb	mempool: remove panic when recheck-tx was not sent to ABCI application (#7134 ) This pull request fixes a panic that exists in both mempools. The panic occurs when the ABCI client misses a response from the ABCI application. This happen when the ABCI client drops the request as a result of a full client queue. The fix here was to loop through the ordered list of recheck-tx in the callback until one matches the currently observed recheck request.	3 years ago
Sam Kleinman	ca8f004112	p2p: remove final shims from p2p package (#7136 ) This is, perhaps, the trival final piece of #7075 that I've been working on. There's more work to be done: - push more of the setup into the pacakges themselves - move channel-based sending/filtering out of the - simplify the buffering throuhgout the p2p stack.	3 years ago
Sam Kleinman	7143f14a63	p2p: simplify open channel interface (#7133 ) A fourth #7075 component patch to simplify the channel creation interface	3 years ago
Sam Kleinman	cbe6ad6cd5	p2p: flatten channel descriptor (#7132 )	3 years ago
Sam Kleinman	0900ea8396	p2p: channel shim cleanup (#7129 )	3 years ago
Sam Kleinman	f4a56f4034	p2p: refactor channel description (#7130 ) This is another small sliver of #7075, with the intention of removing the legacy shim layer related to channel registration.	3 years ago
Marko	66a11fe527	blocksync: remove v0 folder structure (#7128 ) Remove v0 blocksync folder structure.	3 years ago
Jared Zhou	b95c261981	rpc: fix typo in broadcast commit (#7124 )	3 years ago
M. J. Fromberger	86f00135dd	rpc: Remove the deprecated gRPC interface to the RPC service (#7121 ) This change removes the partial gRPC interface to the RPC service, which was deprecated in resolution of #6718. Details: - rpc: Remove the client and server interfaces and proto definitions. - Remove the gRPC settings from the config library. - Remove gRPC setup for the RPC service in the node startup. - Fix various test helpers to remove gRPC bits. - Remove the --rpc.grpc-laddr flag from the CLI. Note that to satisfy the protobuf interface check, this change also includes a temporary edit to buf.yaml, that I will revert after this is merged.	3 years ago
William Banfield	ff7b0e638e	p2p: fix priority queue bytes pending calculation (#7120 ) This metric describes itself as 'pending' but never actual decrements when the messages are removed from the queue. This change fixes that by decrementing the metric when the data is removed from the queue.	3 years ago
William Banfield	36a1acff52	internal/proxy: add initial set of abci metrics (#7115 ) This PR adds an initial set of metrics for use ABCI. The initial metrics enable the calculation of timing histograms and call counts for each of the ABCI methods. The metrics are also labeled as either 'sync' or 'async' to determine if the method call was performed using ABCI's `*Async` methods. An example of these metrics is included here for reference: ``` tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.0001"} 0 tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.0004"} 5 tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.002"} 12 tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.009"} 13 tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.02"} 13 tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.1"} 13 tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="0.65"} 13 tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="2"} 13 tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="6"} 13 tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="25"} 13 tendermint_abci_connection_method_timing_bucket{chain_id="ci",method="commit",type="sync",le="+Inf"} 13 tendermint_abci_connection_method_timing_sum{chain_id="ci",method="commit",type="sync"} 0.007802058000000001 tendermint_abci_connection_method_timing_count{chain_id="ci",method="commit",type="sync"} 13 ``` These metrics can easily be graphed using prometheus's `histogram_quantile(...)` method to pick out a particular quantile to graph or examine. I chose buckets that were somewhat of an estimate of expected range of times for ABCI operations. They start at .0001 seconds and range to 25 seconds. The hope is that this range captures enough possible times to be useful for us and operators.	3 years ago
Sam Kleinman	4781d04d18	node: always close database engine (#7113 )	3 years ago
Sam Kleinman	34a3fcd8fc	Revert "abci: change client to use multi-reader mutexes (#6306 )" (#7106 ) This reverts commit `1c4dbe30d4`.	3 years ago
Sam Kleinman	ded310093e	lint: fix collection of stale errors (#7090 ) Few things that had been annoying.	3 years ago
Sam Kleinman	3646b635d3	p2p, types: remove legacy NetAddress type (#7084 )	3 years ago
Callum Waters	59404003ee	p2p: rename pexV2 to pex (#7088 )	3 years ago
Sam Kleinman	1b5bb5348f	p2p: cleanup unused arguments (#7079 ) This is mostly just reading through the output of uparam, after noticing that there were a few places where we were ignoring some arguments.	3 years ago
Callum Waters	4ca130d226	cli: allow node operator to rollback last state (#7033 )	3 years ago
Sam Kleinman	5bf30bb049	p2p: cleanup transport interface (#7071 ) This is another batch of things to cleanup in the legacy P2P system.	3 years ago
Sam Kleinman	851d2e3bde	mempool,rpc: add removetx rpc method (#7047 ) Addresses one of the concerns with #7041. Provides a mechanism (via the RPC interface) to delete a single transaction, described by its hash, from the mempool. The method returns an error if the transaction cannot be found. Once the transaction is removed it remains in the cache and cannot be resubmitted until the cache is cleared or it expires from the cache.	3 years ago
Sam Kleinman	3ea81bfaa7	p2p: remove wdrr queue (#7064 ) This code hasn't been battle tested, and seems to have grown increasingly flaky int tests. Given our general direction of reducing queue complexity over the next couple of releases I think it makes sense to remove it.	3 years ago
Sam Kleinman	03ad7d6f20	p2p: delete legacy stack initial pass (#7035 ) A few notes: - this is not all the deletion that we can do, but this is the most "simple" case: it leaves in shims, and there's some trivial additional cleanup to the transport that can happen but that requires writing more code, and I wanted this to be easy to review above all else. - This should land after we cut the branch for 0.35, but I'm anticipating that to happen soon, and I wanted to run this through CI.	3 years ago
William Banfield	f5b9c210ca	consensus: wait until peerUpdates channel is closed to close remaining peers (#7058 ) The race occurred as a result of a goroutine launched by `processPeerUpdate` racing with the `OnStop` method. The `processPeerUpdates` goroutine deletes from the map as `OnStop` is reading from it. This change updates the `OnStop` method to wait for the peer updates channel to be done before closing the peers. It also copies the map contents to a new map so that it will not conflict with the view of the map that the goroutine created in `processPeerUpdate` sees.	3 years ago

... 3 4 5 6 7

341 Commits (c33be0a4106fcefe77bf67eef020a3f32932341f)