You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

626 lines
19 KiB

10 years ago
7 years ago
10 years ago
7 years ago
11 years ago
7 years ago
11 years ago
8 years ago
7 years ago
7 years ago
7 years ago
7 years ago
p2p: introduce peerConn to simplify peer creation (#1226) * expose AuthEnc in the P2P config if AuthEnc is true, dialed peers must have a node ID in the address and it must match the persistent pubkey from the secret handshake. Refs #1157 * fixes after my own review * fix docs * fix build failure ``` p2p/pex/pex_reactor_test.go:288:88: cannot use seed.NodeInfo().NetAddress() (type *p2p.NetAddress) as type string in array or slice literal ``` * p2p: introduce peerConn to simplify peer creation * Introduce `peerConn` containing the known fields of `peer` * `peer` only created in `sw.addPeer` once handshake is complete and NodeInfo is checked * Eliminates some mutable variables and makes the code flow better * Simplifies the `newXxxPeer` funcs * Use ID instead of PubKey where possible. * SetPubKeyFilter -> SetIDFilter * nodeInfo.Validate takes ID * remove peer.PubKey() * persistent node ids * fixes from review * test: use ip_plus_id.sh more * fix invalid memory panic during fast_sync test ``` 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: panic: runtime error: invalid memory address or nil pointer dereference 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x98dd3e] 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: goroutine 3432 [running]: 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: github.com/tendermint/tendermint/p2p.newOutboundPeerConn(0xc423fd1380, 0xc420933e00, 0x1, 0x1239a60, 0 xc420128c40, 0x2, 0x42caf6, 0xc42001f300, 0xc422831d98, 0xc4227951c0, ...) 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: #011/go/src/github.com/tendermint/tendermint/p2p/peer.go:123 +0x31e 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: github.com/tendermint/tendermint/p2p.(*Switch).addOutboundPeerWithConfig(0xc4200ad040, 0xc423fd1380, 0 xc420933e00, 0xc423f48801, 0x28, 0x2) 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: #011/go/src/github.com/tendermint/tendermint/p2p/switch.go:455 +0x12b 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: github.com/tendermint/tendermint/p2p.(*Switch).DialPeerWithAddress(0xc4200ad040, 0xc423fd1380, 0x1, 0x 0, 0x0) 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: #011/go/src/github.com/tendermint/tendermint/p2p/switch.go:371 +0xdc 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: github.com/tendermint/tendermint/p2p.(*Switch).reconnectToPeer(0xc4200ad040, 0x123e000, 0xc42007bb00) 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: #011/go/src/github.com/tendermint/tendermint/p2p/switch.go:290 +0x25f 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: created by github.com/tendermint/tendermint/p2p.(*Switch).StopPeerForError 2018-02-21T06:30:05Z box887.localdomain docker/local_testnet_4[14907]: #011/go/src/github.com/tendermint/tendermint/p2p/switch.go:256 +0x1b7 ```
7 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
8 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
8 years ago
8 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
8 years ago
11 years ago
8 years ago
11 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
8 years ago
8 years ago
11 years ago
8 years ago
8 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
8 years ago
11 years ago
11 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
8 years ago
8 years ago
10 years ago
8 years ago
new pubsub package comment out failing consensus tests for now rewrite rpc httpclient to use new pubsub package import pubsub as tmpubsub, query as tmquery make event IDs constants EventKey -> EventTypeKey rename EventsPubsub to PubSub mempool does not use pubsub rename eventsSub to pubsub new subscribe API fix channel size issues and consensus tests bugs refactor rpc client add missing discardFromChan method add mutex rename pubsub to eventBus remove IsRunning from WSRPCConnection interface (not needed) add a comment in broadcastNewRoundStepsAndVotes rename registerEventCallbacks to broadcastNewRoundStepsAndVotes See https://dave.cheney.net/2014/03/19/channel-axioms stop eventBuses after reactor tests remove unnecessary Unsubscribe return subscribe helper function move discardFromChan to where it is used subscribe now returns an err this gives us ability to refuse to subscribe if pubsub is at its max capacity. use context for control overflow cache queries handle err when subscribing in replay_test rename testClientID to testSubscriber extract var set channel buffer capacity to 1 in replay_file fix byzantine_test unsubscribe from single event, not all events refactor httpclient to return events to appropriate channels return failing testReplayCrashBeforeWriteVote test fix TestValidatorSetChanges refactor code a bit fix testReplayCrashBeforeWriteVote add comment fix TestValidatorSetChanges fixes from Bucky's review update comment [ci skip] test TxEventBuffer update changelog fix TestValidatorSetChanges (2nd attempt) only do wg.Done when no errors benchmark event bus create pubsub server inside NewEventBus only expose config params (later if needed) set buffer capacity to 0 so we are not testing cache new tx event format: key = "Tx" plus a tag {"tx.hash": XYZ} This should allow to subscribe to all transactions! or a specific one using a query: "tm.events.type = Tx and tx.hash = '013ABF99434...'" use TimeoutCommit instead of afterPublishEventNewBlockTimeout TimeoutCommit is the time a node waits after committing a block, before it goes into the next height. So it will finish everything from the last block, but then wait a bit. The idea is this gives it time to hear more votes from other validators, to strengthen the commit it includes in the next block. But it also gives it time to hear about new transactions. waitForBlockWithUpdatedVals rewrite WAL crash tests Task: test that we can recover from any WAL crash. Solution: the old tests were relying on event hub being run in the same thread (we were injecting the private validator's last signature). when considering a rewrite, we considered two possible solutions: write a "fuzzy" testing system where WAL is crashing upon receiving a new message, or inject failures and trigger them in tests using something like https://github.com/coreos/gofail. remove sleep no cs.Lock around wal.Save test different cases (empty block, non-empty block, ...) comments add comments test 4 cases: empty block, non-empty block, non-empty block with smaller part size, many blocks fixes as per Bucky's last review reset subscriptions on UnsubscribeAll use a simple counter to track message for which we panicked also, set a smaller part size for all test cases
8 years ago
9 years ago
  1. package node
  2. import (
  3. "bytes"
  4. "encoding/json"
  5. "errors"
  6. "fmt"
  7. "net"
  8. "net/http"
  9. "strings"
  10. abci "github.com/tendermint/abci/types"
  11. crypto "github.com/tendermint/go-crypto"
  12. wire "github.com/tendermint/go-wire"
  13. cmn "github.com/tendermint/tmlibs/common"
  14. dbm "github.com/tendermint/tmlibs/db"
  15. "github.com/tendermint/tmlibs/log"
  16. bc "github.com/tendermint/tendermint/blockchain"
  17. cfg "github.com/tendermint/tendermint/config"
  18. cs "github.com/tendermint/tendermint/consensus"
  19. "github.com/tendermint/tendermint/evidence"
  20. mempl "github.com/tendermint/tendermint/mempool"
  21. "github.com/tendermint/tendermint/p2p"
  22. "github.com/tendermint/tendermint/p2p/pex"
  23. "github.com/tendermint/tendermint/p2p/trust"
  24. "github.com/tendermint/tendermint/proxy"
  25. rpccore "github.com/tendermint/tendermint/rpc/core"
  26. grpccore "github.com/tendermint/tendermint/rpc/grpc"
  27. rpc "github.com/tendermint/tendermint/rpc/lib"
  28. rpcserver "github.com/tendermint/tendermint/rpc/lib/server"
  29. sm "github.com/tendermint/tendermint/state"
  30. "github.com/tendermint/tendermint/state/txindex"
  31. "github.com/tendermint/tendermint/state/txindex/kv"
  32. "github.com/tendermint/tendermint/state/txindex/null"
  33. "github.com/tendermint/tendermint/types"
  34. "github.com/tendermint/tendermint/version"
  35. _ "net/http/pprof"
  36. )
  37. //------------------------------------------------------------------------------
  38. // DBContext specifies config information for loading a new DB.
  39. type DBContext struct {
  40. ID string
  41. Config *cfg.Config
  42. }
  43. // DBProvider takes a DBContext and returns an instantiated DB.
  44. type DBProvider func(*DBContext) (dbm.DB, error)
  45. // DefaultDBProvider returns a database using the DBBackend and DBDir
  46. // specified in the ctx.Config.
  47. func DefaultDBProvider(ctx *DBContext) (dbm.DB, error) {
  48. dbType := dbm.DBBackendType(ctx.Config.DBBackend)
  49. return dbm.NewDB(ctx.ID, dbType, ctx.Config.DBDir()), nil
  50. }
  51. // GenesisDocProvider returns a GenesisDoc.
  52. // It allows the GenesisDoc to be pulled from sources other than the
  53. // filesystem, for instance from a distributed key-value store cluster.
  54. type GenesisDocProvider func() (*types.GenesisDoc, error)
  55. // DefaultGenesisDocProviderFunc returns a GenesisDocProvider that loads
  56. // the GenesisDoc from the config.GenesisFile() on the filesystem.
  57. func DefaultGenesisDocProviderFunc(config *cfg.Config) GenesisDocProvider {
  58. return func() (*types.GenesisDoc, error) {
  59. return types.GenesisDocFromFile(config.GenesisFile())
  60. }
  61. }
  62. // NodeProvider takes a config and a logger and returns a ready to go Node.
  63. type NodeProvider func(*cfg.Config, log.Logger) (*Node, error)
  64. // DefaultNewNode returns a Tendermint node with default settings for the
  65. // PrivValidator, ClientCreator, GenesisDoc, and DBProvider.
  66. // It implements NodeProvider.
  67. func DefaultNewNode(config *cfg.Config, logger log.Logger) (*Node, error) {
  68. return NewNode(config,
  69. types.LoadOrGenPrivValidatorFS(config.PrivValidatorFile()),
  70. proxy.DefaultClientCreator(config.ProxyApp, config.ABCI, config.DBDir()),
  71. DefaultGenesisDocProviderFunc(config),
  72. DefaultDBProvider,
  73. logger)
  74. }
  75. //------------------------------------------------------------------------------
  76. // Node is the highest level interface to a full Tendermint node.
  77. // It includes all configuration information and running services.
  78. type Node struct {
  79. cmn.BaseService
  80. // config
  81. config *cfg.Config
  82. genesisDoc *types.GenesisDoc // initial validator set
  83. privValidator types.PrivValidator // local node's validator key
  84. // network
  85. sw *p2p.Switch // p2p connections
  86. addrBook pex.AddrBook // known peers
  87. trustMetricStore *trust.TrustMetricStore // trust metrics for all peers
  88. // services
  89. eventBus *types.EventBus // pub/sub for services
  90. stateDB dbm.DB
  91. blockStore *bc.BlockStore // store the blockchain to disk
  92. bcReactor *bc.BlockchainReactor // for fast-syncing
  93. mempoolReactor *mempl.MempoolReactor // for gossipping transactions
  94. consensusState *cs.ConsensusState // latest consensus state
  95. consensusReactor *cs.ConsensusReactor // for participating in the consensus
  96. evidencePool *evidence.EvidencePool // tracking evidence
  97. proxyApp proxy.AppConns // connection to the application
  98. rpcListeners []net.Listener // rpc servers
  99. txIndexer txindex.TxIndexer
  100. indexerService *txindex.IndexerService
  101. }
  102. // NewNode returns a new, ready to go, Tendermint Node.
  103. func NewNode(config *cfg.Config,
  104. privValidator types.PrivValidator,
  105. clientCreator proxy.ClientCreator,
  106. genesisDocProvider GenesisDocProvider,
  107. dbProvider DBProvider,
  108. logger log.Logger) (*Node, error) {
  109. // Get BlockStore
  110. blockStoreDB, err := dbProvider(&DBContext{"blockstore", config})
  111. if err != nil {
  112. return nil, err
  113. }
  114. blockStore := bc.NewBlockStore(blockStoreDB)
  115. // Get State
  116. stateDB, err := dbProvider(&DBContext{"state", config})
  117. if err != nil {
  118. return nil, err
  119. }
  120. // Get genesis doc
  121. // TODO: move to state package?
  122. genDoc, err := loadGenesisDoc(stateDB)
  123. if err != nil {
  124. genDoc, err = genesisDocProvider()
  125. if err != nil {
  126. return nil, err
  127. }
  128. // save genesis doc to prevent a certain class of user errors (e.g. when it
  129. // was changed, accidentally or not). Also good for audit trail.
  130. saveGenesisDoc(stateDB, genDoc)
  131. }
  132. state, err := sm.LoadStateFromDBOrGenesisDoc(stateDB, genDoc)
  133. if err != nil {
  134. return nil, err
  135. }
  136. // Create the proxyApp, which manages connections (consensus, mempool, query)
  137. // and sync tendermint and the app by performing a handshake
  138. // and replaying any necessary blocks
  139. consensusLogger := logger.With("module", "consensus")
  140. handshaker := cs.NewHandshaker(stateDB, state, blockStore)
  141. handshaker.SetLogger(consensusLogger)
  142. proxyApp := proxy.NewAppConns(clientCreator, handshaker)
  143. proxyApp.SetLogger(logger.With("module", "proxy"))
  144. if err := proxyApp.Start(); err != nil {
  145. return nil, fmt.Errorf("Error starting proxy app connections: %v", err)
  146. }
  147. // reload the state (it may have been updated by the handshake)
  148. state = sm.LoadState(stateDB)
  149. // Decide whether to fast-sync or not
  150. // We don't fast-sync when the only validator is us.
  151. fastSync := config.FastSync
  152. if state.Validators.Size() == 1 {
  153. addr, _ := state.Validators.GetByIndex(0)
  154. if bytes.Equal(privValidator.GetAddress(), addr) {
  155. fastSync = false
  156. }
  157. }
  158. // Log whether this node is a validator or an observer
  159. if state.Validators.HasAddress(privValidator.GetAddress()) {
  160. consensusLogger.Info("This node is a validator", "addr", privValidator.GetAddress(), "pubKey", privValidator.GetPubKey())
  161. } else {
  162. consensusLogger.Info("This node is not a validator", "addr", privValidator.GetAddress(), "pubKey", privValidator.GetPubKey())
  163. }
  164. // Make MempoolReactor
  165. mempoolLogger := logger.With("module", "mempool")
  166. mempool := mempl.NewMempool(config.Mempool, proxyApp.Mempool(), state.LastBlockHeight)
  167. mempool.InitWAL() // no need to have the mempool wal during tests
  168. mempool.SetLogger(mempoolLogger)
  169. mempoolReactor := mempl.NewMempoolReactor(config.Mempool, mempool)
  170. mempoolReactor.SetLogger(mempoolLogger)
  171. if config.Consensus.WaitForTxs() {
  172. mempool.EnableTxsAvailable()
  173. }
  174. // Make Evidence Reactor
  175. evidenceDB, err := dbProvider(&DBContext{"evidence", config})
  176. if err != nil {
  177. return nil, err
  178. }
  179. evidenceLogger := logger.With("module", "evidence")
  180. evidenceStore := evidence.NewEvidenceStore(evidenceDB)
  181. evidencePool := evidence.NewEvidencePool(stateDB, evidenceStore)
  182. evidencePool.SetLogger(evidenceLogger)
  183. evidenceReactor := evidence.NewEvidenceReactor(evidencePool)
  184. evidenceReactor.SetLogger(evidenceLogger)
  185. blockExecLogger := logger.With("module", "state")
  186. // make block executor for consensus and blockchain reactors to execute blocks
  187. blockExec := sm.NewBlockExecutor(stateDB, blockExecLogger, proxyApp.Consensus(), mempool, evidencePool)
  188. // Make BlockchainReactor
  189. bcReactor := bc.NewBlockchainReactor(state.Copy(), blockExec, blockStore, fastSync)
  190. bcReactor.SetLogger(logger.With("module", "blockchain"))
  191. // Make ConsensusReactor
  192. consensusState := cs.NewConsensusState(config.Consensus, state.Copy(),
  193. blockExec, blockStore, mempool, evidencePool)
  194. consensusState.SetLogger(consensusLogger)
  195. if privValidator != nil {
  196. consensusState.SetPrivValidator(privValidator)
  197. }
  198. consensusReactor := cs.NewConsensusReactor(consensusState, fastSync)
  199. consensusReactor.SetLogger(consensusLogger)
  200. p2pLogger := logger.With("module", "p2p")
  201. sw := p2p.NewSwitch(config.P2P)
  202. sw.SetLogger(p2pLogger)
  203. sw.AddReactor("MEMPOOL", mempoolReactor)
  204. sw.AddReactor("BLOCKCHAIN", bcReactor)
  205. sw.AddReactor("CONSENSUS", consensusReactor)
  206. sw.AddReactor("EVIDENCE", evidenceReactor)
  207. // Optionally, start the pex reactor
  208. var addrBook pex.AddrBook
  209. var trustMetricStore *trust.TrustMetricStore
  210. if config.P2P.PexReactor {
  211. addrBook = pex.NewAddrBook(config.P2P.AddrBookFile(), config.P2P.AddrBookStrict)
  212. addrBook.SetLogger(p2pLogger.With("book", config.P2P.AddrBookFile()))
  213. // Get the trust metric history data
  214. trustHistoryDB, err := dbProvider(&DBContext{"trusthistory", config})
  215. if err != nil {
  216. return nil, err
  217. }
  218. trustMetricStore = trust.NewTrustMetricStore(trustHistoryDB, trust.DefaultConfig())
  219. trustMetricStore.SetLogger(p2pLogger)
  220. var seeds []string
  221. if config.P2P.Seeds != "" {
  222. seeds = strings.Split(config.P2P.Seeds, ",")
  223. }
  224. pexReactor := pex.NewPEXReactor(addrBook,
  225. &pex.PEXReactorConfig{Seeds: seeds, SeedMode: config.P2P.SeedMode})
  226. pexReactor.SetLogger(p2pLogger)
  227. sw.AddReactor("PEX", pexReactor)
  228. }
  229. // Filter peers by addr or pubkey with an ABCI query.
  230. // If the query return code is OK, add peer.
  231. // XXX: Query format subject to change
  232. if config.FilterPeers {
  233. // NOTE: addr is ip:port
  234. sw.SetAddrFilter(func(addr net.Addr) error {
  235. resQuery, err := proxyApp.Query().QuerySync(abci.RequestQuery{Path: cmn.Fmt("/p2p/filter/addr/%s", addr.String())})
  236. if err != nil {
  237. return err
  238. }
  239. if resQuery.IsErr() {
  240. return fmt.Errorf("Error querying abci app: %v", resQuery)
  241. }
  242. return nil
  243. })
  244. sw.SetIDFilter(func(id p2p.ID) error {
  245. resQuery, err := proxyApp.Query().QuerySync(abci.RequestQuery{Path: cmn.Fmt("/p2p/filter/pubkey/%s", id)})
  246. if err != nil {
  247. return err
  248. }
  249. if resQuery.IsErr() {
  250. return fmt.Errorf("Error querying abci app: %v", resQuery)
  251. }
  252. return nil
  253. })
  254. }
  255. eventBus := types.NewEventBus()
  256. eventBus.SetLogger(logger.With("module", "events"))
  257. // services which will be publishing and/or subscribing for messages (events)
  258. // consensusReactor will set it on consensusState and blockExecutor
  259. consensusReactor.SetEventBus(eventBus)
  260. // Transaction indexing
  261. var txIndexer txindex.TxIndexer
  262. switch config.TxIndex.Indexer {
  263. case "kv":
  264. store, err := dbProvider(&DBContext{"tx_index", config})
  265. if err != nil {
  266. return nil, err
  267. }
  268. if config.TxIndex.IndexTags != "" {
  269. txIndexer = kv.NewTxIndex(store, kv.IndexTags(strings.Split(config.TxIndex.IndexTags, ",")))
  270. } else if config.TxIndex.IndexAllTags {
  271. txIndexer = kv.NewTxIndex(store, kv.IndexAllTags())
  272. } else {
  273. txIndexer = kv.NewTxIndex(store)
  274. }
  275. default:
  276. txIndexer = &null.TxIndex{}
  277. }
  278. indexerService := txindex.NewIndexerService(txIndexer, eventBus)
  279. // run the profile server
  280. profileHost := config.ProfListenAddress
  281. if profileHost != "" {
  282. go func() {
  283. logger.Error("Profile server", "err", http.ListenAndServe(profileHost, nil))
  284. }()
  285. }
  286. node := &Node{
  287. config: config,
  288. genesisDoc: genDoc,
  289. privValidator: privValidator,
  290. sw: sw,
  291. addrBook: addrBook,
  292. trustMetricStore: trustMetricStore,
  293. stateDB: stateDB,
  294. blockStore: blockStore,
  295. bcReactor: bcReactor,
  296. mempoolReactor: mempoolReactor,
  297. consensusState: consensusState,
  298. consensusReactor: consensusReactor,
  299. evidencePool: evidencePool,
  300. proxyApp: proxyApp,
  301. txIndexer: txIndexer,
  302. indexerService: indexerService,
  303. eventBus: eventBus,
  304. }
  305. node.BaseService = *cmn.NewBaseService(logger, "Node", node)
  306. return node, nil
  307. }
  308. // OnStart starts the Node. It implements cmn.Service.
  309. func (n *Node) OnStart() error {
  310. err := n.eventBus.Start()
  311. if err != nil {
  312. return err
  313. }
  314. // Run the RPC server first
  315. // so we can eg. receive txs for the first block
  316. if n.config.RPC.ListenAddress != "" {
  317. listeners, err := n.startRPC()
  318. if err != nil {
  319. return err
  320. }
  321. n.rpcListeners = listeners
  322. }
  323. // Create & add listener
  324. protocol, address := cmn.ProtocolAndAddress(n.config.P2P.ListenAddress)
  325. l := p2p.NewDefaultListener(protocol, address, n.config.P2P.SkipUPNP, n.Logger.With("module", "p2p"))
  326. n.sw.AddListener(l)
  327. // Generate node PrivKey
  328. // TODO: pass in like priv_val
  329. nodeKey, err := p2p.LoadOrGenNodeKey(n.config.NodeKeyFile())
  330. if err != nil {
  331. return err
  332. }
  333. n.Logger.Info("P2P Node ID", "ID", nodeKey.ID(), "file", n.config.NodeKeyFile())
  334. // Start the switch
  335. n.sw.SetNodeInfo(n.makeNodeInfo(nodeKey.PubKey()))
  336. n.sw.SetNodeKey(nodeKey)
  337. err = n.sw.Start()
  338. if err != nil {
  339. return err
  340. }
  341. // Always connect to persistent peers
  342. if n.config.P2P.PersistentPeers != "" {
  343. err = n.sw.DialPeersAsync(n.addrBook, strings.Split(n.config.P2P.PersistentPeers, ","), true)
  344. if err != nil {
  345. return err
  346. }
  347. }
  348. // start tx indexer
  349. return n.indexerService.Start()
  350. }
  351. // OnStop stops the Node. It implements cmn.Service.
  352. func (n *Node) OnStop() {
  353. n.BaseService.OnStop()
  354. n.Logger.Info("Stopping Node")
  355. // TODO: gracefully disconnect from peers.
  356. n.sw.Stop()
  357. for _, l := range n.rpcListeners {
  358. n.Logger.Info("Closing rpc listener", "listener", l)
  359. if err := l.Close(); err != nil {
  360. n.Logger.Error("Error closing listener", "listener", l, "err", err)
  361. }
  362. }
  363. n.eventBus.Stop()
  364. n.indexerService.Stop()
  365. }
  366. // RunForever waits for an interrupt signal and stops the node.
  367. func (n *Node) RunForever() {
  368. // Sleep forever and then...
  369. cmn.TrapSignal(func() {
  370. n.Stop()
  371. })
  372. }
  373. // AddListener adds a listener to accept inbound peer connections.
  374. // It should be called before starting the Node.
  375. // The first listener is the primary listener (in NodeInfo)
  376. func (n *Node) AddListener(l p2p.Listener) {
  377. n.sw.AddListener(l)
  378. }
  379. // ConfigureRPC sets all variables in rpccore so they will serve
  380. // rpc calls from this node
  381. func (n *Node) ConfigureRPC() {
  382. rpccore.SetStateDB(n.stateDB)
  383. rpccore.SetBlockStore(n.blockStore)
  384. rpccore.SetConsensusState(n.consensusState)
  385. rpccore.SetMempool(n.mempoolReactor.Mempool)
  386. rpccore.SetEvidencePool(n.evidencePool)
  387. rpccore.SetSwitch(n.sw)
  388. rpccore.SetPubKey(n.privValidator.GetPubKey())
  389. rpccore.SetGenesisDoc(n.genesisDoc)
  390. rpccore.SetAddrBook(n.addrBook)
  391. rpccore.SetProxyAppQuery(n.proxyApp.Query())
  392. rpccore.SetTxIndexer(n.txIndexer)
  393. rpccore.SetConsensusReactor(n.consensusReactor)
  394. rpccore.SetEventBus(n.eventBus)
  395. rpccore.SetLogger(n.Logger.With("module", "rpc"))
  396. }
  397. func (n *Node) startRPC() ([]net.Listener, error) {
  398. n.ConfigureRPC()
  399. listenAddrs := strings.Split(n.config.RPC.ListenAddress, ",")
  400. if n.config.RPC.Unsafe {
  401. rpccore.AddUnsafeRoutes()
  402. }
  403. // we may expose the rpc over both a unix and tcp socket
  404. listeners := make([]net.Listener, len(listenAddrs))
  405. for i, listenAddr := range listenAddrs {
  406. mux := http.NewServeMux()
  407. rpcLogger := n.Logger.With("module", "rpc-server")
  408. wm := rpcserver.NewWebsocketManager(rpccore.Routes, rpcserver.EventSubscriber(n.eventBus))
  409. wm.SetLogger(rpcLogger.With("protocol", "websocket"))
  410. mux.HandleFunc("/websocket", wm.WebsocketHandler)
  411. rpcserver.RegisterRPCFuncs(mux, rpccore.Routes, rpcLogger)
  412. listener, err := rpcserver.StartHTTPServer(listenAddr, mux, rpcLogger)
  413. if err != nil {
  414. return nil, err
  415. }
  416. listeners[i] = listener
  417. }
  418. // we expose a simplified api over grpc for convenience to app devs
  419. grpcListenAddr := n.config.RPC.GRPCListenAddress
  420. if grpcListenAddr != "" {
  421. listener, err := grpccore.StartGRPCServer(grpcListenAddr)
  422. if err != nil {
  423. return nil, err
  424. }
  425. listeners = append(listeners, listener)
  426. }
  427. return listeners, nil
  428. }
  429. // Switch returns the Node's Switch.
  430. func (n *Node) Switch() *p2p.Switch {
  431. return n.sw
  432. }
  433. // BlockStore returns the Node's BlockStore.
  434. func (n *Node) BlockStore() *bc.BlockStore {
  435. return n.blockStore
  436. }
  437. // ConsensusState returns the Node's ConsensusState.
  438. func (n *Node) ConsensusState() *cs.ConsensusState {
  439. return n.consensusState
  440. }
  441. // ConsensusReactor returns the Node's ConsensusReactor.
  442. func (n *Node) ConsensusReactor() *cs.ConsensusReactor {
  443. return n.consensusReactor
  444. }
  445. // MempoolReactor returns the Node's MempoolReactor.
  446. func (n *Node) MempoolReactor() *mempl.MempoolReactor {
  447. return n.mempoolReactor
  448. }
  449. // EvidencePool returns the Node's EvidencePool.
  450. func (n *Node) EvidencePool() *evidence.EvidencePool {
  451. return n.evidencePool
  452. }
  453. // EventBus returns the Node's EventBus.
  454. func (n *Node) EventBus() *types.EventBus {
  455. return n.eventBus
  456. }
  457. // PrivValidator returns the Node's PrivValidator.
  458. // XXX: for convenience only!
  459. func (n *Node) PrivValidator() types.PrivValidator {
  460. return n.privValidator
  461. }
  462. // GenesisDoc returns the Node's GenesisDoc.
  463. func (n *Node) GenesisDoc() *types.GenesisDoc {
  464. return n.genesisDoc
  465. }
  466. // ProxyApp returns the Node's AppConns, representing its connections to the ABCI application.
  467. func (n *Node) ProxyApp() proxy.AppConns {
  468. return n.proxyApp
  469. }
  470. func (n *Node) makeNodeInfo(pubKey crypto.PubKey) p2p.NodeInfo {
  471. txIndexerStatus := "on"
  472. if _, ok := n.txIndexer.(*null.TxIndex); ok {
  473. txIndexerStatus = "off"
  474. }
  475. nodeInfo := p2p.NodeInfo{
  476. PubKey: pubKey,
  477. Network: n.genesisDoc.ChainID,
  478. Version: version.Version,
  479. Channels: []byte{
  480. bc.BlockchainChannel,
  481. cs.StateChannel, cs.DataChannel, cs.VoteChannel, cs.VoteSetBitsChannel,
  482. mempl.MempoolChannel,
  483. evidence.EvidenceChannel,
  484. },
  485. Moniker: n.config.Moniker,
  486. Other: []string{
  487. cmn.Fmt("wire_version=%v", wire.Version),
  488. cmn.Fmt("p2p_version=%v", p2p.Version),
  489. cmn.Fmt("consensus_version=%v", cs.Version),
  490. cmn.Fmt("rpc_version=%v/%v", rpc.Version, rpccore.Version),
  491. cmn.Fmt("tx_index=%v", txIndexerStatus),
  492. },
  493. }
  494. if n.config.P2P.PexReactor {
  495. nodeInfo.Channels = append(nodeInfo.Channels, pex.PexChannel)
  496. }
  497. rpcListenAddr := n.config.RPC.ListenAddress
  498. nodeInfo.Other = append(nodeInfo.Other, cmn.Fmt("rpc_addr=%v", rpcListenAddr))
  499. if !n.sw.IsListening() {
  500. return nodeInfo
  501. }
  502. p2pListener := n.sw.Listeners()[0]
  503. p2pHost := p2pListener.ExternalAddress().IP.String()
  504. p2pPort := p2pListener.ExternalAddress().Port
  505. nodeInfo.ListenAddr = cmn.Fmt("%v:%v", p2pHost, p2pPort)
  506. return nodeInfo
  507. }
  508. //------------------------------------------------------------------------------
  509. // NodeInfo returns the Node's Info from the Switch.
  510. func (n *Node) NodeInfo() p2p.NodeInfo {
  511. return n.sw.NodeInfo()
  512. }
  513. //------------------------------------------------------------------------------
  514. var (
  515. genesisDocKey = []byte("genesisDoc")
  516. )
  517. // panics if failed to unmarshal bytes
  518. func loadGenesisDoc(db dbm.DB) (*types.GenesisDoc, error) {
  519. bytes := db.Get(genesisDocKey)
  520. if len(bytes) == 0 {
  521. return nil, errors.New("Genesis doc not found")
  522. } else {
  523. var genDoc *types.GenesisDoc
  524. err := json.Unmarshal(bytes, &genDoc)
  525. if err != nil {
  526. cmn.PanicCrisis(fmt.Sprintf("Failed to load genesis doc due to unmarshaling error: %v (bytes: %X)", err, bytes))
  527. }
  528. return genDoc, nil
  529. }
  530. }
  531. // panics if failed to marshal the given genesis document
  532. func saveGenesisDoc(db dbm.DB, genDoc *types.GenesisDoc) {
  533. bytes, err := json.Marshal(genDoc)
  534. if err != nil {
  535. cmn.PanicCrisis(fmt.Sprintf("Failed to save genesis doc due to marshaling error: %v", err))
  536. }
  537. db.SetSync(genesisDocKey, bytes)
  538. }