You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

160 lines
4.5 KiB

p2p: make PeerManager.DialNext() and EvictNext() block (#5947) See #5936 and #5938 for background. The plan was initially to have `DialNext()` and `EvictNext()` return a channel. However, implementing this became unnecessarily complicated and error-prone. As an example, the channel would be both consumed and populated (via method calls) by the same driving method (e.g. `Router.dialPeers()`) which could easily cause deadlocks where a method call blocked while sending on the channel that the caller itself was responsible for consuming (but couldn't since it was busy making the method call). It would also require a set of goroutines in the peer manager that would interact with the goroutines in the router in non-obvious ways, and fully populating the channel on startup could cause deadlocks with other startup tasks. Several issues like these made the solution hard to reason about. I therefore simply made `DialNext()` and `EvictNext()` block until the next peer was available, using internal triggers to wake these methods up in a non-blocking fashion when any relevant state changes occurred. This proved much simpler to reason about, since there are no goroutines in the peer manager (except for trivial retry timers), nor any blocking channel sends, and it instead relies entirely on the existing goroutine structure of the router for concurrency. This also happens to be the same pattern used by the `Transport.Accept()` API, following Go stdlib conventions, so all router goroutines end up using a consistent pattern as well.
4 years ago
p2p: make PeerManager.DialNext() and EvictNext() block (#5947) See #5936 and #5938 for background. The plan was initially to have `DialNext()` and `EvictNext()` return a channel. However, implementing this became unnecessarily complicated and error-prone. As an example, the channel would be both consumed and populated (via method calls) by the same driving method (e.g. `Router.dialPeers()`) which could easily cause deadlocks where a method call blocked while sending on the channel that the caller itself was responsible for consuming (but couldn't since it was busy making the method call). It would also require a set of goroutines in the peer manager that would interact with the goroutines in the router in non-obvious ways, and fully populating the channel on startup could cause deadlocks with other startup tasks. Several issues like these made the solution hard to reason about. I therefore simply made `DialNext()` and `EvictNext()` block until the next peer was available, using internal triggers to wake these methods up in a non-blocking fashion when any relevant state changes occurred. This proved much simpler to reason about, since there are no goroutines in the peer manager (except for trivial retry timers), nor any blocking channel sends, and it instead relies entirely on the existing goroutine structure of the router for concurrency. This also happens to be the same pattern used by the `Transport.Accept()` API, following Go stdlib conventions, so all router goroutines end up using a consistent pattern as well.
4 years ago
p2p: make PeerManager.DialNext() and EvictNext() block (#5947) See #5936 and #5938 for background. The plan was initially to have `DialNext()` and `EvictNext()` return a channel. However, implementing this became unnecessarily complicated and error-prone. As an example, the channel would be both consumed and populated (via method calls) by the same driving method (e.g. `Router.dialPeers()`) which could easily cause deadlocks where a method call blocked while sending on the channel that the caller itself was responsible for consuming (but couldn't since it was busy making the method call). It would also require a set of goroutines in the peer manager that would interact with the goroutines in the router in non-obvious ways, and fully populating the channel on startup could cause deadlocks with other startup tasks. Several issues like these made the solution hard to reason about. I therefore simply made `DialNext()` and `EvictNext()` block until the next peer was available, using internal triggers to wake these methods up in a non-blocking fashion when any relevant state changes occurred. This proved much simpler to reason about, since there are no goroutines in the peer manager (except for trivial retry timers), nor any blocking channel sends, and it instead relies entirely on the existing goroutine structure of the router for concurrency. This also happens to be the same pattern used by the `Transport.Accept()` API, following Go stdlib conventions, so all router goroutines end up using a consistent pattern as well.
4 years ago
p2p: make PeerManager.DialNext() and EvictNext() block (#5947) See #5936 and #5938 for background. The plan was initially to have `DialNext()` and `EvictNext()` return a channel. However, implementing this became unnecessarily complicated and error-prone. As an example, the channel would be both consumed and populated (via method calls) by the same driving method (e.g. `Router.dialPeers()`) which could easily cause deadlocks where a method call blocked while sending on the channel that the caller itself was responsible for consuming (but couldn't since it was busy making the method call). It would also require a set of goroutines in the peer manager that would interact with the goroutines in the router in non-obvious ways, and fully populating the channel on startup could cause deadlocks with other startup tasks. Several issues like these made the solution hard to reason about. I therefore simply made `DialNext()` and `EvictNext()` block until the next peer was available, using internal triggers to wake these methods up in a non-blocking fashion when any relevant state changes occurred. This proved much simpler to reason about, since there are no goroutines in the peer manager (except for trivial retry timers), nor any blocking channel sends, and it instead relies entirely on the existing goroutine structure of the router for concurrency. This also happens to be the same pattern used by the `Transport.Accept()` API, following Go stdlib conventions, so all router goroutines end up using a consistent pattern as well.
4 years ago
p2p: make PeerManager.DialNext() and EvictNext() block (#5947) See #5936 and #5938 for background. The plan was initially to have `DialNext()` and `EvictNext()` return a channel. However, implementing this became unnecessarily complicated and error-prone. As an example, the channel would be both consumed and populated (via method calls) by the same driving method (e.g. `Router.dialPeers()`) which could easily cause deadlocks where a method call blocked while sending on the channel that the caller itself was responsible for consuming (but couldn't since it was busy making the method call). It would also require a set of goroutines in the peer manager that would interact with the goroutines in the router in non-obvious ways, and fully populating the channel on startup could cause deadlocks with other startup tasks. Several issues like these made the solution hard to reason about. I therefore simply made `DialNext()` and `EvictNext()` block until the next peer was available, using internal triggers to wake these methods up in a non-blocking fashion when any relevant state changes occurred. This proved much simpler to reason about, since there are no goroutines in the peer manager (except for trivial retry timers), nor any blocking channel sends, and it instead relies entirely on the existing goroutine structure of the router for concurrency. This also happens to be the same pattern used by the `Transport.Accept()` API, following Go stdlib conventions, so all router goroutines end up using a consistent pattern as well.
4 years ago
  1. package p2p_test
  2. import (
  3. "errors"
  4. "testing"
  5. "github.com/fortytw2/leaktest"
  6. gogotypes "github.com/gogo/protobuf/types"
  7. "github.com/stretchr/testify/assert"
  8. "github.com/stretchr/testify/require"
  9. dbm "github.com/tendermint/tm-db"
  10. "github.com/tendermint/tendermint/crypto"
  11. "github.com/tendermint/tendermint/crypto/ed25519"
  12. "github.com/tendermint/tendermint/libs/log"
  13. "github.com/tendermint/tendermint/p2p"
  14. )
  15. type TestMessage = gogotypes.StringValue
  16. func generateNode() (p2p.NodeInfo, crypto.PrivKey) {
  17. privKey := ed25519.GenPrivKey()
  18. nodeID := p2p.NodeIDFromPubKey(privKey.PubKey())
  19. nodeInfo := p2p.NodeInfo{
  20. NodeID: nodeID,
  21. // FIXME: We have to fake a ListenAddr for now.
  22. ListenAddr: "127.0.0.1:1234",
  23. Moniker: "foo",
  24. }
  25. return nodeInfo, privKey
  26. }
  27. func echoReactor(channel *p2p.Channel) {
  28. for {
  29. select {
  30. case envelope := <-channel.In():
  31. channel.Out() <- p2p.Envelope{
  32. To: envelope.From,
  33. Message: &TestMessage{Value: envelope.Message.(*TestMessage).Value},
  34. }
  35. case <-channel.Done():
  36. return
  37. }
  38. }
  39. }
  40. func TestRouter(t *testing.T) {
  41. defer leaktest.Check(t)()
  42. logger := log.TestingLogger()
  43. network := p2p.NewMemoryNetwork(logger)
  44. nodeInfo, privKey := generateNode()
  45. transport := network.CreateTransport(nodeInfo.NodeID)
  46. defer transport.Close()
  47. chID := p2p.ChannelID(1)
  48. // Start some other in-memory network nodes to communicate with, running
  49. // a simple echo reactor that returns received messages.
  50. peers := []p2p.NodeAddress{}
  51. for i := 0; i < 3; i++ {
  52. peerManager, err := p2p.NewPeerManager(dbm.NewMemDB(), p2p.PeerManagerOptions{})
  53. require.NoError(t, err)
  54. peerInfo, peerKey := generateNode()
  55. peerTransport := network.CreateTransport(peerInfo.NodeID)
  56. defer peerTransport.Close()
  57. peerRouter, err := p2p.NewRouter(
  58. logger.With("peerID", i),
  59. peerInfo,
  60. peerKey,
  61. peerManager,
  62. []p2p.Transport{peerTransport},
  63. p2p.RouterOptions{},
  64. )
  65. require.NoError(t, err)
  66. peers = append(peers, peerTransport.Endpoints()[0].NodeAddress(peerInfo.NodeID))
  67. channel, err := peerRouter.OpenChannel(chID, &TestMessage{})
  68. require.NoError(t, err)
  69. defer channel.Close()
  70. go echoReactor(channel)
  71. err = peerRouter.Start()
  72. require.NoError(t, err)
  73. defer func() { require.NoError(t, peerRouter.Stop()) }()
  74. }
  75. // Start the main router and connect it to the peers above.
  76. peerManager, err := p2p.NewPeerManager(dbm.NewMemDB(), p2p.PeerManagerOptions{})
  77. require.NoError(t, err)
  78. defer peerManager.Close()
  79. for _, address := range peers {
  80. err := peerManager.Add(address)
  81. require.NoError(t, err)
  82. }
  83. peerUpdates := peerManager.Subscribe()
  84. defer peerUpdates.Close()
  85. router, err := p2p.NewRouter(logger, nodeInfo, privKey, peerManager, []p2p.Transport{transport}, p2p.RouterOptions{})
  86. require.NoError(t, err)
  87. channel, err := router.OpenChannel(chID, &TestMessage{})
  88. require.NoError(t, err)
  89. defer channel.Close()
  90. err = router.Start()
  91. require.NoError(t, err)
  92. defer func() {
  93. // Since earlier defers are closed after this, and we have to make sure
  94. // we close channels and subscriptions before the router, we explicitly
  95. // close them here to.
  96. peerUpdates.Close()
  97. channel.Close()
  98. require.NoError(t, router.Stop())
  99. }()
  100. // Wait for peers to come online, and ping them as they do.
  101. for i := 0; i < len(peers); i++ {
  102. peerUpdate := <-peerUpdates.Updates()
  103. peerID := peerUpdate.PeerID
  104. require.Equal(t, p2p.PeerUpdate{
  105. PeerID: peerID,
  106. Status: p2p.PeerStatusUp,
  107. }, peerUpdate)
  108. channel.Out() <- p2p.Envelope{To: peerID, Message: &TestMessage{Value: "hi!"}}
  109. assert.Equal(t, p2p.Envelope{
  110. From: peerID,
  111. Message: &TestMessage{Value: "hi!"},
  112. }, (<-channel.In()).Strip())
  113. }
  114. // We now send a broadcast, which we should return back from all peers.
  115. channel.Out() <- p2p.Envelope{
  116. Broadcast: true,
  117. Message: &TestMessage{Value: "broadcast"},
  118. }
  119. for i := 0; i < len(peers); i++ {
  120. envelope := <-channel.In()
  121. require.Equal(t, &TestMessage{Value: "broadcast"}, envelope.Message)
  122. }
  123. // We then submit an error for a peer, and watch it get disconnected.
  124. channel.Error() <- p2p.PeerError{
  125. PeerID: peers[0].NodeID,
  126. Err: errors.New("test error"),
  127. Severity: p2p.PeerErrorSeverityCritical,
  128. }
  129. peerUpdate := <-peerUpdates.Updates()
  130. require.Equal(t, p2p.PeerUpdate{
  131. PeerID: peers[0].NodeID,
  132. Status: p2p.PeerStatusDown,
  133. }, peerUpdate)
  134. // The peer manager will automatically reconnect the peer, so we wait
  135. // for that to happen.
  136. peerUpdate = <-peerUpdates.Updates()
  137. require.Equal(t, p2p.PeerUpdate{
  138. PeerID: peers[0].NodeID,
  139. Status: p2p.PeerStatusUp,
  140. }, peerUpdate)
  141. }