You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

158 lines
7.5 KiB

  1. # Design goals
  2. The design goals for Tendermint (and the SDK and related libraries) are:
  3. * Simplicity and Legibility
  4. * Parallel performance, namely ability to utilize multicore architecture
  5. * Ability to evolve the codebase bug-free
  6. * Debuggability
  7. * Complete correctness that considers all edge cases, esp in concurrency
  8. * Future-proof modular architecture, message protocol, APIs, and encapsulation
  9. ## Justification
  10. Legibility is key to maintaining bug-free software as it evolves toward more
  11. optimizations, more ease of debugging, and additional features.
  12. It is too easy to introduce bugs over time by replacing lines of code with
  13. those that may panic, which means ideally locks are unlocked by defer
  14. statements.
  15. For example,
  16. ```go
  17. func (obj *MyObj) something() {
  18. mtx.Lock()
  19. obj.something = other
  20. mtx.Unlock()
  21. }
  22. ```
  23. It is too easy to refactor the codebase in the future to replace `other` with
  24. `other.String()` for example, and this may introduce a bug that causes a
  25. deadlock. So as much as reasonably possible, we need to be using defer
  26. statements, even though it introduces additional overhead.
  27. If it is necessary to optimize the unlocking of mutex locks, the solution is
  28. more modularity via smaller functions, so that defer'd unlocks are scoped
  29. within a smaller function.
  30. Similarly, idiomatic for-loops should always be preferred over those that use
  31. custom counters, because it is too easy to evolve the body of a for-loop to
  32. become more complicated over time, and it becomes more and more difficult to
  33. assess the correctness of such a for-loop by visual inspection.
  34. ## On performance
  35. It doesn't matter whether there are alternative implementations that are 2x or
  36. 3x more performant, when the software doesn't work, deadlocks, or if bugs
  37. cannot be debugged. By taking advantage of multicore concurrency, the
  38. Tendermint implementation will at least be an order of magnitude within the
  39. range of what is theoretically possible. The design philosophy of Tendermint,
  40. and the choice of Go as implementation language, is designed to make Tendermint
  41. implementation the standard specification for concurrent BFT software.
  42. By focusing on the message protocols (e.g. ABCI, p2p messages), and
  43. encapsulation e.g. IAVL module, (relatively) independent reactors, we are both
  44. implementing a standard implementation to be used as the specification for
  45. future implementations in more optimizable languages like Rust, Java, and C++;
  46. as well as creating sufficiently performant software. Tendermint Core will
  47. never be as fast as future implementations of the Tendermint Spec, because Go
  48. isn't designed to be as fast as possible. The advantage of using Go is that we
  49. can develop the whole stack of modular components **faster** than in other
  50. languages.
  51. Furthermore, the real bottleneck is in the application layer, and it isn't
  52. necessary to support more than a sufficiently decentralized set of validators
  53. (e.g. 100 ~ 300 validators is sufficient, with delegated bonded PoS).
  54. Instead of optimizing Tendermint performance down to the metal, lets focus on
  55. optimizing on other matters, namely ability to push feature complete software
  56. that works well enough, can be debugged and maintained, and can serve as a spec
  57. for future implementations.
  58. ## On encapsulation
  59. In order to create maintainable, forward-optimizable software, it is critical
  60. to develop well-encapsulated objects that have well understood properties, and
  61. to re-use these easy-to-use-correctly components as building blocks for further
  62. encapsulated meta-objects.
  63. For example, mutexes are cheap enough for Tendermint's design goals when there
  64. isn't goroutine contention, so it is encouraged to create concurrency safe
  65. structures with struct-level mutexes. If they are used in the context of
  66. non-concurrent logic, then the performance is good enough. If they are used in
  67. the context of concurrent logic, then it will still perform correctly.
  68. Examples of this design principle can be seen in the types.ValidatorSet struct,
  69. and the rand.Rand struct. It's one single struct declaration that can be used
  70. in both concurrent and non-concurrent logic, and due to its well encapsulation,
  71. it's easy to get the usage of the mutex right.
  72. ### example: rand.Rand
  73. `The default Source is safe for concurrent use by multiple goroutines, but
  74. Sources created by NewSource are not`. The reason why the default
  75. package-level source is safe for concurrent use is because it is protected (see
  76. `lockedSource` in <https://golang.org/src/math/rand/rand.go>).
  77. But we shouldn't rely on the global source, we should be creating our own
  78. Rand/Source instances and using them, especially for determinism in testing.
  79. So it is reasonable to have rand.Rand be protected by a mutex. Whether we want
  80. our own implementation of Rand is another question, but the answer there is
  81. also in the affirmative. Sometimes you want to know where Rand is being used
  82. in your code, so it becomes a simple matter of dropping in a log statement to
  83. inject inspectability into Rand usage. Also, it is nice to be able to extend
  84. the functionality of Rand with custom methods. For these reasons, and for the
  85. reasons which is outlined in this design philosophy document, we should
  86. continue to use the rand.Rand object, with mutex protection.
  87. Another key aspect of good encapsulation is the choice of exposed vs unexposed
  88. methods. It should be clear to the reader of the code, which methods are
  89. intended to be used in what context, and what safe usage is. Part of this is
  90. solved by hiding methods via unexported methods. Another part of this is
  91. naming conventions on the methods (e.g. underscores) with good documentation,
  92. and code organization. If there are too many exposed methods and it isn't
  93. clear what methods have what side effects, then there is something wrong about
  94. the design of abstractions that should be revisited.
  95. ## On concurrency
  96. In order for Tendermint to remain relevant in the years to come, it is vital
  97. for Tendermint to take advantage of multicore architectures. Due to the nature
  98. of the problem, namely consensus across a concurrent p2p gossip network, and to
  99. handle RPC requests for a large number of consuming subscribers, it is
  100. unavoidable for Tendermint development to require expertise in concurrency
  101. design, especially when it comes to the reactor design, and also for RPC
  102. request handling.
  103. # Guidelines
  104. Here are some guidelines for designing for (sufficient) performance and concurrency:
  105. * Mutex locks are cheap enough when there isn't contention.
  106. * Do not optimize code without analytical or observed proof that it is in a hot path.
  107. * Don't over-use channels when mutex locks w/ encapsulation are sufficient.
  108. * The need to drain channels are often a hint of unconsidered edge cases.
  109. * The creation of O(N) one-off goroutines is generally technical debt that
  110. needs to get addressed sooner than later. Avoid creating too many
  111. goroutines as a patch around incomplete concurrency design, or at least be
  112. aware of the debt and do not invest in the debt. On the other hand, Tendermint
  113. is designed to have a limited number of peers (e.g. 10 or 20), so the creation
  114. of O(C) goroutines per O(P) peers is still O(C\*P=constant).
  115. * Use defer statements to unlock as much as possible. If you want to unlock sooner,
  116. try to create more modular functions that do make use of defer statements.
  117. # Mantras
  118. * Premature optimization kills
  119. * Readability is paramount
  120. * Beautiful is better than fast.
  121. * In the face of ambiguity, refuse the temptation to guess.
  122. * In the face of bugs, refuse the temptation to cover the bug.
  123. * There should be one-- and preferably only one --obvious way to do it.