You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

472 lines
14 KiB

Close and retry a RemoteSigner on err (#2923) * Close and recreate a RemoteSigner on err * Update changelog * Address Anton's comments / suggestions: - update changelog - restart TCPVal - shut down on `ErrUnexpectedResponse` * re-init remote signer client with fresh connection if Ping fails - add/update TODOs in secret connection - rename tcp.go -> tcp_client.go, same with ipc to clarify their purpose * account for `conn returned by waitConnection can be `nil` - also add TODO about RemoteSigner conn field * Tests for retrying: IPC / TCP - shorter info log on success - set conn and use it in tests to close conn * Tests for retrying: IPC / TCP - shorter info log on success - set conn and use it in tests to close conn - add rwmutex for conn field in IPC * comments and doc.go * fix ipc tests. fixes #2677 * use constants for tests * cleanup some error statements * fixes #2784, race in tests * remove print statement * minor fixes from review * update comment on sts spec * cosmetics * p2p/conn: add failing tests * p2p/conn: make SecretConnection thread safe * changelog * IPCVal signer refactor - use a .reset() method - don't use embedded RemoteSignerClient - guard RemoteSignerClient with mutex - drop the .conn - expose Close() on RemoteSignerClient * apply IPCVal refactor to TCPVal * remove mtx from RemoteSignerClient * consolidate IPCVal and TCPVal, fixes #3104 - done in tcp_client.go - now called SocketVal - takes a listener in the constructor - make tcpListener and unixListener contain all the differences * delete ipc files * introduce unix and tcp dialer for RemoteSigner * rename files - drop tcp_ prefix - rename priv_validator.go to file.go * bring back listener options * fix node * fix priv_val_server * fix node test * minor cleanup and comments
6 years ago
privval: improve Remote Signer implementation (#3351) This issue is related to #3107 This is a first renaming/refactoring step before reworking and removing heartbeats. As discussed with @Liamsi , we preferred to go for a couple of independent and separate PRs to simplify review work. The changes: Help to clarify the relation between the validator and remote signer endpoints Differentiate between timeouts and deadlines Prepare to encapsulate networking related code behind RemoteSigner in the next PR My intention is to separate and encapsulate the "network related" code from the actual signer. SignerRemote ---(uses/contains)--> SignerValidatorEndpoint <--(connects to)--> SignerServiceEndpoint ---> SignerService (future.. not here yet but would like to decouple too) All reconnection/heartbeat/whatever code goes in the endpoints. Signer[Remote/Service] do not need to know about that. I agree Endpoint may not be the perfect name. I tried to find something "Go-ish" enough. It is a common name in go-kit, kubernetes, etc. Right now: SignerValidatorEndpoint: handles the listener contains SignerRemote Implements the PrivValidator interface connects and sets a connection object in a contained SignerRemote delegates PrivValidator some calls to SignerRemote which in turn uses the conn object that was set externally SignerRemote: Implements the PrivValidator interface read/writes from a connection object directly handles heartbeats SignerServiceEndpoint: Does most things in a single place delegates to a PrivValidator IIRC. * cleanup * Refactoring step 1 * Refactoring step 2 * move messages to another file * mark for future work / next steps * mark deprecated classes in docs * Fix linter problems * additional linter fixes
6 years ago
privval: improve Remote Signer implementation (#3351) This issue is related to #3107 This is a first renaming/refactoring step before reworking and removing heartbeats. As discussed with @Liamsi , we preferred to go for a couple of independent and separate PRs to simplify review work. The changes: Help to clarify the relation between the validator and remote signer endpoints Differentiate between timeouts and deadlines Prepare to encapsulate networking related code behind RemoteSigner in the next PR My intention is to separate and encapsulate the "network related" code from the actual signer. SignerRemote ---(uses/contains)--> SignerValidatorEndpoint <--(connects to)--> SignerServiceEndpoint ---> SignerService (future.. not here yet but would like to decouple too) All reconnection/heartbeat/whatever code goes in the endpoints. Signer[Remote/Service] do not need to know about that. I agree Endpoint may not be the perfect name. I tried to find something "Go-ish" enough. It is a common name in go-kit, kubernetes, etc. Right now: SignerValidatorEndpoint: handles the listener contains SignerRemote Implements the PrivValidator interface connects and sets a connection object in a contained SignerRemote delegates PrivValidator some calls to SignerRemote which in turn uses the conn object that was set externally SignerRemote: Implements the PrivValidator interface read/writes from a connection object directly handles heartbeats SignerServiceEndpoint: Does most things in a single place delegates to a PrivValidator IIRC. * cleanup * Refactoring step 1 * Refactoring step 2 * move messages to another file * mark for future work / next steps * mark deprecated classes in docs * Fix linter problems * additional linter fixes
6 years ago
privval: improve Remote Signer implementation (#3351) This issue is related to #3107 This is a first renaming/refactoring step before reworking and removing heartbeats. As discussed with @Liamsi , we preferred to go for a couple of independent and separate PRs to simplify review work. The changes: Help to clarify the relation between the validator and remote signer endpoints Differentiate between timeouts and deadlines Prepare to encapsulate networking related code behind RemoteSigner in the next PR My intention is to separate and encapsulate the "network related" code from the actual signer. SignerRemote ---(uses/contains)--> SignerValidatorEndpoint <--(connects to)--> SignerServiceEndpoint ---> SignerService (future.. not here yet but would like to decouple too) All reconnection/heartbeat/whatever code goes in the endpoints. Signer[Remote/Service] do not need to know about that. I agree Endpoint may not be the perfect name. I tried to find something "Go-ish" enough. It is a common name in go-kit, kubernetes, etc. Right now: SignerValidatorEndpoint: handles the listener contains SignerRemote Implements the PrivValidator interface connects and sets a connection object in a contained SignerRemote delegates PrivValidator some calls to SignerRemote which in turn uses the conn object that was set externally SignerRemote: Implements the PrivValidator interface read/writes from a connection object directly handles heartbeats SignerServiceEndpoint: Does most things in a single place delegates to a PrivValidator IIRC. * cleanup * Refactoring step 1 * Refactoring step 2 * move messages to another file * mark for future work / next steps * mark deprecated classes in docs * Fix linter problems * additional linter fixes
6 years ago
  1. package privval
  2. import (
  3. "bytes"
  4. "context"
  5. "encoding/json"
  6. "errors"
  7. "fmt"
  8. "os"
  9. "time"
  10. "github.com/gogo/protobuf/proto"
  11. "github.com/tendermint/tendermint/crypto"
  12. "github.com/tendermint/tendermint/crypto/ed25519"
  13. "github.com/tendermint/tendermint/crypto/secp256k1"
  14. "github.com/tendermint/tendermint/internal/libs/protoio"
  15. "github.com/tendermint/tendermint/internal/libs/tempfile"
  16. tmbytes "github.com/tendermint/tendermint/libs/bytes"
  17. tmjson "github.com/tendermint/tendermint/libs/json"
  18. tmos "github.com/tendermint/tendermint/libs/os"
  19. tmtime "github.com/tendermint/tendermint/libs/time"
  20. tmproto "github.com/tendermint/tendermint/proto/tendermint/types"
  21. "github.com/tendermint/tendermint/types"
  22. )
  23. // TODO: type ?
  24. const (
  25. stepNone int8 = 0 // Used to distinguish the initial state
  26. stepPropose int8 = 1
  27. stepPrevote int8 = 2
  28. stepPrecommit int8 = 3
  29. )
  30. // A vote is either stepPrevote or stepPrecommit.
  31. func voteToStep(vote *tmproto.Vote) (int8, error) {
  32. switch vote.Type {
  33. case tmproto.PrevoteType:
  34. return stepPrevote, nil
  35. case tmproto.PrecommitType:
  36. return stepPrecommit, nil
  37. default:
  38. return 0, fmt.Errorf("unknown vote type: %v", vote.Type)
  39. }
  40. }
  41. //-------------------------------------------------------------------------------
  42. // FilePVKey stores the immutable part of PrivValidator.
  43. type FilePVKey struct {
  44. Address types.Address `json:"address"`
  45. PubKey crypto.PubKey `json:"pub_key"`
  46. PrivKey crypto.PrivKey `json:"priv_key"`
  47. filePath string
  48. }
  49. // Save persists the FilePVKey to its filePath.
  50. func (pvKey FilePVKey) Save() error {
  51. outFile := pvKey.filePath
  52. if outFile == "" {
  53. return errors.New("cannot save PrivValidator key: filePath not set")
  54. }
  55. jsonBytes, err := tmjson.MarshalIndent(pvKey, "", " ")
  56. if err != nil {
  57. return err
  58. }
  59. return tempfile.WriteFileAtomic(outFile, jsonBytes, 0600)
  60. }
  61. //-------------------------------------------------------------------------------
  62. // FilePVLastSignState stores the mutable part of PrivValidator.
  63. type FilePVLastSignState struct {
  64. Height int64 `json:"height,string"`
  65. Round int32 `json:"round"`
  66. Step int8 `json:"step"`
  67. Signature []byte `json:"signature,omitempty"`
  68. SignBytes tmbytes.HexBytes `json:"signbytes,omitempty"`
  69. filePath string
  70. }
  71. // CheckHRS checks the given height, round, step (HRS) against that of the
  72. // FilePVLastSignState. It returns an error if the arguments constitute a regression,
  73. // or if they match but the SignBytes are empty.
  74. // The returned boolean indicates whether the last Signature should be reused -
  75. // it returns true if the HRS matches the arguments and the SignBytes are not empty (indicating
  76. // we have already signed for this HRS, and can reuse the existing signature).
  77. // It panics if the HRS matches the arguments, there's a SignBytes, but no Signature.
  78. func (lss *FilePVLastSignState) CheckHRS(height int64, round int32, step int8) (bool, error) {
  79. if lss.Height > height {
  80. return false, fmt.Errorf("height regression. Got %v, last height %v", height, lss.Height)
  81. }
  82. if lss.Height == height {
  83. if lss.Round > round {
  84. return false, fmt.Errorf("round regression at height %v. Got %v, last round %v", height, round, lss.Round)
  85. }
  86. if lss.Round == round {
  87. if lss.Step > step {
  88. return false, fmt.Errorf(
  89. "step regression at height %v round %v. Got %v, last step %v",
  90. height,
  91. round,
  92. step,
  93. lss.Step,
  94. )
  95. } else if lss.Step == step {
  96. if lss.SignBytes != nil {
  97. if lss.Signature == nil {
  98. panic("pv: Signature is nil but SignBytes is not!")
  99. }
  100. return true, nil
  101. }
  102. return false, errors.New("no SignBytes found")
  103. }
  104. }
  105. }
  106. return false, nil
  107. }
  108. // Save persists the FilePvLastSignState to its filePath.
  109. func (lss *FilePVLastSignState) Save() error {
  110. outFile := lss.filePath
  111. if outFile == "" {
  112. return errors.New("cannot save FilePVLastSignState: filePath not set")
  113. }
  114. jsonBytes, err := json.MarshalIndent(lss, "", " ")
  115. if err != nil {
  116. return err
  117. }
  118. return tempfile.WriteFileAtomic(outFile, jsonBytes, 0600)
  119. }
  120. //-------------------------------------------------------------------------------
  121. // FilePV implements PrivValidator using data persisted to disk
  122. // to prevent double signing.
  123. // NOTE: the directories containing pv.Key.filePath and pv.LastSignState.filePath must already exist.
  124. // It includes the LastSignature and LastSignBytes so we don't lose the signature
  125. // if the process crashes after signing but before the resulting consensus message is processed.
  126. type FilePV struct {
  127. Key FilePVKey
  128. LastSignState FilePVLastSignState
  129. }
  130. var _ types.PrivValidator = (*FilePV)(nil)
  131. // NewFilePV generates a new validator from the given key and paths.
  132. func NewFilePV(privKey crypto.PrivKey, keyFilePath, stateFilePath string) *FilePV {
  133. return &FilePV{
  134. Key: FilePVKey{
  135. Address: privKey.PubKey().Address(),
  136. PubKey: privKey.PubKey(),
  137. PrivKey: privKey,
  138. filePath: keyFilePath,
  139. },
  140. LastSignState: FilePVLastSignState{
  141. Step: stepNone,
  142. filePath: stateFilePath,
  143. },
  144. }
  145. }
  146. // GenFilePV generates a new validator with randomly generated private key
  147. // and sets the filePaths, but does not call Save().
  148. func GenFilePV(keyFilePath, stateFilePath, keyType string) (*FilePV, error) {
  149. switch keyType {
  150. case types.ABCIPubKeyTypeSecp256k1:
  151. return NewFilePV(secp256k1.GenPrivKey(), keyFilePath, stateFilePath), nil
  152. case "", types.ABCIPubKeyTypeEd25519:
  153. return NewFilePV(ed25519.GenPrivKey(), keyFilePath, stateFilePath), nil
  154. default:
  155. return nil, fmt.Errorf("key type: %s is not supported", keyType)
  156. }
  157. }
  158. // LoadFilePV loads a FilePV from the filePaths. The FilePV handles double
  159. // signing prevention by persisting data to the stateFilePath. If either file path
  160. // does not exist, the program will exit.
  161. func LoadFilePV(keyFilePath, stateFilePath string) (*FilePV, error) {
  162. return loadFilePV(keyFilePath, stateFilePath, true)
  163. }
  164. // LoadFilePVEmptyState loads a FilePV from the given keyFilePath, with an empty LastSignState.
  165. // If the keyFilePath does not exist, the program will exit.
  166. func LoadFilePVEmptyState(keyFilePath, stateFilePath string) (*FilePV, error) {
  167. return loadFilePV(keyFilePath, stateFilePath, false)
  168. }
  169. // If loadState is true, we load from the stateFilePath. Otherwise, we use an empty LastSignState.
  170. func loadFilePV(keyFilePath, stateFilePath string, loadState bool) (*FilePV, error) {
  171. keyJSONBytes, err := os.ReadFile(keyFilePath)
  172. if err != nil {
  173. return nil, err
  174. }
  175. pvKey := FilePVKey{}
  176. err = tmjson.Unmarshal(keyJSONBytes, &pvKey)
  177. if err != nil {
  178. return nil, fmt.Errorf("error reading PrivValidator key from %v: %w", keyFilePath, err)
  179. }
  180. // overwrite pubkey and address for convenience
  181. pvKey.PubKey = pvKey.PrivKey.PubKey()
  182. pvKey.Address = pvKey.PubKey.Address()
  183. pvKey.filePath = keyFilePath
  184. pvState := FilePVLastSignState{}
  185. if loadState {
  186. stateJSONBytes, err := os.ReadFile(stateFilePath)
  187. if err != nil {
  188. return nil, err
  189. }
  190. err = json.Unmarshal(stateJSONBytes, &pvState)
  191. if err != nil {
  192. return nil, fmt.Errorf("error reading PrivValidator state from %v: %w", stateFilePath, err)
  193. }
  194. }
  195. pvState.filePath = stateFilePath
  196. return &FilePV{
  197. Key: pvKey,
  198. LastSignState: pvState,
  199. }, nil
  200. }
  201. // LoadOrGenFilePV loads a FilePV from the given filePaths
  202. // or else generates a new one and saves it to the filePaths.
  203. func LoadOrGenFilePV(keyFilePath, stateFilePath string) (*FilePV, error) {
  204. if tmos.FileExists(keyFilePath) {
  205. pv, err := LoadFilePV(keyFilePath, stateFilePath)
  206. if err != nil {
  207. return nil, err
  208. }
  209. return pv, nil
  210. }
  211. pv, err := GenFilePV(keyFilePath, stateFilePath, "")
  212. if err != nil {
  213. return nil, err
  214. }
  215. if err := pv.Save(); err != nil {
  216. return nil, err
  217. }
  218. return pv, nil
  219. }
  220. // GetAddress returns the address of the validator.
  221. // Implements PrivValidator.
  222. func (pv *FilePV) GetAddress() types.Address {
  223. return pv.Key.Address
  224. }
  225. // GetPubKey returns the public key of the validator.
  226. // Implements PrivValidator.
  227. func (pv *FilePV) GetPubKey(ctx context.Context) (crypto.PubKey, error) {
  228. return pv.Key.PubKey, nil
  229. }
  230. // SignVote signs a canonical representation of the vote, along with the
  231. // chainID. Implements PrivValidator.
  232. func (pv *FilePV) SignVote(ctx context.Context, chainID string, vote *tmproto.Vote) error {
  233. if err := pv.signVote(chainID, vote); err != nil {
  234. return fmt.Errorf("error signing vote: %w", err)
  235. }
  236. return nil
  237. }
  238. // SignProposal signs a canonical representation of the proposal, along with
  239. // the chainID. Implements PrivValidator.
  240. func (pv *FilePV) SignProposal(ctx context.Context, chainID string, proposal *tmproto.Proposal) error {
  241. if err := pv.signProposal(chainID, proposal); err != nil {
  242. return fmt.Errorf("error signing proposal: %w", err)
  243. }
  244. return nil
  245. }
  246. // Save persists the FilePV to disk.
  247. func (pv *FilePV) Save() error {
  248. if err := pv.Key.Save(); err != nil {
  249. return err
  250. }
  251. return pv.LastSignState.Save()
  252. }
  253. // Reset resets all fields in the FilePV.
  254. // NOTE: Unsafe!
  255. func (pv *FilePV) Reset() error {
  256. var sig []byte
  257. pv.LastSignState.Height = 0
  258. pv.LastSignState.Round = 0
  259. pv.LastSignState.Step = 0
  260. pv.LastSignState.Signature = sig
  261. pv.LastSignState.SignBytes = nil
  262. return pv.Save()
  263. }
  264. // String returns a string representation of the FilePV.
  265. func (pv *FilePV) String() string {
  266. return fmt.Sprintf(
  267. "PrivValidator{%v LH:%v, LR:%v, LS:%v}",
  268. pv.GetAddress(),
  269. pv.LastSignState.Height,
  270. pv.LastSignState.Round,
  271. pv.LastSignState.Step,
  272. )
  273. }
  274. //------------------------------------------------------------------------------------
  275. // signVote checks if the vote is good to sign and sets the vote signature.
  276. // It may need to set the timestamp as well if the vote is otherwise the same as
  277. // a previously signed vote (ie. we crashed after signing but before the vote hit the WAL).
  278. func (pv *FilePV) signVote(chainID string, vote *tmproto.Vote) error {
  279. step, err := voteToStep(vote)
  280. if err != nil {
  281. return err
  282. }
  283. height := vote.Height
  284. round := vote.Round
  285. lss := pv.LastSignState
  286. sameHRS, err := lss.CheckHRS(height, round, step)
  287. if err != nil {
  288. return err
  289. }
  290. signBytes := types.VoteSignBytes(chainID, vote)
  291. // We might crash before writing to the wal,
  292. // causing us to try to re-sign for the same HRS.
  293. // If signbytes are the same, use the last signature.
  294. // If they only differ by timestamp, use last timestamp and signature
  295. // Otherwise, return error
  296. if sameHRS {
  297. if bytes.Equal(signBytes, lss.SignBytes) {
  298. vote.Signature = lss.Signature
  299. } else {
  300. timestamp, ok, err := checkVotesOnlyDifferByTimestamp(lss.SignBytes, signBytes)
  301. if err != nil {
  302. return err
  303. }
  304. if !ok {
  305. return errors.New("conflicting data")
  306. }
  307. vote.Timestamp = timestamp
  308. vote.Signature = lss.Signature
  309. return nil
  310. }
  311. }
  312. // It passed the checks. Sign the vote
  313. sig, err := pv.Key.PrivKey.Sign(signBytes)
  314. if err != nil {
  315. return err
  316. }
  317. if err := pv.saveSigned(height, round, step, signBytes, sig); err != nil {
  318. return err
  319. }
  320. vote.Signature = sig
  321. return nil
  322. }
  323. // signProposal checks if the proposal is good to sign and sets the proposal signature.
  324. // It may need to set the timestamp as well if the proposal is otherwise the same as
  325. // a previously signed proposal ie. we crashed after signing but before the proposal hit the WAL).
  326. func (pv *FilePV) signProposal(chainID string, proposal *tmproto.Proposal) error {
  327. height, round, step := proposal.Height, proposal.Round, stepPropose
  328. lss := pv.LastSignState
  329. sameHRS, err := lss.CheckHRS(height, round, step)
  330. if err != nil {
  331. return err
  332. }
  333. signBytes := types.ProposalSignBytes(chainID, proposal)
  334. // We might crash before writing to the wal,
  335. // causing us to try to re-sign for the same HRS.
  336. // If signbytes are the same, use the last signature.
  337. // If they only differ by timestamp, use last timestamp and signature
  338. // Otherwise, return error
  339. if sameHRS {
  340. if bytes.Equal(signBytes, lss.SignBytes) {
  341. proposal.Signature = lss.Signature
  342. } else {
  343. timestamp, ok, err := checkProposalsOnlyDifferByTimestamp(lss.SignBytes, signBytes)
  344. if err != nil {
  345. return err
  346. }
  347. if !ok {
  348. return errors.New("conflicting data")
  349. }
  350. proposal.Timestamp = timestamp
  351. proposal.Signature = lss.Signature
  352. return nil
  353. }
  354. }
  355. // It passed the checks. Sign the proposal
  356. sig, err := pv.Key.PrivKey.Sign(signBytes)
  357. if err != nil {
  358. return err
  359. }
  360. if err := pv.saveSigned(height, round, step, signBytes, sig); err != nil {
  361. return err
  362. }
  363. proposal.Signature = sig
  364. return nil
  365. }
  366. // Persist height/round/step and signature
  367. func (pv *FilePV) saveSigned(height int64, round int32, step int8, signBytes []byte, sig []byte) error {
  368. pv.LastSignState.Height = height
  369. pv.LastSignState.Round = round
  370. pv.LastSignState.Step = step
  371. pv.LastSignState.Signature = sig
  372. pv.LastSignState.SignBytes = signBytes
  373. return pv.LastSignState.Save()
  374. }
  375. //-----------------------------------------------------------------------------------------
  376. // returns the timestamp from the lastSignBytes.
  377. // returns true if the only difference in the votes is their timestamp.
  378. func checkVotesOnlyDifferByTimestamp(lastSignBytes, newSignBytes []byte) (time.Time, bool, error) {
  379. var lastVote, newVote tmproto.CanonicalVote
  380. if err := protoio.UnmarshalDelimited(lastSignBytes, &lastVote); err != nil {
  381. return time.Time{}, false, fmt.Errorf("LastSignBytes cannot be unmarshalled into vote: %w", err)
  382. }
  383. if err := protoio.UnmarshalDelimited(newSignBytes, &newVote); err != nil {
  384. return time.Time{}, false, fmt.Errorf("signBytes cannot be unmarshalled into vote: %w", err)
  385. }
  386. lastTime := lastVote.Timestamp
  387. // set the times to the same value and check equality
  388. now := tmtime.Now()
  389. lastVote.Timestamp = now
  390. newVote.Timestamp = now
  391. return lastTime, proto.Equal(&newVote, &lastVote), nil
  392. }
  393. // returns the timestamp from the lastSignBytes.
  394. // returns true if the only difference in the proposals is their timestamp
  395. func checkProposalsOnlyDifferByTimestamp(lastSignBytes, newSignBytes []byte) (time.Time, bool, error) {
  396. var lastProposal, newProposal tmproto.CanonicalProposal
  397. if err := protoio.UnmarshalDelimited(lastSignBytes, &lastProposal); err != nil {
  398. return time.Time{}, false, fmt.Errorf("LastSignBytes cannot be unmarshalled into proposal: %w", err)
  399. }
  400. if err := protoio.UnmarshalDelimited(newSignBytes, &newProposal); err != nil {
  401. return time.Time{}, false, fmt.Errorf("signBytes cannot be unmarshalled into proposal: %w", err)
  402. }
  403. lastTime := lastProposal.Timestamp
  404. // set the times to the same value and check equality
  405. now := tmtime.Now()
  406. lastProposal.Timestamp = now
  407. newProposal.Timestamp = now
  408. return lastTime, proto.Equal(&newProposal, &lastProposal), nil
  409. }