You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

489 lines
15 KiB

Close and retry a RemoteSigner on err (#2923) * Close and recreate a RemoteSigner on err * Update changelog * Address Anton's comments / suggestions: - update changelog - restart TCPVal - shut down on `ErrUnexpectedResponse` * re-init remote signer client with fresh connection if Ping fails - add/update TODOs in secret connection - rename tcp.go -> tcp_client.go, same with ipc to clarify their purpose * account for `conn returned by waitConnection can be `nil` - also add TODO about RemoteSigner conn field * Tests for retrying: IPC / TCP - shorter info log on success - set conn and use it in tests to close conn * Tests for retrying: IPC / TCP - shorter info log on success - set conn and use it in tests to close conn - add rwmutex for conn field in IPC * comments and doc.go * fix ipc tests. fixes #2677 * use constants for tests * cleanup some error statements * fixes #2784, race in tests * remove print statement * minor fixes from review * update comment on sts spec * cosmetics * p2p/conn: add failing tests * p2p/conn: make SecretConnection thread safe * changelog * IPCVal signer refactor - use a .reset() method - don't use embedded RemoteSignerClient - guard RemoteSignerClient with mutex - drop the .conn - expose Close() on RemoteSignerClient * apply IPCVal refactor to TCPVal * remove mtx from RemoteSignerClient * consolidate IPCVal and TCPVal, fixes #3104 - done in tcp_client.go - now called SocketVal - takes a listener in the constructor - make tcpListener and unixListener contain all the differences * delete ipc files * introduce unix and tcp dialer for RemoteSigner * rename files - drop tcp_ prefix - rename priv_validator.go to file.go * bring back listener options * fix node * fix priv_val_server * fix node test * minor cleanup and comments
5 years ago
privval: improve Remote Signer implementation (#3351) This issue is related to #3107 This is a first renaming/refactoring step before reworking and removing heartbeats. As discussed with @Liamsi , we preferred to go for a couple of independent and separate PRs to simplify review work. The changes: Help to clarify the relation between the validator and remote signer endpoints Differentiate between timeouts and deadlines Prepare to encapsulate networking related code behind RemoteSigner in the next PR My intention is to separate and encapsulate the "network related" code from the actual signer. SignerRemote ---(uses/contains)--> SignerValidatorEndpoint <--(connects to)--> SignerServiceEndpoint ---> SignerService (future.. not here yet but would like to decouple too) All reconnection/heartbeat/whatever code goes in the endpoints. Signer[Remote/Service] do not need to know about that. I agree Endpoint may not be the perfect name. I tried to find something "Go-ish" enough. It is a common name in go-kit, kubernetes, etc. Right now: SignerValidatorEndpoint: handles the listener contains SignerRemote Implements the PrivValidator interface connects and sets a connection object in a contained SignerRemote delegates PrivValidator some calls to SignerRemote which in turn uses the conn object that was set externally SignerRemote: Implements the PrivValidator interface read/writes from a connection object directly handles heartbeats SignerServiceEndpoint: Does most things in a single place delegates to a PrivValidator IIRC. * cleanup * Refactoring step 1 * Refactoring step 2 * move messages to another file * mark for future work / next steps * mark deprecated classes in docs * Fix linter problems * additional linter fixes
5 years ago
privval: improve Remote Signer implementation (#3351) This issue is related to #3107 This is a first renaming/refactoring step before reworking and removing heartbeats. As discussed with @Liamsi , we preferred to go for a couple of independent and separate PRs to simplify review work. The changes: Help to clarify the relation between the validator and remote signer endpoints Differentiate between timeouts and deadlines Prepare to encapsulate networking related code behind RemoteSigner in the next PR My intention is to separate and encapsulate the "network related" code from the actual signer. SignerRemote ---(uses/contains)--> SignerValidatorEndpoint <--(connects to)--> SignerServiceEndpoint ---> SignerService (future.. not here yet but would like to decouple too) All reconnection/heartbeat/whatever code goes in the endpoints. Signer[Remote/Service] do not need to know about that. I agree Endpoint may not be the perfect name. I tried to find something "Go-ish" enough. It is a common name in go-kit, kubernetes, etc. Right now: SignerValidatorEndpoint: handles the listener contains SignerRemote Implements the PrivValidator interface connects and sets a connection object in a contained SignerRemote delegates PrivValidator some calls to SignerRemote which in turn uses the conn object that was set externally SignerRemote: Implements the PrivValidator interface read/writes from a connection object directly handles heartbeats SignerServiceEndpoint: Does most things in a single place delegates to a PrivValidator IIRC. * cleanup * Refactoring step 1 * Refactoring step 2 * move messages to another file * mark for future work / next steps * mark deprecated classes in docs * Fix linter problems * additional linter fixes
5 years ago
privval: improve Remote Signer implementation (#3351) This issue is related to #3107 This is a first renaming/refactoring step before reworking and removing heartbeats. As discussed with @Liamsi , we preferred to go for a couple of independent and separate PRs to simplify review work. The changes: Help to clarify the relation between the validator and remote signer endpoints Differentiate between timeouts and deadlines Prepare to encapsulate networking related code behind RemoteSigner in the next PR My intention is to separate and encapsulate the "network related" code from the actual signer. SignerRemote ---(uses/contains)--> SignerValidatorEndpoint <--(connects to)--> SignerServiceEndpoint ---> SignerService (future.. not here yet but would like to decouple too) All reconnection/heartbeat/whatever code goes in the endpoints. Signer[Remote/Service] do not need to know about that. I agree Endpoint may not be the perfect name. I tried to find something "Go-ish" enough. It is a common name in go-kit, kubernetes, etc. Right now: SignerValidatorEndpoint: handles the listener contains SignerRemote Implements the PrivValidator interface connects and sets a connection object in a contained SignerRemote delegates PrivValidator some calls to SignerRemote which in turn uses the conn object that was set externally SignerRemote: Implements the PrivValidator interface read/writes from a connection object directly handles heartbeats SignerServiceEndpoint: Does most things in a single place delegates to a PrivValidator IIRC. * cleanup * Refactoring step 1 * Refactoring step 2 * move messages to another file * mark for future work / next steps * mark deprecated classes in docs * Fix linter problems * additional linter fixes
5 years ago
  1. package privval
  2. import (
  3. "bytes"
  4. "context"
  5. "encoding/json"
  6. "errors"
  7. "fmt"
  8. "os"
  9. "time"
  10. "github.com/gogo/protobuf/proto"
  11. "github.com/tendermint/tendermint/crypto"
  12. "github.com/tendermint/tendermint/crypto/ed25519"
  13. "github.com/tendermint/tendermint/crypto/secp256k1"
  14. "github.com/tendermint/tendermint/internal/jsontypes"
  15. "github.com/tendermint/tendermint/internal/libs/protoio"
  16. "github.com/tendermint/tendermint/internal/libs/tempfile"
  17. tmbytes "github.com/tendermint/tendermint/libs/bytes"
  18. tmjson "github.com/tendermint/tendermint/libs/json"
  19. tmos "github.com/tendermint/tendermint/libs/os"
  20. tmtime "github.com/tendermint/tendermint/libs/time"
  21. tmproto "github.com/tendermint/tendermint/proto/tendermint/types"
  22. "github.com/tendermint/tendermint/types"
  23. )
  24. // TODO: type ?
  25. const (
  26. stepNone int8 = 0 // Used to distinguish the initial state
  27. stepPropose int8 = 1
  28. stepPrevote int8 = 2
  29. stepPrecommit int8 = 3
  30. )
  31. // A vote is either stepPrevote or stepPrecommit.
  32. func voteToStep(vote *tmproto.Vote) (int8, error) {
  33. switch vote.Type {
  34. case tmproto.PrevoteType:
  35. return stepPrevote, nil
  36. case tmproto.PrecommitType:
  37. return stepPrecommit, nil
  38. default:
  39. return 0, fmt.Errorf("unknown vote type: %v", vote.Type)
  40. }
  41. }
  42. //-------------------------------------------------------------------------------
  43. // FilePVKey stores the immutable part of PrivValidator.
  44. type FilePVKey struct {
  45. Address types.Address `json:"address"`
  46. PubKey crypto.PubKey `json:"pub_key"`
  47. PrivKey crypto.PrivKey `json:"priv_key"`
  48. filePath string
  49. }
  50. func (pvKey FilePVKey) MarshalJSON() ([]byte, error) {
  51. pubk, err := jsontypes.Marshal(pvKey.PubKey)
  52. if err != nil {
  53. return nil, err
  54. }
  55. privk, err := jsontypes.Marshal(pvKey.PrivKey)
  56. if err != nil {
  57. return nil, err
  58. }
  59. return json.Marshal(struct {
  60. Address types.Address `json:"address"`
  61. PubKey json.RawMessage `json:"pub_key"`
  62. PrivKey json.RawMessage `json:"priv_key"`
  63. }{Address: pvKey.Address, PubKey: pubk, PrivKey: privk})
  64. }
  65. // Save persists the FilePVKey to its filePath.
  66. func (pvKey FilePVKey) Save() error {
  67. outFile := pvKey.filePath
  68. if outFile == "" {
  69. return errors.New("cannot save PrivValidator key: filePath not set")
  70. }
  71. jsonBytes, err := tmjson.MarshalIndent(pvKey, "", " ")
  72. if err != nil {
  73. return err
  74. }
  75. return tempfile.WriteFileAtomic(outFile, jsonBytes, 0600)
  76. }
  77. //-------------------------------------------------------------------------------
  78. // FilePVLastSignState stores the mutable part of PrivValidator.
  79. type FilePVLastSignState struct {
  80. Height int64 `json:"height,string"`
  81. Round int32 `json:"round"`
  82. Step int8 `json:"step"`
  83. Signature []byte `json:"signature,omitempty"`
  84. SignBytes tmbytes.HexBytes `json:"signbytes,omitempty"`
  85. filePath string
  86. }
  87. // CheckHRS checks the given height, round, step (HRS) against that of the
  88. // FilePVLastSignState. It returns an error if the arguments constitute a regression,
  89. // or if they match but the SignBytes are empty.
  90. // The returned boolean indicates whether the last Signature should be reused -
  91. // it returns true if the HRS matches the arguments and the SignBytes are not empty (indicating
  92. // we have already signed for this HRS, and can reuse the existing signature).
  93. // It panics if the HRS matches the arguments, there's a SignBytes, but no Signature.
  94. func (lss *FilePVLastSignState) CheckHRS(height int64, round int32, step int8) (bool, error) {
  95. if lss.Height > height {
  96. return false, fmt.Errorf("height regression. Got %v, last height %v", height, lss.Height)
  97. }
  98. if lss.Height == height {
  99. if lss.Round > round {
  100. return false, fmt.Errorf("round regression at height %v. Got %v, last round %v", height, round, lss.Round)
  101. }
  102. if lss.Round == round {
  103. if lss.Step > step {
  104. return false, fmt.Errorf(
  105. "step regression at height %v round %v. Got %v, last step %v",
  106. height,
  107. round,
  108. step,
  109. lss.Step,
  110. )
  111. } else if lss.Step == step {
  112. if lss.SignBytes != nil {
  113. if lss.Signature == nil {
  114. panic("pv: Signature is nil but SignBytes is not!")
  115. }
  116. return true, nil
  117. }
  118. return false, errors.New("no SignBytes found")
  119. }
  120. }
  121. }
  122. return false, nil
  123. }
  124. // Save persists the FilePvLastSignState to its filePath.
  125. func (lss *FilePVLastSignState) Save() error {
  126. outFile := lss.filePath
  127. if outFile == "" {
  128. return errors.New("cannot save FilePVLastSignState: filePath not set")
  129. }
  130. jsonBytes, err := json.MarshalIndent(lss, "", " ")
  131. if err != nil {
  132. return err
  133. }
  134. return tempfile.WriteFileAtomic(outFile, jsonBytes, 0600)
  135. }
  136. //-------------------------------------------------------------------------------
  137. // FilePV implements PrivValidator using data persisted to disk
  138. // to prevent double signing.
  139. // NOTE: the directories containing pv.Key.filePath and pv.LastSignState.filePath must already exist.
  140. // It includes the LastSignature and LastSignBytes so we don't lose the signature
  141. // if the process crashes after signing but before the resulting consensus message is processed.
  142. type FilePV struct {
  143. Key FilePVKey
  144. LastSignState FilePVLastSignState
  145. }
  146. var _ types.PrivValidator = (*FilePV)(nil)
  147. // NewFilePV generates a new validator from the given key and paths.
  148. func NewFilePV(privKey crypto.PrivKey, keyFilePath, stateFilePath string) *FilePV {
  149. return &FilePV{
  150. Key: FilePVKey{
  151. Address: privKey.PubKey().Address(),
  152. PubKey: privKey.PubKey(),
  153. PrivKey: privKey,
  154. filePath: keyFilePath,
  155. },
  156. LastSignState: FilePVLastSignState{
  157. Step: stepNone,
  158. filePath: stateFilePath,
  159. },
  160. }
  161. }
  162. // GenFilePV generates a new validator with randomly generated private key
  163. // and sets the filePaths, but does not call Save().
  164. func GenFilePV(keyFilePath, stateFilePath, keyType string) (*FilePV, error) {
  165. switch keyType {
  166. case types.ABCIPubKeyTypeSecp256k1:
  167. return NewFilePV(secp256k1.GenPrivKey(), keyFilePath, stateFilePath), nil
  168. case "", types.ABCIPubKeyTypeEd25519:
  169. return NewFilePV(ed25519.GenPrivKey(), keyFilePath, stateFilePath), nil
  170. default:
  171. return nil, fmt.Errorf("key type: %s is not supported", keyType)
  172. }
  173. }
  174. // LoadFilePV loads a FilePV from the filePaths. The FilePV handles double
  175. // signing prevention by persisting data to the stateFilePath. If either file path
  176. // does not exist, the program will exit.
  177. func LoadFilePV(keyFilePath, stateFilePath string) (*FilePV, error) {
  178. return loadFilePV(keyFilePath, stateFilePath, true)
  179. }
  180. // LoadFilePVEmptyState loads a FilePV from the given keyFilePath, with an empty LastSignState.
  181. // If the keyFilePath does not exist, the program will exit.
  182. func LoadFilePVEmptyState(keyFilePath, stateFilePath string) (*FilePV, error) {
  183. return loadFilePV(keyFilePath, stateFilePath, false)
  184. }
  185. // If loadState is true, we load from the stateFilePath. Otherwise, we use an empty LastSignState.
  186. func loadFilePV(keyFilePath, stateFilePath string, loadState bool) (*FilePV, error) {
  187. keyJSONBytes, err := os.ReadFile(keyFilePath)
  188. if err != nil {
  189. return nil, err
  190. }
  191. pvKey := FilePVKey{}
  192. err = tmjson.Unmarshal(keyJSONBytes, &pvKey)
  193. if err != nil {
  194. return nil, fmt.Errorf("error reading PrivValidator key from %v: %w", keyFilePath, err)
  195. }
  196. // overwrite pubkey and address for convenience
  197. pvKey.PubKey = pvKey.PrivKey.PubKey()
  198. pvKey.Address = pvKey.PubKey.Address()
  199. pvKey.filePath = keyFilePath
  200. pvState := FilePVLastSignState{}
  201. if loadState {
  202. stateJSONBytes, err := os.ReadFile(stateFilePath)
  203. if err != nil {
  204. return nil, err
  205. }
  206. err = json.Unmarshal(stateJSONBytes, &pvState)
  207. if err != nil {
  208. return nil, fmt.Errorf("error reading PrivValidator state from %v: %w", stateFilePath, err)
  209. }
  210. }
  211. pvState.filePath = stateFilePath
  212. return &FilePV{
  213. Key: pvKey,
  214. LastSignState: pvState,
  215. }, nil
  216. }
  217. // LoadOrGenFilePV loads a FilePV from the given filePaths
  218. // or else generates a new one and saves it to the filePaths.
  219. func LoadOrGenFilePV(keyFilePath, stateFilePath string) (*FilePV, error) {
  220. if tmos.FileExists(keyFilePath) {
  221. pv, err := LoadFilePV(keyFilePath, stateFilePath)
  222. if err != nil {
  223. return nil, err
  224. }
  225. return pv, nil
  226. }
  227. pv, err := GenFilePV(keyFilePath, stateFilePath, "")
  228. if err != nil {
  229. return nil, err
  230. }
  231. if err := pv.Save(); err != nil {
  232. return nil, err
  233. }
  234. return pv, nil
  235. }
  236. // GetAddress returns the address of the validator.
  237. // Implements PrivValidator.
  238. func (pv *FilePV) GetAddress() types.Address {
  239. return pv.Key.Address
  240. }
  241. // GetPubKey returns the public key of the validator.
  242. // Implements PrivValidator.
  243. func (pv *FilePV) GetPubKey(ctx context.Context) (crypto.PubKey, error) {
  244. return pv.Key.PubKey, nil
  245. }
  246. // SignVote signs a canonical representation of the vote, along with the
  247. // chainID. Implements PrivValidator.
  248. func (pv *FilePV) SignVote(ctx context.Context, chainID string, vote *tmproto.Vote) error {
  249. if err := pv.signVote(chainID, vote); err != nil {
  250. return fmt.Errorf("error signing vote: %w", err)
  251. }
  252. return nil
  253. }
  254. // SignProposal signs a canonical representation of the proposal, along with
  255. // the chainID. Implements PrivValidator.
  256. func (pv *FilePV) SignProposal(ctx context.Context, chainID string, proposal *tmproto.Proposal) error {
  257. if err := pv.signProposal(chainID, proposal); err != nil {
  258. return fmt.Errorf("error signing proposal: %w", err)
  259. }
  260. return nil
  261. }
  262. // Save persists the FilePV to disk.
  263. func (pv *FilePV) Save() error {
  264. if err := pv.Key.Save(); err != nil {
  265. return err
  266. }
  267. return pv.LastSignState.Save()
  268. }
  269. // Reset resets all fields in the FilePV.
  270. // NOTE: Unsafe!
  271. func (pv *FilePV) Reset() error {
  272. var sig []byte
  273. pv.LastSignState.Height = 0
  274. pv.LastSignState.Round = 0
  275. pv.LastSignState.Step = 0
  276. pv.LastSignState.Signature = sig
  277. pv.LastSignState.SignBytes = nil
  278. return pv.Save()
  279. }
  280. // String returns a string representation of the FilePV.
  281. func (pv *FilePV) String() string {
  282. return fmt.Sprintf(
  283. "PrivValidator{%v LH:%v, LR:%v, LS:%v}",
  284. pv.GetAddress(),
  285. pv.LastSignState.Height,
  286. pv.LastSignState.Round,
  287. pv.LastSignState.Step,
  288. )
  289. }
  290. //------------------------------------------------------------------------------------
  291. // signVote checks if the vote is good to sign and sets the vote signature.
  292. // It may need to set the timestamp as well if the vote is otherwise the same as
  293. // a previously signed vote (ie. we crashed after signing but before the vote hit the WAL).
  294. func (pv *FilePV) signVote(chainID string, vote *tmproto.Vote) error {
  295. step, err := voteToStep(vote)
  296. if err != nil {
  297. return err
  298. }
  299. height := vote.Height
  300. round := vote.Round
  301. lss := pv.LastSignState
  302. sameHRS, err := lss.CheckHRS(height, round, step)
  303. if err != nil {
  304. return err
  305. }
  306. signBytes := types.VoteSignBytes(chainID, vote)
  307. // We might crash before writing to the wal,
  308. // causing us to try to re-sign for the same HRS.
  309. // If signbytes are the same, use the last signature.
  310. // If they only differ by timestamp, use last timestamp and signature
  311. // Otherwise, return error
  312. if sameHRS {
  313. if bytes.Equal(signBytes, lss.SignBytes) {
  314. vote.Signature = lss.Signature
  315. } else {
  316. timestamp, ok, err := checkVotesOnlyDifferByTimestamp(lss.SignBytes, signBytes)
  317. if err != nil {
  318. return err
  319. }
  320. if !ok {
  321. return errors.New("conflicting data")
  322. }
  323. vote.Timestamp = timestamp
  324. vote.Signature = lss.Signature
  325. return nil
  326. }
  327. }
  328. // It passed the checks. Sign the vote
  329. sig, err := pv.Key.PrivKey.Sign(signBytes)
  330. if err != nil {
  331. return err
  332. }
  333. if err := pv.saveSigned(height, round, step, signBytes, sig); err != nil {
  334. return err
  335. }
  336. vote.Signature = sig
  337. return nil
  338. }
  339. // signProposal checks if the proposal is good to sign and sets the proposal signature.
  340. // It may need to set the timestamp as well if the proposal is otherwise the same as
  341. // a previously signed proposal ie. we crashed after signing but before the proposal hit the WAL).
  342. func (pv *FilePV) signProposal(chainID string, proposal *tmproto.Proposal) error {
  343. height, round, step := proposal.Height, proposal.Round, stepPropose
  344. lss := pv.LastSignState
  345. sameHRS, err := lss.CheckHRS(height, round, step)
  346. if err != nil {
  347. return err
  348. }
  349. signBytes := types.ProposalSignBytes(chainID, proposal)
  350. // We might crash before writing to the wal,
  351. // causing us to try to re-sign for the same HRS.
  352. // If signbytes are the same, use the last signature.
  353. // If they only differ by timestamp, use last timestamp and signature
  354. // Otherwise, return error
  355. if sameHRS {
  356. if bytes.Equal(signBytes, lss.SignBytes) {
  357. proposal.Signature = lss.Signature
  358. } else {
  359. timestamp, ok, err := checkProposalsOnlyDifferByTimestamp(lss.SignBytes, signBytes)
  360. if err != nil {
  361. return err
  362. }
  363. if !ok {
  364. return errors.New("conflicting data")
  365. }
  366. proposal.Timestamp = timestamp
  367. proposal.Signature = lss.Signature
  368. return nil
  369. }
  370. }
  371. // It passed the checks. Sign the proposal
  372. sig, err := pv.Key.PrivKey.Sign(signBytes)
  373. if err != nil {
  374. return err
  375. }
  376. if err := pv.saveSigned(height, round, step, signBytes, sig); err != nil {
  377. return err
  378. }
  379. proposal.Signature = sig
  380. return nil
  381. }
  382. // Persist height/round/step and signature
  383. func (pv *FilePV) saveSigned(height int64, round int32, step int8, signBytes []byte, sig []byte) error {
  384. pv.LastSignState.Height = height
  385. pv.LastSignState.Round = round
  386. pv.LastSignState.Step = step
  387. pv.LastSignState.Signature = sig
  388. pv.LastSignState.SignBytes = signBytes
  389. return pv.LastSignState.Save()
  390. }
  391. //-----------------------------------------------------------------------------------------
  392. // returns the timestamp from the lastSignBytes.
  393. // returns true if the only difference in the votes is their timestamp.
  394. func checkVotesOnlyDifferByTimestamp(lastSignBytes, newSignBytes []byte) (time.Time, bool, error) {
  395. var lastVote, newVote tmproto.CanonicalVote
  396. if err := protoio.UnmarshalDelimited(lastSignBytes, &lastVote); err != nil {
  397. return time.Time{}, false, fmt.Errorf("LastSignBytes cannot be unmarshalled into vote: %w", err)
  398. }
  399. if err := protoio.UnmarshalDelimited(newSignBytes, &newVote); err != nil {
  400. return time.Time{}, false, fmt.Errorf("signBytes cannot be unmarshalled into vote: %w", err)
  401. }
  402. lastTime := lastVote.Timestamp
  403. // set the times to the same value and check equality
  404. now := tmtime.Now()
  405. lastVote.Timestamp = now
  406. newVote.Timestamp = now
  407. return lastTime, proto.Equal(&newVote, &lastVote), nil
  408. }
  409. // returns the timestamp from the lastSignBytes.
  410. // returns true if the only difference in the proposals is their timestamp
  411. func checkProposalsOnlyDifferByTimestamp(lastSignBytes, newSignBytes []byte) (time.Time, bool, error) {
  412. var lastProposal, newProposal tmproto.CanonicalProposal
  413. if err := protoio.UnmarshalDelimited(lastSignBytes, &lastProposal); err != nil {
  414. return time.Time{}, false, fmt.Errorf("LastSignBytes cannot be unmarshalled into proposal: %w", err)
  415. }
  416. if err := protoio.UnmarshalDelimited(newSignBytes, &newProposal); err != nil {
  417. return time.Time{}, false, fmt.Errorf("signBytes cannot be unmarshalled into proposal: %w", err)
  418. }
  419. lastTime := lastProposal.Timestamp
  420. // set the times to the same value and check equality
  421. now := tmtime.Now()
  422. lastProposal.Timestamp = now
  423. newProposal.Timestamp = now
  424. return lastTime, proto.Equal(&newProposal, &lastProposal), nil
  425. }