This PR is related to #3107 and a continuation of #3351
It is important to emphasise that in the privval original design, client/server and listening/dialing roles are inverted and do not follow a conventional interaction.
Given two hosts A and B:
Host A is listener/client
Host B is dialer/server (contains the secret key)
When A requires a signature, it needs to wait for B to dial in before it can issue a request.
A only accepts a single connection and any failure leads to dropping the connection and waiting for B to reconnect.
The original rationale behind this design was based on security.
Host B only allows outbound connections to a list of whitelisted hosts.
It is not possible to reach B unless B dials in. There are no listening/open ports in B.
This PR results in the following changes:
Refactors ping/heartbeat to avoid previously existing race conditions.
Separates transport (dialer/listener) from signing (client/server) concerns to simplify workflow.
Unifies and abstracts away the differences between unix and tcp sockets.
A single signer endpoint implementation unifies connection handling code (read/write/close/connection obj)
The signer request handler (server side) is customizable to increase testability.
Updates and extends unit tests
A high level overview of the classes is as follows:
Transport (endpoints): The following classes take care of establishing a connection
SignerDialerEndpoint
SignerListeningEndpoint
SignerEndpoint groups common functionality (read/write/timeouts/etc.)
Signing (client/server): The following classes take care of exchanging request/responses
SignerClient
SignerServer
This PR also closes#3601
Commits:
* refactoring - work in progress
* reworking unit tests
* Encapsulating and fixing unit tests
* Improve tests
* Clean up
* Fix/improve unit tests
* clean up tests
* Improving service endpoint
* fixing unit test
* fix linter issues
* avoid invalid cache values (improve later?)
* complete implementation
* wip
* improved connection loop
* Improve reconnections + fixing unit tests
* addressing comments
* small formatting changes
* clean up
* Update node/node.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_client.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_client_test.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* check during initialization
* dropping connecting when writing fails
* removing break
* use t.log instead
* unifying and using cmn.GetFreePort()
* review fixes
* reordering and unifying drop connection
* closing instead of signalling
* refactored service loop
* removed superfluous brackets
* GetPubKey can return errors
* Revert "GetPubKey can return errors"
This reverts commit 68c06f19b4.
* adding entry to changelog
* Update CHANGELOG_PENDING.md
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_client.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_dialer_endpoint.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_dialer_endpoint.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_dialer_endpoint.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_dialer_endpoint.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* Update privval/signer_listener_endpoint_test.go
Co-Authored-By: jleni <juan.leni@zondax.ch>
* updating node.go
* review fixes
* fixes linter
* fixing unit test
* small fixes in comments
* addressing review comments
* addressing review comments 2
* reverting suggestion
* Update privval/signer_client_test.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update privval/signer_client_test.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* Update privval/signer_listener_endpoint_test.go
Co-Authored-By: Anton Kaliaev <anton.kalyaev@gmail.com>
* do not expose brokenSignerDialerEndpoint
* clean up logging
* unifying methods
shorten test time
signer also drops
* reenabling pings
* improving testability + unit test
* fixing go fmt + unit test
* remove unused code
* Addressing review comments
* simplifying connection workflow
* fix linter/go import issue
* using base service quit
* updating comment
* Simplifying design + adjusting names
* fixing linter issues
* refactoring test harness + fixes
* Addressing review comments
* cleaning up
* adding additional error check
This issue is related to #3107
This is a first renaming/refactoring step before reworking and removing heartbeats.
As discussed with @Liamsi , we preferred to go for a couple of independent and separate PRs to simplify review work.
The changes:
Help to clarify the relation between the validator and remote signer endpoints
Differentiate between timeouts and deadlines
Prepare to encapsulate networking related code behind RemoteSigner in the next PR
My intention is to separate and encapsulate the "network related" code from the actual signer.
SignerRemote ---(uses/contains)--> SignerValidatorEndpoint <--(connects to)--> SignerServiceEndpoint ---> SignerService (future.. not here yet but would like to decouple too)
All reconnection/heartbeat/whatever code goes in the endpoints. Signer[Remote/Service] do not need to know about that.
I agree Endpoint may not be the perfect name. I tried to find something "Go-ish" enough. It is a common name in go-kit, kubernetes, etc.
Right now:
SignerValidatorEndpoint:
handles the listener
contains SignerRemote
Implements the PrivValidator interface
connects and sets a connection object in a contained SignerRemote
delegates PrivValidator some calls to SignerRemote which in turn uses the conn object that was set externally
SignerRemote:
Implements the PrivValidator interface
read/writes from a connection object directly
handles heartbeats
SignerServiceEndpoint:
Does most things in a single place
delegates to a PrivValidator IIRC.
* cleanup
* Refactoring step 1
* Refactoring step 2
* move messages to another file
* mark for future work / next steps
* mark deprecated classes in docs
* Fix linter problems
* additional linter fixes
* Close and recreate a RemoteSigner on err
* Update changelog
* Address Anton's comments / suggestions:
- update changelog
- restart TCPVal
- shut down on `ErrUnexpectedResponse`
* re-init remote signer client with fresh connection if Ping fails
- add/update TODOs in secret connection
- rename tcp.go -> tcp_client.go, same with ipc to clarify their purpose
* account for `conn returned by waitConnection can be `nil`
- also add TODO about RemoteSigner conn field
* Tests for retrying: IPC / TCP
- shorter info log on success
- set conn and use it in tests to close conn
* Tests for retrying: IPC / TCP
- shorter info log on success
- set conn and use it in tests to close conn
- add rwmutex for conn field in IPC
* comments and doc.go
* fix ipc tests. fixes#2677
* use constants for tests
* cleanup some error statements
* fixes#2784, race in tests
* remove print statement
* minor fixes from review
* update comment on sts spec
* cosmetics
* p2p/conn: add failing tests
* p2p/conn: make SecretConnection thread safe
* changelog
* IPCVal signer refactor
- use a .reset() method
- don't use embedded RemoteSignerClient
- guard RemoteSignerClient with mutex
- drop the .conn
- expose Close() on RemoteSignerClient
* apply IPCVal refactor to TCPVal
* remove mtx from RemoteSignerClient
* consolidate IPCVal and TCPVal, fixes#3104
- done in tcp_client.go
- now called SocketVal
- takes a listener in the constructor
- make tcpListener and unixListener contain all the differences
* delete ipc files
* introduce unix and tcp dialer for RemoteSigner
* rename files
- drop tcp_ prefix
- rename priv_validator.go to file.go
* bring back listener options
* fix node
* fix priv_val_server
* fix node test
* minor cleanup and comments