Browse Source

added initial trust metric design doc and code

pull/787/head
caffix 7 years ago
parent
commit
e160a6198c
4 changed files with 511 additions and 0 deletions
  1. +117
    -0
      docs/architecture/adr-006-trust-metric.md
  2. BIN
      docs/architecture/img/formula1.png
  3. BIN
      docs/architecture/img/formula2.png
  4. +394
    -0
      p2p/trust/trustmetric.go

+ 117
- 0
docs/architecture/adr-006-trust-metric.md View File

@ -0,0 +1,117 @@
# Trust Metric Design
## Overview
The proposed trust metric will allow Tendermint to maintain local trust rankings for peers it has directly interacted with, which can then be used to implement soft security controls. The calculations were obtained from the [TrustGuard](https://dl.acm.org/citation.cfm?id=1060808) project.
## Background
The Tendermint Core project developers would like to improve Tendermint security and reliability by keeping track of the level of trustworthiness peers have demonstrated within the peer-to-peer network. This way, undesirable outcomes from peers will not immediately result in them being dropped from the network (potentially causing drastic changes to take place). Instead, peers behavior can be monitored with appropriate metrics and be removed from the network once Tendermint Core is certain the peer is a threat. For example, when the PEXReactor makes a request for peers network addresses from a already known peer, and the returned network addresses are unreachable, this untrustworthy behavior should be tracked. Returning a few bad network addresses probably shouldn’t cause a peer to be dropped, while excessive amounts of this behavior does qualify the peer being dropped.
Trust metrics can be circumvented by malicious nodes through the use of strategic oscillation techniques, which adapts the malicious node’s behavior pattern in order to maximize its goals. For instance, if the malicious node learns that the time interval of the Tendermint trust metric is *X* hours, then it could wait *X* hours in-between malicious activities. We could try to combat this issue by increasing the interval length, yet this will make the system less adaptive to recent events.
Instead, having shorter intervals, but keeping a history of interval values, will give our metric the flexibility needed in order to keep the network stable, while also making it resilient against a strategic malicious node in the Tendermint peer-to-peer network. Also, the metric can access trust data over a rather long period of time while not greatly increasing its history size by aggregating older history values over a larger number of intervals, and at the same time, maintain great precision for the recent intervals. This approach is referred to as fading memories, and closely resembles the way human beings remember their experiences. The trade-off to using history data is that the interval values should be preserved in-between executions of the node.
## Scope
The proposed trust metric will be implemented as a Go programming language object that will allow a developer to inform the object of all good and bad events relevant to the trust object instantiation, and at any time, the metric can be queried for the current trust ranking. Methods will be provided for storing trust metric history data that is required across instantiations.
## Detailed Design
This section will cover the process being considered for calculating the trust ranking and the interface for the trust metric.
### Proposed Process
The proposed trust metric will count good and bad events relevant to the object, and calculate the percent of counters that are good over an interval with a predefined duration. This is the procedure that will continue for the life of the trust metric. When the trust metric is queried for the current **trust value**, a resilient equation will be utilized to perform the calculation.
The equation being proposed resembles a Proportional-Integral-Derivative (PID) controller used in control systems. The proportional component allows us to be sensitive to the value of the most recent interval, while the integral component allows us to incorporate trust values stored in the history data, and the derivative component allows us to give weight to sudden changes in the behavior of a peer. We compute the trust value of a peer in interval i based on its current trust ranking, its trust rating history prior to interval *i* (over the past *maxH* number of intervals) and its trust ranking fluctuation. We will break up the equation into the three components.
```math
(1) Proportional Value = a * R[i]
```
where *R*[*i*] denotes the raw trust value at time interval *i* (where *i* == 0 being current time) and *a* is the weight applied to the contribution of the current reports. The next component of our equation uses a weighted sum over the last *maxH* intervals to calculate the history value for time *i*:
`H[i] = ` ![formula1](https://github.com/tendermint/tendermint/blob/develop/docs/architecture/img/formula1.png "Weighted Sum Formula")
The weights can be chosen either optimistically or pessimistically. With the history value available, we can now finish calculating the integral value:
```math
(2) Integral Value = b * H[i]
```
Where *H*[*i*] denotes the history value at time interval *i* and *b* is the weight applied to the contribution of past performance for the object being measured. The derivative component will be calculated as follows:
```math
D[i] = R[i] – H[i]
(3) Derivative Value = (c * D[i]) * D[i]
```
Where the value of *c* is selected based on the *D*[*i*] value relative to zero. With the three components brought together, our trust value equation is calculated as follows:
```math
TrustValue[i] = a * R[i] + b * H[i] + (c * D[i]) * D[i]
```
As a performance optimization that will keep the amount of raw interval data being saved to a reasonable size of *m*, while allowing us to represent 2^*m* - 1 history intervals, we can employ the fading memories technique that will trade space and time complexity for the precision of the history data values by summarizing larger quantities of less recent values. While our equation above attempts to access up to *maxH* (which can be 2^*m* - 1), we will map those requests down to *m* values using equation 4 below:
```math
(4) j = index, where index > 0
```
Where *j* is one of *(0, 1, 2, … , m – 1)* indices used to access history interval data. Now we can access the raw intervals using the following calculations:
```math
R[0] = raw data for current time interval
```
`R[j] = ` ![formula2](https://github.com/tendermint/tendermint/blob/develop/docs/architecture/img/formula2.png "Fading Memories Formula")
### Interface Detailed Design
This section will cover the Go programming language API designed for the previously proposed process. Below is the interface for a TrustMetric:
```go
package trust
type TrustMetric struct {
}
type TrustMetricConfig struct {
ProportionalWeight float64
IntegralWeight float64
HistoryMaxSize int
IntervalLen time.Duration
}
func (tm *TrustMetric) Stop()
func (tm *TrustMetric) IncBad()
func (tm *TrustMetric) AddBad(num int)
func (tm *TrustMetric) IncGood()
func (tm *TrustMetric) AddGood(num int)
// get the dependable trust value
func (tm *TrustMetric) TrustValue() float64
func NewMetric() *TrustMetric
func NewMetricWithConfig(tmc *TrustMetricConfig) *TrustMetric
func GetPeerTrustMetric(key string) *TrustMetric
func PeerDisconnected(key string)
```
## References
S. Mudhakar, L. Xiong, and L. Liu, “TrustGuard: Countering Vulnerabilities in Reputation Management for Decentralized Overlay Networks,” in *Proceedings of the 14th international conference on World Wide Web, pp. 422-431*, May 2005.

BIN
docs/architecture/img/formula1.png View File

Before After
Width: 294  |  Height: 131  |  Size: 9.6 KiB

BIN
docs/architecture/img/formula2.png View File

Before After
Width: 331  |  Height: 84  |  Size: 5.8 KiB

+ 394
- 0
p2p/trust/trustmetric.go View File

@ -0,0 +1,394 @@
package trust
import (
"encoding/json"
"io/ioutil"
"math"
"os"
"path/filepath"
"time"
)
var (
store *trustMetricStore
)
type peerMetricRequest struct {
Key string
Resp chan *TrustMetric
}
type trustMetricStore struct {
PeerMetrics map[string]*TrustMetric
Requests chan *peerMetricRequest
Disconn chan string
}
func init() {
store = &trustMetricStore{
PeerMetrics: make(map[string]*TrustMetric),
Requests: make(chan *peerMetricRequest, 10),
Disconn: make(chan string, 10),
}
go store.processRequests()
}
type peerHistory struct {
NumIntervals int `json:"intervals"`
History []float64 `json:"history"`
}
func loadSaveFromFile(key string, isLoad bool, data *peerHistory) *peerHistory {
tmhome, ok := os.LookupEnv("TMHOME")
if !ok {
return nil
}
filename := filepath.Join(tmhome, "trust_history.json")
peers := make(map[string]peerHistory, 0)
// read in previously written history data
content, err := ioutil.ReadFile(filename)
if err == nil {
err = json.Unmarshal(content, &peers)
}
var result *peerHistory
if isLoad {
if p, ok := peers[key]; ok {
result = &p
}
} else {
peers[key] = *data
b, err := json.Marshal(peers)
if err == nil {
err = ioutil.WriteFile(filename, b, 0644)
}
}
return result
}
func createLoadPeerMetric(key string) *TrustMetric {
tm := NewMetric()
if tm == nil {
return tm
}
// attempt to load the peer's trust history data
if ph := loadSaveFromFile(key, true, nil); ph != nil {
tm.historySize = len(ph.History)
if tm.historySize > 0 {
tm.numIntervals = ph.NumIntervals
tm.history = ph.History
tm.historyValue = tm.calcHistoryValue()
}
}
return tm
}
func (tms *trustMetricStore) processRequests() {
for {
select {
case req := <-tms.Requests:
tm, ok := tms.PeerMetrics[req.Key]
if !ok {
tm = createLoadPeerMetric(req.Key)
if tm != nil {
tms.PeerMetrics[req.Key] = tm
}
}
req.Resp <- tm
case key := <-tms.Disconn:
if tm, ok := tms.PeerMetrics[key]; ok {
ph := peerHistory{
NumIntervals: tm.numIntervals,
History: tm.history,
}
tm.Stop()
delete(tms.PeerMetrics, key)
loadSaveFromFile(key, false, &ph)
}
}
}
}
// request a TrustMetric by Peer Key
func GetPeerTrustMetric(key string) *TrustMetric {
resp := make(chan *TrustMetric, 1)
store.Requests <- &peerMetricRequest{Key: key, Resp: resp}
return <-resp
}
// the trust metric store should know when a Peer disconnects
func PeerDisconnected(key string) {
store.Disconn <- key
}
// keep track of Peer reliability
type TrustMetric struct {
proportionalWeight float64
integralWeight float64
numIntervals int
maxIntervals int
intervalLen time.Duration
history []float64
historySize int
historyMaxSize int
historyValue float64
bad, good float64
stop chan int
update chan *updateBadGood
trustValue chan *reqTrustValue
}
type TrustMetricConfig struct {
// be careful changing these weights
ProportionalWeight float64
IntegralWeight float64
// don't allow 2^HistoryMaxSize to be greater than int max value
HistoryMaxSize int
// each interval should be short for adapability
// less than 30 seconds is too sensitive,
// and greater than 5 minutes will make the metric numb
IntervalLen time.Duration
}
func defaultConfig() *TrustMetricConfig {
return &TrustMetricConfig{
ProportionalWeight: 0.4,
IntegralWeight: 0.6,
HistoryMaxSize: 16,
IntervalLen: 1 * time.Minute,
}
}
type updateBadGood struct {
IsBad bool
Add int
}
type reqTrustValue struct {
Resp chan float64
}
// calculates the derivative component
func (tm *TrustMetric) derivativeValue() float64 {
return tm.proportionalValue() - tm.historyValue
}
// strengthens the derivative component
func (tm *TrustMetric) weightedDerivative() float64 {
var weight float64
d := tm.derivativeValue()
if d < 0 {
weight = 1.0
}
return weight * d
}
func (tm *TrustMetric) fadedMemoryValue(interval int) float64 {
if interval == 0 {
// base case
return tm.history[0]
}
index := int(math.Floor(math.Log(float64(interval)) / math.Log(2)))
// map the interval value down to an actual history index
return tm.history[index]
}
func (tm *TrustMetric) updateFadedMemory() {
if tm.historySize < 2 {
return
}
// keep the last history element
faded := tm.history[:1]
for i := 1; i < tm.historySize; i++ {
x := math.Pow(2, float64(i))
ftv := ((tm.history[i] * (x - 1)) + tm.history[i-1]) / x
faded = append(faded, ftv)
}
tm.history = faded
}
// calculates the integral (history) component of the trust value
func (tm *TrustMetric) calcHistoryValue() float64 {
var wk []float64
// create the weights
hlen := tm.numIntervals
for i := 0; i < hlen; i++ {
x := math.Pow(.8, float64(i+1)) // optimistic wk
wk = append(wk, x)
}
var wsum float64
// calculate the sum of the weights
for _, v := range wk {
wsum += v
}
var hv float64
// calculate the history value
for i := 0; i < hlen; i++ {
weight := wk[i] / wsum
hv += tm.fadedMemoryValue(i) * weight
}
return hv
}
// calculates the current score for good experiences
func (tm *TrustMetric) proportionalValue() float64 {
value := 1.0
// bad events are worth more
total := tm.good + math.Pow(tm.bad, 2)
if tm.bad > 0 || tm.good > 0 {
value = tm.good / total
}
return value
}
func (tm *TrustMetric) calcTrustValue() float64 {
weightedP := tm.proportionalWeight * tm.proportionalValue()
weightedI := tm.integralWeight * tm.historyValue
weightedD := tm.weightedDerivative()
tv := weightedP + weightedI + weightedD
if tv < 0 {
tv = 0
}
return tv
}
func (tm *TrustMetric) processRequests() {
t := time.NewTicker(tm.intervalLen)
defer t.Stop()
loop:
for {
select {
case bg := <-tm.update:
if bg.IsBad {
tm.bad += float64(bg.Add)
} else {
tm.good += float64(bg.Add)
}
case rtv := <-tm.trustValue:
// send the calculated trust value back
rtv.Resp <- tm.calcTrustValue()
case <-t.C:
newHist := tm.calcTrustValue()
tm.history = append([]float64{newHist}, tm.history...)
if tm.historySize < tm.historyMaxSize {
tm.historySize++
} else {
tm.history = tm.history[:tm.historyMaxSize]
}
if tm.numIntervals < tm.maxIntervals {
tm.numIntervals++
}
tm.updateFadedMemory()
tm.historyValue = tm.calcHistoryValue()
tm.good = 0
tm.bad = 0
case <-tm.stop:
break loop
}
}
}
func (tm *TrustMetric) Stop() {
tm.stop <- 1
}
// indicate that an undesirable event took place
func (tm *TrustMetric) IncBad() {
tm.update <- &updateBadGood{IsBad: true, Add: 1}
}
// multiple undesirable events need to be acknowledged
func (tm *TrustMetric) AddBad(num int) {
tm.update <- &updateBadGood{IsBad: true, Add: num}
}
// positive events need to be recorded as well
func (tm *TrustMetric) IncGood() {
tm.update <- &updateBadGood{IsBad: false, Add: 1}
}
// multiple positive can be indicated in a single call
func (tm *TrustMetric) AddGood(num int) {
tm.update <- &updateBadGood{IsBad: false, Add: num}
}
// get the dependable trust value; a score that takes a long history into account
func (tm *TrustMetric) TrustValue() float64 {
resp := make(chan float64, 1)
tm.trustValue <- &reqTrustValue{Resp: resp}
return <-resp
}
func NewMetric() *TrustMetric {
return NewMetricWithConfig(defaultConfig())
}
func NewMetricWithConfig(tmc *TrustMetricConfig) *TrustMetric {
tm := new(TrustMetric)
dc := defaultConfig()
if tmc.ProportionalWeight != 0 {
tm.proportionalWeight = tmc.ProportionalWeight
} else {
tm.proportionalWeight = dc.ProportionalWeight
}
if tmc.IntegralWeight != 0 {
tm.integralWeight = tmc.IntegralWeight
} else {
tm.integralWeight = dc.IntegralWeight
}
if tmc.HistoryMaxSize != 0 {
tm.historyMaxSize = tmc.HistoryMaxSize
} else {
tm.historyMaxSize = dc.HistoryMaxSize
}
if tmc.IntervalLen != time.Duration(0) {
tm.intervalLen = tmc.IntervalLen
} else {
tm.intervalLen = dc.IntervalLen
}
// this gives our metric a tracking window of days
tm.maxIntervals = int(math.Pow(2, float64(tm.historyMaxSize)))
tm.historyValue = 1.0
tm.update = make(chan *updateBadGood, 10)
tm.trustValue = make(chan *reqTrustValue, 10)
tm.stop = make(chan int, 1)
go tm.processRequests()
return tm
}

Loading…
Cancel
Save