Median

Definition

The median algorithm returns the middle value from a sorted list of validator reports. It is most naturally used for numerical data but can be applied to any ordered set of data.

A possible variant is the weighted median - Let wiw^i be the stake of data validator ii, and assume that i=1Nwi=1\sum_{i=1}^Nw^i=1. Also, for simplicity, assume that rt1,rt2,...,rtNr_t^1,r_t^2,...,r_t^N are sorted. The weighted median is an element rtkr_t^k such that

i=1k1wi12 and i=k+1Nwi12\sum_{i=1}^{k-1} w^i \leq \frac{1}{2} \text{ and } \sum_{i=k+1}^{N} w^i \leq \frac{1}{2}

Rationale and Properties

  • Resistant to extreme values and outliers, as they only affect the tails of the sorted list—ignores the impact of anomalous data points due to its reliance on central values.

  • The aggregation value will always be between the honest reports, assuming a majority of the stake belongs to honest validators (weighted median)

Limitations

  • May not reflect the influence of large/small prices if they are outliers

  • Non-incremental - the aggregated value cannot be updated upon each report in an online manner (as opposed to mean, for example, which supports mn=n1nmn1+rnnm_n=\frac{n-1}{n}\cdot m_{n-1}+\frac{r_n}{n} where mim_i

    is the mean after ii reports and rir_i is the ii-th report).

Last updated