๐Ÿ“ŠMonitoring

Monitoring the health of your operator setup.

Monitoring, Metrics, Grafana dashboards

Quickstart

Here you'll find a quick start guide to run the Prometheus, Grafana, and Node exporter stack. Check out the README here for more details. If you want to manually set this up, follow the steps below.

Metrics

1

Check the data validator metrics

Replace the EO_PROMETHEUS_PORT with the value of EO_PROMETHEUS_PORT from the data-validator/.env

curl http://localhost:<EO_PROMETHEUS_PORT>/metrics

You should see something like:

# HELP eigen_performance_score The performance metric is a score between 0 and 100 and each developer can define their own way of calculating the score. The score is calculated based on the performance of the Node and the performance of the backing services.
# TYPE eigen_performance_score gauge
eigen_performance_score{avs_name="EoracleDataValidator"} 100
...
# HELP eoracle_health_check EOracle Health Check
# TYPE eoracle_health_check gauge
eoracle_health_check{avs_name="EoracleDataValidator", name="service"} 1
eoracle_health_check{avs_name="EoracleDataValidator", name="polygon.io"} 1
...
2

Setup the monitoring stack

We use prometheus to scrape the metrics from the eOracle data validator container. Make sure to edit the prometheus.yml file, located at Eoracle-operator-setup/data-validator/monitoring, replacing the placeholders 'PROMETHEUS_PORT', OPERATOR_ADDRESS, and mainnet|testnet with your specific values

cd Eoracle-operator-setup/data-validator/monitoring

The relevant lines are:

  - job_name: 'eoracle-data-validator'
    static_configs:
      - targets: ['eoracle-data-validator:<PROMETHEUS_PORT>']
3

Setup sending data validator metrics to eOracle monitoring

We allow operators to push the data validator metrics to eOracle monitoring system for extra monitoring. To do so, make sure to edit the vmagent.yml file, located at Eoracle-operator-setup/data-validator/monitoring, replacing the placeholders 'PROMETHEUS_PORT', OPERATOR_ADDRESS, and mainnet|testnet with your specific values

cd Eoracle-operator-setup/data-validator/monitoring

The relevant lines are:

  - job_name: 'eoracle-data-validator'
    static_configs:
      - targets: ['eoracle-data-validator:<PROMETHEUS_PORT>']
    metric_relabel_configs:
    - target_label: operator_address
      replacement: <OPERATOR_ADDRESS>
    - target_label: eochain
      replacement: <mainnet|testnet>
4

Start the monitoring stack

You can start all the monitoring stack, Prometheus, Grafana, and Node exporter all at once or only specific component

docker compose up -d
5

Connect docker networks

Since the eOracle data validator is running in a different docker network, we will need to have the Prometheus container in the same network of oracle-data-validator. To do that, run the following command.

docker network connect eoracle-data-validator prometheus

Troubleshooting

  • If you see the following error:

    permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json": dial unix /var/run/docker.sock: connect: permission denied

    Use the same command by prepending sudo in front of it.

Grafana

We use Grafana to visualize the metrics from the eOracle AVS.

You can use OSS Grafana for it or any other Dashboard provider.

You should be able to navigate to http://<ip>:3000 and log in with admin/admin. This container of Grafana has a Prometheus data source setup using port 9090. If you change the Prometheus port, you need to add a new data source or update the existing data source. You can do this by navigating to http://<ip>:3000/datasources

Useful Dashboards

The eOracle Data Validator dashboard can be used to monitor performance, issues and data source statuses. Explaining each panel on below -

Score

The score panel shows the gauge metric eigen_performance_scorebetween 0-100 which is calculated based on the performance of the AVS operator and the performance of the backing services.

RPC Req

The RPC Req panel shows the counter and histogram of the total number of json-rpc <method> requests from the execution client.

eOracle Errors & eOracle Errors Avg 5 min

The eOracle Errors panel shows the counter for the number of errors encountered by the execution client.

Update Rate Duration (s)

The Update Rate duration panel helps visualize the frequency of updates in seconds. For example, in the above chart 91% of the submitted transactions were processed within 0.1 seconds.

eOracle Chain Performance (s)

The eOracle chain panel helps visualize the time between blocks on the eOracle chain. As per the above dashboard, the majority of the blocks are produced within 0.005 seconds.

Data Providers All

The Data providers panel shows the connection status of each data source that the validator pings for price feeds. If one of the sources shows 'FAIL' instead of 'OK', it means the connection to that source is broken.

You can find the json file to import the above dashboard here. Once you have Grafana set up, feel free to import the dashboards.

Node exporter

The eOracle data validator emits eOracle specific metrics. However, it's also important to keep track of the node's health. For this, we will use Node Exporter which is a Prometheus exporter for hardware and OS metrics exposed by *NIX kernels, written in Go with pluggable metric collectors. By default, it is installed and started when you start the entire monitoring stack. If you want to modify the stack, you can install the binary or use docker to run it.

In Grafana dashboards screen, import the node-exporter to see host metrics.

VMAgent

The vmagent is the docker that submits data validator metrics to eOracle central monitoring. This allow us to help you in troubleshoot your operator.

If you don't want to share with us the metrics, remove the vmagent from the docker-compose.yml in the data-validator/monitoring folder

Last updated