Telemetry
Pessimism uses Prometheus for telemetry. The application spins up a metrics server on a specified port (default 7300) and exposes the /metrics
endpoint.
Local Testing
To verify that metrics are being collected locally, curl the metrics endpoint via curl localhost:7300/metrics
. The response should display all custom and system metrics.
Server Configuration
The default configuration within config.env.template
should be suitable in most cases, however if you do not want to run the metrics server, set METRICS_ENABLED=0
and the metrics server will not be started. This is useful mainly for testing purposes.
Generating Documentation
To generate documentation for metrics, run make docs
from the root of the repository. This will generate markdown
which can be pasted directly below to keep current system metric documentation up to date.
Current Metrics
METRIC | DESCRIPTION | LABELS | TYPE |
---|---|---|---|
pessimism_up | 1 if the service is up | gauge | |
pessimism_heuristics_active_heuristics | Number of active heuristics | heuristic,network,path | gauge |
pessimism_etl_active_paths | Number of active paths | path,network | gauge |
pessimism_heuristics_heuristic_runs_total | Number of times a specific heuristic has been run | network,heuristic | counter |
pessimism_alerts_generated_total | Number of total alerts generated for a given heuristic | network,heuristic,path,destination | counter |
pessimism_node_errors_total | Number of node errors caught | node | counter |
pessimism_block_latency | Millisecond latency of block processing | network | gauge |
pessimism_path_latency | Millisecond latency of path processing | PathID | gauge |
pessimism_heuristic_execution_time | Nanosecond time of heuristic execution | heuristic | gauge |
pessimism_heuristic_errors_total | Number of errors generated by heuristic executions | heuristic | counter |