Skip to main content
Version: 3.0.0

Deviser Telemetry

This document describes the OpenTelemetry metrics and assets for Deviser.

For general setup instructions, see the OpenTelemetry Configuration guide.

Telemetry Assets​

Metrics Reference​

Naming note​

In Prometheus, metric names are normalized with underscores (e.g., deviser_startup_success), and counters typically include the _total suffix.

Metrics Table​

MetricTypeUnitDescriptionPromQL example
deviser.startup.successCountercountService startup completed successfully.sum(increase(deviser_startup_success_total[1h]))
deviser.startup.failureCountercountService startup failed.sum(increase(deviser_startup_failure_total[1h]))
deviser.db.connect.duration_secondsHistogramsecondsTime spent waiting for PostgreSQL readiness.histogram_quantile(0.95, sum by (le) (rate(deviser_db_connect_duration_seconds_bucket[5m])))
deviser.db.schema_apply.duration_secondsHistogramsecondsTime spent applying startup SQL schemas.histogram_quantile(0.95, sum by (le) (rate(deviser_db_schema_apply_duration_seconds_bucket[5m])))
deviser.db.schema_apply.failureCountercountFailures during schema application.sum(increase(deviser_db_schema_apply_failure_total[1h]))
deviser.partitions.ensure.duration_secondsHistogramsecondsDuration of daily partition creation routine.histogram_quantile(0.95, sum by (le) (rate(deviser_partitions_ensure_duration_seconds_bucket[5m])))
deviser.partitions.ensure.failureCountercountFailures in partition ensure routine.sum(increase(deviser_partitions_ensure_failure_total[1h]))
deviser.partitions.ensure.daysGaugedaysConfigured horizon for partition creation.max(deviser_partitions_ensure_days)
deviser.partitions.maintain.duration_secondsHistogramsecondsDuration of partition maintenance routine.histogram_quantile(0.95, sum by (le) (rate(deviser_partitions_maintain_duration_seconds_bucket[5m])))
deviser.partitions.maintain.failureCountercountFailures in partition maintenance.sum(increase(deviser_partitions_maintain_failure_total[1h]))
deviser.partitions.dropped.expiredCountercountPartitions dropped due to retention policy.sum(increase(deviser_partitions_dropped_expired_total[1h]))
deviser.partitions.dropped.quotaCountercountPartitions dropped due to quota enforcement.sum(increase(deviser_partitions_dropped_quota_total[1h]))
deviser.partitions.bytes_freedCounterbytesBytes reclaimed by dropping partitions.sum(increase(deviser_partitions_bytes_freed_total[1h]))
deviser.partitions.total_discoveredGaugecountNumber of partitions discovered during maintenance.max(deviser_partitions_total_discovered)
deviser.partitions.total_bytesGaugebytesTotal size in bytes of discovered partitions.max(deviser_partitions_total_bytes)
deviser.resources.purge.duration_secondsHistogramsecondsDuration of soft-delete purge routine.histogram_quantile(0.95, sum by (le) (rate(deviser_resources_purge_duration_seconds_bucket[5m])))
deviser.resources.purge.rowsCounterrowsNumber of rows hard-deleted from krateo_resources.sum(increase(deviser_resources_purge_rows_total[1h]))
deviser.resources.purge.failureCountercountFailures in soft-delete purge routine.sum(increase(deviser_resources_purge_failure_total[1h]))
deviser.loop.iteration.successCountercountMain scheduled loop iteration with no errors.sum(rate(deviser_loop_iteration_success_total[5m]))
deviser.loop.iteration.failureCountercountMain scheduled loop iteration with at least one error.sum(rate(deviser_loop_iteration_failure_total[5m]))