HealthService
HealthService provides webhook health metrics derived from delivery outcomes. Health is computed per-webhook based on success rate and consecutive failures: HEALTHY (>90% success, <3 consecutive failures), DEGRADED (50-90% or 3-9 consecutive), UNHEALTHY (<50% or 10+ consecutive failures).
GetWebhookHealth
/webhook.HealthService/GetWebhookHealth GetWebhookHealth returns health status and detailed metrics for a single webhook: total/successful/failed deliveries, consecutive failures, success rate, avg response time, and error category breakdown (client/server/timeout/network errors). Returns HEALTH_UNSPECIFIED with no metrics if the webhook has no delivery history.
Request
webhook_idstringUUID of the webhook. Required.
namespacestringNamespace the webhook belongs to. Required.
Response
webhook_idstringUUID of the webhook.
healthWebhookHealthComputed health status based on success_rate and consecutive_failures. Returns HEALTH_UNSPECIFIED if the webhook has no delivery history.
metrics.webhook_idstringUUID of the webhook these metrics belong to.
metrics.total_deliveriesint32Total number of deliveries ever made to this webhook (all time).
metrics.successful_deliveriesint32Number of deliveries that succeeded (terminal SUCCESS status).
metrics.failed_deliveriesint32Number of deliveries that failed permanently (terminal FAILED status).
metrics.consecutive_failuresint32Current streak of consecutive failed deliveries. Resets to 0 on any success. Used for health status computation: 3-9 = DEGRADED, 10+ = UNHEALTHY.
metrics.last_success_atTimestampTimestamp of the most recent successful delivery. Null if the webhook has never succeeded.
metrics.last_failure_atTimestampTimestamp of the most recent failed delivery. Null if the webhook has never failed.
metrics.success_ratedoubleSuccess rate as a decimal between 0.0 and 1.0. Computed as successful_deliveries / total_deliveries. Used for health thresholds: >0.9 = HEALTHY, 0.5-0.9 = DEGRADED, <0.5 = UNHEALTHY.
metrics.avg_response_timeint32Average response time in milliseconds across all delivery attempts.
metrics.created_atTimestampWhen the health metrics record was first created.
metrics.updated_atTimestampWhen the health metrics were last updated.
metrics.client_errorsint32Count of client errors (HTTP 4xx) in the last 24 hours. Client errors are never retried -- typically indicates a misconfigured endpoint (wrong URL, missing auth, payload format mismatch).
metrics.server_errorsint32Count of server errors (HTTP 5xx) in the last 24 hours. Server errors are retried according to the webhook's retry configuration.
metrics.timeout_errorsint32Count of timeout errors in the last 24 hours. Timeouts are retried. May indicate the endpoint is slow or overloaded.
metrics.network_errorsint32Count of network-level errors in the last 24 hours. Includes DNS failures, TLS errors, connection refused, and other transport errors.
curl -X POST http://localhost:8080/webhook.HealthService/GetWebhookHealth \
-H "Content-Type: application/json" \
-d '{
"webhook_id": "550e8400-e29b-41d4-a716-446655440000",
"namespace": "production"
}' {
"health": "HEALTHY",
"metrics": {
"total_deliveries": 1520,
"successful_deliveries": 1480,
"failed_deliveries": 40,
"consecutive_failures": 0,
"success_rate": 0.974,
"avg_response_time": 245,
"client_errors": 2,
"server_errors": 5,
"timeout_errors": 1,
"network_errors": 0
}
} ListWebhooksByHealth
/webhook.HealthService/ListWebhooksByHealth ListWebhooksByHealth returns all webhooks matching a given health status. Useful for finding degraded or unhealthy endpoints. Paginated.
Request
healthWebhookHealthHealth status to filter by. Required. Use HEALTH_UNHEALTHY to find problematic endpoints, HEALTH_DEGRADED for early warnings.
paginationPaginationRequestPagination parameters. Default: limit=50, offset=0.
Response
webhooksRegisteredWebhook[]Webhooks with the requested health status.
paginationPaginationResponsePagination metadata.
curl -X POST http://localhost:8080/webhook.HealthService/ListWebhooksByHealth \
-H "Content-Type: application/json" \
-d '{
"health": "HEALTH_UNHEALTHY"
}' GetHealthSummary
/webhook.HealthService/GetHealthSummary GetHealthSummary returns aggregate counts of webhooks by health status (healthy, degraded, unhealthy, unknown) across all namespaces.
Response
summary.healthy_countint32Number of webhooks in HEALTHY state (>90% success rate, <3 consecutive failures).
summary.degraded_countint32Number of webhooks in DEGRADED state (50-90% success rate or 3-9 consecutive failures).
summary.unhealthy_countint32Number of webhooks in UNHEALTHY state (<50% success rate or 10+ consecutive failures).
summary.unknown_countint32Number of webhooks with no delivery history (HEALTH_UNSPECIFIED).
summary.total_countint32Total number of webhooks (sum of all categories above).
curl -X POST http://localhost:8080/webhook.HealthService/GetHealthSummary \
-H "Content-Type: application/json" \
-d '{}' {
"summary": {
"healthy_count": 45,
"degraded_count": 3,
"unhealthy_count": 1,
"unknown_count": 2,
"total_count": 51
}
}