Skip to content

HealthService

HealthService provides webhook health metrics derived from delivery outcomes. Health is computed per-webhook based on success rate and consecutive failures: HEALTHY (>90% success, <3 consecutive failures), DEGRADED (50-90% or 3-9 consecutive), UNHEALTHY (<50% or 10+ consecutive failures).

GetWebhookHealth

POST /webhook.HealthService/GetWebhookHealth

GetWebhookHealth returns health status and detailed metrics for a single webhook: total/successful/failed deliveries, consecutive failures, success rate, avg response time, and error category breakdown (client/server/timeout/network errors). Returns HEALTH_UNSPECIFIED with no metrics if the webhook has no delivery history.

Request

webhook_idstring

UUID of the webhook. Required.

namespacestring

Namespace the webhook belongs to. Required.

Response

webhook_idstring

UUID of the webhook.

healthWebhookHealth

Computed health status based on success_rate and consecutive_failures. Returns HEALTH_UNSPECIFIED if the webhook has no delivery history.

metrics.webhook_idstring

UUID of the webhook these metrics belong to.

metrics.total_deliveriesint32

Total number of deliveries ever made to this webhook (all time).

metrics.successful_deliveriesint32

Number of deliveries that succeeded (terminal SUCCESS status).

metrics.failed_deliveriesint32

Number of deliveries that failed permanently (terminal FAILED status).

metrics.consecutive_failuresint32

Current streak of consecutive failed deliveries. Resets to 0 on any success. Used for health status computation: 3-9 = DEGRADED, 10+ = UNHEALTHY.

metrics.last_success_atTimestamp

Timestamp of the most recent successful delivery. Null if the webhook has never succeeded.

metrics.last_failure_atTimestamp

Timestamp of the most recent failed delivery. Null if the webhook has never failed.

metrics.success_ratedouble

Success rate as a decimal between 0.0 and 1.0. Computed as successful_deliveries / total_deliveries. Used for health thresholds: >0.9 = HEALTHY, 0.5-0.9 = DEGRADED, <0.5 = UNHEALTHY.

metrics.avg_response_timeint32

Average response time in milliseconds across all delivery attempts.

metrics.created_atTimestamp

When the health metrics record was first created.

metrics.updated_atTimestamp

When the health metrics were last updated.

metrics.client_errorsint32

Count of client errors (HTTP 4xx) in the last 24 hours. Client errors are never retried -- typically indicates a misconfigured endpoint (wrong URL, missing auth, payload format mismatch).

metrics.server_errorsint32

Count of server errors (HTTP 5xx) in the last 24 hours. Server errors are retried according to the webhook's retry configuration.

metrics.timeout_errorsint32

Count of timeout errors in the last 24 hours. Timeouts are retried. May indicate the endpoint is slow or overloaded.

metrics.network_errorsint32

Count of network-level errors in the last 24 hours. Includes DNS failures, TLS errors, connection refused, and other transport errors.

Request
curl -X POST http://localhost:8080/webhook.HealthService/GetWebhookHealth \
  -H "Content-Type: application/json" \
  -d '{
  "webhook_id": "550e8400-e29b-41d4-a716-446655440000",
  "namespace": "production"
}'
Response
{
  "health": "HEALTHY",
  "metrics": {
    "total_deliveries": 1520,
    "successful_deliveries": 1480,
    "failed_deliveries": 40,
    "consecutive_failures": 0,
    "success_rate": 0.974,
    "avg_response_time": 245,
    "client_errors": 2,
    "server_errors": 5,
    "timeout_errors": 1,
    "network_errors": 0
  }
}

ListWebhooksByHealth

POST /webhook.HealthService/ListWebhooksByHealth

ListWebhooksByHealth returns all webhooks matching a given health status. Useful for finding degraded or unhealthy endpoints. Paginated.

Request

healthWebhookHealth

Health status to filter by. Required. Use HEALTH_UNHEALTHY to find problematic endpoints, HEALTH_DEGRADED for early warnings.

paginationPaginationRequest

Pagination parameters. Default: limit=50, offset=0.

Response

webhooksRegisteredWebhook[]

Webhooks with the requested health status.

paginationPaginationResponse

Pagination metadata.

Request
curl -X POST http://localhost:8080/webhook.HealthService/ListWebhooksByHealth \
  -H "Content-Type: application/json" \
  -d '{
  "health": "HEALTH_UNHEALTHY"
}'

GetHealthSummary

POST /webhook.HealthService/GetHealthSummary

GetHealthSummary returns aggregate counts of webhooks by health status (healthy, degraded, unhealthy, unknown) across all namespaces.

Response

summary.healthy_countint32

Number of webhooks in HEALTHY state (>90% success rate, <3 consecutive failures).

summary.degraded_countint32

Number of webhooks in DEGRADED state (50-90% success rate or 3-9 consecutive failures).

summary.unhealthy_countint32

Number of webhooks in UNHEALTHY state (<50% success rate or 10+ consecutive failures).

summary.unknown_countint32

Number of webhooks with no delivery history (HEALTH_UNSPECIFIED).

summary.total_countint32

Total number of webhooks (sum of all categories above).

Request
curl -X POST http://localhost:8080/webhook.HealthService/GetHealthSummary \
  -H "Content-Type: application/json" \
  -d '{}'
Response
{
  "summary": {
    "healthy_count": 45,
    "degraded_count": 3,
    "unhealthy_count": 1,
    "unknown_count": 2,
    "total_count": 51
  }
}