SLA / SLO Targets

MIT v0.1.0 published

SLA / SLO Targets¶

Service Level Objectives per function en component. Targets zijn gebaseerd op de huidige dev/staging omgeving — productie-SLA’s worden vastgesteld bij eerste tenant onboarding (BSW).

Function / Component	P95 Latency	P99 Latency	Availability	Throughput
`memory-retrieve`	`< 500ms`	`< 2s`	99.5%	—
`federation-gateway`	`< 1s`	`< 5s`	99%	—
`memory-persist` (SQS)	N/A (async)	N/A	99.9% (SQS managed)	≥ 10 IO/s
`vanna-query`	`< 3s`	`< 10s`	99%	—
`agent-memory`	`< 200ms`	`< 500ms`	99.5%	—
Cognificatie (end-to-end)	`< 30s per IO`	N/A (batch)	≥ 2 IO/min
IV PostgreSQL	`< 50ms`	`< 200ms`	99.9% (managed)	—
Redis Cache	`< 5ms`	`< 20ms`	99.9% (managed)	—

Error Budget¶

99.5% Availability

= max 3.6 uur downtime/maand

= max **43.8 uur/jaar**

Acceptabel voor dev/staging fase. Functions zijn stateless en herstarten automatisch (cold start <5s).

Monitoring

Cockpit dashboards: 9 Grafana dashboards (s19)

Health probes: federation-health job elke 30 min

Alerting: 11 alert rules (4 critical, 7 warning)

Tracing: OTel @traced decorator op alle functions

Info

Productie SLA: Formele SLA’s (met boeteclausules) worden vastgesteld bij de eerste productie-tenant onboarding (BSW M3 MinFin). Huidige targets zijn interne SLO’s voor de development en staging omgeving.

Changelog¶

Versie	Datum	Wijziging
0.1.0	2026-02-24	Initiële versie — SLO-tabel, error budget, monitoring overzicht