Run & Maintain
Ongoing operation of the systems we ship — on-call, eval drift, retraining cadence, regulatory posture.
AI systems decay differently than software. Data shifts, providers change, evals quietly stop reflecting reality. Run-and-Maintain is the discipline of treating that decay as a known maintenance schedule rather than a surprise.
Why ongoing operation is its own contract
A launch team can ship a great system. Keeping it great is a different skillset, a different cadence, and a different incentive structure. We take the on-call pager, run the monthly eval review, and own the retraining calendar — so the team that built it gets to build the next thing.
What we own when we run a system
On-call
24×7 pager rotation with SLOs you set, and a runbook we maintain.
Eval drift
Weekly automated eval runs. Drift alerts. Quarterly benchmark refresh.
Retraining cadence
Triggered by drift, by new data volume, or by calendar — whichever fires first.
Cost governance
Per-customer cost attribution, monthly review with finance, alerting on anomalies.
Incident response
Postmortems within 72 hours. Action items tracked to closure. No theatre.
Regulatory posture
SOC 2, GDPR, HIPAA, EU AI Act — whichever framework applies, we keep current.
How a Run engagement works
- 01
Onboarding
Read every runbook, sit through one full incident, shadow the team for two weeks.
- 02
Stabilize
Codify undocumented practice. Fix the obvious sharp edges. Set SLOs explicitly.
- 03
Operate
Daily ops. Weekly eval review. Monthly business review with stakeholders.
- 04
Improve
Quarterly retraining. Annual architecture review. Continuous cost trimming.