5 weeks · 0 milestones
Produce a deployment plan and operations runbook for a real software system covering: a step-by-step deployment procedure with pre-conditions, execution steps, and post-deployment verification for at least one environment (staging or production), a rollback procedure with decision criteria for when to roll back, an incident response procedure for the 3 most likely failure modes with documented detection, diagnosis, and recovery steps, and an on-call guide documenting the operational state of the system (what healthy looks like, what alerts fire under what conditions). Preferred proof: the runbook for a live system you operate. Accessible alternative: a runbook for a real open-source system you have deployed locally (Railway free tier, Render free tier, or similar) — documented evidence of actual deployment required, not a hypothetical. Proof artifacts: deployment procedure document (design artifact) and runbook with rollback procedure (documentation artifact). Verification: someone who has operated production systems reviews 'would I be able to bring this system back up from a cold start using this runbook alone?' and 'what happens if the rollback itself fails?'