Exercise 6.8 — Resilience Probe: Web + API + Redis
Exercises are guided practice for the lecture material. They are non-graded, but later graded assignments assume this work has been completed.
System Diagram

Starter Code Download
Why you are doing this (and why it matters later)
Lecture 08 introduces the core reliability distinction between process startup and service readiness. In real systems, dependencies restart and transient failures happen. This exercise practices designing /healthz and /readyz, observing Compose health states, and injecting failures so students can prove the system degrades gracefully and recovers without manual intervention.
This exercise makes us practice:
- use health and readiness endpoints for machine-readable service state
- interpret Docker Compose health transitions during boot and failures
- simulate dependency disruption and validate retry/backoff behavior
- collect evidence from logs and probes to locate the failing service
Starting Instructions
- Download the Exercise 6.8 starter code archive from the course site and extract it.
- Open a terminal in the extracted folder that contains
compose.yml. - Run
docker compose up -d --buildto build and startweb,api, andredis. - Run
docker compose psand confirm all services are up before starting probes. - Keep a second terminal ready for logs with
docker compose logs -f web api redis.
Exercise Instructions
Part A (10-15 min): Baseline Health and Readiness
- Probe
apiwithcurl http://localhost:3000/healthzandcurl http://localhost:3000/readyz. - Probe
webwithcurl http://localhost:3001and record the response when the stack is healthy. - Capture one
docker compose pssnapshot and identify health states (starting,healthy, orunhealthy).
Part B (15-20 min): Inject and Observe Failure
- Inject disruption using
docker compose restart redis. - While Redis restarts, repeatedly run
curl http://localhost:3001andcurl http://localhost:3000/readyz. - Follow logs with
docker compose logs -f web api redisand note exactly when symptoms begin and recovery occurs. - Describe whether failures are brief degradation (
503) or persistent crash behavior.
Part C (15-20 min): Tune Recovery Behavior
- In
web/server.js, change retry parameters inwithRetryusage toattempts = 6andbaseMs = 200. - Rebuild only web using
docker compose up -d --build web. - Repeat the Redis restart experiment and compare user-visible failures before vs after tuning.
- Write a short evidence-based conclusion about whether increased retry/backoff improved resilience.
AI Assist Ideas (Optional)
AI use is allowed for this exercise.
Prompt log requirement: AI assistance is allowed, but you must log every prompt/response in prompt-log.md with tool name, exact prompt, and a short note describing how you used the result.
Suggested prompts (adapt to your exact code context):
Explain the withRetry function and predict total worst-case wait time for 6 attempts with 200ms base delay.Given the readyz handler in the API service, what false-positive readiness risks still exist?I observed intermittent 503 responses after restarting Redis with Docker Compose. Suggest a debugging checklist focused on logs and probe sequencing.
Target behavior:
/healthzremains 200 while service processes are alive/readyzreflects dependency state and can return 503 during Redis outageswebmay briefly degrade during Redis disruption but should recover without manual restarts- students can identify the offending dependency from probe outputs plus logs
Saving Your Work
- Save only the files you changed for this exercise in your extracted exercise folder.
- Keep notes and timeline evidence in
exercise-notes.md(or an equivalent text file). - Keep at least one screenshot of
docker compose psand your before/after retry comparison for class discussion.
Verifying Your Work
- Run
docker compose psand confirm healthy state after experiments complete. - Re-run curl checks for
http://localhost:3000/healthz,http://localhost:3000/readyz, andhttp://localhost:3001. - Demonstrate that the stack recovers after
docker compose restart redis. - Ensure your before/after comparison of retry settings is based on observed outputs, not assumptions.
Solution Walkthrough
- Compare your probe order against the lecture's observability sequence:
compose ps->logs-> endpoint probes -> in-container checks. - Review
web/server.jsandapi/server.jsfrom your downloaded exercise code to explain why behavior changed. - Use your timeline to justify which service was the initial source of failure and why.