Chapter 38 Exercises: Deploying Python to the Cloud

How to Use These Exercises

Five tiers of difficulty. Tiers 1–2 can be completed without a cloud account. Tiers 3–5 involve actual deployment and will incur minor cloud costs (typically under $1 for the exercises, most within free tiers).

If you do not want to create cloud accounts for the exercises, Tier 1–3 can be completed using local Docker runs as a production simulation.

Tier 1 — Recall (Exercises 1–4)

Exercise 1: Docker Concepts

Answer the following questions in your own words (2–3 sentences each):

a) What is the difference between a Docker image and a Docker container? Use the class/instance analogy or another analogy of your choice.

b) Why does Dockerfile instruction order matter for build performance? Give a specific example of what would happen if COPY . . appeared before COPY requirements.txt . in the Dockerfile.

c) What does ENV PYTHONUNBUFFERED=1 do in a Dockerfile, and why does it matter for reading container logs?

d) What does EXPOSE 8000 do in a Dockerfile? What does it NOT do?

e) The chapter's Dockerfile creates a non-root user (adduser -D appuser). Explain the security reason for this practice.

Exercise 2: Environment Variables Mapping

Given the following table of application behavior, identify whether each configuration belongs in: - The .env file (development only) - Platform environment variables (production, e.g., Render) - Hardcoded in app.py (acceptable to be in source code)

Setting	Value Example	Where?
Application name	"Acme Corp Dashboard"
Flask SECRET_KEY	`8f3e2a1c7b9d...`
Dashboard password	`AcmeFY2024!`
FLASK_DEBUG	`true` (development)
FLASK_DEBUG	`false` (production)
Default port	`5000`
Database filename	`data/acme.db` (no credentials)
PostgreSQL connection URL with password	`postgresql://user:pass@host/db`
Maximum upload file size	`16 * 1024 * 1024` (16MB)
Admin email address	`admin@acme.com`

For each item in the "Hardcoded in app.py" category, explain what condition makes it acceptable to hardcode.

Exercise 3: Gunicorn vs. Flask Dev Server

A colleague says: "I tested my Flask app with python app.py and it works fine. Why do I need Gunicorn for production? Isn't that extra complexity?"

Write a response that: a) Explains the specific technical limitations of Flask's development server b) Explains what Gunicorn does differently c) Describes one concrete scenario where the development server would fail under real-world usage d) Acknowledges one situation where the simpler approach might actually be appropriate (think: what types of internal tools might be acceptable with the dev server?)

Exercise 4: AWS Lambda Decision Framework

For each of the following business Python tasks, decide whether AWS Lambda with a scheduled trigger is an appropriate deployment target. Explain your reasoning (2–3 sentences each).

a) A script that runs every Monday at 8 a.m., reads acme_sales.csv from S3, generates a PDF summary, and emails it to Sandra Chen.

b) A Flask web application that displays the Acme Corp dashboard to Sandra.

c) A script that analyzes customer review text with NLP and updates a "sentiment score" column in a database, running every time a new review is added.

d) A machine learning model that scores lead quality in real time — a sales rep types a company name and immediately sees a lead score.

e) A script that runs on the 1st of every month, generates all client invoices, and emails them.

Tier 2 — Apply (Exercises 5–8)

Exercise 5: Write a Dockerfile

Write a complete, production-ready Dockerfile for the following application:

A Python 3.11 Flask application
The main file is report_server.py with a Flask instance named server
Dependencies are in requirements.txt
The application reads PDFs from a reports/ directory and serves them
Should run on port 9000 in production using Gunicorn
Must run as a non-root user

Requirements for your Dockerfile: - Use python:3.11-alpine as the base image - Set all recommended ENV variables from the chapter - Install system dependencies needed for a Python-only Alpine image - Copy requirements.txt separately before the application code - Create and use a non-root user - Use correct Gunicorn CMD syntax for report_server.py with instance server - Include a comment explaining the purpose of each significant instruction

Also write the corresponding .dockerignore file.

Exercise 6: Docker Build and Run Locally

Using the Dockerfile from the chapter (in chapter-38-deployment/code/Dockerfile) and the Flask application from Chapter 37 (in chapter-37-flask/code/):

a) Build the Docker image locally. What command do you use? What output do you see?

b) Run the container locally with the environment variables from chapter-37-flask/code/.env. What command do you use?

c) Verify the application responds at http://localhost:8000. What do you check?

d) The chapter's Dockerfile copies requirements.txt before the rest of the code. Make a small, arbitrary change to app.py (add a comment) and rebuild. Observe that the pip install layer is cached (skipped) while only the COPY . . layer is rebuilt. Describe what you see in the build output and why it matters.

e) Run docker run --rm acme-dashboard whoami to verify the container is running as appuser rather than root. What does the --rm flag do?

Exercise 7: Docker Compose for a Two-Service Setup

Extend the docker-compose.yml from the chapter to include a second service: a Redis cache.

Context: You want to add caching to the dashboard so that load_sales_metrics() only re-reads the CSV once per minute, not on every request. Redis is a fast in-memory cache well-suited for this.

Write a docker-compose.yml that: - Defines the web service as in the chapter - Adds a redis service using the redis:7-alpine image - Configures the web service to wait for redis to be healthy before starting (use depends_on with condition: service_healthy) - Sets a REDIS_URL environment variable in the web service that points to the Redis service - Defines a health check for the Redis service (redis-cli ping)

You do not need to modify the Flask application to actually use Redis for this exercise — only the docker-compose.yml.

Exercise 8: Environment Variable Audit

Run an environment variable audit on the Acme Corp application (chapter-37-flask/code/app.py).

a) List every value in app.py that is currently loaded from an environment variable.

b) List any values that are currently hardcoded but should ideally be environment variables in a production deployment. For each, explain why.

c) The application uses os.environ.get("SECRET_KEY", "dev-secret-change-before-deploying"). Explain what the second argument (the default) does. Is this default value safe? What would happen if someone deployed this to production without setting SECRET_KEY?

d) Write the complete .env.example file that a developer should fill in when setting up this project locally. Include realistic placeholder values and a comment for each variable explaining what it is and where to get it.

Tier 3 — Extend (Exercises 9–12)

Exercise 9: Health Check Endpoint

Add a /health endpoint to the Acme Corp application that returns JSON suitable for use as a container health check and an uptime monitoring target.

Requirements: - Returns a 200 status with JSON on success - Returns a 500 status with JSON on failure - Checks: (1) the application is responding, (2) the acme_sales.csv file exists and is readable, (3) a sample pandas operation succeeds - The response should include: status ("ok" or "error"), timestamp, checks (a dict with each check's status), and version (a string you define) - Does NOT require authentication

Example successful response:

{
    "status": "ok",
    "timestamp": "2024-11-15T09:23:41",
    "version": "1.2.0",
    "checks": {
        "application": "ok",
        "data_file": "ok",
        "data_readable": "ok"
    }
}

Configure UptimeRobot (free tier) to monitor this endpoint and email you if it returns anything other than 200. Document the UptimeRobot configuration steps.

Exercise 10: GitHub Actions CI Workflow

Set up a complete GitHub Actions workflow for the Acme Corp application.

The workflow must: - Trigger on push to main and on pull requests to main - Run on ubuntu-latest - Install Python 3.11 - Install all dependencies from requirements.txt - Run the test suite with pytest tests/ - Fail the workflow (and block the Render deployment) if any test fails - Also run pip check to verify there are no dependency conflicts

Write the complete .github/workflows/test.yml file.

Additionally, write three meaningful test functions in tests/test_app.py that are not covered by the examples in the chapter: 1. Test that submitting the expense form with a future date returns a 200 with an error message 2. Test that the /api/metrics endpoint returns JSON with all expected keys when authenticated 3. Test that accessing /expenses/history without authentication redirects to the login page

Exercise 11: Render Deployment (Hands-On)

Deploy the Acme Corp dashboard to Render. This exercise requires creating a free Render account and a GitHub account.

Document each step:

a) Create a private GitHub repository. Push the application code. Verify the .env file is NOT in the repository.

b) Create a Render account. Create a new Web Service connected to your GitHub repository.

c) Configure all required environment variables in Render's "Environment" section. Note: use the Render Starter tier ($7/month) or test on Free tier (accepting the cold-start limitation).

d) Watch the initial build log. Note: record the full build time.

e) Verify the deployed application: - Visit the production URL - Log in with the production dashboard password - Confirm the dashboard renders (may show "no data" warning if no CSV is present) - Generate sample data by calling the data generation endpoint

f) Make a small change to the application (add a sentence to the home page description). Push to GitHub. Observe the automatic redeploy on Render.

g) Calculate the annual cost of this deployment at Render Starter pricing. Is it cost-justified for internal use?

Exercise 12: AWS Lambda Scheduled Script

Deploy Maya's monthly invoice generation script as an AWS Lambda function with a monthly EventBridge trigger.

This exercise requires an AWS account (Lambda is within the free tier for this usage level).

a) Write a simplified version of the invoice generation script as a Lambda handler: - The function reads project data from an S3 bucket (not a local database — adapt accordingly) - Generates a simple text summary (not a full PDF — just to demonstrate the pattern) - Writes the summary back to S3

b) Package the function with its dependencies as a ZIP file. Document the exact commands.

c) Create the Lambda function in the AWS Console. Set the runtime to Python 3.11.

d) Create an EventBridge (CloudWatch Events) rule that triggers the function on the 1st of each month at 9:00 AM UTC. Write the cron expression.

e) Test the function manually using the "Test" button in the Lambda console. Show the execution result.

f) Verify the function appears in your Lambda billing — it should be $0.00 within the free tier.

Tier 4 — Design (Exercises 13–15)

Exercise 13: Persistent Data Strategy

You are advising Maya on her database persistence strategy. She currently uses SQLite with a Render Persistent Disk. She is evaluating three alternatives:

Option A: Keep SQLite + Persistent Disk, add daily S3 backup via Lambda.

Option B: Migrate to PostgreSQL using Render's managed PostgreSQL offering ($7/month). Render provides automated daily backups.

Option C: Migrate to PlanetScale (cloud-hosted MySQL) — free tier available, serverless model.

For each option, analyze: 1. Data safety (what happens if the server dies?) 2. Operational complexity (how much does Maya need to manage?) 3. Cost (monthly) 4. Migration effort from current SQLite setup 5. Performance for Maya's current scale (8 clients, ~1,000 rows total)

Make a specific recommendation with clear reasoning. What would change your recommendation if Maya's client base grew to 100 clients?

Exercise 14: Multi-Environment Deployment Strategy

Maya wants to add a "staging" environment to her deployment pipeline: - Development: Her laptop (local Flask dev server) - Staging: A Render free tier instance where she tests before pushing to production - Production: The Render Starter instance her clients use

Design the full multi-environment strategy:

a) What should be different between staging and production? List at least five things (environment variables, data, logging, error display, etc.)

b) How should the GitHub branching strategy work? What branches trigger deployment to staging vs. production?

c) The staging database needs to be populated with realistic but not real client data. Describe a database seeding strategy that creates plausible test data for all eight project types Maya typically handles.

d) How would you prevent the staging URL from being accidentally shared with real clients? (Hint: think about HTTP authentication at the server level, separate domains, or Render's preview URL feature.)

e) What is the minimum viable staging environment for Maya's current situation? Is the full strategy you designed in (a)-(d) justified at her scale?

Exercise 15: Incident Response Plan

Write a basic incident response plan for Maya's client portal. Assume the portal becomes unavailable and clients begin receiving errors.

The plan should cover: a) Detection: How does Maya know the portal is down? (Include at least two independent monitoring mechanisms.)

b) First response (0–5 minutes): What does Maya do immediately when she receives an alert? What is the first URL she visits?

c) Diagnosis (5–15 minutes): How does she determine whether the issue is: (1) the application crashing, (2) the database being inaccessible, (3) the Render platform having an outage, or (4) the custom domain having DNS issues?

d) Communication (during the incident): What does Maya tell clients? Write a template status message she can send within 10 minutes of detecting an outage.

e) Recovery: For each of the four failure scenarios in (c), describe the specific recovery action.

f) Post-mortem: What should Maya document after an incident? Write a template post-mortem document structure.

Tier 5 — Challenge (Exercises 16–19)

Exercise 16: Zero-Downtime Deployment

Render's Starter plan does not guarantee zero-downtime deployments — there is typically a brief period (5–30 seconds) where the old container is stopped and the new one starts. For an internal dashboard, this is acceptable. For Maya's client portal, a client who happens to refresh during a deployment would see an error.

Research and implement a zero-downtime deployment strategy:

a) Explain what a blue-green deployment is and how it achieves zero downtime.

b) Render's Standard plan ($25/month) includes zero-downtime deployments via health checks. Implement this by: 1. Adding a /health endpoint (from Exercise 9) that Render can poll 2. Configuring the health check URL in Render's service settings 3. Documenting how Render uses the health check to determine when to switch traffic

c) As an alternative to upgrading Render's plan, describe how you could implement a simple zero-downtime strategy using two Render free-tier instances and a load balancer (Render does not provide load balancing on free tier, so you would need to use Cloudflare or another DNS-based approach). Describe the architecture, cost, and operational complexity.

d) For Maya's portal at its current scale (8 clients, checking occasionally), is zero-downtime deployment worth the additional cost and complexity? Make the case for both "yes" and "no."

Exercise 17: Observability Stack

Maya wants professional-grade observability for her client portal — not just uptime monitoring, but insights into how clients use the portal and early warning of performance degradation.

Design and implement a minimal observability stack:

a) Structured logging: Modify the Flask application to emit structured JSON logs instead of plain text. Each log line should be a JSON object with: timestamp, level, event, project_code (when available in session), path, method, status_code, and duration_ms (request duration).

b) Log shipping: Configure the application to ship logs to Logtail (logtail.com — free tier available). Document the setup steps.

c) Metrics: Add a /metrics endpoint that returns application metrics in Prometheus text format: - portal_requests_total (counter, labeled by endpoint and status) - portal_active_sessions (gauge — number of valid sessions in the last hour) - portal_data_load_seconds (histogram — time to load project data)

d) Alerting: Configure an alert in Logtail that sends Maya an email if more than 5 errors occur within a 10-minute window.

e) Dashboard: Create a simple observability summary page at /admin/stats (protected by admin password) that shows: requests in the last 24 hours, error rate, most active project codes, and average page load time.

Exercise 18: Disaster Recovery Testing

"Untested backups are not backups." Implement and test a complete disaster recovery procedure for Maya's client portal.

a) Backup implementation: Write a Python script that: - Downloads maya_projects.db from the Render persistent disk via SSH - Compresses it with gzip - Uploads it to an S3 bucket with a timestamped filename - Sends Maya an email with the backup size and S3 location - Verifies the backup is readable by running a SQLite integrity check on the downloaded copy

b) Backup scheduling: Package this script as a Lambda function. Schedule it to run daily at 3 AM UTC via EventBridge.

c) Recovery procedure: Write a step-by-step recovery procedure document that someone other than Maya could follow to: 1. Find the most recent backup in S3 2. Download and decompress it 3. Verify its integrity 4. Upload it to a fresh Render persistent disk 5. Verify the restored application works with the recovered data

d) Drill: Actually perform the recovery procedure with a test database (create a copy of maya_projects.db with fake data). Document what worked as expected and what was unclear in your procedure.

e) Recovery time objective: Based on performing the drill, estimate how long a complete recovery from catastrophic disk failure would take. Is this acceptable for Maya's clients? What would you change to reduce recovery time?

Exercise 19: Bring It All Together — Full CI/CD Pipeline

Build a complete CI/CD pipeline for either the Acme Corp dashboard or Maya's client portal (your choice).

The pipeline must include:

a) GitHub repository structure: - main branch → deploys to production automatically if CI passes - staging branch → deploys to staging automatically if CI passes - Feature branches → CI runs tests on push; no deployment

b) GitHub Actions CI workflow: - Runs tests with pytest - Checks code style with flake8 (PEP 8 compliance) - Builds the Docker image and verifies it starts successfully - Runs the health check endpoint against the locally-started container - Fails fast: if any step fails, subsequent steps are skipped

c) Render configuration: - Production service: connected to main branch - Staging service: connected to staging branch - Both services have appropriate environment variables set - Both services have the health check endpoint configured

d) Deployment documentation: - Write a one-page "How to Deploy" guide for a hypothetical second team member - Include: how to create a branch, how to test locally, how to push and verify CI, how to merge, how to watch the deployment, and how to roll back

e) Demonstrate the pipeline: Walk through a complete cycle: - Create a feature branch - Make a visible change to the UI - Push the branch (CI runs) - Merge to staging (deploys to staging) - Verify on staging - Merge to main (deploys to production) - Verify on production

Document each step with screenshots or log output.

Exercise Notes and Hints

Exercise 1d: EXPOSE is documentation only. Run-time port publication requires -p host:container in docker run or the ports: setting in docker-compose.yml.

Exercise 5: On Alpine, the Gunicorn CMD for a Flask app in report_server.py with instance server would be: CMD ["gunicorn", "--workers", "4", "--bind", "0.0.0.0:9000", "report_server:server"]

Exercise 7: depends_on with health condition syntax:

depends_on:
  redis:
    condition: service_healthy

Exercise 9: Flask health check endpoints should avoid authentication so monitoring systems can access them without credentials. Never expose sensitive data through the health endpoint.

Exercise 12: EventBridge cron for the 1st of each month at 9 AM UTC: cron(0 9 1 * ? *)

Exercise 13: PostgreSQL connection strings with credentials take the form: postgresql://username:password@hostname:5432/dbname. These are always environment variables, never hardcoded.

Exercise 17a: Python's logging.LogRecord can be converted to JSON using the python-json-logger package. This is the standard approach for structured logging in Python.

Exercise 18a: Render's persistent disks can be accessed via SSH using the service's shell. The file path for the database would be /app/data/maya_projects.db. The paramiko Python library provides SSH/SFTP from within Python scripts.