Your Progress

Phase 0: Intro
Phase 1: Fundamentals
Phase 2: Flask & Docker
Phase 3: AWS CLI
Phase 4: ECR
Phase 5: VPC & ALB
Phase 6: ECS
Phase 7: Monitoring
Phase 8: CI/CD
Phase 9: Auto-Scale
0% Complete
CI/CD Pipeline and CloudWatch Monitoring

AWS ECS Fargate Deployment with Terraform

What You'll Build

Manual AWS Console clicks

Infrastructure as Code with Terraform

No networking knowledge

VPC, Subnets, CIDR mastery

Docker on localhost only

Containers running in AWS ECS

No CI/CD pipeline

Automated deployments with CodePipeline

No monitoring

CloudWatch alerts and dashboards

Fixed capacity

Auto-scaling with Fargate Spot

View Source Code

Explore the complete Terraform configurations, Flask application, and CI/CD setup on GitHub.

View on GitHub

Key Concepts Flashcards

Click any card to flip and reveal the definition.

What is Terraform?

A tool that lets you create, change, and manage infrastructure using code instead of manual clicks. It allows you to define your cloud infrastructure in code and automatically provision it across platforms like AWS, Azure, and GCP.

What is a VPC?

A Virtual Private Cloud - your own private network inside AWS where you control IP addresses, subnets, routing, and security. Think of it as your own isolated data-center network in the cloud.

What is the difference between Public and Private Subnets?

A public subnet has a route to the Internet Gateway (0.0.0.0/0 -> IGW) and can reach the internet directly. A private subnet has no route to IGW and cannot be reached directly from the internet.

What is ECS Fargate?

AWS's serverless container service. You tell ECS what to run (container image, CPU, memory) and AWS handles all the underlying infrastructure. You never see or manage EC2 instances.

What is an Application Load Balancer (ALB)?

A service that receives HTTP/HTTPS traffic from users and intelligently distributes it to backend services (EC2/ECS tasks) based on rules. It acts as the 'front door' to your application.

What is a CIDR block?

A CIDR block defines a range of IP addresses. For example, 10.0.0.0/16 means 65,536 IP addresses starting from 10.0.0.0. The number after / indicates how many bits identify the network.

How many IPs does AWS reserve per subnet?

AWS reserves 5 IPs in every subnet: Network address, VPC router, DNS server, Future use, and Broadcast address. So a /24 subnet (256 IPs) only has 251 usable IPs.

What is the difference between ECR and ECS?

ECR (Elastic Container Registry) stores Docker images - it's like Docker Hub but private and inside AWS. ECS (Elastic Container Service) runs containers - it pulls images from ECR and runs them.

Introduction

Time:~5 min read
Level:Just getting started
You'll need:Curiosity, that's it

In this project, we're going to deploy a Flask app to AWS using ECS Fargate and Terraform. I'll walk you through the whole thing - setting up a VPC, understanding CIDR blocks, configuring security groups, and building a CI/CD pipeline with CodePipeline. If you're trying to get into DevOps or just want to actually understand how AWS infrastructure works instead of just clicking around the console, this is what you need.

💬
Real Talk

I want students like me to go through this project so that they get to know the basics, they get to know how things work at enterprise level, and also to get jobs in DevOps with Terraform and AWS services. My main motto is not to showcase my projects, I want others to grasp the fundamentals of the topics.

Mainly there are IP networking fundamentals here like IP addresses, CIDR and all that - stuff that can help you in cybersecurity too.

💡

Key Takeaway

This project is divided into 8 phases (A through H). We'll cover fundamentals first, then build the entire infrastructure step by step.

AWS Architecture Diagram

Click on components to learn more

WWWInternetVPC: 10.0.0.0/16Internet GatewayPublic SubnetsPublic Subnet 110.0.1.0/24 (us-east-1a)Public Subnet 210.0.2.0/24 (us-east-1b)Application Load BalancerPort 80 (HTTP)Private SubnetsPrivate Subnet 110.0.101.0/24 (us-east-1a)ECS Fargate TaskFlask App :8080Private Subnet 210.0.102.0/24 (us-east-1b)ECS Fargate TaskFlask App :8080ECR Repositoryproj2-app:latestCloudWatchLogs & Metrics
Public Subnet
Private Subnet
ECS Task
Load Balancer

Fundamentals

Pre-flight Check
Time
~20 min
Vibe
Theory first, code later
Bring
Basic programming know-how
Coffee
Recommended

Imagine you are building the backend for a food delivery app (like Uber Eats).

REST API = The Rulebook (First)

REST says:

ActionREST Rule
View all restaurantsGET /restaurants
View one restaurantGET /restaurants/12
Add a restaurantPOST /restaurants
Update a restaurantPUT /restaurants/12
Delete a restaurantDELETE /restaurants/12

These rules exist before Flask or FastAPI. They are framework-independent.

Flask's Role (Executor)

When Flask receives this request:

  1. Flask listens for incoming HTTP requests
  2. It checks: Path: /restaurants/12, Method: GET
  3. Flask finds: "Ah, this matches the rule for fetching a restaurant"
  4. Flask runs Python logic that: Reads 12, Fetches restaurant data from database
  5. Flask wraps the result into: JSON response
  6. Flask sends it back to the client

Flask does not invent rules. Flask executes REST rules.

💡

Key Takeaway

REST does not care which framework you use. REST = Rules & structure. Flask = Executes rules manually. FastAPI = Executes rules with automation.

CIDR Calculator

Calculate IP ranges and AWS-usable addresses

Quick Select

Traffic Flow Animation

Step through how requests flow through your infrastructure

1

Browser

2

IGW

3

ALB

4

Target Group

5

ECS Fargate

6

Response

Step 1 of 6

User Request

User opens browser and navigates to your app URL

Port HTTP/443
  • HTTP GET request sent
  • DNS resolves to ALB IP
  • Request goes to port 80

Fundamentals Quiz

1 / 3

What is Terraform?

Create Flask Application and Docker Image

Pre-flight Check
Time
~15 min
Mode
Hands-on coding
Need
Docker running on your machine
Outcome
Your first container

Goal

Create a simple web application and package it into a Docker container that can run anywhere - on your laptop, in AWS, or any cloud provider.

What We're Building

A minimal Flask web application with:

  • Homepage endpoint (/) that returns a JSON greeting
  • Health check endpoint (/health) for AWS to monitor the app
  • Request logging with timing data (observability)
Create Project Structure
▶️

Click the Play button above to start!

Watch the commands execute step by step

Docker Commands

Build and Run Docker Container
▶️

Click the Play button above to start!

Watch the commands execute step by step

Test Endpoints
▶️

Click the Play button above to start!

Watch the commands execute step by step

View Logs and Cleanup
▶️

Click the Play button above to start!

Watch the commands execute step by step

💡

Key Takeaway

This is a professional standard - keeps code organized and maintainable. The app/ folder contains application code. The terraform/ folder contains infrastructure code.

💬
Real Talk

Docker can feel intimidating at first, but once you get your first container running, it clicks. You just packaged your app into something that runs identically everywhere - that's powerful.

AWS CLI Setup + IAM + Budget Alert

Pre-flight Check
Time
~10 min
Mode
AWS Console + Terminal
Need
An AWS account (Free Tier works)
Watch out
Don't skip the budget alert!

In Phase B, we verified our AWS setup before creating any resources. We confirmed that the AWS CLI was properly configured and authenticated as the test_user IAM user by running aws sts get-caller-identity. We verified the default region was set to us-east-1 (the cheapest and most feature-complete AWS region). We created a budget alert called proj2-cost-guardrail with a $10/month limit that sends an email notification when spending exceeds 80% ($8) — this acts as a safety net to prevent surprise bills. Finally, we tested that we had the necessary permissions to access ECR, ECS, and VPC services by running describe/list commands and confirming no "Access Denied" errors. This phase was all about preparation and safety checks — no infrastructure was built yet, just making sure we had the keys, knew where we were going, and had cost protection in place.

Verify AWS CLI Setup
▶️

Click the Play button above to start!

Watch the commands execute step by step

💡

Key Takeaway

This phase was all about preparation and safety checks — no infrastructure was built yet, just making sure we had the keys, knew where we were going, and had cost protection in place.

ECR Repository + Push Image

Pre-flight Check
Time
~12 min
Mode
Terminal commands
Need
Phases A & B done
Result
Image in the cloud

Goal

Upload your Docker image from your Mac to AWS ECR so that ECS can later download it and run it in the cloud.

What We're Doing

  1. Create a "repository" in ECR (like creating an album in iCloud)
  2. Log Docker into ECR (authenticate)
  3. Tag image with ECR address (rename for upload)
  4. Push image to ECR (upload)
  5. Verify it uploaded correctly

After This Phase

Your Docker image will be:

  • Stored in AWS
  • Accessible to ECS
  • Scanned for vulnerabilities
  • Encrypted at rest
  • Ready for deployment

What it is (simple): ECR is AWS's Docker image storage. It stores your built container images securely.

Think: ECR = Docker Hub, but private and inside AWS.

What goes into ECR?

  • Docker images (myapp:latest, myapp:v1)
  • Image layers + metadata

What ECR does NOT do

  • It does not run containers
  • It only stores images
Push Image to ECR
▶️

Click the Play button above to start!

Watch the commands execute step by step

💡

Key Takeaway

Now the image lives in ECR. ECS will pull this image when it needs to start a container.

Terraform VPC + ALB Infrastructure

Pre-flight Check
Time
~20 min
Mode
Terraform all the way
Need
Phase C done + Terraform installed
This is
Where it gets real

In Phase D, we used Terraform to provision the core AWS networking infrastructure as code. Instead of manually clicking through the AWS Console, we wrote Terraform configuration files (.tf files) that define our desired infrastructure declaratively. We created a VPC with CIDR block 10.0.0.0/16, two public subnets across different availability zones (us-east-1a and us-east-1b) for high availability, an Internet Gateway to allow internet access, route tables to direct traffic, and two security groups — one for the ALB (allowing HTTP port 80 from the internet) and one for ECS tasks (allowing traffic only from the ALB). We also set up an Application Load Balancer (ALB) with a target group and HTTP listener. Running terraform init, terraform plan, and terraform apply created all 12 resources automatically. At this point, the ALB returns a 503 error because there are no targets yet — the ECS tasks haven't been deployed. This phase established the network foundation that our containers will run on.

An Application Load Balancer (ALB) is a service that:

Receives HTTP/HTTPS traffic from users and intelligently distributes it to backend services (EC2/ECS tasks) based on rules.

Terraform Resource Tree

Infrastructure dependency graph

aws_vpc.main(VPC)
aws_internet_gateway.main(Internet Gateway)
aws_subnet.public_1(Public Subnet)
aws_subnet.public_2(Public Subnet)
aws_subnet.private_1(Private Subnet)
aws_subnet.private_2(Private Subnet)
aws_security_group.alb(Security Group)
aws_security_group.ecs(Security Group)

Select a resource to view details

Pending
Creating
Created
Terraform Commands
▶️

Click the Play button above to start!

Watch the commands execute step by step

💡

Key Takeaway

At this point, the ALB returns a 503 error because there are no targets yet — the ECS tasks haven't been deployed. This phase established the network foundation that our containers will run on.

Infrastructure as Code means you can destroy everything and rebuild it identically with a single command. That's power.

Phase D Quiz

1 / 2

What does terraform plan do?

ECS Fargate Deployment

Pre-flight Check
Time
~15 min
Mode
More Terraform magic
Need
Phase D infrastructure up
Moment
Your app goes live!

In Phase E, we deployed our containerized Flask application to AWS ECS Fargate. We created an IAM role (ecsTaskExecutionRole) that gives ECS permission to pull Docker images from ECR and write logs to CloudWatch. We set up a CloudWatch Log Group (/ecs/proj2-app) to capture container output. We then created an ECS Cluster (proj2-cluster) as a logical grouping for our containers, and a Task Definition that specifies exactly how to run our container — the Docker image from ECR, CPU (256 units = 0.25 vCPU), memory (512MB), port mapping (8080), and logging configuration. Finally, we created an ECS Service that maintains our desired task count, connects to the ALB target group, and automatically replaces unhealthy tasks. After terraform apply, the ALB health checks pass, and hitting the ALB URL now returns {"message": "Hello from ECS Fargate!", "service": "proj2-app"} — our app is live. The complete request flow is: User → ALB (port 80) → ECS Task (port 8080) → Flask app → response back through the same path.

What it is (simple): ECS is AWS's container runner and manager. It pulls images from ECR and runs them as containers.

Think: ECS = the service that starts, stops, restarts, and scales containers.

Key ECS concepts (quick)

  • Task Definition: blueprint (which image, CPU, memory, port)
  • Task: a running container
  • Service: keeps N tasks running (self-healing)
  • Cluster: logical place where services/tasks live
  • Compute: where tasks run → EC2 or Fargate
💡

Key Takeaway

ECS itself does NOT run containers. ECS only decides what should run and how many. The actual execution happens on: EC2 (you manage) OR Fargate (AWS manages).

Test Your Live App
▶️

Click the Play button above to start!

Watch the commands execute step by step

💬
Real Talk

When you see your app running on a real public URL for the first time - that feeling never gets old. You built this. It's live. Real users could access it right now.

Monitoring & Alerting

Pre-flight Check
Time
~12 min
Mode
AWS Console + Terraform
Need
Phase E (app running)
Why
Know when things break

In Phase F, we set up monitoring and alerting to know when something goes wrong with our application. We created an SNS (Simple Notification Service) topic and subscribed your email to it, so you receive notifications when alarms trigger. We then created 7 CloudWatch Alarms that monitor critical metrics: high CPU utilization (>80%), high memory utilization (>80%), HTTP 5xx errors (server errors), HTTP 4xx errors (client errors), unhealthy targets in the load balancer, high ALB response time, and low running task count. When any of these thresholds are breached, you get an email alert. We also built a CloudWatch Dashboard that displays real-time graphs of request count, response times, error rates, CPU/memory usage, healthy host count, and running tasks — giving you a single pane of glass to monitor your application's health. This transforms the setup from "hope it works" to "know it works" — you'll be notified of problems before users even notice them.

Amazon Web Services CloudWatch is AWS's service for monitoring, logs, and alerts.

CloudWatch answers 3 questions:

  1. "What is happening right now?" (metrics)
  2. "What happened?" (logs)
  3. "Should I be alerted?" (alarms)

CloudWatch Metrics (numbers over time)

What they are: Automatic measurements AWS collects every few seconds/minutes.

Examples you'll actually see:

  • EC2: CPUUtilization, NetworkIn/Out
  • ALB: RequestCount, TargetResponseTime, HTTP 5xx
  • ECS: CPUUtilization, MemoryUtilization
  • RDS: CPU, FreeStorageSpace
💡

Key Takeaway

This transforms the setup from "hope it works" to "know it works" — you'll be notified of problems before users even notice them.

CI/CD Pipeline with AWS CodePipeline

Pre-flight Check
Time
~15 min
Mode
AWS Console setup
Need
Phase F + GitHub account
Level up
Push to deploy

In Phase G, we built an automated CI/CD pipeline using AWS CodePipeline and CodeBuild so that code changes automatically deploy to production. We created a CodeStar Connection to link AWS with your GitHub repository (requires one-time manual authorization in the AWS Console). We set up an S3 bucket to store pipeline artifacts, and created IAM roles with the necessary permissions for CodeBuild and CodePipeline to access ECR, ECS, and other services. We wrote a buildspec.yml file that tells CodeBuild exactly what to do: log into ECR, build the Docker image from the app/ directory, tag it with the commit hash, push it to ECR, and generate an imagedefinitions.json file that tells ECS which image to deploy. The pipeline has three stages: Source (pulls code from GitHub), Build (runs CodeBuild), and Deploy (updates ECS service with the new image using rolling deployment). Now, whenever you git push to the main branch, the entire build-and-deploy process runs automatically in about 5-7 minutes — no manual steps required.

1) CodeCommit — "Where your code lives" (Git repo)

AWS's managed Git repository (like GitHub, GitLab). So CodeCommit = source code storage + trigger

2) CodeBuild — "Build + test + create artifacts"

A managed build server that: installs dependencies, runs tests, builds Docker images, produces "artifacts" (outputs). So CodeBuild = compiler + tester + docker builder

3) CodeDeploy — "Deployment automation"

A deployment service that can deploy to: EC2 (classic), Lambda, ECS (blue/green deployments). So CodeDeploy = deployment engine, especially for EC2 and advanced ECS strategies

4) CodePipeline — "The conductor / orchestrator"

CodePipeline connects the whole CI/CD flow into stages. It doesn't "build" or "deploy" by itself. It calls other services in order. So CodePipeline = the workflow manager

Pipeline Flow
▶️

Click the Play button above to start!

Watch the commands execute step by step

💡

Key Takeaway

Now, whenever you git push to the main branch, the entire build-and-deploy process runs automatically — no manual steps required.

💬
Real Talk

CI/CD is the turning point from "I'm learning" to "I'm building like a professional." Automated deployments are how real teams ship code.

Auto-Scaling with Fargate Spot

Pre-flight Check
Time
~12 min
Mode
Terraform + cost optimization
Need
Everything up to Phase G
Boss level
Production-ready infra

In Phase H, we implemented auto-scaling and cost optimization for our ECS service. We configured Fargate Spot as a capacity provider, which uses AWS's spare compute capacity at up to 70% discount compared to regular Fargate — with the tradeoff that tasks can be interrupted with 2 minutes notice (we keep at least 1 task on regular Fargate for reliability). We set up Application Auto Scaling with two target tracking policies: one that scales based on CPU utilization (target 70%) and another based on ALB request count (target 100 requests per target per minute). When load increases, tasks scale out quickly (60-second cooldown); when load decreases, tasks scale in slowly (300-second cooldown) to avoid flapping. We also added scheduled scaling to save costs: at 10 PM UTC the service scales to 0 tasks (complete shutdown), and at 8 AM UTC it scales back up to 1-4 tasks. The capacity provider strategy uses an 80/20 split — 80% Fargate Spot, 20% regular Fargate. Combined, these optimizations reduce compute costs by roughly 83% compared to running regular Fargate 24/7.

Cost Optimization

Regular Fargate 24/7

Fargate Spot (70% discount)

Fixed task count

Auto-scaling based on CPU/requests

Running at night

Scheduled scaling to 0 at night

100% Fargate

80/20 Spot/Regular split

💡

Key Takeaway

Combined, these optimizations reduce compute costs by roughly 83% compared to running regular Fargate 24/7.

Final Quiz

1 / 3

What is Fargate Spot?

Congratulations!

You've successfully built a production-grade AWS infrastructure with:

  • Flask API containerized with Docker
  • VPC with public/private subnets across 2 AZs
  • Application Load Balancer with health checks
  • ECS Fargate running your containers serverlessly
  • CloudWatch monitoring with 7 alarms
  • Automated CI/CD pipeline with CodePipeline
  • Auto-scaling with Fargate Spot for cost optimization

This is real production infrastructure used by companies. You now have the skills employers look for!