Your Progress

Your progress will be saved across devices

Phase 0: Intro←

Phase 1: Fundamentals

Phase 2: Flask & Docker

Phase 3: AWS CLI

Phase 4: ECR

Phase 5: VPC & ALB

Phase 6: ECS

Phase 7: Monitoring

Phase 8: CI/CD

Phase 9: Auto-Scale

0% Complete0/10

Your Progress

Phase 0: Intro

Phase 1: Fundamentals

Phase 2: Flask & Docker

Phase 3: AWS CLI

Phase 4: ECR

Phase 5: VPC & ALB

Phase 6: ECS

Phase 7: Monitoring

Phase 8: CI/CD

Phase 9: Auto-Scale

0% Complete

CI/CD Pipeline and CloudWatch Monitoring

AWS ECS Fargate Deployment with Terraform

What You'll Build

❌ Basic Student Project

✅ Production-Ready Project

✗Manual AWS Console clicks

✓Infrastructure as Code with Terraform

✗No networking knowledge

✓VPC, Subnets, CIDR mastery

✗Docker on localhost only

✓Containers running in AWS ECS

✗No CI/CD pipeline

✓Automated deployments with CodePipeline

✗No monitoring

✓CloudWatch alerts and dashboards

✗Fixed capacity

✓Auto-scaling with Fargate Spot

✗Manual AWS Console clicks

✓Infrastructure as Code with Terraform

✗No networking knowledge

✓VPC, Subnets, CIDR mastery

✗Docker on localhost only

✓Containers running in AWS ECS

✗No CI/CD pipeline

✓Automated deployments with CodePipeline

✗No monitoring

✓CloudWatch alerts and dashboards

✗Fixed capacity

✓Auto-scaling with Fargate Spot

View Source Code

Explore the complete Terraform configurations, Flask application, and CI/CD setup on GitHub.

View on GitHub

Key Concepts Flashcards

Click any card to flip and reveal the definition.

What is Terraform?

A tool that lets you create, change, and manage infrastructure using code instead of manual clicks. It allows you to define your cloud infrastructure in code and automatically provision it across platforms like AWS, Azure, and GCP.

What is a VPC?

A Virtual Private Cloud - your own private network inside AWS where you control IP addresses, subnets, routing, and security. Think of it as your own isolated data-center network in the cloud.

What is the difference between Public and Private Subnets?

A public subnet has a route to the Internet Gateway (0.0.0.0/0 -> IGW) and can reach the internet directly. A private subnet has no route to IGW and cannot be reached directly from the internet.

What is ECS Fargate?

AWS's serverless container service. You tell ECS what to run (container image, CPU, memory) and AWS handles all the underlying infrastructure. You never see or manage EC2 instances.

What is an Application Load Balancer (ALB)?

A service that receives HTTP/HTTPS traffic from users and intelligently distributes it to backend services (EC2/ECS tasks) based on rules. It acts as the 'front door' to your application.

What is a CIDR block?

A CIDR block defines a range of IP addresses. For example, 10.0.0.0/16 means 65,536 IP addresses starting from 10.0.0.0. The number after / indicates how many bits identify the network.

How many IPs does AWS reserve per subnet?

AWS reserves 5 IPs in every subnet: Network address, VPC router, DNS server, Future use, and Broadcast address. So a /24 subnet (256 IPs) only has 251 usable IPs.

What is the difference between ECR and ECS?

ECR (Elastic Container Registry) stores Docker images - it's like Docker Hub but private and inside AWS. ECS (Elastic Container Service) runs containers - it pulls images from ECR and runs them.

Introduction

Time:~5 min read

Level:Just getting started

You'll need:Curiosity, that's it

In this project, we're going to deploy a Flask app to AWS using ECS Fargate and Terraform. I'll walk you through the whole thing - setting up a VPC, understanding CIDR blocks, configuring security groups, and building a CI/CD pipeline with CodePipeline. If you're trying to get into DevOps or just want to actually understand how AWS infrastructure works instead of just clicking around the console, this is what you need.

💬

Real Talk

I want students like me to go through this project so that they get to know the basics, they get to know how things work at enterprise level, and also to get jobs in DevOps with Terraform and AWS services. My main motto is not to showcase my projects, I want others to grasp the fundamentals of the topics.

Mainly there are IP networking fundamentals here like IP addresses, CIDR and all that - stuff that can help you in cybersecurity too.

💡

Key Takeaway

This project is divided into 8 phases (A through H). We'll cover fundamentals first, then build the entire infrastructure step by step.

AWS Architecture Diagram

Click on components to learn more

Public Subnet

Private Subnet

ECS Task

Load Balancer

Fundamentals

Pre-flight Check

Time

~20 min

Vibe

Theory first, code later

Bring

Basic programming know-how

Coffee

Recommended

Imagine you are building the backend for a food delivery app (like Uber Eats).

REST API = The Rulebook (First)

REST says:

Action	REST Rule
View all restaurants	GET /restaurants
View one restaurant	GET /restaurants/12
Add a restaurant	POST /restaurants
Update a restaurant	PUT /restaurants/12
Delete a restaurant	DELETE /restaurants/12

These rules exist before Flask or FastAPI. They are framework-independent.

Flask's Role (Executor)

When Flask receives this request:

Flask listens for incoming HTTP requests
It checks: Path: /restaurants/12, Method: GET
Flask finds: "Ah, this matches the rule for fetching a restaurant"
Flask runs Python logic that: Reads 12, Fetches restaurant data from database
Flask wraps the result into: JSON response
Flask sends it back to the client

Flask does not invent rules. Flask executes REST rules.

💡

Key Takeaway

REST does not care which framework you use. REST = Rules & structure. Flask = Executes rules manually. FastAPI = Executes rules with automation.

CIDR Calculator

Calculate IP ranges and AWS-usable addresses

Quick Select

Traffic Flow Animation

Step through how requests flow through your infrastructure

Browser

IGW

ALB

Target Group

ECS Fargate

Response

Step 1 of 6

User Request

User opens browser and navigates to your app URL

Port HTTP/443

HTTP GET request sent
DNS resolves to ALB IP
Request goes to port 80

Fundamentals Quiz

1 / 3

What is Terraform?

Create Flask Application and Docker Image

Pre-flight Check

Time

~15 min

Mode

Hands-on coding

Need

Docker running on your machine

Outcome

Your first container

Goal

Create a simple web application and package it into a Docker container that can run anywhere - on your laptop, in AWS, or any cloud provider.

What We're Building

A minimal Flask web application with:

Homepage endpoint (/) that returns a JSON greeting
Health check endpoint (/health) for AWS to monitor the app
Request logging with timing data (observability)

Create Project Structure

▶️

Click the Play button above to start!

Watch the commands execute step by step

Docker Commands

Build and Run Docker Container

▶️

Click the Play button above to start!

Watch the commands execute step by step

Test Endpoints

▶️

Click the Play button above to start!

Watch the commands execute step by step

View Logs and Cleanup

▶️

Click the Play button above to start!

Watch the commands execute step by step

💡

Key Takeaway

This is a professional standard - keeps code organized and maintainable. The app/ folder contains application code. The terraform/ folder contains infrastructure code.

💬

Real Talk

Docker can feel intimidating at first, but once you get your first container running, it clicks. You just packaged your app into something that runs identically everywhere - that's powerful.

AWS CLI Setup + IAM + Budget Alert

Pre-flight Check

Time

~10 min

Mode

AWS Console + Terminal

Need

An AWS account (Free Tier works)

Watch out

Don't skip the budget alert!

In Phase B, we verified our AWS setup before creating any resources. We confirmed that the AWS CLI was properly configured and authenticated as the test_user IAM user by running aws sts get-caller-identity. We verified the default region was set to us-east-1 (the cheapest and most feature-complete AWS region). We created a budget alert called proj2-cost-guardrail with a $10/month limit that sends an email notification when spending exceeds 80% ($8) — this acts as a safety net to prevent surprise bills. Finally, we tested that we had the necessary permissions to access ECR, ECS, and VPC services by running describe/list commands and confirming no "Access Denied" errors. This phase was all about preparation and safety checks — no infrastructure was built yet, just making sure we had the keys, knew where we were going, and had cost protection in place.

Verify AWS CLI Setup

▶️

Click the Play button above to start!

Watch the commands execute step by step

💡

Key Takeaway

This phase was all about preparation and safety checks — no infrastructure was built yet, just making sure we had the keys, knew where we were going, and had cost protection in place.

ECR Repository + Push Image

Pre-flight Check

Time

~12 min

Mode

Terminal commands

Need

Phases A & B done

Result

Image in the cloud

Goal

Upload your Docker image from your Mac to AWS ECR so that ECS can later download it and run it in the cloud.

What We're Doing

Create a "repository" in ECR (like creating an album in iCloud)
Log Docker into ECR (authenticate)
Tag image with ECR address (rename for upload)
Push image to ECR (upload)
Verify it uploaded correctly

After This Phase

Your Docker image will be:

Stored in AWS
Accessible to ECS
Scanned for vulnerabilities
Encrypted at rest
Ready for deployment

What it is (simple): ECR is AWS's Docker image storage. It stores your built container images securely.

Think: ECR = Docker Hub, but private and inside AWS.

What goes into ECR?

Docker images (myapp:latest, myapp:v1)
Image layers + metadata

What ECR does NOT do

It does not run containers
It only stores images

Push Image to ECR

▶️

Click the Play button above to start!

Watch the commands execute step by step

💡

Key Takeaway

Now the image lives in ECR. ECS will pull this image when it needs to start a container.

Terraform VPC + ALB Infrastructure

Pre-flight Check

Time

~20 min

Mode

Terraform all the way

Need

Phase C done + Terraform installed

This is

Where it gets real

In Phase D, we used Terraform to provision the core AWS networking infrastructure as code. Instead of manually clicking through the AWS Console, we wrote Terraform configuration files (.tf files) that define our desired infrastructure declaratively. We created a VPC with CIDR block 10.0.0.0/16, two public subnets across different availability zones (us-east-1a and us-east-1b) for high availability, an Internet Gateway to allow internet access, route tables to direct traffic, and two security groups — one for the ALB (allowing HTTP port 80 from the internet) and one for ECS tasks (allowing traffic only from the ALB). We also set up an Application Load Balancer (ALB) with a target group and HTTP listener. Running terraform init, terraform plan, and terraform apply created all 12 resources automatically. At this point, the ALB returns a 503 error because there are no targets yet — the ECS tasks haven't been deployed. This phase established the network foundation that our containers will run on.

An Application Load Balancer (ALB) is a service that:

Receives HTTP/HTTPS traffic from users and intelligently distributes it to backend services (EC2/ECS tasks) based on rules.

Terraform Resource Tree

Infrastructure dependency graph

aws_vpc.main(VPC)

aws_internet_gateway.main(Internet Gateway)

aws_subnet.public_1(Public Subnet)

aws_subnet.public_2(Public Subnet)

aws_subnet.private_1(Private Subnet)

aws_subnet.private_2(Private Subnet)

aws_security_group.alb(Security Group)

aws_security_group.ecs(Security Group)

Select a resource to view details

Pending

Creating

Created

Terraform Commands

▶️

Click the Play button above to start!

Watch the commands execute step by step

💡

Key Takeaway

At this point, the ALB returns a 503 error because there are no targets yet — the ECS tasks haven't been deployed. This phase established the network foundation that our containers will run on.

Infrastructure as Code means you can destroy everything and rebuild it identically with a single command. That's power.

Phase D Quiz

1 / 2

What does terraform plan do?

ECS Fargate Deployment

Pre-flight Check

Time

~15 min

Mode

More Terraform magic

Need

Phase D infrastructure up

Moment

Your app goes live!

In Phase E, we deployed our containerized Flask application to AWS ECS Fargate. We created an IAM role (ecsTaskExecutionRole) that gives ECS permission to pull Docker images from ECR and write logs to CloudWatch. We set up a CloudWatch Log Group (/ecs/proj2-app) to capture container output. We then created an ECS Cluster (proj2-cluster) as a logical grouping for our containers, and a Task Definition that specifies exactly how to run our container — the Docker image from ECR, CPU (256 units = 0.25 vCPU), memory (512MB), port mapping (8080), and logging configuration. Finally, we created an ECS Service that maintains our desired task count, connects to the ALB target group, and automatically replaces unhealthy tasks. After terraform apply, the ALB health checks pass, and hitting the ALB URL now returns {"message": "Hello from ECS Fargate!", "service": "proj2-app"} — our app is live. The complete request flow is: User → ALB (port 80) → ECS Task (port 8080) → Flask app → response back through the same path.

What it is (simple): ECS is AWS's container runner and manager. It pulls images from ECR and runs them as containers.

Think: ECS = the service that starts, stops, restarts, and scales containers.

Key ECS concepts (quick)

Task Definition: blueprint (which image, CPU, memory, port)
Task: a running container
Service: keeps N tasks running (self-healing)
Cluster: logical place where services/tasks live
Compute: where tasks run → EC2 or Fargate

💡

Key Takeaway

ECS itself does NOT run containers. ECS only decides what should run and how many. The actual execution happens on: EC2 (you manage) OR Fargate (AWS manages).

Test Your Live App

▶️

Click the Play button above to start!

Watch the commands execute step by step

💬

Real Talk

When you see your app running on a real public URL for the first time - that feeling never gets old. You built this. It's live. Real users could access it right now.

Monitoring & Alerting

Pre-flight Check

Time

~12 min

Mode

AWS Console + Terraform

Need

Phase E (app running)

Why

Know when things break

In Phase F, we set up monitoring and alerting to know when something goes wrong with our application. We created an SNS (Simple Notification Service) topic and subscribed your email to it, so you receive notifications when alarms trigger. We then created 7 CloudWatch Alarms that monitor critical metrics: high CPU utilization (>80%), high memory utilization (>80%), HTTP 5xx errors (server errors), HTTP 4xx errors (client errors), unhealthy targets in the load balancer, high ALB response time, and low running task count. When any of these thresholds are breached, you get an email alert. We also built a CloudWatch Dashboard that displays real-time graphs of request count, response times, error rates, CPU/memory usage, healthy host count, and running tasks — giving you a single pane of glass to monitor your application's health. This transforms the setup from "hope it works" to "know it works" — you'll be notified of problems before users even notice them.

Amazon Web Services CloudWatch is AWS's service for monitoring, logs, and alerts.

CloudWatch answers 3 questions:

"What is happening right now?" (metrics)
"What happened?" (logs)
"Should I be alerted?" (alarms)

CloudWatch Metrics (numbers over time)

What they are: Automatic measurements AWS collects every few seconds/minutes.

Examples you'll actually see:

EC2: CPUUtilization, NetworkIn/Out
ALB: RequestCount, TargetResponseTime, HTTP 5xx
ECS: CPUUtilization, MemoryUtilization
RDS: CPU, FreeStorageSpace

💡

Key Takeaway

This transforms the setup from "hope it works" to "know it works" — you'll be notified of problems before users even notice them.

CI/CD Pipeline with AWS CodePipeline

Pre-flight Check

Time

~15 min

Mode

AWS Console setup

Need

Phase F + GitHub account

Level up

Push to deploy

In Phase G, we built an automated CI/CD pipeline using AWS CodePipeline and CodeBuild so that code changes automatically deploy to production. We created a CodeStar Connection to link AWS with your GitHub repository (requires one-time manual authorization in the AWS Console). We set up an S3 bucket to store pipeline artifacts, and created IAM roles with the necessary permissions for CodeBuild and CodePipeline to access ECR, ECS, and other services. We wrote a buildspec.yml file that tells CodeBuild exactly what to do: log into ECR, build the Docker image from the app/ directory, tag it with the commit hash, push it to ECR, and generate an imagedefinitions.json file that tells ECS which image to deploy. The pipeline has three stages: Source (pulls code from GitHub), Build (runs CodeBuild), and Deploy (updates ECS service with the new image using rolling deployment). Now, whenever you git push to the main branch, the entire build-and-deploy process runs automatically in about 5-7 minutes — no manual steps required.

1) CodeCommit — "Where your code lives" (Git repo)

AWS's managed Git repository (like GitHub, GitLab). So CodeCommit = source code storage + trigger

2) CodeBuild — "Build + test + create artifacts"

A managed build server that: installs dependencies, runs tests, builds Docker images, produces "artifacts" (outputs). So CodeBuild = compiler + tester + docker builder

3) CodeDeploy — "Deployment automation"

A deployment service that can deploy to: EC2 (classic), Lambda, ECS (blue/green deployments). So CodeDeploy = deployment engine, especially for EC2 and advanced ECS strategies

4) CodePipeline — "The conductor / orchestrator"

CodePipeline connects the whole CI/CD flow into stages. It doesn't "build" or "deploy" by itself. It calls other services in order. So CodePipeline = the workflow manager

Pipeline Flow

▶️

Click the Play button above to start!

Watch the commands execute step by step

💡

Key Takeaway

Now, whenever you git push to the main branch, the entire build-and-deploy process runs automatically — no manual steps required.

💬

Real Talk

CI/CD is the turning point from "I'm learning" to "I'm building like a professional." Automated deployments are how real teams ship code.

Auto-Scaling with Fargate Spot

Pre-flight Check

Time

~12 min

Mode

Terraform + cost optimization

Need

Everything up to Phase G

Boss level

Production-ready infra

In Phase H, we implemented auto-scaling and cost optimization for our ECS service. We configured Fargate Spot as a capacity provider, which uses AWS's spare compute capacity at up to 70% discount compared to regular Fargate — with the tradeoff that tasks can be interrupted with 2 minutes notice (we keep at least 1 task on regular Fargate for reliability). We set up Application Auto Scaling with two target tracking policies: one that scales based on CPU utilization (target 70%) and another based on ALB request count (target 100 requests per target per minute). When load increases, tasks scale out quickly (60-second cooldown); when load decreases, tasks scale in slowly (300-second cooldown) to avoid flapping. We also added scheduled scaling to save costs: at 10 PM UTC the service scales to 0 tasks (complete shutdown), and at 8 AM UTC it scales back up to 1-4 tasks. The capacity provider strategy uses an 80/20 split — 80% Fargate Spot, 20% regular Fargate. Combined, these optimizations reduce compute costs by roughly 83% compared to running regular Fargate 24/7.

Cost Optimization

❌ Basic Student Project

✅ Production-Ready Project

✗Regular Fargate 24/7

✓Fargate Spot (70% discount)

✗Fixed task count

✓Auto-scaling based on CPU/requests

✗Running at night

✓Scheduled scaling to 0 at night

✗100% Fargate

✓80/20 Spot/Regular split

✗Regular Fargate 24/7

✓Fargate Spot (70% discount)

✗Fixed task count

✓Auto-scaling based on CPU/requests

✗Running at night

✓Scheduled scaling to 0 at night

✗100% Fargate

✓80/20 Spot/Regular split

💡

Key Takeaway

Combined, these optimizations reduce compute costs by roughly 83% compared to running regular Fargate 24/7.

Final Quiz

1 / 3

What is Fargate Spot?

Congratulations!

You've successfully built a production-grade AWS infrastructure with:

Flask API containerized with Docker
VPC with public/private subnets across 2 AZs
Application Load Balancer with health checks
ECS Fargate running your containers serverlessly
CloudWatch monitoring with 7 alarms
Automated CI/CD pipeline with CodePipeline
Auto-scaling with Fargate Spot for cost optimization

This is real production infrastructure used by companies. You now have the skills employers look for!