Amazon ECS Essentials
Quick reference guide for AWS Elastic Container Service (ECS).
What is Amazon ECS?
Amazon ECS is a fully managed container orchestration service that:
Core Concepts
Architecture Components
ECS Cluster
├── Services
│ └── Tasks (running containers)
├── Task Definitions (container blueprint)
├── Container Instances (EC2 launch type)
└── Fargate (serverless launch type)
Key Components
Launch Types
Fargate vs EC2
| Feature | Fargate | EC2 |
|---|---|---|
| Infrastructure | Serverless (AWS managed) | Self-managed EC2 instances |
| Pricing | Pay per task (vCPU + memory) | Pay for EC2 instances |
| Scaling | Automatic | Manual (ASG) + ECS scaling |
| Use Case | Microservices, batch jobs | Cost optimization, custom needs |
| Maintenance | None | Patch OS, manage instances |
| Startup Time | Slower (~1 min) | Faster (~10 sec) |
Task Definitions
Basic Task Definition
{
"family": "my-app",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "256",
"memory": "512",
"containerDefinitions": [
{
"name": "app",
"image": "nginx:latest",
"portMappings": [
{
"containerPort": 80,
"protocol": "tcp"
}
],
"essential": true,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/my-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
}
}
]
}
Multi-Container Task
{
"family": "web-app-with-sidecar",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"containerDefinitions": [
{
"name": "web-app",
"image": "myapp:latest",
"portMappings": [
{
"containerPort": 8080,
"protocol": "tcp"
}
],
"essential": true,
"environment": [
{
"name": "DATABASE_URL",
"value": "postgres://db:5432/myapp"
}
],
"secrets": [
{
"name": "DB_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:region:account:secret:db-password"
}
],
"dependsOn": [
{
"containerName": "log-router",
"condition": "START"
}
]
},
{
"name": "log-router",
"image": "fluent/fluentd:latest",
"essential": false,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/fluentd",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "logs"
}
}
}
]
}
Task Definition with Health Check
{
"containerDefinitions": [
{
"name": "app",
"image": "myapp:latest",
"healthCheck": {
"command": [
"CMD-SHELL",
"curl -f http://localhost/health || exit 1"
],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
},
"portMappings": [
{
"containerPort": 80
}
]
}
]
}
AWS CLI Commands
Cluster Management
# ========== Create Cluster ==========
aws ecs create-cluster --cluster-name my-cluster
# ========== List Clusters ==========
aws ecs list-clusters
# ========== Describe Cluster ==========
aws ecs describe-clusters --clusters my-cluster
# ========== Delete Cluster ==========
aws ecs delete-cluster --cluster my-cluster
Task Definitions
# ========== Register Task Definition ==========
aws ecs register-task-definition \
--cli-input-json file://task-definition.json
# ========== List Task Definitions ==========
aws ecs list-task-definitions
# ========== Describe Task Definition ==========
aws ecs describe-task-definition \
--task-definition my-app:1
# ========== Deregister Task Definition ==========
aws ecs deregister-task-definition \
--task-definition my-app:1
Running Tasks
# ========== Run Task (One-Time) ==========
aws ecs run-task \
--cluster my-cluster \
--task-definition my-app:1 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-12345],securityGroups=[sg-12345],assignPublicIp=ENABLED}" \
--count 1
# ========== List Tasks ==========
aws ecs list-tasks --cluster my-cluster
# ========== Describe Tasks ==========
aws ecs describe-tasks \
--cluster my-cluster \
--tasks task-id-12345
# ========== Stop Task ==========
aws ecs stop-task \
--cluster my-cluster \
--task task-id-12345
Services
# ========== Create Service ==========
aws ecs create-service \
--cluster my-cluster \
--service-name my-service \
--task-definition my-app:1 \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-12345],securityGroups=[sg-12345],assignPublicIp=ENABLED}"
# ========== Update Service ==========
aws ecs update-service \
--cluster my-cluster \
--service my-service \
--desired-count 5
# ========== Update Service with New Task Definition ==========
aws ecs update-service \
--cluster my-cluster \
--service my-service \
--task-definition my-app:2
# ========== Delete Service ==========
aws ecs delete-service \
--cluster my-cluster \
--service my-service \
--force
Service with Load Balancer
Application Load Balancer Integration
{
"cluster": "my-cluster",
"serviceName": "web-service",
"taskDefinition": "web-app:1",
"desiredCount": 3,
"launchType": "FARGATE",
"networkConfiguration": {
"awsvpcConfiguration": {
"subnets": ["subnet-12345", "subnet-67890"],
"securityGroups": ["sg-12345"],
"assignPublicIp": "ENABLED"
}
},
"loadBalancers": [
{
"targetGroupArn": "arn:aws:elasticloadbalancing:region:account:targetgroup/my-targets",
"containerName": "web-app",
"containerPort": 80
}
],
"healthCheckGracePeriodSeconds": 60
}
Create Service with ALB (CLI)
aws ecs create-service \
--cluster my-cluster \
--service-name web-service \
--task-definition web-app:1 \
--desired-count 3 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-1,subnet-2],securityGroups=[sg-12345]}" \
--load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:region:account:targetgroup/my-tg,containerName=web,containerPort=80" \
--health-check-grace-period-seconds 60
Auto Scaling
Target Tracking Scaling
# ========== Register Scalable Target ==========
aws application-autoscaling register-scalable-target \
--service-namespace ecs \
--resource-id service/my-cluster/my-service \
--scalable-dimension ecs:service:DesiredCount \
--min-capacity 2 \
--max-capacity 10
# ========== Create Scaling Policy (CPU) ==========
aws application-autoscaling put-scaling-policy \
--service-namespace ecs \
--resource-id service/my-cluster/my-service \
--scalable-dimension ecs:service:DesiredCount \
--policy-name cpu-scaling-policy \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 70.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ECSServiceAverageCPUUtilization"
},
"ScaleInCooldown": 300,
"ScaleOutCooldown": 60
}'
# ========== Create Scaling Policy (Memory) ==========
aws application-autoscaling put-scaling-policy \
--service-namespace ecs \
--resource-id service/my-cluster/my-service \
--scalable-dimension ecs:service:DesiredCount \
--policy-name memory-scaling-policy \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 80.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ECSServiceAverageMemoryUtilization"
}
}'
# ========== Create Scaling Policy (ALB Request Count) ==========
aws application-autoscaling put-scaling-policy \
--service-namespace ecs \
--resource-id service/my-cluster/my-service \
--scalable-dimension ecs:service:DesiredCount \
--policy-name request-count-policy \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 1000.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ALBRequestCountPerTarget",
"ResourceLabel": "app/my-alb/xxx/targetgroup/my-tg/yyy"
}
}'
Task Placement Strategies (EC2 Launch Type)
Placement Strategies
{
"placementStrategy": [
{
"type": "spread",
"field": "attribute:ecs.availability-zone"
},
{
"type": "binpack",
"field": "memory"
}
],
"placementConstraints": [
{
"type": "memberOf",
"expression": "attribute:ecs.instance-type =~ t3.*"
}
]
}
Strategy Types:
Service Discovery
Cloud Map Integration
# ========== Create Private DNS Namespace ==========
aws servicediscovery create-private-dns-namespace \
--name local \
--vpc vpc-12345
# ========== Create Service Discovery Service ==========
aws servicediscovery create-service \
--name my-app \
--namespace-id ns-12345 \
--dns-config '{
"DnsRecords": [
{
"Type": "A",
"TTL": 60
}
]
}'
# ========== Create ECS Service with Service Discovery ==========
aws ecs create-service \
--cluster my-cluster \
--service-name my-service \
--task-definition my-app:1 \
--desired-count 2 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-12345],securityGroups=[sg-12345]}" \
--service-registries "registryArn=arn:aws:servicediscovery:region:account:service/srv-12345"
Secrets Management
Using AWS Secrets Manager
{
"containerDefinitions": [
{
"name": "app",
"image": "myapp:latest",
"secrets": [
{
"name": "DB_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:db-password-AbCdEf"
},
{
"name": "API_KEY",
"valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:api-key-XyZ123:key::"
}
]
}
]
}
Using Systems Manager Parameter Store
{
"containerDefinitions": [
{
"name": "app",
"image": "myapp:latest",
"secrets": [
{
"name": "DATABASE_URL",
"valueFrom": "arn:aws:ssm:us-east-1:123456789012:parameter/prod/database-url"
}
]
}
]
}
Logging and Monitoring
CloudWatch Logs
{
"containerDefinitions": [
{
"name": "app",
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/my-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs",
"awslogs-create-group": "true"
}
}
}
]
}
View Logs with CLI
# ========== Get Log Events ==========
aws logs tail /ecs/my-app --follow
# ========== Filter Logs ==========
aws logs filter-log-events \
--log-group-name /ecs/my-app \
--filter-pattern "ERROR" \
--start-time $(date -d '1 hour ago' +%s)000
Container Insights
# ========== Enable Container Insights ==========
aws ecs update-cluster-settings \
--cluster my-cluster \
--settings name=containerInsights,value=enabled
Deployment Strategies
Rolling Update
{
"deploymentConfiguration": {
"maximumPercent": 200,
"minimumHealthyPercent": 100,
"deploymentCircuitBreaker": {
"enable": true,
"rollback": true
}
}
}
Blue/Green Deployment
# Using CodeDeploy for Blue/Green
aws deploy create-deployment \
--application-name my-app \
--deployment-group-name my-deployment-group \
--revision '{
"revisionType": "AppSpecContent",
"appSpecContent": {
"content": "{
\"version\": 0.0,
\"Resources\": [{
\"TargetService\": {
\"Type\": \"AWS::ECS::Service\",
\"Properties\": {
\"TaskDefinition\": \"my-app:2\",
\"LoadBalancerInfo\": {
\"ContainerName\": \"web\",
\"ContainerPort\": 80
}
}
}
}]
}"
}
}'
IAM Roles
Task Execution Role
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"secretsmanager:GetSecretValue",
"ssm:GetParameters"
],
"Resource": [
"arn:aws:secretsmanager:region:account:secret:my-secret-*",
"arn:aws:ssm:region:account:parameter/prod/*"
]
}
]
}
Task Role
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": "arn:aws:s3:::my-bucket/*"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:Query"
],
"Resource": "arn:aws:dynamodb:region:account:table/my-table"
}
]
}
Capacity Providers
Fargate Capacity Provider
# ========== Create Capacity Provider Strategy ==========
aws ecs put-cluster-capacity-providers \
--cluster my-cluster \
--capacity-providers FARGATE FARGATE_SPOT \
--default-capacity-provider-strategy \
capacityProvider=FARGATE,weight=1,base=2 \
capacityProvider=FARGATE_SPOT,weight=4
EC2 Auto Scaling Group Capacity Provider
# ========== Create Capacity Provider ==========
aws ecs create-capacity-provider \
--name my-capacity-provider \
--auto-scaling-group-provider '{
"autoScalingGroupArn": "arn:aws:autoscaling:region:account:autoScalingGroup:id:autoScalingGroupName/my-asg",
"managedScaling": {
"status": "ENABLED",
"targetCapacity": 80,
"minimumScalingStepSize": 1,
"maximumScalingStepSize": 10
},
"managedTerminationProtection": "ENABLED"
}'
Best Practices
Task Definition Best Practices
{
"family": "production-app",
"taskRoleArn": "arn:aws:iam::account:role/task-role",
"executionRoleArn": "arn:aws:iam::account:role/execution-role",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "512",
"memory": "1024",
"containerDefinitions": [
{
"name": "app",
"image": "account.dkr.ecr.region.amazonaws.com/my-app:v1.0.0",
"essential": true,
"readonlyRootFilesystem": true,
"user": "1000:1000",
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
},
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/production/my-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "ecs"
}
},
"environment": [
{
"name": "ENVIRONMENT",
"value": "production"
}
],
"secrets": [
{
"name": "DB_PASSWORD",
"valueFrom": "arn:aws:secretsmanager:region:account:secret:prod/db-password"
}
]
}
]
}
Service Configuration Best Practices
Security Best Practices
Troubleshooting
Common Issues
# ========== Task Fails to Start ==========
# Check task stopped reason
aws ecs describe-tasks \
--cluster my-cluster \
--tasks task-id \
--query 'tasks[0].stoppedReason'
# Check container exit code
aws ecs describe-tasks \
--cluster my-cluster \
--tasks task-id \
--query 'tasks[0].containers[0].exitCode'
# ========== Service Deployment Stuck ==========
# Check service events
aws ecs describe-services \
--cluster my-cluster \
--services my-service \
--query 'services[0].events[:10]'
# ========== Cannot Pull Image ==========
# Verify ECR permissions
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin account.dkr.ecr.region.amazonaws.com
# Check task execution role has ECR permissions
# ========== Connection Issues ==========
# Verify security groups allow traffic
# Check VPC route tables
# Verify target group health checks