📄 EC2 Inference

ลงชื่อเข้าใช้

📅 Timeline & Action Plan (2–3 Weeks)

🔹 Week 1 — Assessment & Containerization

1.1 Analyze Current SageMaker Setup (Day 1–2)

Deliverables

Model type (PyTorch / sklearn / custom)
Inference entrypoint (inference.py)
Input / Output schema
Dependencies (requirements.txt)
Model artifacts location (S3)

Checklist

model.tar.gz structure
preprocessing / postprocessing logic
threshold / config JSON
latency target (ms)

1.2 Build Docker Image for Inference (Day 3–4)

Tasks

สร้าง Dockerfile
แยก inference logic ออกจาก SageMaker SDK
Mount / download model from S3

Deliverables

Dockerfile
app.py หรือ main.py
requirements.txt
Docker image run ได้ locally

1.3 Define API Contract (Day 5)

Tasks

Freeze request/response JSON
Match output format กับของเดิม (SageMaker)

Deliverables

API Spec (JSON Example)
Versioned endpoint /v1/predict

🔹 Week 2 — EC2 Deployment & Integration

2.1 Provision EC2 + Docker Runtime (Day 6–7)

Recommended

EC2: c6i.large / c6g.large (CPU inference)
OS: Amazon Linux 2023
Install: Docker + docker-compose

Deliverables

EC2 instance
Security Group (Allow 80 / 443 / 8080)

2.2 Deploy Container on EC2 (Day 8–9)

Tasks

Pull image / build on EC2
Use docker-compose
Set restart policy

Deliverables

Production container running
Health check /health

2.3 Connect API / System (Day 10)

Tasks

Update Node.js / Backend ให้ชี้ไป EC2
Feature flag (SageMaker vs EC2)

Deliverables

Toggle switch (env-based)
Fallback plan

🔹 Week 3 — Testing, Optimization & Cutover

3.1 Load & Performance Testing (Day 11–13)

Metrics

p95 latency
CPU / RAM usage
Throughput (req/sec)

Deliverables

Benchmark report
Instance sizing decision

3.2 Hardening & Ops (Day 14–15)

Tasks

Auto-restart
Log rotation
Basic monitoring (CloudWatch Agent / Prometheus)
IAM role (S3 read-only)

Deliverables

Runbook
Alarm thresholds

3.3 Production Cutover (Day 16–17)

Steps

Switch traffic 10% → 50% → 100%
Monitor errors & latency
Keep SageMaker as rollback (1–2 weeks)

Deliverables

Production sign-off