📅 Timeline & Action Plan (2–3 Weeks)
🔹 Week 1 — Assessment & Containerization
1.1 Analyze Current SageMaker Setup (Day 1–2)
Deliverables
Model type (PyTorch / sklearn / custom)
Inference entrypoint (inference.py)
Input / Output schema
Dependencies (requirements.txt)
Model artifacts location (S3)
Checklist
model.tar.gz structure
preprocessing / postprocessing logic
threshold / config JSON
latency target (ms)
1.2 Build Docker Image for Inference (Day 3–4)
Tasks
สร้าง Dockerfile
แยก inference logic ออกจาก SageMaker SDK
Mount / download model from S3
Deliverables
Dockerfile
app.py หรือ main.py
requirements.txt
Docker image run ได้ locally
1.3 Define API Contract (Day 5)
Tasks
Freeze request/response JSON
Match output format กับของเดิม (SageMaker)
Deliverables
API Spec (JSON Example)
Versioned endpoint /v1/predict
🔹 Week 2 — EC2 Deployment & Integration
2.1 Provision EC2 + Docker Runtime (Day 6–7)
Recommended
EC2: c6i.large / c6g.large (CPU inference)
OS: Amazon Linux 2023
Install: Docker + docker-compose
Deliverables
EC2 instance
Security Group (Allow 80 / 443 / 8080)
2.2 Deploy Container on EC2 (Day 8–9)
Tasks
Pull image / build on EC2
Use docker-compose
Set restart policy
Deliverables
Production container running
Health check /health
2.3 Connect API / System (Day 10)
Tasks
Update Node.js / Backend ให้ชี้ไป EC2
Feature flag (SageMaker vs EC2)
Deliverables
Toggle switch (env-based)
Fallback plan
🔹 Week 3 — Testing, Optimization & Cutover
3.1 Load & Performance Testing (Day 11–13)
Metrics
p95 latency
CPU / RAM usage
Throughput (req/sec)
Deliverables
Benchmark report
Instance sizing decision
3.2 Hardening & Ops (Day 14–15)
Tasks
Auto-restart
Log rotation
Basic monitoring (CloudWatch Agent / Prometheus)
IAM role (S3 read-only)
Deliverables
Runbook
Alarm thresholds
3.3 Production Cutover (Day 16–17)
Steps
Switch traffic 10% → 50% → 100%
Monitor errors & latency
Keep SageMaker as rollback (1–2 weeks)
Deliverables
Production sign-off