Successfully implemented comprehensive monitoring and alerting infrastructure for the Meteor platform across all three stories of Epic 3: **Story 3.5: 核心业务指标监控 (Core Business Metrics Monitoring)** - Instrumented NestJS web backend with CloudWatch metrics integration using prom-client - Instrumented Go compute service with structured CloudWatch metrics reporting - Created comprehensive Terraform infrastructure from scratch with modular design - Built 5-row CloudWatch dashboard with application, error rate, business, and infrastructure metrics - Added proper error categorization and provider performance tracking **Story 3.6: 关键故障告警 (Critical System Alerts)** - Implemented SNS-based alerting infrastructure via Terraform - Created critical alarms for NestJS 5xx error rate (>1% threshold) - Created Go service processing failure rate alarm (>5% threshold) - Created SQS queue depth alarm (>1000 messages threshold) - Added actionable alarm descriptions with investigation guidance - Configured email notifications with manual confirmation workflow **Cross-cutting Infrastructure:** - Complete AWS infrastructure as code with Terraform (S3, SQS, CloudWatch, SNS, IAM, optional RDS/Fargate) - Structured logging implementation across all services (NestJS, Go, Rust) - Metrics collection following "Golden Four Signals" observability approach - Configurable thresholds and deployment-ready monitoring solution The platform now has production-grade observability with comprehensive metrics collection, centralized monitoring dashboards, and automated critical system alerting. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Meteor Compute Service
A Go-based compute service for the distributed meteor monitoring network.
Project Structure
meteor-compute-service/
├── cmd/
│ └── meteor-compute-service/ # Application entry point
│ └── main.go
├── internal/
│ ├── config/ # Configuration management
│ │ └── config.go
│ └── health/ # Health check functionality
│ ├── handler.go
│ └── server.go
├── go.mod
└── README.md
Running the Service
go run cmd/meteor-compute-service/main.go
Health Check
The service exposes a health check endpoint at /health that returns:
{
"status": "ok"
}
Environment Variables
PORT: Server port (default: 8080)
Building
go build -o bin/meteor-compute-service cmd/meteor-compute-service/main.go