Building TB-Scale Data Infrastructure at UN-Habitat
When the United Nations Human Settlements Programme needed to process terabytes of urban data from 12 global cities, they needed comprehensive infrastructure for real-time processing, complex analytics, and reliable monitoring at enterprise scale.
Building TB-Scale Data Infrastructure at UN-Habitat
When UN-Habitat needed to process terabytes of urban data from 12 global cities, we built a scalable AWS infrastructure that handles 10+ TB monthly with real-time processing capabilities.
The Challenge
UN-Habitat monitors sustainable urban development globally, requiring massive-scale data processing capabilities.
Scale Requirements
The scope of this project was unprecedented:
- 12 cities: Nairobi, São Paulo, Cairo, Manila, Johannesburg, Mexico City, Mumbai, Lagos, Dhaka, Jakarta, Kinshasa, Lima
- 10+ TB monthly: Satellite imagery, census data, infrastructure metrics
- Real-time processing: Sub-5-minute latency for emergency response
- Global access: 24/7 availability across time zones
Technical Challenges
Building at this scale required solving complex problems:
- Data volume: Processing terabytes efficiently
- Geographic distribution: Serving global users with low latency
- Data variety: Handling satellite imagery, structured data, and real-time streams
- Reliability: Mission-critical uptime requirements
Architecture Solution
Core AWS Services
Data Ingestion:
- AWS Glue: ETL pipelines for diverse data sources
- Lambda: Real-time event processing
- EventBridge: Event routing and orchestration
Storage & Processing:
- S3: Data lake with intelligent tiering
- Redshift: Analytics warehouse
- EMR: Large-scale data processing
Monitoring:
- CloudWatch: System metrics and alerts
- Step Functions: Workflow orchestration
Data Flow
# Simplified pipeline architecture
def process_city_data(city_id, data_source):
# 1. Ingest data via Glue
raw_data = glue_crawler.extract(data_source)
# 2. Transform with Lambda
processed = transform_urban_data(raw_data, city_id)
# 3. Store in S3 data lake
s3_client.put_object(
Bucket='un-habitat-data-lake',
Key=f'cities/{city_id}/processed/{timestamp}',
Body=processed
)
# 4. Trigger analytics
eventbridge.send_event({
'source': 'data-pipeline',
'detail': {'city': city_id, 'status': 'ready'}
})
Key Technical Decisions
1. Serverless-First Approach
- AWS Lambda for event-driven processing
- Auto-scaling based on data volume
- Cost optimization with pay-per-use model
2. Data Lake Architecture
- S3 with intelligent tiering for cost efficiency
- Partitioned by city and date for fast queries
- Multiple formats supported (Parquet, JSON, CSV)
3. Real-Time Processing
# EventBridge rule for urgent data
{
"Rules": [{
"Name": "UrgentDataRule",
"EventPattern": {
"source": ["city-sensors"],
"detail": {
"priority": ["urgent", "emergency"]
}
},
"Targets": [{
"Arn": "arn:aws:lambda:us-east-1:account:function:ProcessUrgentData"
}]
}]
}
Results & Impact
Performance Achieved
- Processing time: 8 hours → 15 minutes
- Query response: < 2 seconds average
- Uptime: 99.95% (exceeded target)
- Cost reduction: 40% vs traditional infrastructure
Business Value
- Real-time insights for emergency response
- Standardized data across all 12 cities
- Automated reporting for UN stakeholders
- Scalable foundation for future expansion
Lessons Learned
What Worked Well
- Serverless architecture provided excellent scalability
- Event-driven design enabled real-time responsiveness
- Multi-region deployment ensured global accessibility
- Comprehensive monitoring caught issues before users
Challenges Overcome
- Data quality variations across cities
- Network connectivity in some regions
- Time zone coordination for maintenance windows
- Compliance requirements for sensitive data
Technical Implementation
Glue ETL Pipeline
import boto3
from awsglue.transforms import *
def transform_city_metrics(glueContext, data_frame):
# Standardize city data format
standardized = data_frame.select_fields([
'city_id', 'timestamp', 'population',
'gdp_per_capita', 'infrastructure_score'
])
# Apply data quality rules
cleaned = Filter.apply(
frame=standardized,
f=lambda x: x["population"] > 0 and x["gdp_per_capita"] > 0
)
return cleaned
Monitoring & Alerts
CloudWatch Alarms:
- DataProcessingLatency: >5 minutes
- ErrorRate: >1%
- StorageUtilization: >80%
SNS Notifications:
- Operations team for system alerts
- Data team for quality issues
- Stakeholders for completion notifications
Future Enhancements
Planned Improvements
- Machine learning integration for predictive analytics
- Real-time visualization dashboards
- API gateway for external data sharing
- Blockchain for data provenance tracking
Scaling Considerations
- Adding more cities to the platform
- Implementing edge computing for faster processing
- Enhanced security with AWS Security Hub
- Cost optimization with Reserved Instances
Conclusion
This TB-scale infrastructure transformed UN-Habitat's ability to monitor global urban development. The serverless, event-driven architecture provided the scalability, reliability, and cost-effectiveness needed for their mission-critical work.
The key success factors were choosing the right AWS services for each use case, implementing comprehensive monitoring from day one, and designing with global scale in mind from the beginning.
This project demonstrates how modern cloud architecture can handle enterprise-scale data challenges while maintaining cost efficiency and operational excellence.
Anmol Manchanda
AI-Assisted Developer & Technical Architect