Why Database Continuity Matters
Applications often recover quickly after outages, but if databases lose data or remain offline, business continuity fails. In Azure, you must design for data replication, geo-redundancy, and failover that aligns with workload RPO/RTO needs.
1. SQL Database & SQL Managed Instance
Geo-Replication Options:
-
Active Geo-Replication
-
Creates readable secondary replicas in up to 4 regions.
-
Failover is manual but fast.
-
Best for mission-critical apps needing global availability.
-
-
Auto-Failover Groups
-
Automates failover between primary and secondary.
-
Connection strings redirect automatically.
-
Supports group-level replication (multiple DBs together).
-
Best Use Cases:
-
Banking apps requiring minimal data loss.
-
SaaS platforms with global users.
2. Cosmos DB
Geo-Redundancy Features:
-
Multi-region replication with 99.999% availability SLA.
-
Multi-master write support → active-active across regions.
-
Configurable consistency levels (Strong, Bounded Staleness, Session, Consistent Prefix, Eventual).
-
Failover can be automatic or manual.
Best Use Cases:
-
Global e-commerce platforms.
-
IoT platforms needing local region writes.
-
Apps requiring low latency worldwide.
3. Azure Storage
Replication Options:
-
LRS (Locally Redundant Storage): 3 copies in one datacenter.
-
ZRS (Zone-Redundant Storage): 3 copies across zones in one region.
-
GRS (Geo-Redundant Storage): copies to paired region.
-
RA-GRS (Read-Access GRS): adds read access to secondary region.
Best Use Cases:
-
Archival, backup, compliance storage.
-
Business-critical apps requiring regional DR.
Design Considerations
-
RPO/RTO Alignment
-
SQL Geo-replication → RPO < 5 seconds.
-
Cosmos DB multi-master → near-zero RPO.
-
GRS → eventual consistency, not instant.
-
Failover Strategy
-
SQL Auto-Failover Groups for automation.
-
Cosmos DB automatic regional failover.
-
RA-GRS for read workloads during outages.
-
Cost vs Criticality
-
Global replication = higher cost.
-
Use for workloads where downtime/data loss is unacceptable.
Example Enterprise Scenario
A travel booking company requires:
-
SQL database must failover automatically if region outage occurs.
-
Cosmos DB must support global active-active writes for users in US, EU, and Asia.
-
Blob storage must provide read access in secondary region during outages.
Correct design:
-
Use SQL Auto-Failover Groups for automated DR.
-
Deploy Cosmos DB multi-region, multi-master.
-
Enable RA-GRS for Blob Storage.
Confusion Buster
-
SQL Geo-Replication vs Failover Groups
-
Geo-replication = manual failover.
-
Failover groups = automatic.
-
-
Cosmos DB vs SQL
-
Cosmos = multi-master global scale.
-
SQL = structured relational workloads.
-
-
GRS vs RA-GRS
-
GRS = copy only.
-
RA-GRS = copy + read access.
-
Exam Tips
-
“Which SQL option automates failover between primary and secondary?” → Auto-Failover Groups.
-
“Which DB supports multi-master global writes?” → Cosmos DB.
-
“Which storage option provides read access in secondary region?” → RA-GRS.
-
“Which DB option provides configurable consistency levels?” → Cosmos DB.
What to Expect in the Exam
-
Direct Q: “Which service provides automatic failover for multiple SQL DBs?” → Auto-Failover Groups.
-
Scenario Q: “Company needs multi-region active-active writes.” → Cosmos DB multi-master.
-
Scenario Q: “Storage must be readable in secondary region.” → RA-GRS.
-
Trick Q: “Geo-replication in SQL Database automatically fails over without configuration.” → False.