AWS Unpacked #5: Managed Database Services

Categories: AWS

A Tour of AWS Storage Services: What to Use, When, and Why

Blog header image
TL;DR
AWS offers a range of fully managed database services — from traditional relational databases like RDS and Aurora, to scalable NoSQL options like DynamoDB and DocumentDB. These services handle provisioning, scaling, backups, and high availability so you can focus on your application, not database admin.

Amazon RDS

Amazon RDS (Relational Database Service) is a managed database service that makes it easier to set up, operate, and scale a relational database in the cloud — without managing the underlying hardware, OS, or backup scripts. It supports multiple engines:

  • PostgreSQL
  • MySQL
  • MariaDB
  • Oracle
  • SQL Server
  • Amazon Aurora (MySQL- and PostgreSQL-compatible)

Although RDS is popular because of its “managed service” benefits (such as automated provisioning, OS patching, automated backups), some limitations are:

  • you can’t SSH into RDS
  • Limited control over file system or OS-level config
  • Higher cost compared to self-hosted DBs (but you’re paying for time and peace of mind)

Now, let’s focus on some of the features of RDS.

RDS Auto-Scaling

Storage Auto-Scaling automatically increases storage when:

  • Free space drops below 10%
  • For at least 5 minutes
  • And no changes in size for 6 hours

You must define a maximum storage limit.

Supported for: All RDS engines, including Aurora.

Use case: When workloads grow unpredictably (e.g., growing app data or analytics workloads), you avoid outages due to full disks.

Read Replicas

  • Up to 15 replicas
  • Within same AZ, cross-AZ, or cross-region
  • Replication is asynchronous
  • Replicas can be promoted to standalone DBs
  • Useful for:
    • Read-heavy apps
    • Analytics queries
    • Disaster recovery (if cross-region)
  • Note: Apps must update the connection string if a replica is promoted and you can’t write to replicas (i.e. no INSERT/UPDATE/DELETE)

Network costs

  • Same AZ: No charge
  • Cross-AZ: Charged
  • Cross-region: More expensive

Multi-AZ deployments

  • Enables high availability / disaster recovery , not scaling
  • AWS creates a standby replica in another AZ (synchronously replicated)
  • You get a single DNS endpoint
  • In case of failure, failover is automatic
  • No app change needed (Multi-AZ keeps the same connection string regardless of which database is up)
  • Not for performance, just DR/HA
  • Can you have Multi-AZ Read Replicas? Yes — especially for DR (e.g., Multi-AZ replica in another region)
  • Converting single-AZ to Multi-AZ: this automatically triggers a snapshot in the background that is restored in new AZ (you don’t do this manually). There may be short downtime during conversion

Configuration options when first creating an instance:

  • Engine: Choose DB engine (e.g. Postgres)
  • Availability: Single-AZ or Multi-AZ
  • Instance Class: Choose compute (e.g. db.t3.medium)
  • Storage Type:
    • gp2/gp3 (SSD, general purpose)
    • io1 (Provisioned IOPS)
    • magnetic (deprecated)
  • Storage auto-scaling: Optional
  • Security Groups + Port Config: Control access like EC2

Authentication

  • IAM-based authentication - use cases: for developer or analyst access without needing to manage passwords; short-term EC2-based apps that need secure DB access using IAM roles; environments where secrets management needs to be avoided or simplified. IAM Database Authentication works with MySQL and PostgreSQL (not supported for the other engines)
  • Username/password
  • Kerberos (for SQL Server/Oracle)

Monitoring

  • CloudWatch
  • Enhanced Monitoring (OS-level)
  • Performance Insights (query performance)

Snapshots

  • Manual snapshots: user-initiated, retained until deleted
  • Automated backups: enabled by default, retained for up to 35 days. Transaction logs are backed up every 5 minutes - enables point-in-time recovery.
  • Used for point-in-time restore or cloning environments
  • Cost-saving tip: as you’re only paying for storage, take a manual snapshot when you plan on not using a database for a long time. Then store the snapshot and restore it later when needed. Great for dev/test environments and occasional workloads.

Security

  • Encryption:

    At rest:

    • Master and replicas use AWS KMS, must be defined at launch time
    • If master is not encrypted, read replicas cannot be encrypted
    • To encrypt an unencrypted db, take a snapshot and restore as encrypted

    In-flight:

    • TLS-ready by default, use AWS LTS root certificates client-side
  • Authentication
    • Use IAM roles to connect to db (instead of username and password)
    • Security Groups: control network access to your RDS instance
  • Monitoring
    • Audit logs can be sent to CloudWatch

RDS Custom

Amazon RDS Custom is a special deployment option for Oracle and SQL Server databases where you still get many benefits of a managed RDS service (like backups, monitoring, Multi-AZ), but also get full access to the underlying OS and DB instance — like you would on EC2. It’s ideal when your application requires:

  • Custom configurations
  • Third-party agents or drivers
  • Manual patching
  • Legacy dependencies

With RDS Custom, you can:

  • Connect via SSH or SSM Session Manager
  • Get full admin access (including sysdba for Oracle, sysadmin for SQL Server)
  • Manually install patches, drivers, or monitoring tools
  • Customize OS-level or database-level configurations

So it’s a blend of RDS convenience + EC2-like control.

RDS Custom includes something called “Automation Mode”, which controls AWS automation like backups, monitoring, failover, etc.

To safely customize your instance (especially for manual patching, installing software, or making deep config changes), it’s recommended to switch Automation Mode to OFF & take a manual DB snapshot before making changes (this protects you in case something goes wrong). Note: When Automation Mode is OFF, RDS will not perform backups or maintenance — so remember to re-enable it afterwards if needed.

Use cases for RDS Custom:

  • Legacy enterprise apps that require OS-level tuning
    • Compliance-driven workloads that need custom security software
  • Databases with complex replication or backup tools
  • Migrating on-prem SQL Server/Oracle where exact match of config is required

RDS Proxy

RDS Proxy is a managed database proxy that sits between your application and your RDS database. You can think of it like a smart middleman that helps your app connect to the database more efficiently and securely — especially under high load or during failovers. It’s useful because:

  • Apps don’t always manage database connections well (especially with lots of users).
  • During a database failover (like when a Multi-AZ setup switches to a standby), connections can be dropped or delayed.
  • RDS Proxy helps keep things fast, stable, and secure without major changes to your app. In most cases, you only need to update your database connection string to point to the proxy endpoint instead of the DB directly.

RDS Proxy supports:

  • RDS for MySQL, PostgreSQL, and MariaDB
  • Amazon Aurora (MySQL-compatible and PostgreSQL-compatible)

Key Features of RDS Proxy

Feature Description
High availability Works across multiple AZs for automatic failover
🔁 Connection pooling Reuses database connections to reduce load and startup time
Faster failovers Reduces failover time from ~30 seconds to a few seconds
📈 Scales automatically Manages thousands of connections efficiently
🔐 Improved security No need to store DB passwords in your app; uses AWS Secrets Manager and supports IAM authentication
🔒 Never publicly accessible Always inside your VPC — can’t be accessed from the public internet

Authentication

  • You can enforce IAM authentication to control which apps or users can connect.
  • Instead of hardcoding DB credentials, you store them securely in AWS Secrets Manager — and RDS Proxy uses them automatically.

Amazon Aurora

Amazon Aurora is a fully managed relational database built by AWS specifically for the cloud. It’s compatible with MySQL and PostgreSQL, but significantly faster — up to 5x faster than MySQL and 3x faster than PostgreSQL when compared to equivalent RDS setups, thanks to Aurora’s cloud-native architecture.

Aurora comes in two main deployment models:

Aurora (Provisioned)

This is the traditional model where you choose and manage instance sizes, much like with RDS. You pick the compute capacity you want, and AWS handles the rest — patching, backups, replication, and failover.

Aurora Serverless

Aurora Serverless (v1 and v2) is a more dynamic option where compute capacity automatically scales up or down based on usage. You don’t need to manage instances — it’s ideal for unpredictable workloads, development environments, or infrequent usage.

  • v1: Suitable for bursty, low-usage apps. It starts from zero and scales in “ACUs” (Aurora Capacity Units), but has slower scaling and cold start delays.
  • v2: Much more responsive, supports fine-grained scaling and instant failover, and can handle production workloads more efficiently.

Deployment Options

Deployment Type Best For Key Characteristics
Provisioned Aurora Steady, predictable workloads Fixed instance size; predictable cost and performance
Aurora Serverless v1 Dev/test or spiky low-traffic apps Scales in chunks; might experience cold starts
Aurora Serverless v2 Variable production workloads Smooth auto-scaling with fast failover; more mature for production use

Instance Classes (Provisioned Only)

  • General Purpose: db.r5, db.r6g — balanced CPU and memory
  • Memory Optimized: db.r6i, db.x2g — for large, in-memory workloads
  • Burstable: db.t3, db.t4g — cost-effective for light workloads
  • (Aurora Serverless doesn’t require you to choose an instance class — it handles this automatically.)

Key Features

Feature Description
Cloud-Optimized Designed from scratch for high scalability, durability, and availability
💾 Smart Storage Automatically scales in 10 GB increments up to 128 TB, with no downtime
🧠 Striped I/O Data is striped across hundreds of volumes and 6-way replicated across 3 AZs
📚 Read Replicas Up to 15 Aurora Replicas with low-latency read scaling. Aurora can auto-scale replicas based on CPU or connections
🚀 Fast Failover Failover to a replica takes under 30 seconds (instant with Serverless v2)
🛡️ High Availability 6 copies of your data across 3 AZs: 4 needed for writes, 3 for reads

When to Use Aurora

  • You want the performance of commercial databases like Oracle or SQL Server but at open-source cost
  • You need high availability and durability with minimal ops overhead
  • You’re building SaaS apps, analytics platforms, or e-commerce systems that require both scalability and resilience
  • You want to scale to thousands of connections or unpredictable workloads (Aurora Serverless v2 fits here)

Aurora vs RDS

Aurora costs more than standard RDS (typically 20-30% higher). But you get higher performance, instant failover, and better scalability.

Endpoints

Aurora uses different types of endpoints to manage traffic efficiently:

  • Writer Endpoint: Points to the primary (master) instance and handles all write operations. Only one instance can receive writes at a time.
  • Reader Endpoint: Automatically distributes read traffic across all available replicas. This enables connection load balancing for read-heavy workloads without managing multiple endpoints.
  • Custom Endpoints: You can define custom endpoints to target a specific group of replicas. Useful when you want certain queries (e.g., analytics) to run only on large instances or isolated nodes.

Global Aurora

Global Aurora is a feature that lets you span a single Aurora database across multiple AWS regions:

  • What it is: A multi-region deployment where one region acts as the primary (writer) and other regions have read-only replicas.
  • Use cases:
    • Low-latency reads for global applications
    • Disaster recovery
    • Business continuity
  • Replication lag is typically less than 1 second.
  • Recovery Time Objective (RTO) is under 1 minute.
  • You can replicate to up to 5 secondary regions, with up to 16 replicas per region.
  • You can promote a secondary region to become the primary manually.

Aurora Serverless

Aurora Serverless is an on-demand, auto-scaling configuration of Aurora that automatically adjusts database capacity based on your application needs. It is a version of Aurora that doesn’t require you to provision or manage database instances. Aurora automatically adjusts compute capacity in response to application load. You only pay for actual compute and storage consumed (as opposed to paying for idle time with standard Aurora databases).

Use cases

  • Infrequently used applications
  • Development, testing, or QA environments
  • Unpredictable or spiky workloads

Benefits

  • Pay-per-use (billed per second)
  • Auto-pause and resume to save cost
  • No instance management
  • Available for both MySQL and PostgreSQL versions of Aurora.

Aurora Backups

  • Automated Backups:
    • Retention from 1 to 35 days (cannot be disabled)
    • Supports Point-in-Time Recovery (PITR)
  • Manual Snapshots: Can be created on demand
  • S3 Integration: You can restore a MySQL Aurora cluster from a backup stored in Amazon S3 using Percona XtraBackup.

Aurora Database cloning

You can clone an existing Aurora database to create a separate copy almost instantly:

  • Uses a copy-on-write protocol, so only changes consume additional storage
  • Perfect for testing, dev, and analytics without disrupting production

Aurora Security

Feature Description
Encryption at Rest Uses AWS KMS; must be enabled at DB creation
Encryption in Transit TLS-enabled by default; client must use AWS LTS root certs
IAM Authentication Apps can connect using IAM roles instead of username/password
Security Groups Used to control inbound/outbound traffic to Aurora DB cluster
Audit Logs Can be sent to CloudWatch Logs for visibility and compliance

To encrypt an unencrypted Aurora DB:

  • Take a snapshot
  • Restore it as an encrypted DB (encryption can’t be added to existing DBs directly)

Aurora Machine Learning

Allows running ML inference directly from SQL.

  • Supports Amazon SageMaker and Amazon Comprehend
  • Example use cases: Fraud detection, Sentiment analysis, Recommendation systems

Babelfish for Aurora PostgreSQL

  • Lets Aurora PostgreSQL understand SQL Server’s T-SQL dialect
  • Apps written for Microsoft SQL Server can run without code changes
  • Useful for migrating away from SQL Server while preserving application logic

NoSQL databases in AWS

NoSQL databases are designed for flexible data models, horizontal scaling, and low-latency access. AWS offers a range of managed NoSQL services, each optimized for a specific data model and access pattern. Here’s a breakdown of what you need to know.


Amazon DynamoDB

A fully managed, serverless NoSQL key-value and document database that offers single-digit millisecond performance at any scale. Supports automatic scaling, on-demand backup, and multi-Region replication.

Use cases: Shopping carts, session stores, real-time analytics, serverless apps.

We’ll cover full DynamoDB details in the blog about Serverless, so for now, just note that it’s AWS’s flagship NoSQL option for general-purpose key-value/document workloads.


Amazon DocumentDB (with MongoDB compatibility)

A fully managed document database designed to store, query, and index JSON-like documents. It’s compatible with MongoDB APIs, meaning you can migrate MongoDB apps with minimal changes — but it is not a MongoDB engine under the hood.

Use cases

Content management systems (CMS), user profiles, catalogs, mobile app data stores

Important notes

  • DocumentDB is not open-source MongoDB, but supports the MongoDB 3.6–4.0 APIs.
  • It’s highly available, with replica sets, automated backups, and multi-AZ support.
  • Does not support MongoDB features like joins, but handles many core MongoDB use cases.
  • Storage automatically grows in increments of 10GB
  • Automatically scales to workloads with millions of requests per second

Amazon Neptune

A fully managed graph database for highly connected data. Supports two major graph models:

  • Property Graph (Gremlin)
  • RDF (SPARQL)

Use cases: Social networks (friend-of-a-friend), fraud detection, recommendation engines, knowledge graphs

Important things to remember:

  • Neptune supports ACID transactions, multi-AZ deployments, and read replicas.
  • You choose query language based on model: Gremlin for property graphs, SPARQL for semantic graphs.
  • It’s a specialized tool for connected datasets, not general-purpose querying.
  • Neptune Streams is real-time ordered sequence of every change to your graph data. No duplicates, strict order, available in a HTTP REST API.

Amazon Keyspaces (for Apache Cassandra)

A serverless, scalable, and fully managed database that supports Apache Cassandra Query Language (CQL). You can use existing Cassandra tools and drivers, but it’s managed by AWS.

Use cases: Time-series data, messaging apps, IoT data storage, applications already built with Cassandra

Notes:

  • No need to manage clusters, capacity planning, or patching.
  • On-demand or provisioned throughput.
  • High availability and durability built in, across multiple AZs.

Amazon QLDB (Quantum Ledger Database)

A fully managed ledger database that provides a cryptographically verifiable transaction log — meaning you can trust that the data hasn’t been tampered with.

Use cases: Financial systems, audit trails, supply chain systems, registries (e.g. vehicle registration)

Notes:

  • It’s not blockchain, but a centralized ledger with cryptographic integrity.
  • Immutable: you cannot update or delete history.
  • Supports SQL-like language called PartiQL.
  • It’s serverless and scales automatically.

Amazon Timestream

A serverless time series database, optimized for storing and querying time-stamped data — like metrics, logs, and sensor data.

Use cases: IoT device telemetry, app performance metrics, DevOps monitoring, Industrial and manufacturing analytics

Remember:

  • Timestream automatically moves old data to a cheaper storage tier.
  • Serverless, scales based on usage.
  • Supports SQL-based queries.
  • Integrates with AWS IoT Core, CloudWatch, Kinesis, and more.

NoSQL - comparisons of various services

Service Best for Notes
DynamoDB General-purpose NoSQL Key-value & document; serverless
DocumentDB JSON document stores MongoDB API-compatible
Neptune Graph databases Gremlin/SPARQL
Keyspaces Cassandra-compatible workloads CQL, serverless
QLDB Immutable audit trail Cryptographic verification
Timestream Time-series data Optimized for high-ingest/temporal queries

About the Author

Dawie Loots is a data scientist with a keen interest in using technology to solve real-world problems. He combines data science expertise with his background as a Chartered Accountant and executive leader to help organizations build scalable, value-driven solutions.

Back to Blog