AWS Unpacked #6: Amazon S3

Introducing S3

Amazon S3 (which stands for Simple Storage Service) is AWS’s object storage solution — and one of the oldest, most core services in the AWS ecosystem. It’s designed to store and retrieve any amount of data from anywhere on the internet, at virtually any scale, and it’s often the first storage service people encounter in AWS.

Main Use Cases

S3 is incredibly versatile. Some of the most common use cases include:

Hosting static websites (HTML, CSS, JS files)
Backups and disaster recovery
Big data and analytics storage
Media hosting (images, videos, documents)
Data lakes
Software distribution and installer storage

Object Storage, Not Block Storage

S3 is an object storage service — this is different from block storage like EBS or file storage like EFS:

In S3, data is stored as objects in buckets.
Each object contains the data itself, a unique key, and optional metadata.
Unlike block storage, there’s no concept of a “drive” or “mount point.”

Keys, Prefixes, and the “Folder Illusion”

Every file in S3 is stored with a key, which is simply its full path:

photos/vacation/beach.jpg

Here, photos/vacation/ is the prefix, and beach.jpg is the object name.

There are no true directories in S3 - it’s a flat structure. The AWS Management Console pretends there are folders for user convenience by interpreting slashes (/) in the key names. So when you “create a folder” in the console, AWS is just prepending a string like myfolder/ to the object keys behind the scenes.

Metadata, Tags, and Versioning

Each object can have metadata — both system-defined (e.g., content type) and user-defined.
You can also assign tags (key-value pairs) to objects, which are useful for lifecycle policies, cost allocation, and more.
Versioning can be enabled at the bucket level to store multiple versions of the same object — great for rollback or undelete scenarios.

Some rules to bear in mind

File sizes:
- Maximum object size: 5 TB
- For uploads larger than 100 MB, AWS recommends using multipart upload — it splits the file into smaller parts and uploads them in parallel for speed and fault tolerance.
- Minimum size for a part: 5 MB (except the last one).
S3 buckets are created in a specific AWS region, and all the data physically lives in that region (unless you enable replication). However, once created, bucket names are globally unique — no two buckets across all AWS accounts in the world can have the same name.
When naming your S3 buckets, make sure you follow these rules:
- Names must be globally unique
- Use only lowercase letters, numbers, and hyphens
- No uppercase letters or underscores
- Length must be between 3 and 63 characters
- Name can’t be formatted like an IP address (e.g., 192.168.1.1)
- Must start with a lowercase letter or number
- Must not start with the prefix xn--
- Must not end with the suffix s3alias
These rules ensure compatibility with DNS and avoid conflicts with future AWS features.

Static Website Hosting

Amazon S3 isn’t just for storing backups or files — you can also use it to host a static website, perfect for simple HTML/CSS/JS sites (no server-side code like PHP or Node.js).

What you need to do:

Create a bucket
- S3 bucket names must be globally unique.
- If you intend using your own custom domain name, then the bucket name must match your domain name (e.g., example.com or www.example.com).
Enable static website hosting
- In the S3 console, go to Properties → scroll to Static website hosting.
- Choose Enable, then enter your index document (e.g., index.html) and optionally an error document (e.g., error.html).
Upload your site files
- Upload index.html, CSS, JS, images, etc.

Set permissions

Your files must be publicly accessible.

You can add a bucket policy to allow public read access:

  {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Sid": "PublicReadGetObject",
        "Effect": "Allow",
        "Principal": "*",
        "Action": "s3:GetObject",
        "Resource": "arn:aws:s3:::example.com/*"
      }
    ]
  }
        

Access your website
- AWS gives you a URL in this format:
  
  *http://.s3-website-.amazonaws.com; f*or example: `http://example.com.s3-website-us-east-1.amazonaws.com`
Got a 403 Forbidden Error?

If you see a 403 error, here’s what to check:
- Your files (especially index.html) aren’t public.
- You didn’t enable static website hosting under Properties.
- Your bucket policy is missing or incorrect.
- Your index document name doesn’t match the actual file name (it’s case-sensitive).

S3 Storage Classes

Storage Class	Availability	Min Storage Duration	Retrieval Options	Retrieval Time (Latency)	Durability	Best Use Case
S3 Standard	99.99%	None	N/A	Milliseconds	11 9s	Frequently accessed data, dynamic web/app content
S3 Intelligent-Tiering	99.9–99.99%	Varies by tier	N/A (auto-managed)	Milliseconds to hours	11 9s	Unpredictable or changing access patterns
└── Frequent Access	–	None	–	Milliseconds	–	Default tier for new/active data
└── Infrequent Access	–	30 days	–	Milliseconds	–	Automatically applied for idle data
└── Archive Instant Access	–	90 days	–	Milliseconds	–	Auto-tiered archival access with instant retrieval
└── Archive Access	–	90 days	Standard / Bulk	3–5 hours / 5–12 hours	–	Archive data with occasional retrieval
└── Deep Archive Access	–	180 days	Standard / Bulk	12 hours / 48 hours	–	Deep archival with rare access needs
S3 Standard-IA	99.9%	30 days	N/A	Milliseconds	11 9s	Infrequently accessed but critical data
S3 One Zone-IA	99.5%	30 days	N/A	Milliseconds	11 9s (single AZ)	Non-critical, infrequent-access data in 1 AZ
S3 Glacier Instant Retrieval	99.9%	90 days	N/A	Milliseconds	11 9s	Archive that needs immediate access sometimes
S3 Glacier Flexible Retrieval	99.99%	90 days	Expedited / Standard / Bulk	1–5 min / 3–5 hrs / 5–12 hrs	11 9s	Archive with less frequent, flexible access
S3 Glacier Deep Archive	99.99%	180 days	Standard / Bulk	12 hours / 48 hours	11 9s	Long-term backup (compliance, legal, etc.)

S3 Lifecycle Rules

S3 Lifecycle rules let you automate transitions between storage classes and expire (delete) objects after a set period. This helps optimize storage costs without sacrificing availability where it’s needed.

How Lifecycle Rules Work

Lifecycle rules are set at the bucket level, but you can scope them to:

A prefix (like a “folder”, e.g. "logs/", "2023/")
Object tags (e.g. {"env":"test"})
There are two main types of actions:
- Transition Actions - Automatically move objects to a cheaper storage class after a certain number of days from creation. Example: move to Standard-IA after 30 days.
- Expiration Actions - Delete objects after they reach a certain age or when previous versions are no longer needed.

Choosing the Right Storage Class

Scenario 1: Infrequently Accessed Files
- Move to: Standard-IA (Infrequent Access)
- Lower storage cost than Standard
- Slightly higher access cost
- Good for files accessed occasionally (e.g., user uploads not often retrieved)
Scenario 2: Archived Files with No Need for Fast Access
- Move to:
  - Glacier Instant Retrieval if you might want instant access occasionally
  - Glacier Flexible Retrieval or Deep Archive if access is rare to never, and you can wait hours or days

How S3 Storage Class Analytics Can Help

S3 Analytics helps you figure out when to transition files by showing access patterns for objects stored in the Standard or Standard-IA classes. It doesn’t work with other classes (like Glacier or Intelligent-Tiering).

Once enabled for a prefix or tag-filtered group, it collects 30+ days of data, and then suggests whether a transition would make cost sense.

S3 Requester Pays

By default, the bucket owner pays for all data transfer and request costs, even when the files are downloaded by someone else. So if you host public datasets, you’ll be footing the bill for every download — which can become costly, especially with large or frequently accessed files. With Requester Pays, the responsibility for download-related costs shifts to the person accessing the data, not the bucket owner. Uploads are still paid by the bucket owner, but Downloads (GET requests + data transfer) are paid by the requester.

For example, you’re hosting a large, publicly accessible research dataset — let’s say climate model data or genomic data — and you want researchers to access it without you bearing the cost. Enable Requester Pays, and each researcher using their AWS credentials will be billed on their account when they download the data.

Notes:

The requester must be an authenticated AWS user (they need to include RequestPayer=requester in their API requests)
The bucket must have Requester Pays enabled
The application or SDK accessing the bucket must be able to send the correct header/parameter indicating the requester is willing to pay

S3 Versioning

S3 Versioning is a feature that allows you to keep multiple versions of the same object in a bucket. This means:

You can recover files that were accidentally deleted or overwritten.
You can maintain a full history of changes to files over time.
It’s enabled at the bucket level and applies to all new uploads after it’s turned on. To enable versioning:
- Go to the S3 console, choose your bucket.
- Under Properties, look for the Bucket Versioning section.
- Click Edit, then choose Enable, and save changes.
- That’s it — from then on, every PUT or DELETE creates a new version of the object.
Is It Best Practice? Yes — especially for production environments or any critical data. It adds safety and auditability, and it helps with compliance too. The trade-off is increased storage cost, since every version counts toward total usage.
What Happens to Files Already in the Bucket when versioning is switched on for the first time?
- Objects uploaded before enabling versioning are still accessible, but they don’t have a version ID.
- They’re treated as “null version” objects.
- If you delete one of these, it’s permanently deleted unless you’ve copied or versioned it afterward.
You can suspend versioning, but:
- It does not delete any existing versions.
- It simply means new uploads won’t get a version ID — they behave like pre-versioning objects.
Deletion – The Confusing Part Explained. There are two ways to delete objects when versioning is enabled:
1. Delete Without Specifying a Version ID (Soft Delete)
  - S3 places a delete marker on the object.
  - The object appears deleted when you list the bucket, but all versions still exist.
  - You can restore the object by removing the delete marker (more on that below).
2. Delete a Specific Version (Hard Delete)
  - You permanently delete that version from the bucket.
  - This is irreversible — that version is gone.
Restoring a File with a Delete Marker. If a file shows up as deleted, but versioning is enabled, here’s how you restore it:
1. Go to the object’s versions tab in the S3 console.
2. Look for the entry marked as a delete marker.
3. Delete the delete marker.
This action “undeletes” the file and restores it to being the current version.

S3 Replication

S3 Replication lets you automatically copy objects across S3 buckets, either within the same region or across different regions.

Cross-Region Replication (CRR): Replicates objects to a different AWS region
Same-Region Replication (SRR): Replicates objects to a different bucket in the same region

Notes:

Versioning must be enabled on both the source and destination buckets
Buckets can be in the same or different AWS accounts
Replication is asynchronous
You need to grant replication permissions to the S3 service using a bucket policy and specify an IAM role that Amazon S3 assumes to perform the replication on your behalf
By default, only new objects (and new versions) added after replication is enabled are replicated. To replicate existing (old) objects, use Batch Replication:
- A one-time job that replicates objects that existed before replication rules were in place.
- You define a job using Amazon S3 Batch Operations, referencing a manifest of objects.
- Uses the same IAM role and destination as your replication configuration.
S3 handles deletions in two different ways:
1. Delete Markers (from non-versioned delete operations):
- Optional: You can choose whether to replicate them in your replication rules.
- If enabled, delete markers are copied to the destination bucket.
1. Versioned Deletes (e.g., DELETE /object?versionId=xyz):
- Not replicated.
- If you delete a specific version by ID in the source bucket, it is not deleted in the destination.
- This is to prevent accidental or malicious deletes from propagating.
Replication is one-way and non-transitive. For example, if you set up: Bucket A → Bucket B and Bucket B → Bucket C, then objects from A replicate to B, but they won’t replicate from B to C. Even though B is configured to replicate to C, it won’t forward A’s objects. You must explicitly set up replication from A to C if needed.

Use Cases for Replication:

CRR: Compliance: Store copies in different geographic locations; Lower latency for users in other regions; Backup and disaster recovery; Replicating between production and test accounts
SRR: Log aggregation (e.g., for centralized analysis in a security or analytics account); Maintaining synced buckets for different applications

S3 Event Notifications

S3 Event Notifications let you trigger actions when specific events happen in an S3 bucket — for example a new file is uploaded, a file is deleted, a file is overwritten. When these events occur, S3 can send notifications to:

Amazon SNS (Simple Notification Service)
Amazon SQS (Simple Queue Service)
AWS Lambda functions
Amazon EventBridge (for more complex routing)

Common Use Cases:

Automatically resize images after upload (trigger a Lambda)
Notify users of new content (via SNS)
Queue up jobs for batch processing (send to SQS)
Trigger workflows across AWS services (via EventBridge)

What’s Required for It to Work?

To secure these event triggers, you need resource-based policies on the target service that allow S3 to invoke it. If you’re sending to:

SNS topic → You need an SNS topic access policy that allows s3.amazonaws.com to publish.
SQS queue → You need an SQS queue access policy that allows s3.amazonaws.com to send messages.
Lambda function → You need a Lambda resource policy that allows s3.amazonaws.com to invoke the function.

S3 Performance

Amazon S3 is designed for high durability and massive scalability, but understanding its performance characteristics can help you make the most of it — especially when you’re working with large files or high volumes of traffic. By default, latency for most S3 operations (PUT, GET, DELETE, etc.) is in the range of 100–200 milliseconds. That’s really fast considering you’re hitting a highly durable, globally distributed storage system. But S3 also scales based on how you organize your data.

Prefix-Based Request Rates

In S3, a prefix is basically the part of the object key before the object name, like a folder path in a file system. For example, if your object key is: logs/2025/05/26/event123.json, then the prefix is logs/2025/05/26/
S3 performance scales per prefix, which means:
- PUT, COPY, POST, DELETE requests: up to 3,500 requests/second per prefix
- GET and HEAD requests: up to 5,500 requests/second per prefix
- These are soft limits — S3 can automatically scale beyond them if needed, but this is the general performance guideline.
Why does this matter?
- If you put all your files into the same flat structure (e.g., images/file1.jpg, images/file2.jpg, …), you’re using the same prefix — images/. That can become a bottleneck at high scale.
- But if you distribute files across prefixes, S3 spreads the load across its backend infrastructure. Take this example: You’re uploading 10,000 images per second to S3. If all objects are in the prefix uploads/, you’ll hit the 3,500 write limit. But if you store them like: uploads/2025/05/26/… & uploads/2025/05/27/… & uploads/user123/…, then you’re using multiple prefixes, and S3 can handle thousands of requests per prefix, giving you effectively unlimited scale.
- No limits on number of prefixes — S3 can scale horizontally across as many as you need.

Multi-Part Uploads

Multipart upload lets you upload a large file in chunks (parts). Each part is uploaded independently and in parallel, then reassembled by S3. It is recommended when a file is larger than 100 MB, but it is required when the file is larger than 5 GB. So, why use it?

It speeds up uploads by doing them in parallel
Allows for resuming uploads if a part fails
Reduces timeouts and retries for large files
Good for uploading backups, media files, data exports — anything chunky.

S3 Transfer Acceleration

S3 Transfer Acceleration uses Amazon CloudFront’s edge network to speed up uploads and downloads from far away regions. Instead of sending data directly to your S3 bucket endpoint, you upload to a nearby CloudFront edge location, and AWS routes it over their fast internal network. Use cases include uploading from remote locations (e.g., mobile apps in Africa uploading to an S3 bucket in Europe), performance-sensitive apps (like video or medical imaging uploads) and large file transfers from international users.

How it works:

User uploads to my-bucket.s3-accelerate.amazonaws.com
Data is sent to a nearby CloudFront POP (edge location)
AWS routes the data over its backbone network to the bucket
You enable it at the bucket level, then just switch to using the Accelerate endpoint.

S3 Byte-Range Fetches

Byte-range fetches let you download only part of an object, by specifying a byte range in the request.

Why use it?

Resumable downloads: Only re-download the missing part after a failure
Parallel downloads: Split a large file into parts and download in parallel
Partial processing: Get just the header of a file, or a specific segment (e.g., video seeking)

Example: Request bytes 0–999 of a file:

GET /my-object HTTP/1.1 Range: bytes=0-999

This is handy for:

Media players (streaming just part of a video)
Data processors that only need file headers
Mobile apps that want fast, progressive access

S3 Batch Operations

Managing millions (or even billions) of objects in an S3 bucket is no fun if you have to do it one-by-one. That’s where Amazon S3 Batch Operations come in. Instead of writing your own script to loop over objects and make changes, Batch Operations let you run bulk actions at scale — managed by AWS. S3 Batch Operations support the following actions across thousands or millions of objects:

Modify object metadata (e.g. Content-Type headers)
Copy objects to another bucket or storage class
Add or replace tags
Set or change encryption (e.g., enable SSE-S3 or SSE-KMS)
Restore objects from Glacier or Deep Archive
Invoke a Lambda function on each object (super flexible)

That last one is especially powerful — it means you can run custom logic (like virus scanning, resizing images, parsing logs, etc.) over a massive dataset without building your own looping infrastructure. Every S3 Batch Operations job has:

A list of objects to operate on (CSV manifest)
An operation to perform (like tagging or Lambda)
Optional: A Lambda function
IAM roles to grant permissions for the job
Optional: Completion reports (success/failure per object)

Here’s where S3 Batch Operations shine:

Mass tagging files for cost allocation or data governance
Encrypting all unencrypted objects in a bucket
Cleaning up stale files by calling a Lambda that deletes or archives them
Changing storage class of cold data to Glacier
Restoring archived data before a bulk data migration or audit
Running analytics or pre-processing with a Lambda function on each object

You need a list of objects to run a batch job — and generating that list manually can be painful. That’s where Amazon S3 Inventory comes in. S3 Inventory is a feature that generates a CSV or ORC file listing all objects in your bucket (or filtered prefix), with metadata like size, storage class, encryption status, tags, etc. You can generate this daily or weekly, and it drops into a specified bucket. You can use this inventory file directly as the manifest for a Batch Operations job, super useful when you want to:

Operate only on unencrypted objects
Modify files older than a certain date
Target objects of a specific storage class (e.g., Standard_IA)

Before running a batch job, you might want to filter your inventory. You can do that with Amazon Athena, which lets you run SQL queries directly on S3. From there, you export the result as a CSV and feed it to your Batch Operations job. This gives you full control over which files get modified — and lets you combine structured querying with mass-scale S3 operations.

S3 Storage Lens

Managing S3 is often more than just uploading and downloading objects — especially as your data estate grows. Enter Amazon S3 Storage Lens, a powerful analytics tool that gives you visibility into your S3 usage and activity, across your entire organization if needed. It gives you metrics and recommendations to:

Understand storage usage
Identify cost-optimization opportunities
Monitor data protection
Detect security risks
Track performance and request patterns

S3 Storage Lens provides:

Aggregated metrics (usage, activity, performance, security, etc.)
Dashboards to visualize trends and spot anomalies
Daily exports to S3 in CSV or Parquet format
Recommendations for cost savings and data protection

It helps answer questions like:

Which buckets are growing the fastest?
Do I have unencrypted objects or publicly accessible data?
Where are my 403 errors coming from?
Am I storing cold data in expensive storage classes?

Storage Lens works across multiple dimensions:

🔗 Organization-wide (via AWS Organizations)
🧾 AWS account
🌍 Region
🪣 Bucket
🧑‍💼 Prefix or Storage Class (in exports)

This makes it super powerful for enterprises with hundreds of accounts — or even a solo developer with a few buckets.

Dashboards - you get two options:

Default Dashboard (Free)
- Automatically enabled
- Basic view of usage and activity metrics
- Visual breakdowns of data by account, region, and storage class
- Helps spot trends and issues quickly
Custom Dashboards (Advanced)
- Define scope (specific accounts, buckets, regions)
- Choose metrics to include
- Enable daily metrics export to S3
- Set up filters and groups to tailor visibility

Storage Lens can export daily metrics to an S3 bucket you choose.

Metrics categories

Summary Metrics
- Total bytes, object count
- Track growth over time
- Forecast cost or capacity needs
Cost-Optimization Metrics
- Objects in non-optimized storage classes
- Use case: Move old data from Standard to IA or Glacier
Data Protection Metrics
- Unencrypted objects, versioning disabled
- Use case: Detect buckets not following data protection policies
Access Management Metrics
- Public access, ACLs, bucket policies
- Use case: Audit security posture
Event Metrics
- Replication events, delete markers
- Use case: Monitor cross-region replication and cleanups
Performance Metrics
- 4xx/5xx error rates, request latencies
- Use case: Diagnose app issues or misconfigured clients
Activity Metrics
- PUT, GET, DELETE, HEAD requests
- Use case: Understand workload behavior and spikes
Detailed Status Code Metrics
- Breaks down response codes (403, 404, 503, etc.)
- Use case: Pinpoint errors by type and frequency

All these can be queried with Athena, visualized in QuickSight, or processed with custom tools for alerts and dashboards.

Free vs Advanced Dashboards

Feature	Free Tier	Advanced Tier
Dashboards	Default dashboard only	Custom dashboards
Metrics	Summary and basic usage/activity	35+ advanced metrics
Recommendations	Not included	Cost & protection recommendations
Data retention	14 days	15 months
Metrics export	Not available	Daily exports to S3x

S3 Security

Security is a major part of managing data in the cloud, and Amazon S3 offers several layers of protection to make sure only the right people (or systems) can access your data. There are two main ways to control access to S3 resources:

1. User-Based Access Control (IAM Policies)

This is when you attach IAM policies to users, groups, or roles.
These policies define who can access S3 and what actions they can take (e.g. s3:GetObject, s3:PutObject, etc.).
Example: Give a developer full access to a specific bucket.

2. Resource-Based Access Control

This is when permissions are defined directly on the S3 resources.

There are 3 types:

Bucket Policies (most common):
- JSON-based policies attached to buckets.
- Can allow or deny access to users from other accounts or the public.
- Great for cross-account access or public read access.
Bucket ACLs (Access Control Lists):
- Legacy method.
- Grant basic permissions (read/write) to specific AWS accounts or predefined groups (like “AllUsers” for public access).
- Limited in flexibility — mostly deprecated in favor of bucket policies.
Object ACLs:
- Permissions set at the individual object level.
- Useful if different objects in the same bucket need different access.

Block Public Access (Very Important!)

By default, S3 buckets and objects are private — but it’s easy to accidentally open them to the public.

To help prevent that, AWS offers Block Public Access settings, which:

Disable public ACLs and bucket policies
Prevent new public access policies from being added
Restrict access even if someone tries to override it
You can set Block Public Access at the Bucket or Account level.

Encryption (Protecting Data)

When storing sensitive data in S3, encryption is one of the easiest ways to meet security and compliance requirements. S3 supports encryption at rest and encryption in transit.

Encryption at Rest

SSE-S3 (Server-Side Encryption with S3-Managed Keys)

AWS handles all encryption and key management for you.
Uses AES-256 under the hood.
Enabled by default for all new S3 buckets since 2023 — no config needed! However, if you want to specify it manually, you must set this header: x-amz-server-side-encryption: AES256
Key points:
- Object-specific key generated, then encrypted with a master key.
- You don’t manage any keys.
- No audit trail.

SSE-KMS (Server-Side Encryption with AWS KMS Keys)

AWS still encrypts your data, but you get more control over the keys via AWS Key Management Service (KMS).
You can use:
- AWS-managed key (aws/s3) — easy default.
- Customer-managed CMK — gives more control (like access policies, key rotation).
To use SSE-KMS explicitly, set this header: x-amz-server-side-encryption: aws:kms
Supports CloudTrail logging, so you can audit every time a key is used.
Limitation: Each KMS key has a usage quota (e.g. 5,500 requests per second by default).
If you’re uploading large volumes in parallel, this can become a bottleneck.

SSE-C (Server-Side Encryption with Customer-Provided Keys)

You provide the encryption key with every request, using HTTPS.
AWS uses your key to encrypt/decrypt but never stores it.
If you lose the key, your data is gone forever.
Only works over HTTPS — AWS will reject HTTP requests with SSE-C.

Client-Side Encryption

You encrypt the data yourself before uploading to S3.
You are responsible for key management and encryption logic (typically using AWS SDKs).
S3 just stores the encrypted blob.

Encryption in Transit

S3 supports both HTTP and HTTPS endpoints.
HTTPS is strongly recommended, and mandatory if you’re using SSE-C.
Most AWS SDKs use HTTPS by default — so you’re usually covered.
You can enforce encryption in transit using a bucket policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EnforceTLS",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::your-bucket-name",
        "arn:aws:s3:::your-bucket-name/*"
      ],
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }
  ]
}

Bucket-Level Default Encryption

You can configure a bucket to apply default encryption to all new objects.
This avoids relying on each user/client to remember to set encryption headers.
Works with SSE-S3 and SSE-KMS.

CORS

CORS stands for Cross-Origin Resource Sharing, and it’s a security feature built into your browser. Let’s break that down with a real-life example. Imagine your website lives at https://www.myshop.com and it wants to load images, fonts, or data from somewhere else — for example, from https://cdn.partner.com. Browsers, by default, block this kind of “cross-origin” request — to prevent websites from stealing data or behaving badly.

A “cross-origin” request simply means:

The scheme (http vs https),
The host (domain),
Or the port (e.g., :3000 or :8080)

is different from the page that made the request. If any one of these differs, it’s considered cross-origin.

When you use S3 to host static files, like images or videos on your website, fonts, JSON data for AJAX requests or files uploaded via JavaScript, you may run into CORS issues if your frontend (website) and S3 bucket are on different origins (which they usually are). To allow those cross-origin requests, you need to explicitly tell S3 it’s okay, by configuring a CORS policy on the bucket. You need to enable CORS on your bucket only if:

A browser-based app (like a React frontend or plain JavaScript page)
Is trying to read from or upload to your S3 bucket
And they are on different origins (which includes localhost:3000 for dev)

Let’s say your website is hosted at https://www.mywebsite.com and you want to allow it to access objects from your bucket. THis is the CORS policy you would apply on the bucket:

<CORSConfiguration>
  <CORSRule>
    <AllowedOrigin>https://www.mywebsite.com</AllowedOrigin>
    <AllowedMethod>GET</AllowedMethod>
    <AllowedHeader>*</AllowedHeader>
  </CORSRule>
</CORSConfiguration>

This says:

“Let https://www.mywebsite.com GET objects from this bucket.”
“Allow all headers in the request.”

If you want to allow access from anywhere (not always recommended for sensitive data), you would change the to the wildcard (“*”). Use this with caution — good for **public assets** like images, not for anything private.

MFA Delete

MFA Delete is a security feature in Amazon S3 that requires multi-factor authentication to perform sensitive operations on versioned buckets, specifically deleting object versions and permanently disabling versioning. In other words, even if someone has full permissions, they still need an MFA code to delete stuff for real. So it acts like a “double lock” to prevent accidents or malicious actions.

MFA Delete only kicks in for versioned buckets, and only for:

Deleting object versions (not current versions)
Disabling versioning (you can’t just turn versioning off without MFA)

It does not affect regular uploads, downloads, or deletions of current versions.

Use cases:

Prevent accidental permanent deletion of critical data, even by admins.
Protect from compromised credentials: If someone gets access to your AWS credentials, they can’t wipe versioned data without your physical MFA device.
Add a final layer of protection in compliance-sensitive environments (e.g. financial records, legal data, backups).

Requirements for MFA Delete:

Bucket versioning must be enabled
The bucket owner must be the root user (yes — root, not just an IAM user)
An MFA device must be configured for the root user (virtual or hardware)
You must use the root credentials and MFA code when enabling or disabling MFA Delete. Important: Only the root user can enable or disable MFA Delete. IAM users (even with full admin permissions) cannot.

How Do You Enable MFA Delete? It’s a bit old-school — you can’t do it from the AWS Console. You have to use the AWS CLI and sign the request with an MFA code from the root user’s device.

Example CLI command:

aws s3api put-bucket-versioning \
  --bucket your-bucket-name \
  --versioning-configuration Status=Enabled,MFADelete=Enabled \
  --mfa "arn-of-mfa-device mfa-code"

Once enabled, attempts to delete object versions will now require an MFA code if done through the root user.

Limitations:

MFA Delete does not apply to deleting the current version of an object. You can still “hide” it by uploading a new version or deleting the latest version — but the old ones are still there (and protected by MFA).
It only protects versioned buckets.
As mentioned, only root can enable or disable it, so managing this feature securely requires tight control of root credentials.

S3 access logs

S3 access logs provide detailed records of requests made to your S3 bucket. These logs are delivered in the form of text files to another S3 bucket that you specify. Each log entry captures information such as:

Requester (who made the request)
Bucket and object accessed
Request time and action (GET, PUT, DELETE, etc.)
Response status (200, 403, 404…)
Error codes (if any)
Requestor IP, user agent, and more

This is not real-time logging, but a low-cost way to analyze historical access patterns. S3 access logs are handy for:

Security audits – Identify unusual access or unauthorized usage.
Compliance – Demonstrate how data was accessed over time.
Billing breakdowns – See which clients or tools are accessing what.
Debugging issues – Track down 403 or 404 errors.
Analytics – Understand how users interact with your bucket.

Important note: You cannot use the same bucket for logging and for storing the access logs. For example, if you’re logging access to my-data-bucket, you can’t set my-data-bucket as the log destination — you must specify a different bucket (e.g. my-logs-bucket). If you were allowed to log into the same bucket, then every log file written to the bucket would itself generate a new log entry, which would generate another log file, and that would result in an infinite loop of logging the logs.

How to Enable S3 Access Logs

Choose the bucket you want to monitor.
Enable access logging.
Specify a different target bucket (log destination).
Optionally, add a prefix like logs/ for organization.

You can then use tools like Athena, CloudWatch Logs, or custom scripts to parse and analyze these logs.

S3 Pre-Signed URLs

A pre-signed URL is a temporary, secure link that allows someone to access a private object in S3 without needing AWS credentials. You (the bucket owner or an IAM user) generate this special URL, and anyone with the link can use it only for a limited time. The pre-signed URL includes:

The object key (file path),
A signature based on your credentials,
An expiry time.

Once it expires, the URL no longer works — simple and secure.

How Do You Generate One?

You can generate a pre-signed URL using:

AWS SDKs (e.g. boto3 in Python, AWS SDK for JavaScript, etc.)
AWS CLI: aws s3 presign s3://my-bucket/myfile.txt –expires-in 3600

Expiration

The default expiration is usually 3600 seconds (1 hour), but you can specify it.
CLI: Use -expires-in with the number of seconds (max varies by AWS signature version).
SDKs: Pass expiration time as an argument (e.g. Expiration=timedelta(minutes=15) in Python).
Pre-signed URLs cannot be revoked — they expire automatically.

Common Use Cases

Download a file securely without making the whole bucket public.
Upload a file (e.g. user uploads profile photo from frontend directly to S3).
Share time-limited access to media, invoices, reports, etc.
Mobile and web apps that need temporary access without exposing AWS keys.

What Permissions Are Used?

Whoever generates the pre-signed URL determines what the URL can do. That means, the person using the URL inherits the permissions of the IAM principal that created it. So if an IAM user only has GetObject permissions, the pre-signed URL will only allow downloads — not uploads or deletes. And if they try to generate a pre-signed URL for an action they don’t have permission for, it will fail.

S3 Object Lock vs Glacier Vault Lock

When you need to store data in a WORM (Write Once, Read Many) fashion — meaning that once written, data cannot be modified or deleted — AWS offers two mechanisms:

S3 Object Lock (for objects in S3 Standard, IA, etc.)
S3 Glacier Vault Lock (for archives in Glacier storage)

Amazon S3 Object Lock

S3 Object Lock lets you store objects using a WORM model — meaning the object cannot be overwritten or deleted for a set amount of time or until explicitly removed (depending on the mode).

This is ideal for compliance requirements where tamper-proof storage is needed. Once an object is locked with Object Lock in compliance mode, it cannot be altered or deleted by any user, including the root user. This enforces a strict WORM policy.

Use Cases:

Financial data retention compliance (e.g. SEC Rule 17a-4)
Legal evidence retention
Backups that must be preserved for X years
Protection against ransomware or accidental deletion

Retention Modes

Governance Mode

Object version can’t be deleted by most users.
BUT users with special IAM permissions (e.g., s3:BypassGovernanceRetention) can remove the lock or delete the object.
Good for internal compliance with flexibility.

Compliance Mode

Object version cannot be overwritten or deleted by anyone until the retention period ends.
Even the root user can’t delete it early. This is for legal/regulatory scenarios.

Legal Hold

Independent of retention period.
Can be applied or removed by authorized users.
Freezes an object from deletion even if no retention period is set, or if it has expired.
Great for eDiscovery or litigation situations.

Retention Period

Can be set in days or years.
You can apply retention settings at the object level, either when uploading or afterward (if the object hasn’t been locked yet).

Amazon S3 Glacier Vault Lock

Glacier Vault Lock allows you to enforce a compliance policy on an entire Glacier vault, ensuring that archives stored inside cannot be deleted or modified before a defined retention policy is satisfied. Once the Vault Lock policy is finalized, it cannot be changed — this provides a strong WORM guarantee for everything in that vault. Unlike S3 Object Lock (which is per-object), Vault Lock is set per vault.

Use Cases:

Long-term archival of sensitive or regulated data
Retention requirements at a vault-wide level (e.g. a 7-year archive policy)
Industries like healthcare, finance, or legal where immutable archival is required

How It Works:

You initiate a Vault Lock policy (JSON format).
AWS gives you a 24-hour window to test and finalize the policy.
Once finalized, the policy is immutable.

Glacier Vault Lock policies define things like:

Minimum retention period
Prevention of deletion for that time
User access conditions
Important: You can still upload new archives to the vault, but deleting archives before the lock expires is not allowed.

S3 Access Points

Managing access to an S3 bucket used to be all about bucket policies, IAM policies, and maybe even some complex VPC endpoint setups. But once you have multiple teams, applications, or use cases accessing the same bucket — things get messy. That’s where S3 Access Points come in. An S3 Access Point is a named network endpoint that you can create to manage access to shared data in a bucket, with its own:

Name (i.e. my-app-access-point)
Access policy
Network origin (public or VPC)
Optional prefix filter

Think of it like creating a mini-portal into your bucket, custom-built for a specific application or team.

Why use it?

Normally, you might have a single S3 bucket serving multiple applications. To grant access to each application, you’d have to keep updating your bucket policy — and it quickly turns into spaghetti.

With S3 Access Points, you:

Avoid a messy bucket policy
Assign fine-grained access for different apps, services, or departments
Restrict access per prefix in the bucket (like folders)
Route traffic only via a specific VPC, if needed

Typical use cases are:

Multi-tenant application where each tenant has a logical folder (tenant1/*, tenant2/*) — create one access point per tenant.
Data lake where data engineers, analysts, and machine learning models all access the same bucket but need different levels of access.
Hybrid network environments — you want to allow access from a VPC for one team, but over the internet (read-only) for another.

How does access control work?

Each access point has its own IAM-style resource policy, separate from the bucket policy. This lets you define:

Who can access the access point
What actions are allowed (s3:GetObject, s3:PutObject, etc.)
Which prefixes can be accessed
Which VPCs (if any) the traffic must come from

This simplifies permissions dramatically, especially in large environments.

Each access point gets its own DNS-compliant endpoint, like: *https://-.s3-accesspoint..amazonaws.com.* This means apps can directly talk to their assigned access point — and it’s clear which “slice” of the bucket they’re using.

Integration with VPCs

You can create VPC-only access points, which only allow access from resources inside a specific VPC. This works hand-in-hand with VPC endpoints for S3, and it’s a best practice in secure or regulated environments.

S3 Object Lambda Amazon also offers S3 Object Lambda Access Points, which let you intercept and transform objects on the fly (e.g., redact PII, resize images, or translate content) before returning them to the client — all without changing the underlying data. Imagine you have objects stored in S3 that you want to modify dynamically when retrieved, but you don’t want to change the source object. Maybe you want to:

Redact sensitive information
Resize an image
Convert data formats (e.g. CSV → JSON)
Personalize content per user
Mask PII for certain users

S3 Object Lambda lets you do all that — without modifying the object stored in S3.

How It Works

You create a standard S3 access point to your data.
You create an Object Lambda access point that:
- References the standard access point as the source.
- Associates a Lambda function to process the data.
A user or application calls the Object Lambda access point.
S3:
- Fetches the object using the underlying access point.
- Sends it to the Lambda function.
- Returns the modified result to the requester.

You can apply transformation to:

Object body (actual content)
Metadata
HTTP headers

TL;DR

Introducing S3

Main Use Cases

Object Storage, Not Block Storage

Keys, Prefixes, and the “Folder Illusion”

Metadata, Tags, and Versioning

Some rules to bear in mind

Static Website Hosting

S3 Storage Classes

S3 Lifecycle Rules

How Lifecycle Rules Work

Choosing the Right Storage Class

How S3 Storage Class Analytics Can Help

S3 Requester Pays

S3 Versioning

S3 Replication

Notes:

Use Cases for Replication:

S3 Event Notifications

Common Use Cases:

What’s Required for It to Work?

S3 Performance

Prefix-Based Request Rates

Multi-Part Uploads

S3 Transfer Acceleration

S3 Byte-Range Fetches

S3 Batch Operations

S3 Storage Lens

Metrics categories

Free vs Advanced Dashboards

S3 Security

Block Public Access (Very Important!)

Encryption (Protecting Data)

CORS

MFA Delete

S3 access logs

S3 Pre-Signed URLs

S3 Object Lock vs Glacier Vault Lock

S3 Access Points

About the Author

Share this post

Related Articles

AWS Unpacked #9: Containers on AWS

AWS Unpacked #3: VPC & Networking

Why I'm Getting AWS Certified