5 AWS Services That Lock You In (And What to Use Instead)
Lambda, DynamoDB, SQS — convenient, affordable, and about as easy to leave as a timeshare presentation. Here are the escape routes we actually use.
The Hotel California Problem
You know the song. You can check out any time you like, but you can never leave.
Don Henley was singing about a mysterious desert hotel, but he might as well have been singing about AWS. Every service in the AWS console is a tiny, well-designed trap. Each one solves a real problem. Each one is easy to set up. Each one has a free tier that whispers "try me, what's the worst that could happen?" And each one adds another strand to the web that eventually makes migration feel impossible.
We've helped dozens of companies untangle themselves from AWS services they adopted in a weekend and spent years trying to leave. Not because AWS is bad — it's genuinely excellent infrastructure. But "excellent infrastructure you can never leave" is a different value proposition than "excellent infrastructure," and nobody reads the fine print until the renewal email arrives.
Here are the five services we see cause the most pain, the portable alternatives we actually deploy in production, and the war stories from the migrations that taught us everything we know. Each one gets a Lock-In Score from 1-10, where 1 is "change an environment variable" and 10 is "rewrite your application from scratch while questioning your career choices."
1. Amazon RDS -> CloudNativePG
Lock-In Score: 6/10
The Trap
RDS is seductive. Click a button, get a managed PostgreSQL database. Automated backups. Read replicas. Multi-AZ failover. Monitoring built in. It's like having a DBA on staff who never complains and works 24/7.
So you adopt it. And then you adopt RDS Proxy for connection pooling. And then you use IAM database authentication because it's "more secure." And then you enable Performance Insights because your queries are slow. And then you turn on RDS Data API because your Lambda functions need database access. And suddenly your "standard PostgreSQL" database is actually six AWS-specific services wearing a PostgreSQL trench coat.
The Pain
We had a client — a B2B SaaS platform doing about $30M ARR — who wanted to offer their product on Azure for a major enterprise deal. "How hard could it be?" their VP of Engineering asked. "It's just PostgreSQL."
It was not just PostgreSQL.
Their application used RDS-specific connection strings with IAM auth tokens. Their backup strategy depended on RDS automated snapshots. Their monitoring was built on Performance Insights metrics piped to CloudWatch. Their connection pooling went through RDS Proxy. Their disaster recovery relied on cross-region read replicas — an RDS feature that doesn't exist outside of AWS.
Estimated migration time: "a couple weeks." Actual migration time: four months.
The Escape: CloudNativePG
CloudNativePG is a Kubernetes operator that manages PostgreSQL clusters. It handles everything RDS handles — automated backups, high availability, failover, connection pooling — but it runs on any Kubernetes cluster, anywhere.
Here's what the before and after looked like:
Before (RDS-specific application config):
# AWS RDS with IAM authentication — works only on AWS
import boto3
def get_db_connection():
rds_client = boto3.client('rds')
# Generate IAM auth token — AWS-specific
token = rds_client.generate_db_auth_token(
DBHostname='mydb.cluster-abc123.us-east-1.rds.amazonaws.com',
Port=5432,
DBUsername='app_user',
Region='us-east-1'
)
return psycopg2.connect(
host='mydb.cluster-abc123.us-east-1.rds.amazonaws.com',
port=5432,
user='app_user',
password=token, # Temporary IAM token
database='myapp',
sslmode='require',
sslrootcert='rds-combined-ca-bundle.pem' # AWS-specific CA
)
After (portable, works anywhere):
# Standard PostgreSQL connection — works on any cloud or on-prem
import os
def get_db_connection():
return psycopg2.connect(
host=os.environ['DB_HOST'], # CloudNativePG service DNS
port=os.environ.get('DB_PORT', '5432'),
user=os.environ['DB_USER'],
password=os.environ['DB_PASSWORD'], # From Vault via External Secrets
database=os.environ['DB_NAME'],
sslmode='require'
)
And here's the CloudNativePG cluster definition:
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: myapp-db
spec:
instances: 3
postgresql:
parameters:
max_connections: "200"
shared_buffers: "256MB"
effective_cache_size: "768MB"
# Automated backups — works with any S3-compatible storage
backup:
barmanObjectStore:
destinationPath: "s3://myapp-backups/pgdata"
endpointURL: "https://minio.internal:9000" # MinIO, not AWS S3
s3Credentials:
accessKeyId:
name: backup-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: backup-creds
key: SECRET_ACCESS_KEY
retentionPolicy: "30d"
# Storage — uses whatever StorageClass your cluster provides
storage:
size: 100Gi
storageClass: standard # gp3, managed-premium, pd-ssd — whatever
# Monitoring — standard Prometheus metrics, not CloudWatch
monitoring:
enablePodMonitor: true
The Aftermath
The client's DBA — a grizzled veteran who'd been managing PostgreSQL since version 8 — literally teared up when he saw CloudNativePG handle an automatic failover during testing. "It just... promoted the replica and updated the service endpoint," he said. "Do you know how many 3 AM pages I've taken for manual failover? Do you know how many runbooks I've written?"
He printed out the CloudNativePG docs and hung them on his wall. We're not making this up.
Migration time: 3 weeks (including testing). Application code changes: 14 lines.
2. Amazon SQS -> NATS
Lock-In Score: 7/10
The Trap
SQS is the duct tape of AWS architectures. Need to decouple two services? SQS. Need a dead letter queue? SQS. Need to handle bursty workloads? SQS. It's reliable, it's cheap (the first million messages are free!), and it's available in every region.
The problem isn't SQS itself — it's that SQS has an API that exists nowhere else in the universe. Every message you publish, every queue you create, every visibility timeout you configure is done through the AWS SDK. Your application code doesn't just use SQS; it becomes SQS. The queue semantics, the message format, the error handling, the polling model — all of it is AWS-specific.
The Pain
A fintech client of ours had 34 SQS queues connecting 12 microservices. The queues handled everything from payment processing to notification delivery to audit logging. When they needed to deploy to a private cloud for a banking customer who (understandably) wouldn't let payment data touch a public cloud, they discovered that SQS was the connective tissue of their entire architecture. Replacing it meant touching every service.
Their initial estimate: "We'll abstract the queue layer." Their eventual realization: the "queue layer" was everywhere — in handler functions, error retry logic, message serialization, dead letter processing, and CloudWatch alarm definitions. There was no layer. It was more like a queue atmosphere.
The Escape: NATS with JetStream
NATS is a lightweight, high-performance messaging system that runs anywhere. With JetStream (its persistence layer), it provides the same guarantees as SQS — at-least-once delivery, message replay, dead letter queues — but with zero cloud dependency.
Here's a side-by-side comparison of producer and consumer code:
Before (SQS):
import boto3
import json
sqs = boto3.client('sqs', region_name='us-east-1')
QUEUE_URL = 'https://sqs.us-east-1.amazonaws.com/123456789/payment-events'
# Producer
def publish_payment_event(event: dict):
sqs.send_message(
QueueUrl=QUEUE_URL,
MessageBody=json.dumps(event),
MessageAttributes={
'EventType': {
'DataType': 'String',
'StringValue': event['type']
}
},
MessageGroupId=event.get('customer_id', 'default'), # FIFO queue
MessageDeduplicationId=event['idempotency_key']
)
# Consumer
def process_payment_events():
while True:
response = sqs.receive_message(
QueueUrl=QUEUE_URL,
MaxNumberOfMessages=10,
WaitTimeSeconds=20, # Long polling
MessageAttributeNames=['All']
)
for message in response.get('Messages', []):
try:
event = json.loads(message['Body'])
handle_payment(event)
# Must explicitly delete after processing
sqs.delete_message(
QueueUrl=QUEUE_URL,
ReceiptHandle=message['ReceiptHandle']
)
except Exception as e:
# Message returns to queue after visibility timeout
print(f"Failed to process: {e}")
After (NATS JetStream):
import nats
import json
# Producer
async def publish_payment_event(event: dict):
nc = await nats.connect(os.environ.get('NATS_URL', 'nats://nats:4222'))
js = nc.jetstream()
await js.publish(
f"payments.{event['type']}",
json.dumps(event).encode(),
headers={
'Nats-Msg-Id': event['idempotency_key'] # Deduplication
}
)
await nc.close()
# Consumer
async def process_payment_events():
nc = await nats.connect(os.environ.get('NATS_URL', 'nats://nats:4222'))
js = nc.jetstream()
# Create a durable consumer (survives restarts, like SQS)
sub = await js.pull_subscribe(
"payments.*",
durable="payment-processor",
config=nats.js.api.ConsumerConfig(
ack_wait=30, # Like SQS visibility timeout
max_deliver=5, # Retry up to 5 times
filter_subject="payments.*"
)
)
while True:
try:
messages = await sub.fetch(batch=10, timeout=20)
for msg in messages:
try:
event = json.loads(msg.data.decode())
await handle_payment(event)
await msg.ack() # Acknowledge successful processing
except Exception as e:
await msg.nak() # Negative ack — will be redelivered
print(f"Failed to process: {e}")
except nats.errors.TimeoutError:
continue # No messages available, keep polling
And the NATS deployment on Kubernetes:
# nats-jetstream.yaml — deploys identically on any K8s cluster
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: nats
spec:
chart:
spec:
chart: nats
version: "1.2.x"
sourceRef:
kind: HelmRepository
name: nats
values:
nats:
jetstream:
enabled: true
memStorage:
enabled: true
size: "2Gi"
fileStorage:
enabled: true
size: "20Gi"
storageClassName: "standard"
cluster:
enabled: true
replicas: 3
# Monitoring — Prometheus, not CloudWatch
exporter:
enabled: true
serviceMonitor:
enabled: true
The Aftermath
The fintech client's migration took about 6 weeks for all 34 queues. The NATS codebase was actually shorter than the SQS code — no more managing receipt handles, no more polling configuration, no more fighting with SQS's 256KB message size limit (NATS handles up to 64MB by default).
Their favorite part? NATS runs embedded for testing. No more LocalStack, no more mocking the SQS API, no more "it works in my tests but fails in staging because SQS behaves differently." Unit tests that used to take 45 seconds (waiting for LocalStack to spin up) now run in 200ms.
3. AWS Lambda -> Knative
Lock-In Score: 8/10
The Trap
Lambda is, genuinely, a brilliant piece of engineering. Write a function. Deploy it. It scales to zero when nobody's using it and scales to thousands of concurrent executions when traffic spikes. You pay only for the milliseconds your code actually runs. It's the closest thing we have to "just run my code" infrastructure.
It's also the stickiest service AWS has ever built.
Lambda functions aren't just code — they're code plus AWS event sources, IAM roles, VPC configurations, layer dependencies, environment variables managed by CloudFormation, triggers from API Gateway or S3 or SQS or DynamoDB Streams or EventBridge or forty other AWS services. A Lambda function is not a function. It's a function-shaped hole in the AWS ecosystem, and everything around it is AWS.
The Pain
We worked with a client who had gone all-in on Lambda. And we mean all in. 247 Lambda functions. Not just event handlers — their entire backend was Lambda functions behind API Gateway. Authentication, business logic, CRUD operations, background jobs, scheduled tasks, webhooks — all Lambda.
When a major government customer required on-premise deployment, their architect did the math: migrating 247 Lambda functions meant rewriting their entire backend. Not refactoring. Not porting. Rewriting. The functions themselves were 20% of the problem. The other 80% was the event wiring — API Gateway routes, SQS triggers, S3 event notifications, CloudWatch scheduled events — all of which only exist inside AWS.
The deal was worth $8M annually. The estimated rewrite: 12-18 months and $2M in engineering costs. They almost walked away from it.
The Escape: Knative
Here's the thing about serverless that AWS doesn't want you to realize: serverless isn't a cloud feature. It's an architecture pattern. Scale-to-zero, event-driven, pay-per-use — these are design principles, not products. And Knative implements them on any Kubernetes cluster.
Before (Lambda + API Gateway):
# handler.py — Lambda function for processing uploaded documents
import json
import boto3
s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('documents')
def handler(event, context):
"""Triggered by S3 upload event — only works on AWS"""
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
# Download from S3
response = s3.get_object(Bucket=bucket, Key=key)
content = response['Body'].read()
# Process the document
result = process_document(content)
# Store result in DynamoDB
table.put_item(Item={
'document_id': key,
'status': 'processed',
'result': result,
'processed_at': context.get_remaining_time_in_millis()
})
return {
'statusCode': 200,
'body': json.dumps({'message': 'Processed'})
}
# serverless.yml — Serverless Framework, AWS-specific
service: document-processor
provider:
name: aws
runtime: python3.11
region: us-east-1
iam:
role:
statements:
- Effect: Allow
Action: ['s3:GetObject']
Resource: 'arn:aws:s3:::uploads/*'
- Effect: Allow
Action: ['dynamodb:PutItem']
Resource: 'arn:aws:dynamodb:us-east-1:*:table/documents'
functions:
processDocument:
handler: handler.handler
events:
- s3:
bucket: uploads
event: s3:ObjectCreated:*
After (Knative + portable services):
# app.py — Standard Flask app that scales to zero with Knative
import os
from flask import Flask, request, jsonify
from cloudevents.http import from_http
from minio import Minio
import psycopg2
app = Flask(__name__)
# Portable — works with MinIO, S3, GCS, or any S3-compatible storage
storage = Minio(
os.environ['STORAGE_ENDPOINT'],
access_key=os.environ['STORAGE_ACCESS_KEY'],
secret_key=os.environ['STORAGE_SECRET_KEY'],
secure=os.environ.get('STORAGE_SECURE', 'true').lower() == 'true'
)
def get_db():
return psycopg2.connect(os.environ['DATABASE_URL'])
@app.route('/', methods=['POST'])
def process_document():
"""Triggered by CloudEvents — works with any event source"""
event = from_http(request.headers, request.get_data())
bucket = event.data['bucket']
key = event.data['key']
# Download from object storage (MinIO/S3/GCS — same API)
response = storage.get_object(bucket, key)
content = response.read()
# Process the document
result = process_document_content(content)
# Store result in PostgreSQL (CloudNativePG — runs anywhere)
with get_db() as conn:
with conn.cursor() as cur:
cur.execute(
"""INSERT INTO documents (document_id, status, result, processed_at)
VALUES (%s, %s, %s, NOW())""",
(key, 'processed', json.dumps(result))
)
return jsonify({'message': 'Processed'}), 200
if __name__ == '__main__':
app.run(host='0.0.0.0', port=int(os.environ.get('PORT', 8080)))
# knative-service.yaml — Scales to zero, runs anywhere K8s runs
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: document-processor
spec:
template:
metadata:
annotations:
# Scale to zero after 5 minutes of inactivity
autoscaling.knative.dev/window: "300s"
autoscaling.knative.dev/target: "100" # 100 concurrent requests per pod
spec:
containers:
- image: registry.internal/document-processor:v2.1.0
ports:
- containerPort: 8080
env:
- name: STORAGE_ENDPOINT
value: "minio.storage:9000"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"
---
# Event trigger — listen for object storage events
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
name: document-upload-trigger
spec:
broker: default
filter:
attributes:
type: "io.minio.s3.ObjectCreated"
source: "minio.storage"
subscriber:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: document-processor
The Aftermath
The client didn't migrate all 247 functions. They migrated the 40 most critical ones — the ones that touched payment processing, document management, and the core API. The remaining 207 functions still run on Lambda, and that's fine. The goal was deployment flexibility, not cloud purity.
Those 40 Knative services now deploy identically to AWS (via EKS), their government customer's on-premise Kubernetes cluster, and a test environment running on a developer's workstation. Same container, same config, same behavior.
The government deal closed. $8M ARR. The engineering cost? About $600K over 4 months — a 13x return in year one.
Their architect summed it up perfectly: "Serverless isn't a vendor. It's a pattern. Once we understood that, everything clicked."
4. AWS Secrets Manager -> HashiCorp Vault + External Secrets Operator
Lock-In Score: 5/10
The Trap
AWS Secrets Manager is one of those services that sneaks into your architecture. Nobody wakes up and says "let's deeply integrate with AWS Secrets Manager today." It just happens. You need to store a database password. You put it in Secrets Manager. Then you reference it from a Lambda function. Then from an ECS task definition. Then from a CloudFormation template. Then from another service. Before you know it, you have 200 secrets in Secrets Manager and every service in your architecture has secretsmanager:GetSecretValue in its IAM policy.
The lock-in isn't in the secret storage — it's in the secret retrieval. Every place your code calls boto3.client('secretsmanager').get_secret_value() is a place that only works on AWS.
The Pain
A client migrating to a hybrid deployment discovered they had 847 references to AWS Secrets Manager across their codebase. Not 847 secrets — 847 places in code that called the Secrets Manager API. Lambdas, ECS tasks, EC2 instances, CI/CD pipelines, configuration scripts, one-off admin tools, even a Slack bot. The secrets had metastasized.
The Escape: Vault + External Secrets Operator
The magic of this approach is that your application code never knows where secrets come from. Vault stores the secrets. The External Secrets Operator syncs them into Kubernetes Secrets. Your application reads standard environment variables or mounted files. The entire secrets backend can change without touching a single line of application code.
Before (AWS Secrets Manager in application code):
import boto3
import json
def get_database_credentials():
client = boto3.client('secretsmanager', region_name='us-east-1')
response = client.get_secret_value(SecretId='prod/myapp/database')
secret = json.loads(response['SecretString'])
return {
'host': secret['host'],
'port': secret['port'],
'username': secret['username'],
'password': secret['password'],
'database': secret['dbname']
}
# Every service that needs DB creds imports this function
# and has an IAM policy granting secretsmanager:GetSecretValue
# on this specific secret ARN. 847 times.
After (standard environment variables, source-agnostic):
import os
def get_database_credentials():
return {
'host': os.environ['DB_HOST'],
'port': os.environ.get('DB_PORT', '5432'),
'username': os.environ['DB_USER'],
'password': os.environ['DB_PASSWORD'],
'database': os.environ['DB_NAME']
}
# That's it. The application doesn't know or care where these values
# come from. Vault, AWS Secrets Manager, Azure Key Vault, a sticky
# note on someone's monitor — the app is blissfully ignorant.
The External Secrets Operator bridges the gap between Vault and Kubernetes:
# external-secret.yaml — Syncs secrets from Vault to K8s Secrets
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: myapp-db-credentials
spec:
refreshInterval: "1h"
secretStoreRef:
name: vault-backend
kind: ClusterSecretStore
target:
name: myapp-db-credentials
creationPolicy: Owner
data:
- secretKey: DB_HOST
remoteRef:
key: secret/data/myapp/database
property: host
- secretKey: DB_PORT
remoteRef:
key: secret/data/myapp/database
property: port
- secretKey: DB_USER
remoteRef:
key: secret/data/myapp/database
property: username
- secretKey: DB_PASSWORD
remoteRef:
key: secret/data/myapp/database
property: password
- secretKey: DB_NAME
remoteRef:
key: secret/data/myapp/database
property: dbname
---
# The ClusterSecretStore — configure once, use everywhere
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: vault-backend
spec:
provider:
vault:
server: "https://vault.internal:8200"
path: "secret"
version: "v2"
auth:
kubernetes:
mountPath: "kubernetes"
role: "external-secrets"
The Aftermath
The migration took 3 weeks. But here's the beautiful part: the client's application code got simpler. Instead of 847 Secrets Manager API calls scattered across the codebase, they had environment variables. Standard, boring, works-everywhere environment variables.
The Vault instance runs on their Kubernetes cluster — same cluster, any cloud. When they deployed to their second cloud (Azure), the only change was updating the Vault address in the ClusterSecretStore. Zero application code changes. Zero.
Their security team actually preferred Vault — it has better audit logging, dynamic secrets (database credentials that rotate automatically), and a policy engine that makes IAM policies look like cave paintings.
5. Amazon S3 -> MinIO
Lock-In Score: 3/10
The Trap
S3 is probably the most ubiquitous cloud service ever built. The S3 API has become a de facto standard — even other cloud providers and storage systems implement it. So you'd think S3 would be easy to leave.
And you'd be... mostly right. Which is why S3's lock-in score is only 3/10. The lock-in isn't in the API — it's in the ecosystem. S3 event notifications trigger Lambda functions. S3 lifecycle policies manage data tiering. CloudFront distributions serve S3 content. IAM policies control S3 access. S3 bucket policies, CORS configurations, versioning, replication rules — all configured through AWS-specific APIs.
Your application code that reads and writes objects? That's portable. Everything around it? Less so.
The Pain
This was actually one of our easier migrations, so instead of a horror story, here's a comedy. A client asked us to migrate their object storage from S3 to something portable. We looked at their application code. It used the standard S3 SDK. We pointed it at MinIO. We ran the tests.
They passed.
All of them.
The entire migration was changing one environment variable:
# Before
AWS_S3_ENDPOINT=https://s3.us-east-1.amazonaws.com
# After
AWS_S3_ENDPOINT=https://minio.storage:9000
The senior engineer who'd been allocated two weeks for the migration finished in an afternoon and spent the rest of the sprint refactoring technical debt. He still talks about it as "the best migration I've ever done" — a bar that, in fairness, was set extremely low by every other migration he'd done.
The Escape: MinIO
MinIO is an S3-compatible object storage system that runs anywhere. And when we say "S3-compatible," we mean it — MinIO implements the S3 API so faithfully that most applications can't tell the difference.
# minio-tenant.yaml — S3-compatible storage on any K8s cluster
apiVersion: minio.min.io/v2
kind: Tenant
metadata:
name: storage
spec:
pools:
- servers: 4
volumesPerServer: 4
size: 500Gi
storageClassName: standard
resources:
requests:
cpu: "1"
memory: "2Gi"
# Enable bucket notifications (replaces S3 Event Notifications)
# Works with NATS, Kafka, Webhooks, etc.
env:
- name: MINIO_NOTIFY_NATS_ENABLE
value: "on"
- name: MINIO_NOTIFY_NATS_ADDRESS
value: "nats://nats:4222"
- name: MINIO_NOTIFY_NATS_SUBJECT
value: "storage.events"
# TLS, monitoring, the works
requestAutoCert: true
prometheusOperator: true
Application code — literally unchanged:
import boto3
import os
# The same boto3 code works with both S3 and MinIO
s3 = boto3.client(
's3',
endpoint_url=os.environ.get('S3_ENDPOINT'), # Only env var that changes
aws_access_key_id=os.environ['S3_ACCESS_KEY'],
aws_secret_access_key=os.environ['S3_SECRET_KEY'],
)
def upload_document(file_data: bytes, filename: str) -> str:
"""Upload a document — works identically with S3 and MinIO"""
key = f"documents/{filename}"
s3.put_object(
Bucket=os.environ['S3_BUCKET'],
Key=key,
Body=file_data,
ContentType='application/pdf'
)
return key
def download_document(key: str) -> bytes:
"""Download a document — works identically with S3 and MinIO"""
response = s3.get_object(
Bucket=os.environ['S3_BUCKET'],
Key=key
)
return response['Body'].read()
# Presigned URLs work too
def get_download_url(key: str, expires: int = 3600) -> str:
return s3.generate_presigned_url(
'get_object',
Params={'Bucket': os.environ['S3_BUCKET'], 'Key': key},
ExpiresIn=expires
)
The Aftermath
The only "gotcha" we've encountered with MinIO is performance tuning. S3 is backed by some of the most sophisticated distributed storage infrastructure ever built. MinIO is backed by whatever disks you give it. For most workloads, this doesn't matter. For high-throughput analytics workloads processing petabytes of data, you'll want to invest time in MinIO cluster sizing and storage configuration.
But for the 95% of applications that use S3 as "a place to put files and get them back later"? MinIO is a drop-in replacement. Literally drop-in. We have clients who ran their existing test suites against MinIO without changing a single line of test code, and everything passed.
The Escape Plan
If you've read this far, you might be feeling a mix of motivation and dread. The motivation: "We need to fix this." The dread: "That's a lot of services to migrate."
Here's the good news: you don't migrate all at once. That's the fastest way to create a different kind of disaster. Instead, we recommend the Escape Plan — a phased approach that builds portability incrementally.
Phase 1: Stop Digging (Week 1-2)
Establish a policy: all new services use portable alternatives. New database? CloudNativePG, not RDS. New queue? NATS, not SQS. New object storage bucket? Configure it with the S3 API but through an abstraction that works with MinIO.
You're not migrating anything. You're just stopping the bleeding.
Phase 2: Build the Platform (Month 1-2)
Deploy the portable alternatives alongside your AWS services. Get CloudNativePG, NATS, Vault, and MinIO running on your Kubernetes cluster. Let your team get comfortable with them. Run them in staging first.
Phase 3: Migrate by Opportunity (Month 2-6)
Every time you touch a service — bug fix, feature add, refactor — migrate its cloud dependencies to portable alternatives. Don't go looking for things to migrate. Let the natural development cycle bring migrations to you.
Phase 4: Prove Portability (Month 6)
Deploy your application to a second environment. It doesn't have to handle production traffic. It just has to work. This is your proof — to yourself, to your cloud vendor's sales team, and to your enterprise customers — that you're not locked in.
The Score Card
Here's a summary of our lock-in scores and migration estimates for a typical mid-size SaaS application:
| AWS Service | Portable Alternative | Lock-In Score | Migration Time | Code Changes |
|---|---|---|---|---|
| RDS | CloudNativePG | 6/10 | 2-4 weeks | Minimal |
| SQS/SNS | NATS | 7/10 | 4-8 weeks | Moderate |
| Lambda | Knative | 8/10 | 2-6 months | Significant |
| Secrets Manager | Vault + ESO | 5/10 | 2-3 weeks | Minimal to none |
| S3 | MinIO | 3/10 | 1-3 days | One env variable |
The total effort for most companies is 3-6 months of incremental work — not a rewrite, not a Big Bang migration, just steady progress toward portability.
The Bottom Line
AWS built brilliant services. We use them ourselves. This isn't an anti-AWS post — it's a pro-optionality post.
The Hotel California problem isn't that the hotel is bad. The hotel is great. The problem is that you can never leave. And in business, the inability to leave a vendor isn't just uncomfortable — it's expensive, it limits your market, and it puts your roadmap at the mercy of someone else's pricing team.
You don't need to check out. You just need to know that you could.
Want help assessing your lock-in exposure? We built a free architecture review that maps your AWS dependencies and estimates migration effort. Takes about an hour and doesn't require you to change anything — just understand where you stand.