Your CTO is fleecing you (and AWS thanks you for it)
A founder I was advising had just been through a round of layoffs. Then I looked at his AWS bill.
I told him, with no particular drama in my voice, that I could save him the equivalent of a professional salary by turning things off. Not rewriting anything. Not migrating anything. Just turning off things that were sitting there, idle, billing him every hour.
He almost fell out of his chair. Not in the good way.
This isn’t a hit piece on CTOs
I want to be careful here, because the headline sounds like one and it isn’t. I have spent twenty years around technical leaders. The vast majority of them are not bad people. They are not fleecing anyone on purpose.
What they are is nerds with toys. And nerds with toys, given no resistance, will play with the toys.
Full disclosure: I am one of these nerds. I love AWS. I love wiring up infrastructure. The neckbeard impulse to orchestrate petabytes in the sky lives in me too. The difference, I hope, is that I learned early not to take liberties with other people’s pocketbooks. When it is my own money on the line, the boring choice gets a lot more interesting.
How this actually happens
The pattern at almost every small SaaS I get pulled into looks the same. It isn’t a conspiracy. It isn’t an alliance. It’s two well-intentioned people speaking different languages.
The CEO, who is usually not technical, says things they picked up from Harvard Business School, from investors, or from a competitor’s marketing page. They want the product to be “best in class,” “highly available,” “enterprise-ready,” “ready for the next round.” These phrases sound responsible. They sound like the things a serious operator should be asking for. They are also, almost always, completely disconnected from what the business actually needs at its current size.
The CTO hears those phrases and translates them. “Best in class” becomes “I get to use the best-in-class services.” “Highly available” becomes “we need multi-region.” “Ready to scale” becomes “Kubernetes.” The translation isn’t dishonest. From inside the CTO’s head, those are the obvious right answers. They are also, conveniently, the toys the CTO has been wanting to play with for months. AWS ships hundreds of services. Elasticsearch indexes things. Kafka streams events. Redis caches things. Microservices let you have many small repos instead of one large one. Each is genuinely cool. Each is a real service that real engineers have built real careers on.
The CEO doesn’t have the technical literacy to push back. The CTO doesn’t have the business literacy to push back on themselves. Nobody in the room is doing math against the runway.
That’s the fleecing. Not malice. Two earnest people, no friction between them, and a lot of pretty toys on the shelf.
What it looked like in real life
The company I told you about at the top of this post had, when I came in, six AWS accounts. A root account, plus separate ones for production, staging, development, logging, and one whose purpose I never figured out. Each account had its own permissions, its own networking, its own resources, and its own confusing relationships to the others. Three of the accounts were paying for high-tier AWS support, separately. Same company, billed three times.
The servers were enormous. The database was oversized for a database that was barely being queried. Plenty of unused resources were just sitting in regions nobody remembered creating. Elasticsearch was running even though the search needs of the app could be served by a Postgres index. A Memcached server and a Redis server sat side by side, doing essentially the same job. The whole thing had been “built to scale.” Built to scale to what, nobody could quite say. The product had real customers, but the customer count was a small four-digit number and the AWS bill was somewhere around ten thousand a month. About half of that was waste.
Performance, by the way, was bad. Slow pages, slow queries. But the performance problem wasn’t in the infrastructure. The performance problem was in the application. There were N+1 queries everywhere, several obvious database indexes were missing, and a few queries that ran on every page load could have been cached at the application layer in twenty lines of code.
The CTO didn’t need a bigger server. The CTO needed to spend two days with the slow log and a SQL EXPLAIN.
I cut the bill roughly in half without rewriting a single line of application code. I turned things off. I right-sized the things that needed to stay on. I consolidated accounts. I let one of the duplicate caching layers die. The product got faster too, because turning things off tends to make the things still on faster.
The shade list
Specific services I see misused over and over:
- Redis or Memcached. Almost always a red flag. Very often used, rarely actually needed. You’re already paying for a Postgres server, and Postgres has the caching features most apps actually need built right in. Use those. When you blow Postgres out as a cache, fine, then we talk.
- Elasticsearch. Your search is probably not that hard. Postgres has full-text search. It is fine. If Postgres full-text legitimately fails for your use case, run a small self-hosted Elasticsearch on a single node, not an Amazon OpenSearch cluster that costs more per month than a junior engineer.
- Microservices. If you have fewer than fifty engineers, you should be in a monolith. Twenty-five repos is twenty-five places to break, twenty-five things to deploy, and twenty-five chances to misconfigure the orchestration between them. You are not Netflix. You are not even one of Netflix’s interns.
- Multi-region anything, for an app with fewer than a hundred thousand active users. You don’t have a region problem. You have an uptime narrative problem, and the cure is single-region with backups, not multi-region with a Kafka cluster between them.
- Multi-account AWS federation, for any company not specifically dealing with regulatory compartmentalization. It’s a lot of yak-shaving for very little benefit.
If you took every dollar early-stage SaaS pays to AWS for services they don’t need and gave it back to those companies, you could fund a small country’s worth of runway.
The stack you actually need
For a sub-million-ARR SaaS with fewer than a few thousand users:
- A Postgres database.
- A place to store your assets (S3, R2, whatever).
- A compute node, or two if you must, behind a load balancer.
That’s it.
You don’t need a service mesh. You don’t need event streaming. You don’t need a separate caching layer; Postgres will hold what it needs in memory, and your application can cache hot reads itself in twenty lines of code. You don’t need an Elasticsearch cluster. You don’t need a NoSQL database next to the SQL one because someone on a blog told you “Postgres doesn’t scale.”
Postgres scales. Use Postgres until it visibly fails. When it does, you’ll know, and you’ll have a real business problem that has earned the new infrastructure. Buying the infrastructure first and waiting for the business problem to arrive is the most expensive form of optimism in modern software.
The rule of thumb
For a small piece of business software with ten thousand or fewer users, a reasonable monthly cloud bill floor is around five hundred dollars. You could do a lot worse than targeting that number.
For perspective: StriveDB, our SaaS for victim-service organizations, runs on AWS for about two hundred dollars a month. Real customers, real data, real security and uptime expectations. Two hundred a month.
If your bill is in the thousands of dollars a month, and is higher than the number of paying users you have, you are probably doing something wrong. It isn’t a perfect heuristic. It is a starting point. The conversation it opens with your CTO is the entire reason to write it down.
How to cut your cloud bill, in order
The order matters. Most teams skip the cheap steps and go straight to the engineering work, then wonder why the bill barely moved.
- Turn off everything that isn’t being used. Idle instances. Forgotten test clusters in regions nobody remembers creating. Duplicate logging pipelines. The staging environment nobody actually uses but that still runs production-grade resources. Walk through every account, every region, every service, and kill anything that doesn’t have a live owner. This is where the biggest wins live, and it costs nothing but an afternoon.
- Right-size everything that’s left. Look at actual utilization of every instance, database, and cluster. If your average CPU is seven percent, you bought too much machine. If your database is using twelve percent of its provisioned IOPS, you bought too much database. The cloud is biased toward you over-provisioning because that’s their revenue model. Drop everything one or two tiers. You can scale back up in a single API call if you got it wrong.
- Commit and reserve. Now that you know what you actually use, purchase reserved instances or savings plans against that baseline. AWS, GCP, and Azure all offer 30 to 60 percent discounts in exchange for a one- or three-year commitment. The math is almost always obvious once you’re not over-provisioned. Free money.
- Then, and only then, do the expensive programming work. If after the first three steps the bill is still too high, now it’s time to spend engineering hours making architecture components unnecessary. Fix the N+1 queries. Add the missing indexes. Replace the redundant caching layer with Postgres. Collapse the microservices. This is real work and worth doing, but it is the most expensive way to save money, so make sure the cheap moves came first.
Most CTOs want to start at step four. Resist. Steps one through three usually cut the bill in half without anyone writing a single line of code.
The CTO’s actual job
This is the part the title undersells.
A CTO at an early-stage company has a duty that goes beyond writing good code or picking interesting tools. They are the chief steward of the technical budget of a small, fragile, capital-constrained business. They are an advisor to a CEO who, by definition, cannot fully evaluate their technical recommendations.
That role is a fiduciary role, even if no investor calls it that. It requires courage. It requires the willingness to tell your own engineers that no, we’re not going to use the cool thing this quarter, because we can’t afford it and we don’t need it. It requires the willingness to tell the CEO that “we need to scale” is not a real requirement, and please tell me what business outcome you are actually trying to enable.
The CTOs I trust most are the ones who fight for the boring stack. Not because they don’t know the interesting one, but because they know exactly what the interesting one would cost the company, and they would rather give that money back as runway.
Your founder red-flag checklist
If you are a non-technical founder reading this and wondering whether you are being fleeced, here is the cheat sheet.
| Signal | What it usually means |
|---|---|
| Your engineering team spends a lot of meeting time orchestrating services | You have too many services. Healthy ones don’t need that much coordination. |
| Your monthly cloud bill grew faster than your user count | Tooling problem, not a scaling problem. |
| Your CTO uses “we need to be ready for scale” without naming a specific business event | Permission-asking, not requirements-gathering. |
| Your repo count is in the double digits and you have fewer than thirty engineers | Premature microservices. |
| You’re running Kubernetes and you have fewer than fifty engineers | K8s is a full-time job, not a free feature. Most apps your size run fine on a managed service or a couple of VMs. |
| You’re paying for AWS Enterprise Support on more than one account | Avoidable duplication. Consolidate. |
| You’re running Redis or Memcached in production | Rarely necessary. Postgres caches what most apps need. Ask whether you actually outgrew it before adding a second server. |
| You’re running multi-region production for an app with under a hundred thousand users | A story you’re telling investors, not a thing your users need. |
| Your performance complaints get answered with “we need to upgrade the database server” | Almost always actually an N+1 query, a missing index, or an uncached hot path. |
Take this list to your next one-on-one with your CTO. The good ones will tell you which of these you actually do need and why. The bad ones will get defensive.
Bookend
We turned off the unused stuff. We right-sized the rest. We left the cool toys in the box.
The savings would have kept someone on payroll.
Don’t be that founder. Don’t be that CTO. The boring stack is the kind one.