ferkakta.dev

Every tool I’ve ever used is a CloudFormation frontend

I was reading a job description that wanted CloudFormation experience, and I had the thought that derails the actual task: I’ve spent my entire career using tools that compile down to CloudFormation and don’t mention it until something breaks. I’ve just never framed it that way.

My career is a parade of progressively nicer frontends for the same underlying control plane — but one at a time.

The first one was the AWS console. Click, wait, refresh, click. Then CloudFormation itself, which was an improvement in the way that a paper map is an improvement over asking for directions — technically correct, nearly unusable in practice. Then Serverless Framework, which promised to abstract the whole stack into a YAML file and a deploy command. Then Terraform, which promised cloud-agnostic infrastructure as code with a state model that actually worked.

Every one of these tools made the same pitch: you don’t need to understand CloudFormation. You don’t need to think about the control plane. We handle that. And every one of them eventually put me in a room with CloudFormation anyway — debugging a stack that wouldn’t delete, tracing a resource that drifted, figuring out why an update rolled back when the API call succeeded.

The deepest nesting I ever hit was Terraform deploying Elastic Serverless Forwarder — Elastic’s log shipping product, distributed as an AWS Serverless Application, which was itself a CloudFormation stack of nested stacks. Terraform was automating CloudFormation that was automating CloudFormation. Three layers of indirection, and when it broke I was reading CloudFormation events from the innermost stack trying to understand why a Lambda function wouldn’t create, while Terraform’s plan said everything was fine and the Serverless Application’s own status page said “CREATE_COMPLETE.” Three tools, three opinions, one control plane that disagreed with all of them.

The best part was the cross-account access. I needed ESF to assume a role in a centralized logging account — the kind of thing you do in any serious AWS organization. I’d spent weeks navigating the bureaucracy of a CCOE that had accreted years of guardrails around a utility company’s AWS footprint, getting the right IAM role crafted with the right trust policy to ingest Kinesis streams from the org’s logging account. That access was a feat in itself.

So I configured role_arn in the ESF config. Because of course role_arn is there. Every other Elastic product — Logstash, Filebeat, the Elasticsearch output plugin — honors role_arn for cross-account access. This was a deeply AWS-centric product distributed through AWS’s own Serverless Application Repository. The field had to exist.

It didn’t. ESF never implemented cross-account role assumption. It wasn’t in the docs. It wasn’t a broken feature — it was a feature I hallucinated from pattern matching. I knew AWS, I knew Elastic products, I knew cross-account IAM, and I inferred something that was never there. The Sales Engineer that Elastic assigned to me couldn’t help because he’d never worked in an org complex enough to need it. My effort was doomed from the start and nobody knew it. In a Ruth Rendell novel she tells you whodunit on page 1 and the thrill is watching the machinery converge. This was the opposite — no one knew the ending, not me, not the Sales Engineer, not the CCOE. We all assumed the system had the capability because everything in our collective experience said it should.

I wrote a Lambda to replace ESF. Three hundred lines of Python. The Serverless Application it replaced was a CloudFormation stack with 47 resources.

The parade isn’t limited to cloud provisioning. I went through the same cycle with configuration management — Puppet, then Ansible, each one a frontend for “make this machine look like this.” I was huge on Puppet. I still remember Hiera fondly, the hierarchical data lookup that let you separate config from code before anyone called it “GitOps.” I once had the instinct to bring Puppet back for a problem that felt awkward in Terraform. Then I remembered I already know Ansible, I’ve used it plenty in the last two years, and Puppet didn’t lose to some superior universal successor. It became less relevant to the world I was actually working in.

The world I was actually working in was AWS.

The portability fantasy

What I wanted, especially early on, was to avoid marrying a cloud-specific automation language. As a Rubyist, then a Pythonista, then a Terraform person, I assumed the higher-order move was to stay abstract enough that I could theoretically redeploy the stack to Google Cloud, Azure, Oracle, or whatever came next. That theory sounded prudent for years.

It did not describe reality.

AWS outlived employers. AWS outlived projects. AWS outlived most of the abstraction fantasies built on top of it. I’ve been on the same cloud for over a decade. The companies changed, the tools changed, the job titles changed. The substrate didn’t. The practical work was never escaping the control plane — it was choosing how many layers of manners and indirection I wanted between me and it.

Terraform won that contest for me, not because it’s cloud-agnostic (I’ve never once redeployed a Terraform stack to a different cloud) but because its state model is honest. It tells you what it thinks exists, you tell it what should exist, and the diff is the plan. CloudFormation’s state is a black box. Terraform’s state is a JSON file you can read, move, import into, and argue with. The abstraction layer is thinner and the machinery is visible, which turns out to be what I actually wanted. When a resource won’t update I taint it and let Terraform rebuild it. I’m still teaching a tool to coerce CloudFormation — just with a better vocabulary.

The machinery grins through

The pattern I keep rediscovering is that every abstraction layer eventually becomes thin enough to see the original machinery grinning through it. Serverless Framework generates CloudFormation templates — and when the deploy fails, you debug CloudFormation. Terraform calls the AWS API — and when the resource won’t update, you read the CloudFormation event log to find out why. CDK compiles TypeScript to CloudFormation — the whole product is a transpiler for the thing everyone claims to hate.

I’ve gotten more effective in AWS not by finding better abstractions but by learning which abstraction is lying to me at any given moment.

I don’t think this is a failure of tooling. I think it’s a property of substrates. The substrate is the thing that outlasts the tools built on top of it. SQL outlasted every ORM. HTTP outlasted every framework. CloudFormation will outlast Terraform, CDK, Pulumi, and whatever comes after them — not because it’s good, but because it’s the actual mechanism that creates and destroys AWS resources. Everything else is a frontend.

God is dead, but AWS is still here

The lesson I keep learning is not “CloudFormation bad” or “Terraform good” or “bring back config management.” It’s that my career has been a series of choices about which frontend I wanted over the same cloud substrate, and the only constant has been the substrate itself.

The tools get nicer. The abstractions get thinner. And every few years, when something breaks in a way the frontend can’t explain, I end up reading CloudFormation events at 2am and thinking: you were always here, weren’t you.

#aws #terraform #platformengineering #career