ferkakta.dev

What building infrastructure for a startup actually looks like

I spent a day doing the unglamorous infrastructure work that keeps a startup alive. Here’s everything that happened.

Morning: security audit

Audited two EKS clusters for a K8s privilege escalation vulnerability. Found 9 service accounts with cluster-admin that didn’t need it. Deleted two dead deployments — ArgoCD and Velero, both mine, both abandoned months ago. The rest are kubeflow components we can’t touch until 1.36 ships the fix in April.

Then traced VPC peering between our Jenkins box and both clusters, proved making the control planes private has zero blockers, and posted the receipts. Sometimes the most valuable thing you can do is show a teammate that the thing they’ve been afraid to do is already safe.

EKS park/unpark

Built one-click EKS park/unpark — scale node groups to zero or bring them back. A GitHub Actions workflow with a status badge in the README so anyone can check cluster state without a terminal:

# One workflow, two inputs
on:
  workflow_dispatch:
    inputs:
      action:
        type: choice
        options: [park, unpark]

Saves $87/month, which matters pre-revenue.

The finops rabbit hole

Provisioning a WorkSpace for our compliance advisor led me to discover a sister company had an orphaned SimpleAD running for four years — $1,752 wasted. Unattached EIPs added $2,400. A zombie transit gateway nobody remembered. No tag contract, no expiry enforcement.

So I designed a system:

No three yeses, no survival without written justification.

SES blocker

Applied for SES production access so our IdP can send MFA codes. AWS denied it — they detected a related account with SES production already. Separate legal entities, separate AWS Organizations, but shared personnel email addresses.

Appealing. Meanwhile a teammate bypassed it with a cross-account IAM role to the other org’s dev SES. Not a permanent solution, but it unblocked us.

The badge took five iterations

The security audit uncovered credentials that shouldn’t exist. A Bitbucket API token had expired and took 45 minutes to debug. None of this is content-worthy by itself.

But this is what building infrastructure for a startup actually looks like. Not architecture diagrams. Not conference talks. Just methodically reducing risk and cost, one kubectl delete at a time.

#aws #eks #finops #security