ferkakta.dev

GovCloud Bedrock: The Model Graveyard

I run a multi-tenant compliance platform where tenants can search their indexed review data using a Bedrock-backed agent. The search feature worked fine on commercial AWS — same code, same container image, same release. When I deployed to GovCloud this morning, the search endpoint returned nothing. No error in the UI. Just emptiness where results should have been.

The silent failure

The apiserver handles both indexing and search, so I tailed the pod logs and triggered a search from the browser. The request came back 200 OK in two milliseconds. The endpoint returned success immediately, then tried to call Bedrock in the background. The Bedrock call failed, but the HTTP response was already gone. The user saw a loading spinner that resolved to nothing.

The actual error was buried in stderr:

ERROR    Unexpected error calling Bedrock API: An error occurred
         (ValidationException) when calling the ConverseStream operation:
         Invocation of model ID anthropic.claude-3-haiku-20240307-v1:0 with
         on-demand throughput isn't supported. Retry your request with the ID
         or ARN of an inference profile that contains this model.

The app was calling Bedrock with the raw Haiku model ID, and GovCloud rejected it. I spent the next four hours peeling back what GovCloud actually wants — and each answer revealed another problem underneath.

The EULA circle

I tried enabling Anthropic models in the GovCloud Bedrock console. GovCloud has a model access page with a proper form — company name, website, industry, intended users, and a text field where you describe your use case to Anthropic. I filled it out and hit submit.

All four models failed:

The GovCloud account XXXXXXXXXXXX cannot perform this operation.
You must use your associated standard AWS account to establish a
model access agreement or view the EULA in either us-east-1 or us-west-2 regions

So I went to the linked commercial account. The model access page there has been retired — AWS replaced it with auto-enablement, where models activate on first invocation with no form needed. The page just says “this feature has been retired” and points you at the model catalog.

The GovCloud console sends you to commercial. Commercial tells you the page doesn’t exist anymore. Two consoles, each pointing at the other, neither willing to do the thing.

If you’re on Control Tower, there’s a prerequisite before any of this matters — Bedrock inference profiles route cross-region, and the standard Allow SCP workaround doesn’t help. You have to add Bedrock to the guardrail’s NotAction list. I covered the full SCP mechanics in The Allow SCP that worked until it didn’t.

Invocation as workaround

Since commercial auto-enables on first invocation, I invoked all four Anthropic models from the linked commercial account via CLI. Haiku responded. Sonnet 4.5 responded. Sonnet 3.7 was legacy and refused. Sonnet 3.5 returned ResourceNotFoundException: This model version has reached the end of its life — fully dead on commercial, and yet the GovCloud form still presented it as a model I could request access to.

I went back to the GovCloud console and tried the model access form again. This time Haiku went through — the EULA had propagated from commercial in a strange eventual consistency pattern. Reloading the page a few times turned up more approved models. Sonnet 4.5 and 3.7 came through too. Three of four. Sonnet 3.5 stayed stuck — the GovCloud form still returned the same “use your associated standard AWS account” error, pointing me back to a commercial page that doesn’t exist, for a model that’s already dead on commercial anyway. A dead model, a retired page, and a GovCloud form that doesn’t know about either.

But “approved” and “usable” turned out to be different things.

The graveyard

I ran list-foundation-models in GovCloud and discovered what I’d actually been approving:

ModelGovCloud Status
Claude Sonnet 4.5ACTIVE
Claude 3.7 SonnetLEGACY
Claude 3.5 SonnetLEGACY
Claude 3 HaikuLEGACY

Every Anthropic model except Sonnet 4.5 is LEGACY in GovCloud. Legacy doesn’t mean deprecated-but-available — it means if you haven’t used the model in the last 30 days, AWS won’t let you start. Our GovCloud account was new, so every model except 4.5 was a tombstone.

The approval form let me approve models I can’t invoke. The entitlement exists. The access exists. The model refuses to run. Both GovCloud regions show the same table — identical graveyard.

The inference profile wall

With Sonnet 4.5 approved and active, I configured the app to use it. The app defaults to anthropic.claude-3-haiku-20240307-v1:0, so I overrode it with anthropic.claude-sonnet-4-5-20250929-v1:0. Same error shape, different words:

ValidationException: Invocation of model ID anthropic.claude-sonnet-4-5-20250929-v1:0
with on-demand throughput isn't supported. Retry your request with the ID or ARN
of an inference profile that contains this model.

GovCloud doesn’t support direct on-demand model invocation. You have to use an inference profile — a cross-region routing wrapper that AWS manages. The ID format changes from anthropic.claude-sonnet-4-5-20250929-v1:0 to us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0. The us-gov. prefix isn’t just a namespace; it’s a fundamentally different invocation mechanism. Any app that hardcodes a Bedrock model ID needs a GovCloud-specific override, and there’s no automatic translation.

The IAM surprise

With the inference profile ID in place, the next error was AccessDeniedException. Our IAM policy allowed arn:aws-us-gov:bedrock:us-gov-east-1::foundation-model/*, which seemed reasonable — our infrastructure is in us-gov-east-1.

The us-gov. inference profile routes across both GovCloud regions, and AWS picks the region for each request. My second test request landed in us-gov-west-1 and got denied because the IAM policy only covered east. On top of that, inference profiles have a different ARN namespace from foundation models — arn:aws-us-gov:bedrock:REGION:ACCOUNT:inference-profile/* instead of arn:aws-us-gov:bedrock:REGION::foundation-model/*. The original policy only matched foundation models, so even requests that stayed in east were denied when they hit the inference profile ARN.

I ended up with four resource entries:

"Resource": [
  "arn:aws-us-gov:bedrock:us-gov-east-1::foundation-model/*",
  "arn:aws-us-gov:bedrock:us-gov-west-1::foundation-model/*",
  "arn:aws-us-gov:bedrock:us-gov-east-1:XXXXXXXXXXXX:inference-profile/*",
  "arn:aws-us-gov:bedrock:us-gov-west-1:XXXXXXXXXXXX:inference-profile/*"
]

Foundation models in both regions, inference profiles in both regions. The inference profile is a load balancer you can’t configure, and it uses a resource ARN your existing policy doesn’t match.

What I changed

Three infrastructure changes, zero code changes. The app already supported the AGENT_LLM_MODEL_ID environment variable — it just defaulted to Haiku, which was fine everywhere except GovCloud.

I invoked models from the linked commercial account to propagate EULAs, then submitted the GovCloud use case form and waited for eventual consistency. I set the GovCloud apiserver’s environment variable to us-gov.anthropic.claude-sonnet-4-5-20250929-v1:0 — the only non-legacy model, using the inference profile ID format. And I expanded the IAM policy to cover both GovCloud regions for foundation models and inference profiles.

One bright spot — Titan Embed v2 works fine with raw model IDs in GovCloud. The inference profile requirement appears to be specific to Anthropic models, or at least to models with cross-region profiles.

Some of this is documented — the cross-region inference page covers inference profiles and the us-gov. prefix. But the interaction between model enablement, legacy status, and the broken commercial-to-GovCloud EULA flow isn’t in any single document. I found the full picture by reading error messages and testing each layer independently.