<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Litellm on ferkakta.dev</title><link>https://ferkakta.dev/tags/litellm/</link><description>Recent content in Litellm on ferkakta.dev</description><generator>Hugo</generator><language>en-US</language><copyright>Copyright fizz.</copyright><lastBuildDate>Fri, 20 Mar 2026 17:00:00 -0500</lastBuildDate><atom:link href="https://ferkakta.dev/tags/litellm/index.xml" rel="self" type="application/rss+xml"/><item><title>The $233 Day, Part 2: The Inference Iceberg</title><link>https://ferkakta.dev/233-dollar-day-part-2/</link><pubDate>Fri, 20 Mar 2026 17:00:00 -0500</pubDate><guid>https://ferkakta.dev/233-dollar-day-part-2/</guid><description>&lt;p&gt;I posted the part 1 findings to the team thread — model switch, cache invalidation, 20× call volume, $173 training run. Case closed. The numbers were clean, the explanation was satisfying, and the model got reverted within the hour.&lt;/p&gt;
&lt;p&gt;Except $173 was wrong. Not wrong in the analysis — the training run did cost that much. Wrong in scope. I&amp;rsquo;d found the visible part of the spend and stopped looking.&lt;/p&gt;</description></item><item><title>The $173 Training Run</title><link>https://ferkakta.dev/173-dollar-training-run/</link><pubDate>Fri, 20 Mar 2026 15:00:00 -0500</pubDate><guid>https://ferkakta.dev/173-dollar-training-run/</guid><description>&lt;p&gt;The Slack message landed at 3pm on a Wednesday: &amp;ldquo;model training successful, previously 20min, now 1h30m.&amp;rdquo; I had finished an EKS 1.32-to-1.33 upgrade on the ramparts cluster that morning. My upgrade, my timeline, my problem.&lt;/p&gt;
&lt;p&gt;The first theory wrote itself. New cluster version, fresh nodes, cold image caches. I&amp;rsquo;d fixed a broken cluster autoscaler earlier that day — the old autoscaler deployment was pinned to a node selector that no longer matched after the upgrade, so pods were stacking up in Pending until I caught it. First-run penalties after a major version bump are real. Everyone on the call nodded. I almost typed up that explanation and moved on.&lt;/p&gt;</description></item></channel></rss>