Richard Skacel

AI for Cloud Engineers

Activity

Mon

Wed

Fri

Sun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

Jun

What is this?

Less

Owned by Richard

AI for Cloud Engineers

18 members • Free

Automate your cloud work with AI. GCP, Azure, VMware. Save hours every week with real workflows.

Memberships

NEO AI

51 members • Free

Skoolers

172.5k members • Free

Crypto Shield - Vlčí smečka 🐺

249 members • Free

Digital Mastermind

583 members • Free

Chase AI Community

68.5k members • Free

23 contributions to AI for Cloud Engineers

Richard Skacel

4d •

Cost Optimization

I built an AI agent that reads my AWS bill and tells me exactly what to cut

AWS shipped its own FinOps Agent last week (June 9, public preview). It's genuinely good: plain-English cost investigations, anomaly root-cause, scheduled reports. But you don't have to wait for the preview, and you don't have to hand your entire cost dataset to a managed black box. You can build a leaner version yourself in an afternoon, and you'll understand every line of it. Here's the core idea. Your AWS Cost and Usage data is just numbers. LLMs are great at turning structured numbers into prioritized, plain-English actions, IF you feed them the right slice. The pattern is four stages. Stage one, Collect: pull the last 30 to 60 days of cost data from the Cost Explorer API (boto3, get_cost_and_usage), grouped by SERVICE and by linked account, plus rightsizing and idle-resource signals from Compute Optimizer and Cost Optimization Hub. Stage two, Reduce: compress that into a compact JSON summary of top movers, week-over-week deltas, and biggest absolute spend. This is the step that matters most. Stage three, Reason: hand THAT to Claude or GPT with a tight prompt. "You are a FinOps analyst. Here is the cost summary. Return the top 5 actions ranked by dollars saved per month, each with the exact AWS steps and the risk." Stage four, Deliver: render the ranked report to Slack, a PDF, or a ticket. The magic isn't the model. It's the data engineering before the model. Feed it raw billing CSVs and you get mush. Feed it a clean "top 10 movers plus deltas" summary and you get an action list a senior engineer would write. Three things I learned the hard way. First, never paste raw account IDs or ARNs into a public LLM, mask them; FinOps data is sensitive. Second, anchor the model with thresholds: tell it to flag anything that grew over 20% week-over-week or any single idle resource over $200 a month, or it hedges. Third, make it output a diff, not a dashboard. "What changed and what do I do" beats "here are 40 charts" every time. That's literally why AWS's own agent emphasizes investigation summaries over more graphs.

Richard Skacel

6d •

Cost Optimization

Your AI inference cluster is probably burning 60% of its GPU budget on idle silicon. Here's the fix.

Quick gut check: open your Kubernetes cluster right now and run kubectl describe node on any GPU node. Look at the nvidia.com/gpu requests vs. actual utilization in your monitoring. If you're like most teams I audit, you'll find pods that reserve a full A100 or L4 and then sit at 8-15% GPU utilization all day. That's not a tuning problem. That's money on fire. Here's why it happens. Kubernetes treats nvidia.com/gpu as a countable, non-divisible resource by default. Request 1 and you get the whole card, even if your model only needs a sliver of it. Most inference models (anything under ~7B params, embeddings, rerankers, OCR, classic CV) don't come close to saturating a modern GPU. So you end up with a 1:1 pod-to-GPU mapping and a fleet that's mostly idle. Three levers fix the bulk of this, in order of effort: 1) GPU time-slicing (NVIDIA device plugin). Lets multiple pods share one physical GPU by oversubscribing time. Zero hardware requirement, works on almost any card, configured with a single ConfigMap. Best for bursty, latency-tolerant inference. Downside: no memory isolation, a leaky pod can OOM its neighbors. 2) MIG (Multi-Instance GPU) on A100/H100/H200. Hardware-partitions one GPU into up to 7 isolated instances, each with dedicated memory and compute. Real isolation, predictable performance. Best for production multi-tenant inference. Downside: only on data-center GPUs, fixed partition profiles. 3) Right-size the request, then bin-pack onto Spot. Once pods share GPUs cleanly, schedule them onto spot/preemptible GPU nodes with proper PodDisruptionBudgets and a fallback node pool. This is where the 60-80% savings actually land. Real example from a recent audit: a team running 12 inference services on 12 dedicated L4s. After time-slicing the latency-tolerant services 4:1 and moving them to spot, they dropped to 4 GPUs with a 1-GPU on-demand safety pool. ~65% monthly GPU cost cut, no SLA regression.

Richard Skacel

9d •

Cost Optimization

The Cloud Bill Nobody Wants to Open

Let’s be honest. Most engineers love building things. Very few enjoy opening the monthly cloud bill. At first, cloud costs seem manageable. A few virtual machines, some storage, a database or two. Then the project grows. More environments. More teams. More services. More “temporary” resources that somehow survive for years. And suddenly everyone is asking the same question: “Why are we spending so much?” The Funny Thing About Cloud Costs The biggest cloud bills rarely come from one huge mistake. They come from hundreds of small decisions. A larger VM because it’s easier. A test environment left running over the weekend. Snapshots that nobody reviews. A Kubernetes cluster sized for traffic that never arrived. Individually, these decisions seem harmless. Together, they become a budget problem. FinOps Isn’t About Spending Less This is where many people get FinOps wrong. FinOps isn’t about cutting costs at all costs. It’s about understanding where money creates value and where it creates waste. Nobody complains about spending money on infrastructure that helps the business grow. People complain when they’re paying for things nobody uses. That’s a big difference. Engineers Are Part of the Solution For years, cloud costs were often treated as a finance problem. Today, that’s changing. The engineers designing the architecture are often in the best position to optimize it. A small design decision can save thousands of dollars per year. A good tagging strategy can reveal hidden waste. A simple automation can shut down non-production environments every night. The best cost optimization stories often start with engineers asking: “Do we actually need this?” Where AI Comes In This is one of the reasons I’m excited about combining AI and cloud engineering. AI can already help: - analyze cloud bills - identify unusual spending patterns - recommend rightsizing opportunities - explain cost spikes - generate optimization reports Instead of spending hours digging through dashboards, engineers can focus on making decisions.

Richard Skacel

20d •

AI & Automation

Looking for People Who Want to Build This With Me

I’m building AI for Cloud Engineers — a community focused on real, practical cloud engineering with AI. Not theory.Not recycled LinkedIn content.Real things engineers can use: - AWS / Azure / GCP tips - Terraform examples - automation scripts - cloud cost optimization - AI prompts for engineers - troubleshooting workflows - infrastructure templates - real-world cloud scenarios I’m thinking about turning this into something bigger than just a community. Possible directions: - free knowledge hub - paid resource library - cloud engineering toolkit - affiliate offers - live workshops - consulting offers - partner-led content - small expert network If you are into cloud, DevOps, automation, AI, FinOps, security, or infrastructure, and you’d like to contribute, collaborate, create content, share tools, or help shape the direction of this community, let me know. I’m open to: - contributors - technical writers - cloud engineers - DevOps engineers - AI builders - FinOps people - affiliate/partnership ideas - people who want to build useful products for engineers The goal is simple: Build a practical place where cloud engineers can learn faster, automate more, and stay ahead of what’s coming. If this sounds interesting, comment below or send me a message.

20d •

IAM is one of the best areas for AI-assisted review. Not because AI should decide permissions for you, but because it can quickly highlight risky patterns: - wildcard permissions - overly broad roles - admin-level access - unused-looking access - possible privilege escalation paths This is especially useful in AWS IAM, GCP IAM, Azure RBAC, and service account reviews. If you test this, reply with: 1. AWS / Azure / GCP 2. what type of policy or role you reviewed 3. what risk AI found 4. whether you agreed with the result Never paste real customer IAM data without sanitizing it.

1-10 of 23

Level 1

4points to level up

Richard Skacel

@richard-skacel-9962

Cloud Architect @ Google Partner | GCP, VMware, AD, Terraform | Building resilient cloud solutions | Awaiting Midea.

Active 18h ago

Joined Mar 29, 2026

Prague

Contributions

Followers

Following