Three-Pillar LLM Security: Isolate, Monitor, Review

aisec-levchenkod.com.min.png

Intro

The rise of AI tooling has created new opportunities for us and, undoubtedly, will continue to create even more challenges.

Here are some facts. Credentials are leaking at a pace the industry hasn't seen before, the supply chain attack surface is actively expanding, and we're all obligated to use these tools to stay competitive. 46% of SMBs experienced a cyberattack in 2025 - and only 14% said they were adequately prepared. That gap existed before agents. But now...

Most of the recently exposed vulnerabilities are not new. The development industry offers great solutions from end to end to cover our backs. Yet, naturally, unfortunately, some security advice has been put at the bottom of the backlog, because the number of engineers and businesses that have faced a dedicated cyberattack is not that large. 46% of SMBs experienced a cyberattack in 2025, and only 14% said they were adequately prepared. TechTarget

Now, unintentional exposures happen all over the place. So it's time to step back and see how we can protect ourselves from our own tooling.

I like to think about basic AI-aware precautions as a three-pillars framework: Isolate, Monitor, Review.

Isolate

Agents can't expose what they don't have access to.

Coding agents are surprisingly good at finding environment variables. Not through any clever exploit - they just read what's available. I watched Gemini export active env vars straight from a running Docker container. No explicit instruction, and no approval dialog. docker exec is all that's needed.

What can we do:

Keep prod credentials out of the agent's reach entirely. Separate env profiles, agents work against local or dev configs only. An agent that can't reach prod can't leak prod - and that includes uncontrolled access to prod instances, not just env files.

Store secrets in a proper manager. 1Password and Doppler both have solid secrets management with fine-grained access control. Worth noting: Bitwarden's own npm CLI was compromised via a hijacked GitHub Action in their CI pipeline in April 2026 - end-user vaults were untouched, but it's a clean illustration of why the tool you trust and the channel it ships through are separate threat surfaces.

Run agents in isolated containers. Claude Code ships with sandboxing support - use it. Researchers found 30+ vulnerabilities across AI coding tools in 2025 - Copilot, Cursor, Gemini, Codex CLI - many exploiting agents that simply trusted their environment.

Least privilege applies to MCP servers too. The server your agent connects to should have the minimum permissions for the task.

Automate credential rotation. When a leak happens, rotation limits the exposure window. OWASP has a good cheatsheet

Use Pre-commit hooks e.g. via Husky, to catch sensitive tokens and credentials before they are committed to the repo. Shift-Left Security at its finest

Monitor

Exposing your Claude API Token is bad. Not knowing about it is the actual worst.

The attack surface is growing fast. One tracker logged 35 AI-related security incidents in March 2026 alone - more than the previous seven months combined. CrowdStrike found that up to 90% of developers were already using AI coding tools in 2025, most with access to high-value source code.

Watch usage, set alerts on spikes. A sudden jump in API calls or token consumption is a signal worth investigating.

Backups matter more now - and so do guardrails. I don't curse on LLM, but when I do, it's because it changed something it shouldn't have. Ban destructive commands explicitly: rm -rf, drops, and truncates. Protect sensitive files: .env, *.pem, *.key. When something goes sideways - and it will be subtle - you want a restore point and a short list of things that couldn't have been touched. Especially critical when wiring up services via MCP. Commit and stash every meaningful change.

Harness-level logging and traces. If you're running agents through an orchestration layer - LangChain, LangGraph, CrewAI, or similar - ship traces to an observability tool. Langfuse is a solid open-source option for LLM tracing: every tool call, every input/output, timestamped. That's your audit trail. You really appreciate when the investigation "what did the agent do and when?" takes less than a minute

PII filters on outbound data. Know what's leaving your system. Agents working with user data should never be in a position to exfiltrate it without tripping a wire. Some tools like Datadog have scanners for sensitive data. Frameworks like Presidio take PII masking and redaction a step further.

Review

AI influencers will advertise a 4000-star claude-skills-repo or MCP to unlock some magic agentic workflow. The README is polished, and the install is one line. Add it to your harness, and you've handed an unreviewed package shell-level access to your dev environment.

That's not hypothetical. CVE-2025-6514 - the first documented full-system compromise via MCP infrastructure - came through mcp-remote, a package with 437,000 downloads featured in integration guides from Cloudflare, Hugging Face, and Auth0. The tj-actions supply chain attack hit 23,000+ repositories via a compromised GitHub Action disguised as a legitimate bot commit, auto-merged. CISA issued an advisory. OWASP's MCP Top 10 covers this pattern directly: compromised dependencies altering agent behaviour without triggering detection because they look legitimate. Ouch

Review every artifact your agent harness touches. Skills, MCP servers, plugins - anything it can reach is your responsibility. Be especially skeptical of anything heavily promoted with a thin commit history behind it.

The review load is only going up. Agents produce more code, faster. LLM judges can help triage - a second model checking outputs before they land is a reasonable first pass. But human-in-the-loop before merge stays a must. IBM has a good guide on the topic

Summary

Intentionally or not, AI amplifies existing vulnerabilities and ignoring that is just a delayed recipe for disaster.
Stick to industry's best practices, and review the artifacts. The tools are great. Just don't let them reach further than they need to.

The Quiet Breach: Security in the Age of Coding Agents

Intro

Isolate

Monitor

Review

Summary

Preparing RAG for production

Ask your AI about my services

The Quiet Breach: Security in the Age of Coding Agents

Intro

Isolate

Monitor

Review

Summary

Related posts

Preparing RAG for production

Ask your AI about my services