智能助手网
标签聚合 security

/tag/security

hnrss.org · 2026-04-17 05:39:58+08:00 · tech

Hi HN I’ve been working on an open-source project to explore a problem I keep running into with LLM systems in production: We give models the ability to call tools, access data, and make decisions… but we don’t have a real runtime security layer around them. So I built a system that acts as a control plane for AI behavior, not just infrastructure. GitHub: https://github.com/dshapi/AI-SPM What it does The system sits around an LLM pipeline and enforces decisions in real time: Detects and blocks prompt injection (including obfuscation attempts) Forces structured tool calls (no direct execution from the model) Validates tool usage against policies Prevents data leakage (PII / sensitive outputs) Streams all activity for detection + audit Architecture (high-level) Gateway layer for request control Context inspection (prompt analysis + normalization) Policy engine (using Open Policy Agent) Runtime enforcement (tool validation + sandboxing) Streaming pipeline (Apache Kafka + Apache Flink) Output filtering before response leaves the system The key idea is: Treat the LLM as untrusted, and enforce everything externally What broke during testing Some things that surprised me: Simple pattern-based prompt injection detection is easy to bypass Obfuscated inputs (base64, unicode tricks) are much more common than expected Tool misuse is the biggest real risk (not the model itself) Most “guardrails” don’t actually enforce anything at runtime What I’m unsure about Would really appreciate feedback from people who’ve worked on similar systems: Is a general-purpose policy engine like OPA the right abstraction here? How are people handling prompt injection detection beyond heuristics? Where should enforcement actually live (gateway vs execution layer)? What am I missing in terms of attack surface? Why I’m sharing This space feels a bit underdeveloped compared to traditional security. We have CSPM, KSPM, etc… but nothing equivalent for AI systems yet. Trying to explore what that should look like in practice. Would love any feedback — especially critical takes. Comments URL: https://news.ycombinator.com/item?id=47799856 Points: 1 # Comments: 0

hnrss.org · 2026-04-14 14:25:47+08:00 · tech

We use Claude Code, Cursor, and Copilot daily. These tools run shell commands, read files, and call APIs on their own. When something goes wrong you find out after. A .env file gets read, a secret ends up somewhere it should not, a command runs that nobody approved. EDR sees process spawns. Cloud audit logs see API calls. Neither understands that the agent's chain of actions together is credential theft. Burrow sits between the agent and the machine. You define policies in plain language, like "block any agent from deleting production resources" or "alert if an agent reads AWS credentials and then sends data to an external endpoint." Burrow maps those policies against the actual tools, MCP servers, and plugins in your environment, then intercepts tool calls at the framework level before they execute. Risky calls get dropped. Everything else passes through. Works with Claude Code, Cursor, Copilot, Windsurf, CrewAI, LangChain, LangGraph, and a few more. CLI and SDK install in under a minute. Free tier for individuals, paid for teams. I ran infrastructure security at a large media company before this. Going full time on Burrow later this month. Happy to answer anything, especially the "does this actually work in production" question. try - https://burrow.run Comments URL: https://news.ycombinator.com/item?id=47761957 Points: 3 # Comments: 0