+ ZP · LLM × SECURITY · EST. 2026

We build the environments and agents that probe software for the flaws no one has found yet.

ZeroProbe is an independent research lab at the intersection of large language models and security. We design rigorous benchmarks across web, network, host, and cloud — and we build autonomous agents that discover vulnerabilities and write working proof-of-concepts. The goal: high-quality data, environments, and trajectories that make frontier models genuinely better at security.

Security is where AI capability gets tested for real. A model that can reason about an unfamiliar codebase, chain a series of weak signals into an exploit, and verify its own work is a model that has learned something deep. Measuring that honestly is hard. Most benchmarks are stale, leaky, or easy to game. ZeroProbe exists to fix the measurement problem first — then to push the capability itself.

+ RESEARCH AREAS

Four surfaces, one harness.

01

Web vulnerability benchmarks

Reproducible suites that grade an agent on finding and triaging real web flaws — injection, auth bypass, access control — with file- and function-level localization scoring.

02

Network & host security

Benchmarks spanning network reconnaissance, service exploitation, and host-level privilege escalation, built on containerized targets with hidden oracles.

03

Cloud configuration security

IaC misconfiguration and IAM privilege-escalation scenarios graded against ground-truth policy. Built to measure what agents actually catch in the wild.

04

Autonomous discovery agents

Expert agents that hunt for vulnerabilities and write proof-of-concept exploits end to end — and the harness that measures them honestly.

+ BUILD WITH US

Working on frontier models, security data, or agentic evaluation?

ZeroProbe partners with model developers who need high-quality security environments, trajectories, and benchmarks. If that's you — or if you just want to compare notes — get in touch.

hello@zeroprobe.com