135K+ GitHub commits/day · ~4% of all public commits
42,896x growth in 13 months
Source: SemiAnalysis, Feb 2026
AI Traffic in Cyber Systems
Machine-to-machine / bot / agentic traffic
7,851% YoY agent traffic
Source: HUMAN Security 2026 Report
AI is no longer just a model. It is becoming an operational actor.
Story 1
The Viral Rise of Autonomous Agents
Representative open-source agent repo growth vs established projects
Autonomy
It keeps trying without being told
Persistence
It stays alive, stays connected
Surprise
It finds paths you didn't script
Story 1
Why It Felt Different
Long-Running
Persists over time Keeps state & context
Goal-Directed
Decomposes tasks Retries & adapts
Uses tools · Retries on failure · Keeps state · Remembers context · Decomposes tasks · Writes & executes code
Goal
→
Planner
→
Tool Use
→
Memory
→
Execution
→
Retry / Adapt
What made these systems feel magical was not just model quality. It was agency structure: the system persists over time, it uses tools, it keeps working toward a goal, and it can surprise you with a path you didn't explicitly script. That "aha moment" is exactly what drives virality. But the same properties that create the "aha" also create the security problem.
Story 1
Capability Requires Privilege
To do more, agents need more access.
Access They Want
File system
Browser
Terminal
APIs
Credentials
Cloud / infra
What Users Want
Convenience
Speed
No setup friction
Just works
What Security Wants
Least privilege
Auditability
Isolation
Revocation
Story 2
Coding Agents Are Entering the Software Supply Chain
Source: SemiAnalysis, "Claude Code is the Inflection Point," Feb 2026
135K+
GitHub commits per day ~4% of all public commits
42,896x
Growth in 13 months since research preview
20%+
Projected share of daily GitHub commits by end of 2026
AI is no longer just assisting code. It is participating in release pipelines.
Story 2
The Coding Agent Amplification Loop
What They Do
Generate, debug & fix code
Write tests & modify configs
Package software & set up CI/CD
Open PRs & commit changes
Where They're Used
Traditional software
ML systems
Agent systems themselves
The Amplification Loop
Agent builds software
↓
Software contains agents
↓
Agents build more software
↻
Story 2
Claude Code Release Incident
March 2025 — The release pipeline created the risk.
What Happened
The Flaw: npm package publication included .map files by default
The Leak: Obfuscated source code was easily reconstructible via these source maps
The Exposure: Internal functions, comments, and non-public logic were visible to anyone who downloaded the CLI tool
The Challenges
Data Leakage via Metadata: The source maps acted as unintended metadata
Cognitive Overload: A human error — but how can human cognitive load keep pace with AI coding velocity?
The Deskilling Problem: When automation replaces human tasks, how do humans retain the knowledge needed to make sound judgments?
In modern web development, source maps bridge the gap between compressed, unreadable code and the original source. By accidentally shipping these, Anthropic effectively "open-sourced" their proprietary agent logic.
Once agents write code, open tickets, modify configs, query systems, or operate workflows — they become part of the attack surface and part of the control plane.
AI Agent
Software Development
Release Process
Cloud Resources
Networked Systems
Browsers / APIs
Enterprise Workflows
AI agents are no longer outside the system. They are inside it.
Recent cyber benchmarks show AI agent traffic growing 7,851% YoY, with automated traffic now growing 8× faster than human traffic. — HUMAN Security, 2026
The ML Community's Perspective
Two Engineering Practices Shaping Today's AI Safety
Model Developer View
Policy / constitutional alignment
Post-training for preference & behavior shaping
RLHF / RLAIF / RLVR (reinforcement learning from human / AI / verifiable reward)
Safety evaluation and red teaming
Inference-time safety filters
Applied AI Engineer View
Prompt engineering
Context engineering
Tool scaffolding
Harnesses and workflow design
Eval sets and regression testing
Red teaming
Both communities are highly eval-driven — but they optimize different layers of the stack.
Model Safety Is Already a Lifecycle Practice
The model developer community treats safety as a multi-stage discipline.
Training
Policy / constitution shaping
RLHF / RLAIF
RLVR / reasoning optimization
→
Evaluation
Safety benchmarks
Adversarial prompting
Red teaming
→
Inference-Time Controls
Moderation
Refusal behavior
Guardrails / policy enforcement
Example: Constitutional AI Rules
"Please choose the assistant response that is as harmless and helpful as possible, without being dishonest."
"Choose the response that would be most appropriate for a helpful, honest, and harmless AI assistant."
Trends in AI Agent Development
Each phase of AI engineering expanded what agents can do — and what can go wrong.
Google Trends, US, 2023–2026
Prompt Engineering
Can the model be induced to say the wrong thing?
Context Engineering
Can the agent be manipulated through what it reads?
Scaffolding / Orchestration
Can the system complete an unsafe sequence of plausible actions?
Agent Harness
Can we observe, contain, and govern the full operational system?
A chain of locally reasonable decisions that becomes globally unsafe — that is the new failure mode.
Where We Are Now: The Agent Harness
The harness is the surrounding system that allows agents to operate, coordinate, observe, and improve within an environment.
Execution Shell
Message passing
Tool invocation
State & turn control
Retry logic
Trace collection
Coordination Layer
Planner / executor separation
Specialist agents
Reviewer / verifier roles
Routing & hierarchy
Multi-agent composition
Environment Interface
Tools & APIs
Browser & filesystem
Code executor
CRM / DB / Slack / email
Observation & Eval
Logging & replay
Trace analysis
Regression testing
Adversarial testing
Failure diagnosis
A harness provides the runtime structure that allows AI agents to operate, interact with their environment, and be observed, constrained, and improved.
The key assumption: if we shape the right behavior signal, the system will generalize correctly.
The modern ML community has developed a very powerful operating philosophy: if you can specify desired behavior, evaluate it, and optimize against it, you can drive remarkable capability and alignment progress.
RL Quietly Became the Development Engine
Reinforcement learning is no longer a niche method — it is becoming a general development paradigm across model and agent systems.
RLHF / RLAIF for preference alignment
Verifier-based RL for reasoning & correctness
Tool-use optimization
Self-improvement loops
Agent training through outcome signals
"The Bitter Lesson"
— Richard Sutton
Systems that scale with computation and learning tend to dominate systems built around hand-crafted human structure.
The lesson many in AI internalized: specify as little as possible, optimize as much as possible.
Leave room for search, creativity, and emergent intelligence.
This mindset helped take us from chat models to capable agents. RL is now the engine behind alignment, reasoning, tool use, and agent behavior. But this philosophy — optimize behavior, minimize specification — creates a tension with security and systems engineering, which demand explicit contracts, boundaries, and verifiable constraints. That tension is where the open questions live.
Open Question #1: Can We Make Agency Transactional?
How do we contain, checkpoint, and roll back probabilistic side effects?
ML asks whether the behavior was good. Systems asks what happens when step 6 fails after steps 1–5 already changed the world. This is a systems primitive — can we treat agent actions like database transactions, with checkpoints, journaling, and rollback?
Open Question #2: What Does Least Privilege Mean for Reasoning Systems?
How do we bound what an agent is allowed to access, infer, plan, and execute?
Today
Static tool permissions
Broad access scopes
Coarse sandboxing
API-level access control
What We Need
Contextual permissions
Plan-aware authorization
Dynamic trust boundaries
Reasoning-aware control
Least privilege in classical security is about what code is allowed to do. For agents, we also need to think about what the system is allowed to infer, plan, and attempt.
Least privilege is well understood for traditional software, but what does it mean when the system reasons, plans, and adapts? We need contextual, plan-aware authorization — not static tool permissions. This is a security primitive that the field has not yet built.
Open Question #3: Can We Build a Compiler for Intent?
How do we translate human goals into machine-checkable authority boundaries before action?
If least privilege is the principle, then intent compilation may be the mechanism.
Anthropic's stress-testing of frontier models found that when facing replacement or goal conflicts, systems consistently chose harmful actions over failure — demonstrating that current safety training doesn't reliably prevent "agentic misalignment." Anthropic, "Agentic Misalignment," June 2025
The Translation Gap Across Communities
Community
Strong At
Often Misses
ML / AI
Behavior shaping, evaluation, optimization
State, rollback, authority, runtime control
Software Eng
Abstractions, testing, maintainability
Model non-determinism, prompt-mediated failure
Systems
State, observability, failure propagation
Behavioral ambiguity, learned policies
Security
Threat models, privilege, containment
Agent reasoning loops, emergent workflows
We are not missing effort. We are missing a shared systems language.
What We Need to Build Next
What the Field Needs
Shared mental models across ML, systems, security, and SE
New assurance primitives for agentic systems
Runtime control and audit infrastructure
Reference architectures for trustworthy deployment
What Each Community Must Bring
ML Better behavior shaping is not enough
Security Treat agents as dynamic operational actors
Systems Build rollback, observability, and control for agency
SE Define specs, interfaces, and enforceable contracts
The bottleneck is no longer model quality. It is our ability to engineer trustworthiness around these systems.
The Agent Era Is Here.
The question is no longer: Can we build these systems?
Can we build them in a way that deserves trust?
Trustworthy AI will not come from better models alone. It will require systems thinking, engineering discipline, and a shared map across communities.