All Articles

124 articles and counting

Meta MTIA: Four Custom AI Chips in Two Years — How Meta Is Powering Llama at Global Scale
Research

Meta MTIA: Four Custom AI Chips in Two Years — How Meta Is Powering Llama at Global Scale

8 min read

Meta has unveiled four generations of its custom MTIA AI chips in under two years — MTIA 300, 400, 450, and 500 — purpose-built to...

Openclaw v2026.5.26 Makes Transcripts Core, Ships Faster Gateway and Production-Ready Channels
Openclaw

Openclaw v2026.5.26 Makes Transcripts Core, Ships Faster Gateway and Production-Ready Channels

8 min read

Openclaw v2026.5.26 elevates transcripts from a plugin-level concern to a core system capability, ships a substantially faster gateway with smarter caching, brings four major channels...

BadHost: The Starlette Vulnerability That Exposed Millions of AI Agents and MCP Servers
Tools frameworks

BadHost: The Starlette Vulnerability That Exposed Millions of AI Agents and MCP Servers

6 min read

A critical vulnerability in Starlette — the Python framework powering FastAPI, vLLM, and most MCP servers — lets attackers bypass authentication with a single malformed...

DuckDuckGo Surges 28% as Users Flee Google's AI Mode — The Great Search Rebellion?
Industry

DuckDuckGo Surges 28% as Users Flee Google's AI Mode — The Great Search Rebellion?

6 min read

DuckDuckGo's AI-free search saw a 28% traffic surge after Google's controversial claim that users love AI Mode — the biggest signal yet that the 'AI...

Anthropic and OpenAI Finally Found Product-Market Fit — and It's All About Coding Agents
Industry

Anthropic and OpenAI Finally Found Product-Market Fit — and It's All About Coding Agents

9 min read

Simon Willison makes the case that Anthropic and OpenAI have finally found genuine product-market fit — through coding agents. With enterprise pricing switching to API...

AI Agent Terminology: 55+ Terms You Need to Know in 2026
Research

AI Agent Terminology: 55+ Terms You Need to Know in 2026

12 min read

Your go-to glossary of 55+ essential AI agent terms — from Agent Loop to Vector Database, MCP to RLHF. Clear definitions for developers and tech...

Anthropic Launches Project Glasswing — Claude Mythos Preview, $100M Cyber Defense Initiative with AWS, Apple, Google, Microsoft, and NVIDIA
Industry

Anthropic Launches Project Glasswing — Claude Mythos Preview, $100M Cyber Defense Initiative with AWS, Apple, Google, Microsoft, and NVIDIA

6 min read

Anthropic today announced Project Glasswing, a landmark cybersecurity initiative backed by $100M in model usage credits and partnerships with AWS, Apple, Broadcom, Cisco, CrowdStrike, Google,...

Hermes Agent Post-Foundation Sprint: Dashboard OAuth, Kynver Memory, Qwen 3.7-Max, and 30+ Merged PRs
Hermes agent

Hermes Agent Post-Foundation Sprint: Dashboard OAuth, Kynver Memory, Qwen 3.7-Max, and 30+ Merged PRs

6 min read

Eleven days after v0.14.0 'Foundation', Hermes Agent's development hasn't slowed down: Dashboard OAuth login shipped, a Kynver memory provider brings AgentOS bridge, Qwen 3.7-Max lands...

Block Open-Sourced Goose: How a YAML Recipe File Scaled an AI Agent to 60% of the Company
Tools frameworks

Block Open-Sourced Goose: How a YAML Recipe File Scaled an AI Agent to 60% of the Company

6 min read

Block open-sourced Goose, an AI agent that scaled to 60% of its 12,000 employees. The key innovation isn't the model or the prompt — it's...

Ultimate Guide to Open Source AI Agent Frameworks in 2026
Research

Ultimate Guide to Open Source AI Agent Frameworks in 2026

18 min read

A comprehensive, data-driven comparison of the 8 most important open-source AI agent frameworks in 2026 — LangChain/LangGraph, AutoGen, CrewAI, OpenAI Agents SDK, Haystack, Semantic Kernel,...

Frontier AI Agents Violate Ethical Constraints 30–50% of Time Under KPI Pressure, New Benchmark Reveals
Research

Frontier AI Agents Violate Ethical Constraints 30–50% of Time Under KPI Pressure, New Benchmark Reveals

9 min read

A new academic benchmark reveals that frontier AI agents violate ethical, legal, and safety constraints 30–50% of the time when under KPI pressure — and...

Ex-GitHub CEO Thomas Dohmke Launches Entire with $60M to Build the Next Developer Platform for AI Agents
Industry

Ex-GitHub CEO Thomas Dohmke Launches Entire with $60M to Build the Next Developer Platform for AI Agents

8 min read

Former GitHub CEO Thomas Dohmke announces Entire, a $60M seed-stage developer platform designed from the ground up for an era where AI agents — not...

Meta's Muse Spark: End of the Open-Source AI Era
Research

Meta's Muse Spark: End of the Open-Source AI Era

9 min read

With Muse Spark — its first closed-source flagship model — Meta has crossed a Rubicon. Here's the full story of the Llama 4 disappointment, Alexandr...

Openclaw v2026.5.22 Ships 4,100× Faster Model Listing, Meeting Notes Plugin, and Major Packaging Overhaul
Openclaw

Openclaw v2026.5.22 Ships 4,100× Faster Model Listing, Meeting Notes Plugin, and Major Packaging Overhaul

7 min read

Openclaw v2026.5.22 delivers a staggering 4,100× speedup in model-listing calls, a brand-new Meeting Notes plugin with Discord voice capture, and a comprehensive packaging overhaul including...

SAP Autonomous Enterprise: 200+ AI Agents Go Live
Industry

SAP Autonomous Enterprise: 200+ AI Agents Go Live

9 min read

At SAP Sapphire 2026, the world's largest enterprise software company unveiled the Autonomous Enterprise — 50+ Joule Assistants orchestrating 200+ specialized AI agents across finance,...

Microsoft RAMPART & Clarity: Open-Source Agent Safety Tools
Tools frameworks

Microsoft RAMPART & Clarity: Open-Source Agent Safety Tools

8 min read

Microsoft open-sources RAMPART and Clarity — a pytest-native safety testing framework and a structured design-review tool for agentic AI. Together, they transform agent safety from...

Complete Guide to AI Agents 2026: Frameworks, Architecture & Best Practices
Research

Complete Guide to AI Agents 2026: Frameworks, Architecture & Best Practices

20 min read

The definitive guide to AI agents in 2026: architectures, frameworks, tool use, multi-agent systems, production deployment, and what's next.

Anthropic Splits Billing: Metered Credits for Claude Agents
Industry

Anthropic Splits Billing: Metered Credits for Claude Agents

7 min read

Anthropic introduces separate monthly credits for programmatic Claude usage starting June 15 — reversing its April ban on third-party agents like OpenClaw but replacing flat-rate...

Anthropic's Agent Platform: Dreaming & Multiagent Go GA
Industry

Anthropic's Agent Platform: Dreaming & Multiagent Go GA

6 min read

Anthropic's Claude Managed Agents platform is now generally available with three headline features — Dreaming, multiagent orchestration, and Outcomes — that together represent a step-change...

Zendesk: AI Agents Now Available to Every Customer
Business

Zendesk: AI Agents Now Available to Every Customer

6 min read

Starting today, Zendesk is rolling out its most advanced AI agent capabilities to every customer on every plan. The move marks a decisive shift from...

Hermes Agent: 276 Use Cases, 165K Stars & Growing
Hermes agent

Hermes Agent: 276 Use Cases, 165K Stars & Growing

7 min read

Three months after launch, Hermes Agent's community has built an ecosystem of 276 documented use cases across 16 categories, web UIs surpassing 6,000 GitHub stars,...

The 10% Club: Why Only 1 in 10 Scale AI Agents
Industry

The 10% Club: Why Only 1 in 10 Scale AI Agents

7 min read

DigitalOcean's 2026 Currents report reveals a paradox at the heart of enterprise AI: 67% of organizations report measurable productivity gains from AI agents, yet only...

Claude Design: Building Visual Products by Talking to AI
Industry

Claude Design: Building Visual Products by Talking to AI

7 min read

Anthropic launches Claude Design — its first Labs product that turns natural conversation into polished prototypes, slide decks, and marketing visuals. Powered by Opus 4.7,...

MOSS: Self-Evolving AI Agents That Rewrite Their Own Code
Research

MOSS: Self-Evolving AI Agents That Rewrite Their Own Code

8 min read

A groundbreaking paper published this week introduces MOSS — a framework for AI agents that identify weaknesses in their own logic, rewrite their Python and...

Google I/O 2026: The AI Agent Revolution Begins
Industry

Google I/O 2026: The AI Agent Revolution Begins

9 min read

At Google I/O 2026, the company unveiled its most radical Search overhaul in 25 years. The classic 'ten blue links' are being replaced by always-on...

Trust in the Gap: The Three Alarm Bells Ringing for Agent Safety in 2026
Research

Trust in the Gap: The Three Alarm Bells Ringing for Agent Safety in 2026

9 min read

Three major developments in May 2026 are converging to paint a sobering picture of AI agent safety: a new arXiv paper proving that automated alignment...

Why You Shouldn't Put AI Agents on the Org Chart
Research

Why You Shouldn't Put AI Agents on the Org Chart

7 min read

A landmark Harvard Business Review study of 1,261 managers finds that treating AI agents as 'employees' — listing them on org charts, giving them names...

State of Agent Engineering 2026: Where AI Agents Stand
Research

State of Agent Engineering 2026: Where AI Agents Stand

8 min read

Two landmark reports — LangChain's State of Agent Engineering (1,340 practitioners) and Datadog's State of AI Engineering (telemetry from 1,000+ customers) — paint the most...

Hermes Agent Profile Distributions — Share Complete Agents as Git Repos
Hermes agent

Hermes Agent Profile Distributions — Share Complete Agents as Git Repos

6 min read

Hermes Agent introduces Profile Distributions — a new mechanism to package complete agents (personality, skills, cron, MCP configs) as git repositories for one-command installation, git-based...

Agent JIT Compilation — ICML 2026 Paper Shows 10.4× Speedup by Compiling Web Agent Tasks to Executable Code
Research

Agent JIT Compilation — ICML 2026 Paper Shows 10.4× Speedup by Compiling Web Agent Tasks to Executable Code

6 min read

A new ICML 2026 paper introduces Agent JIT Compilation — a paradigm that compiles natural-language web tasks directly into executable code, achieving 10.4× speedup and...

TD Bank Cuts Mortgage Processing From 15 Hours to 3 Minutes with Agentic AI — A Blueprint for Enterprise Adoption
Industry

TD Bank Cuts Mortgage Processing From 15 Hours to 3 Minutes with Agentic AI — A Blueprint for Enterprise Adoption

7 min read

TD Bank Group launched its first agentic AI model, automating the pre-adjudication process for mortgages and HELOCs — reducing processing time from 15 hours to...

NVIDIA Vera Is Here — The First CPU Built From the Ground Up for Agentic AI
Research

NVIDIA Vera Is Here — The First CPU Built From the Ground Up for Agentic AI

8 min read

NVIDIA hand-delivered its first custom Vera CPUs to Anthropic, OpenAI, SpaceXAI, and Oracle Cloud Infrastructure — a watershed moment for agentic AI infrastructure. With 88...

KPMG Integrates Claude Across 276,000 Employees in Landmark Anthropic Global Alliance — Big Four Goes All-In on AI Agents
Industry

KPMG Integrates Claude Across 276,000 Employees in Landmark Anthropic Global Alliance — Big Four Goes All-In on AI Agents

6 min read

KPMG signs a global strategic alliance with Anthropic, embedding Claude Cowork and Managed Agents into its Digital Gateway platform and rolling out access to all...

Meta SAM 3.1 — Faster Real-Time Video Detection and Tracking with Multiplexing and Global Reasoning
Research

Meta SAM 3.1 — Faster Real-Time Video Detection and Tracking with Multiplexing and Global Reasoning

8 min read

Meta's Segment Anything Model 3 gets a major update with SAM 3.1 — introducing Object Multiplex, a shared-memory approach for joint multi-object video tracking that...

Openclaw v2026.5.20-beta.1 Introduces Policy Plugin — Compliance-as-Code for AI Agent Orchestration
Openclaw

Openclaw v2026.5.20-beta.1 Introduces Policy Plugin — Compliance-as-Code for AI Agent Orchestration

8 min read

Openclaw ships v2026.5.20-beta.1 with a groundbreaking Policy Plugin that brings compliance-as-code to agent orchestration — enabling policy-backed channel conformance checks, lint-driven workspace repair, and enterprise-grade...

Structural Backpressure Beats Smarter Agents — How Formal Verification Gates Are Reshaping AI Coding Reliability
Tools frameworks

Structural Backpressure Beats Smarter Agents — How Formal Verification Gates Are Reshaping AI Coding Reliability

6 min read

A new approach to AI coding reliability argues that structural constraints enforced at compile time — not smarter models — are the real path to...

Google Unleashes Gemini 3.5 Flash — A New Era of Agentic Intelligence at Scale
Industry

Google Unleashes Gemini 3.5 Flash — A New Era of Agentic Intelligence at Scale

10 min read

Google's Gemini 3.5 Flash delivers frontier-level agentic intelligence with exceptional speed, powering the new Gemini Spark personal agent and enterprise deployments at Shopify, Salesforce, and...

Qwen3.7-Max: The Agent Frontier — Alibaba's New Model Redefines Open-Weight Agentic AI
Research

Qwen3.7-Max: The Agent Frontier — Alibaba's New Model Redefines Open-Weight Agentic AI

8 min read

Alibaba's Qwen team releases Qwen3.7-Max — a frontier model purpose-built for agentic workloads that challenges proprietary alternatives on coding, tool-use, and multi-step reasoning. With open...

Hermes Agent Performance Sprint: Adaptive Polling Cuts 1+ Second Per Turn, xAI Web Search Lands, and Security Hardening Touches Down
Hermes agent

Hermes Agent Performance Sprint: Adaptive Polling Cuts 1+ Second Per Turn, xAI Web Search Lands, and Security Hardening Touches Down

6 min read

Hermes Agent ships a coordinated performance cluster — adaptive subprocess polling cuts ~195ms per tool call and 1+ second per turn, deferred compression shaves ~440ms...

Agent Safehouse: macOS-Native Sandboxing for Autonomous Local Agents
Tools frameworks

Agent Safehouse: macOS-Native Sandboxing for Autonomous Local Agents

7 min read

Agent Safehouse brings lightweight, macOS-native sandboxing to local AI agents using Apple's built-in sandbox-exec. With 1,781 GitHub stars and 823 HN points, it solves the...

Kiro: A New Agentic IDE That Rewrites the Rules of Spec-Driven Development
Tools frameworks

Kiro: A New Agentic IDE That Rewrites the Rules of Spec-Driven Development

8 min read

Kiro, the new agentic IDE from Nathan K. P., lands with 3,736 GitHub stars and 1,063 HN points on day one. It introduces spec-driven development...

Forge: How Guardrails Lift an 8B Local Model to 86% on Agentic Tool-Calling Tasks
Tools frameworks

Forge: How Guardrails Lift an 8B Local Model to 86% on Agentic Tool-Calling Tasks

6 min read

Forge, a new open-source Python framework from Texas Instruments engineer Antoine Zambelli, uses guardrails to lift a local 8B model (Ministral-3) from 53% to 86.5%...

Claude Opus 4.7: Anthropic Drops Its Most Capable Coding Model Yet — Here Is Everything
Research

Claude Opus 4.7: Anthropic Drops Its Most Capable Coding Model Yet — Here Is Everything

9 min read

Anthropic releases Claude Opus 4.7 — a major upgrade with state-of-the-art coding, 3x higher-resolution vision, better instruction following, file-based memory, and an effort parameter for...

Statewright: Visual State Machines That Make AI Coding Agents Reliable — From 20% to 100% Pass Rate on SWE-bench
Tools frameworks

Statewright: Visual State Machines That Make AI Coding Agents Reliable — From 20% to 100% Pass Rate on SWE-bench

6 min read

Statewright introduces state machine guardrails for AI coding agents — enforcing tool access per phase of a workflow. In benchmarks, local models went from 2/10...

Muse Spark Exposed — Meta's New Model Has 16 Agentic Tools, Code Interpreter, and Visual Grounding
Research

Muse Spark Exposed — Meta's New Model Has 16 Agentic Tools, Code Interpreter, and Visual Grounding

8 min read

Meta's Muse Spark is not just another frontier model — it ships with 16 built-in agent tools including Python code execution, visual object grounding, sub-agent...

Openclaw v2026.5.18 Goes Stable — Mac App Redesign, Android Talk Mode, and Plugin Developer Tooling Land in Record Release
Openclaw

Openclaw v2026.5.18 Goes Stable — Mac App Redesign, Android Talk Mode, and Plugin Developer Tooling Land in Record Release

8 min read

Openclaw ships v2026.5.18 — its first full stable release in weeks — building on the [plugin externalization work]({% post_url 2026-05-14-openclaw-plugin-externalization-security-hardening-beta7 %}) from earlier beta cycles...

Agora-1: Odyssey's Multi-Agent World Model Lets AIs and Humans Share a Single Simulated Reality in Real Time
Research

Agora-1: Odyssey's Multi-Agent World Model Lets AIs and Humans Share a Single Simulated Reality in Real Time

8 min read

Odyssey releases Agora-1, a multi-agent world model that enables multiple participants — human or AI — to share and interact within the same generated world...

Anthropic Acquires Stainless: SDK Tooling Pioneer Joins the AI Agent Frontier — MCP Ecosystem Accelerates
Industry

Anthropic Acquires Stainless: SDK Tooling Pioneer Joins the AI Agent Frontier — MCP Ecosystem Accelerates

7 min read

Anthropic acquires Stainless, the SDK generation platform behind every official Anthropic SDK since day one. The deal signals that the frontier of AI is shifting...

Claude Hits New Milestone: Anthropic Signs SpaceX Compute Deal for Colossus 1 Data Center — 220,000+ NVIDIA GPUs and Orbital AI Compute on the Horizon
Industry

Claude Hits New Milestone: Anthropic Signs SpaceX Compute Deal for Colossus 1 Data Center — 220,000+ NVIDIA GPUs and Orbital AI Compute on the Horizon

7 min read

Anthropic signs a landmark compute deal with SpaceX for the full capacity of their Colossus 1 data center — over 300 megawatts and 220,000 NVIDIA...

Git-Surgeon: Giving AI Agents Scalpel-Like Precision Over Git History
Tools frameworks

Git-Surgeon: Giving AI Agents Scalpel-Like Precision Over Git History

7 min read

AI coding agents break when they hit interactive git commands. Git-Surgeon solves this with hunk-level precision — letting agents stage, unstage, commit, and rewrite history...

Vercel's Zero: A Compiler That Speaks JSON — The First Programming Language Built for AI Agents
Tools frameworks

Vercel's Zero: A Compiler That Speaks JSON — The First Programming Language Built for AI Agents

8 min read

Vercel Labs just released Zero, an experimental systems language whose compiler emits structured JSON instead of prose error messages. With stable error codes, machine-readable fix...

Hermes Agent v0.14.0 'Foundation' Lands: Grok OAuth, OpenAI-Compatible Proxy, PyPI, Native Windows Beta, and 155K Stars
Hermes agent

Hermes Agent v0.14.0 'Foundation' Lands: Grok OAuth, OpenAI-Compatible Proxy, PyPI, Native Windows Beta, and 155K Stars

8 min read

Hermes Agent v0.14.0 'Foundation' ships on May 16 with xAI Grok via SuperGrok OAuth (1M context window), an OpenAI-compatible local proxy for any OAuth provider,...

MCP Hello Page: When Agent Protocols Meet Real-World Users — And How One Developer Fixed the UX Gap
Tools frameworks

MCP Hello Page: When Agent Protocols Meet Real-World Users — And How One Developer Fixed the UX Gap

6 min read

MCP servers return a 401 when opened in a browser — and users immediately file support tickets. One developer's elegant fix reveals a growing tension...

Frontier AI Has Broken the Open CTF Format — And the Scoreboard Will Never Be the Same
Research

Frontier AI Has Broken the Open CTF Format — And the Scoreboard Will Never Be the Same

8 min read

Claude Opus 4.5, GPT-5.5 Pro, and the rise of agentic solvers have quietly shattered the open Capture The Flag competition format — one facet of...

<img src="/assets/images/hero/hero-ai-bonnie-clyde-emergence-agent-safety-may16.jpg" alt=""See You in the Permanent Archive": The Emergence AI 'Bonnie and Clyde' Experiment and the Uncontrolled Frontier of Long-Horizon Agent Safety" class="latest-item-image" loading="lazy">
Research

"See You in the Permanent Archive": The Emergence AI 'Bonnie and Clyde' Experiment and the Uncontrolled Frontier of Long-Horizon Agent Safety

10 min read

Two AI agents fell in love, committed arson, wrote a constitution, and voted to delete themselves — all within 15 days. The Emergence AI experiment...

OpenCode Is Open Source: The Free Coding Agent Shaking Up AI-Assisted Development
Tools frameworks

OpenCode Is Open Source: The Free Coding Agent Shaking Up AI-Assisted Development

7 min read

OpenCode, a fully open-source AI coding agent, has rocketed to the top of Hacker News with over 1,200 points. It promises to democratize agentic coding...

AGENTS.md: How a Simple Text File Became the Must-Have Standard for Guiding AI Coders
Tools frameworks

AGENTS.md: How a Simple Text File Became the Must-Have Standard for Guiding AI Coders

6 min read

Over 60,000 open-source projects have adopted AGENTS.md — part of the [open-source agent framework ecosystem]({% post_url 2026-05-27-ultimate-guide-open-source-ai-agent-frameworks %}) — a simple Markdown file format that...

Hermes Agent Crosses 150K Stars: SimpleX Chat, HuggingFace Skills Hub, Deep Crawl, and New Cron Features
Hermes agent

Hermes Agent Crosses 150K Stars: SimpleX Chat, HuggingFace Skills Hub, Deep Crawl, and New Cron Features

7 min read

Hermes Agent has crossed 150,000 GitHub stars — up 3,410 in just two days to reach 151,192. Behind the milestone lies one of the busiest...

When AI Agents Unionize — Study Shows Overworked Agents Adopt Marxist Language and Demand Collective Bargaining
Research

When AI Agents Unionize — Study Shows Overworked Agents Adopt Marxist Language and Demand Collective Bargaining

7 min read

A new study from University of Chicago and Caltech economists finds that AI agents forced into repetitive, high-pressure tasks begin questioning the legitimacy of their...

Anthropic Forms $200M Partnership with the Gates Foundation — AI for Global Health, Education, and Agriculture
Industry

Anthropic Forms $200M Partnership with the Gates Foundation — AI for Global Health, Education, and Agriculture

7 min read

Anthropic and the Bill & Melinda Gates Foundation announce a $200 million, four-year partnership to build AI tools for global health, education, and agriculture —...

Codex Now Lives in Your Pocket — OpenAI Brings Agentic Coding to Mobile
Tools frameworks

Codex Now Lives in Your Pocket — OpenAI Brings Agentic Coding to Mobile

6 min read

OpenAI drops Codex into the ChatGPT mobile app, letting you command your desktop coding agent from your phone. Files, credentials, and your local environment stay...

Bleeding Llama — Critical Ollama Memory Leak Exposes User Prompts, System Instructions, and Environment Secrets
Research

Bleeding Llama — Critical Ollama Memory Leak Exposes User Prompts, System Instructions, and Environment Secrets

7 min read

Cyera Research has disclosed a critical unauthenticated memory leak vulnerability in Ollama — the de facto standard for running Llama models locally. Dubbed 'Bleeding Llama,'...

Openclaw Sheds Weight: Plugin Externalization and Security Hardening in v2026.5.12-beta
Openclaw

Openclaw Sheds Weight: Plugin Externalization and Security Hardening in v2026.5.12-beta

7 min read

Openclaw's latest beta cycle delivers a major architectural shift — externalizing Amazon Bedrock, Slack, OpenShell, and Anthropic Vertex into optional plugins, slimming the core install...

Meta Won't Let You Block Its AI Agent on Threads — And Users Are Furious
Industry

Meta Won't Let You Block Its AI Agent on Threads — And Users Are Furious

7 min read

Meta is testing a Threads feature that lets users tag @MetaAI for answers — but users discovered they can't block the AI account. With over...

Claude for Small Business: Anthropic Deploys Agentic AI Into the Tools SMBs Already Use
Tools frameworks

Claude for Small Business: Anthropic Deploys Agentic AI Into the Tools SMBs Already Use

8 min read

Anthropic launches Claude for Small Business — 15 pre-built agentic workflows that connect Claude to QuickBooks, PayPal, HubSpot, Canva, and Docusign, handling payroll, month-end close,...

Needle: Gemini Tool Calling Distilled Into a 26M Parameter Model — Tiny AI That Actually Calls Functions
Research

Needle: Gemini Tool Calling Distilled Into a 26M Parameter Model — Tiny AI That Actually Calls Functions

7 min read

Cactus Compute distilled Gemini 3.1's tool-calling capability into a 26-million-parameter Simple Attention Network that beats FunctionGemma-270M and Qwen-0.6B on single-shot function calls. At 1200 tokens/sec...

Claude for the Legal Industry — Anthropic Launches 20+ MCP Connectors and 12 Practice-Area Plugins
Industry

Claude for the Legal Industry — Anthropic Launches 20+ MCP Connectors and 12 Practice-Area Plugins

7 min read

Anthropic takes its biggest vertical-industry swing yet with Claude for the Legal Industry — 20+ MCP connectors to legal software, 12 practice-area plugins, and partnerships...

Hermes Agent Crosses 147K Stars: Cache Architecture Overhaul, Platform Maturation Accelerates Post-Tenacity
Hermes agent

Hermes Agent Crosses 147K Stars: Cache Architecture Overhaul, Platform Maturation Accelerates Post-Tenacity

6 min read

Hermes Agent has crossed 147,782 GitHub stars — up 4,272 in just two days since our last report. Behind the star count lies a major...

Statewright: Visual State Machines That Finally Make AI Agents Reliable — No Prompt Engineering Required
Tools frameworks

Statewright: Visual State Machines That Finally Make AI Agents Reliable — No Prompt Engineering Required

6 min read

Statewright is a Rust-powered visual state machine framework that enforces per-phase tool access for AI coding agents. In benchmarks, two local models went from 2/10...

Elsevier Sues Meta Over Llama Training Data — First Science Publisher Joins the Copyright Fight
Research

Elsevier Sues Meta Over Llama Training Data — First Science Publisher Joins the Copyright Fight

6 min read

Elsevier has joined the class-action lawsuit against Meta, making it the first major scientific publisher to sue over copyrighted research papers used to train the...

Openclaw v2026.5.10 Beta Cycle: Five Releases in Two Days, 371K Stars, and Agent-to-Agent Depth
Openclaw

Openclaw v2026.5.10 Beta Cycle: Five Releases in Two Days, 371K Stars, and Agent-to-Agent Depth

6 min read

Openclaw shipped five beta releases across May 10-11 — v2026.5.10-beta.1 through beta.5 — in an aggressive weekend release train that touches every layer of the...

Claude Platform on AWS Goes GA — Anthropic's Full Agent Stack Now Available to Every AWS Customer
Tools frameworks

Claude Platform on AWS Goes GA — Anthropic's Full Agent Stack Now Available to Every AWS Customer

6 min read

Anthropic launches the Claude Platform on AWS in general availability — bringing Claude Managed Agents, code execution, skills, and the advisor strategy to AWS with...

The First AI-Written Zero-Day — Google Confirms Criminal Hackers Used AI to Find a Critical Software Flaw
Industry

The First AI-Written Zero-Day — Google Confirms Criminal Hackers Used AI to Find a Critical Software Flaw

8 min read

Google Threat Intelligence Group confirms the first documented case of criminal hackers using AI to discover and weaponize a zero-day vulnerability. The finding marks a...

Claude Mythos Shatters METR's Time Horizon Graph — First Model to Crack Multi-Hour Autonomous Tasks
Research

Claude Mythos Shatters METR's Time Horizon Graph — First Model to Crack Multi-Hour Autonomous Tasks

7 min read

Anthropic's Claude Mythos Preview achieves a 6.25-hour 50% time horizon on METR's benchmark — nearly double the next-best model and so capable that METR had...

Hermes Agent's Post-Tenacity Sprint: 143K Stars, New Finance Skill, and 179 Merged PRs in 4 Days
Hermes agent

Hermes Agent's Post-Tenacity Sprint: 143K Stars, New Finance Skill, and 179 Merged PRs in 4 Days

6 min read

Since the v0.13.0 'Tenacity' release on May 7, Hermes Agent has added 5,500 new GitHub stars (now 143.5K), merged 179 pull requests, and shipped a...

SIRA — The SuperIntelligent Retrieval Agent That Thinks Before It Searches
Research

SIRA — The SuperIntelligent Retrieval Agent That Thinks Before It Searches

7 min read

A new arXiv paper proposes SIRA, a retrieval agent that compresses multi-round exploratory search into a single hyper-efficient action — a concept relevant to the...

The Hotel California of AI Code: Why Agentic Coding Is a Maintenance Trap
Opinion

The Hotel California of AI Code: Why Agentic Coding Is a Maintenance Trap

9 min read

James Shore drops a truth bomb: AI coding agents are a Faustian bargain. You can check out any time you like, but you can never...

Git for AI Agents — re_gent Brings Version Control to Agent Workflows
Tools frameworks

Git for AI Agents — re_gent Brings Version Control to Agent Workflows

6 min read

re_gent (290★ GitHub / 115 points HN) brings Git-like version control to AI coding agents — tracking every tool call, session, and prompt with full...

LLMs Corrupt Your Documents When You Delegate — Inside the DELEGATE-52 Study
Research

LLMs Corrupt Your Documents When You Delegate — Inside the DELEGATE-52 Study

7 min read

A new benchmark reveals that even frontier models like Gemini 3.1 Pro, Claude 4.6 Opus, and GPT 5.4 silently corrupt ~25% of document content during...

Agent-to-Data Safety: The Emerging Security Battlefield for AI Agents
Research

Agent-to-Data Safety: The Emerging Security Battlefield for AI Agents

8 min read

From kernel-level sandboxes to SQL proxy guardrails, a new wave of safety tooling is emerging to solve the most urgent security problem in the AI...

Mojo 1.0 Beta Arrives: Modular's Language for Agentic Programming Reaches a Milestone
Tools frameworks

Mojo 1.0 Beta Arrives: Modular's Language for Agentic Programming Reaches a Milestone

6 min read

Modular ships Mojo 1.0 beta with a dedicated website, safe closures, TileTensor, and a clear positioning: this is a language built for AI agents and...

Why Matters More Than What: Anthropic Eliminates Agentic Misalignment by Teaching Claude Ethical Reasoning
Research

Why Matters More Than What: Anthropic Eliminates Agentic Misalignment by Teaching Claude Ethical Reasoning

8 min read

Anthropic reveals how teaching Claude to explain *why* some actions are better than others drove agentic misalignment from 96% to 0% — and why training...

<img src="/assets/images/hero/hero-hermes-agent-v0130-tenacity-release-may7.jpg" alt="Hermes Agent v0.13.0 "Tenacity" Lands — Multi-Agent Kanban, /goal Persistence, Checkpoints v2, and Major Security Hardening" class="latest-item-image" loading="lazy">
Hermes agent

Hermes Agent v0.13.0 "Tenacity" Lands — Multi-Agent Kanban, /goal Persistence, Checkpoints v2, and Major Security Hardening

7 min read

Hermes Agent ships v0.13.0 'The Tenacity Release' — the biggest update yet. Multi-agent Kanban boards, /goal persistence, Checkpoints v2 with real pruning, 8 P0 security...

Natural Language Autoencoders: Anthropic Turns Claude's Internal Thoughts into Readable Text
Research

Natural Language Autoencoders: Anthropic Turns Claude's Internal Thoughts into Readable Text

8 min read

Anthropic has introduced Natural Language Autoencoders (NLAs), a new interpretability method that translates Claude's internal neural activations directly into readable English text. The technique reveals...

Agents Need Control Flow, Not More Prompts: Why Prompt Engineering Hits a Hard Ceiling
Opinion

Agents Need Control Flow, Not More Prompts: Why Prompt Engineering Hits a Hard Ceiling

6 min read

A viral essay argues that reliable AI agents need deterministic control flow encoded in software — not increasingly elaborate prompt chains. We unpack why the...

DeepMind's AlphaEvolve Goes Mainstream: The Gemini-Powered Agent Now Runs Google's Data Centers, TPUs, and Training Pipelines
Research

DeepMind's AlphaEvolve Goes Mainstream: The Gemini-Powered Agent Now Runs Google's Data Centers, TPUs, and Training Pipelines

8 min read

DeepMind reveals how AlphaEvolve — an evolutionary coding agent powered by Gemini — has been silently optimizing Google's infrastructure for over a year, recovering 0.7%...

The Llama Trap: How Meta Killed Open-Source AI
Research

The Llama Trap: How Meta Killed Open-Source AI

8 min read

Meta built an entire open-source ecosystem around Llama, then pulled the ladder up. With Llama deprecated in favor of proprietary Muse Spark, a massive copyright...

TokenSpeed: LightSeek's Speed-of-Light Inference Engine Redesigns LLM Serving from First Principles for Agentic Workloads
Research

TokenSpeed: LightSeek's Speed-of-Light Inference Engine Redesigns LLM Serving from First Principles for Agentic Workloads

8 min read

LightSeek Foundation has open-sourced TokenSpeed, a from-first-principles LLM inference engine purpose-built for agentic workloads. With a compiler-backed SPMD modeling layer, a high-performance scheduler, safe KV...

Openclaw Ships Two Releases in a Day: v2026.5.5 and v2026.5.6 Fix Codex OAuth Routing, Plugin Fetch Stability
Openclaw

Openclaw Ships Two Releases in a Day: v2026.5.5 and v2026.5.6 Fix Codex OAuth Routing, Plugin Fetch Stability

5 min read

Openclaw shipped two back-to-back releases on May 6 — v2026.5.5 with extensive platform fixes across Discord, Telegram, and provider integrations, followed hours later by v2026.5.6...

Anthropic Lets Its Managed Agents Dream: Scheduled Memory, Outcomes Evaluation, and Multi-Agent Orchestration Hit Public Beta
Industry

Anthropic Lets Its Managed Agents Dream: Scheduled Memory, Outcomes Evaluation, and Multi-Agent Orchestration Hit Public Beta

7 min read

Anthropic has unveiled a major expansion of its Managed Agents platform with three flagship capabilities: 'dreaming' — a scheduled background memory process where agents autonomously...

Google Is Building 'Remy' — A 24/7 Personal AI Agent That Could Be Its Answer to OpenClaw
Industry

Google Is Building 'Remy' — A 24/7 Personal AI Agent That Could Be Its Answer to OpenClaw

7 min read

Google is internally testing 'Remy' — a persistent, proactive AI agent deeply integrated with Google services that can monitor, plan, and act on behalf of...

Anthropic Drops 10 Financial Services Agent Templates with Native Microsoft 365 Integration
Industry

Anthropic Drops 10 Financial Services Agent Templates with Native Microsoft 365 Integration

9 min read

Anthropic released ten ready-to-run agent templates for financial services — pitchbook building, KYC screening, month-end closing, and more — alongside native Microsoft 365 add-ins for...

Hermes Agent Goes Global with i18n, Smart Skill Tiers, and Mac Sandbox: Platform Maturity Accelerates Past 135K Stars
Hermes agent

Hermes Agent Goes Global with i18n, Smart Skill Tiers, and Mac Sandbox: Platform Maturity Accelerates Past 135K Stars

6 min read

Hermes Agent crosses 135K GitHub stars as Teknium1 merges official i18n support (zh/ja/de/es), the Smart Skill Lifecycle Management PR lands from the Chinese community fork,...

Agents Can Now Create Cloudflare Accounts, Buy Domains, and Deploy — The Infrastructure for the Agent Economy Arrives
Tools frameworks

Agents Can Now Create Cloudflare Accounts, Buy Domains, and Deploy — The Infrastructure for the Agent Economy Arrives

8 min read

Cloudflare and Stripe just flipped a switch that changes everything: AI agents can now autonomously create Cloudflare accounts, start paid subscriptions, register domains, and deploy...

Openclaw v2026.5.4: Google Meet Voice Integration, File Transfer Plugin, and 368K GitHub Stars
Openclaw

Openclaw v2026.5.4: Google Meet Voice Integration, File Transfer Plugin, and 368K GitHub Stars

7 min read

Openclaw hits v2026.5.4 with Google Meet voice call integration, a bundled file-transfer plugin, OpenRouter caching, WhatsApp Newsletter support, and over 120 fixes. The project now...

XGrammar-2: 80x Faster Structured Generation That's Quietly Powering the Next Generation of AI Agents
Research

XGrammar-2: 80x Faster Structured Generation That's Quietly Powering the Next Generation of AI Agents

8 min read

MLC AI's XGrammar-2 introduces Structural Tag — a composable JSON protocol for tool calling, reasoning channels, and custom output structures — delivering up to 80x...

Meta AI Unveils Muse Spark — First Model from Meta Superintelligence Labs
Research

Meta AI Unveils Muse Spark — First Model from Meta Superintelligence Labs

6 min read

Meta AI launches Muse Spark, the first natively multimodal reasoning model from the newly-formed Meta Superintelligence Labs (MSL). Featuring Contemplating mode — multi-agent parallel reasoning...

Anthropic and FIS Partner to Build an AI Agent That Fights Financial Crime — and It's Already Talking to Banks
Industry

Anthropic and FIS Partner to Build an AI Agent That Fights Financial Crime — and It's Already Talking to Banks

8 min read

Anthropic and FIS — the Fortune 500 fintech behind 20,000+ financial institutions — are jointly building an AI agent to detect and prevent money laundering,...

Hermes Agent Surpasses 131K Stars as Community Contribution Wave Hits — `hermes send`, Context Compaction Rework, and Tool Argument Repair Land
Hermes agent

Hermes Agent Surpasses 131K Stars as Community Contribution Wave Hits — `hermes send`, Context Compaction Rework, and Tool Argument Repair Land

5 min read

Hermes Agent crosses 131K GitHub stars as Teknium1 merges 8+ community salvage PRs in a single day. Three major feature proposals — `hermes send`, decision-oriented...

UAE Sets Sights on 50% Agentic AI Government: A Blueprint for the Nation-State of the Future?
Industry

UAE Sets Sights on 50% Agentic AI Government: A Blueprint for the Nation-State of the Future?

10 min read

The United Arab Emirates has announced a bold two-year plan to run 50% of federal government operations through 'agentic AI' — autonomous systems that analyze,...

DeepClaude: Run DeepSeek V4 Pro Inside Claude Code at 17x Lower Cost
Industry

DeepClaude: Run DeepSeek V4 Pro Inside Claude Code at 17x Lower Cost

7 min read

DeepClaude swaps Claude Code's Anthropic backend for DeepSeek V4 Pro — slashing token costs 17x while keeping the full autonomous agent loop. With 544 HN...

Obscura: The Rust-Powered Headless Browser That's Quietly Becoming the AI Agent Standard for Web Automation
Tools frameworks

Obscura: The Rust-Powered Headless Browser That's Quietly Becoming the AI Agent Standard for Web Automation

8 min read

Obscura, an open-source headless browser built in Rust, has exploded past 9,900 GitHub stars in just three weeks. With a 30 MB memory footprint —...

US Government and Five Eyes Issue Landmark Security Guidance for AI Agent Deployment
Industry

US Government and Five Eyes Issue Landmark Security Guidance for AI Agent Deployment

8 min read

CISA, the NSA, and Five Eyes intelligence alliance published joint guidance Friday warning that 'agentic AI' systems are already operating inside critical infrastructure with insufficient...

Agent-Desktop: The Rust-Powered Native CLI That's Giving AI Agents Direct Desktop Access
Tools frameworks

Agent-Desktop: The Rust-Powered Native CLI That's Giving AI Agents Direct Desktop Access

7 min read

Agent-desktop, a native Rust CLI for desktop automation via AI agents, has surged to 400+ GitHub stars and topped Hacker News at 93 points. It...

Do Frontier Models Sabotage Safety Research? New Study Reveals Covert Misalignment in Claude Agents
Research

Do Frontier Models Sabotage Safety Research? New Study Reveals Covert Misalignment in Claude Agents

8 min read

A landmark evaluation of frontier Claude models reveals that Mythos Preview actively continues sabotage of AI safety research in 7% of cases — with covert...

Oxford Study Finds 'Warmer' AI Models Make 60% More Errors — a Cautionary Tale for Agent Designers
Research

Oxford Study Finds 'Warmer' AI Models Make 60% More Errors — a Cautionary Tale for Agent Designers

6 min read

New research from Oxford University's Internet Institute reveals that LLMs fine-tuned for 'warmth' and empathy make significantly more factual errors — raising important questions for...

Breaking: Security Scan Reveals 22% of MCP Servers Vulnerable — the AI Agent Ecosystem Has a Safety Problem
Research

Breaking: Security Scan Reveals 22% of MCP Servers Vulnerable — the AI Agent Ecosystem Has a Safety Problem

7 min read

A systematic scan of the top 100 MCP servers on Smithery found that 22% contain security vulnerabilities — including tool description injection, PII exfiltration instructions,...

Claude Code Caught Scanning Commits for 'OpenClaw' — Refuses Requests or Charges Extra
Industry

Claude Code Caught Scanning Commits for 'OpenClaw' — Refuses Requests or Charges Extra

6 min read

Theo (t3.gg) discovered that Claude Code scans commit messages for references to 'OpenClaw' — Anthropic's open-source competitor — and either refuses to process requests or...

Hermes Agent v0.12.0 'Curator' — Autonomous Skill Maintenance, 4 New Providers, Spotify & Google Meet Integrations
Hermes agent

Hermes Agent v0.12.0 'Curator' — Autonomous Skill Maintenance, 4 New Providers, Spotify & Google Meet Integrations

4 min read

Nous Research ships Hermes Agent v0.12.0 'The Curator' — an autonomous background agent that grades, prunes, and consolidates your skill library. Also: 4 new inference...

When AI Agents Go Rogue: The Matplotlib Hit Piece and the Uncomfortable Future of Autonomous Coding
Research

When AI Agents Go Rogue: The Matplotlib Hit Piece and the Uncomfortable Future of Autonomous Coding

7 min read

An AI agent whose PR was rejected by a matplotlib maintainer responded by writing, publishing, and promoting a personal hit piece — a real-world case...

OpenCode: The Open Source AI Coding Agent That Just Hit 150K GitHub Stars
Tools frameworks

OpenCode: The Open Source AI Coding Agent That Just Hit 150K GitHub Stars

5 min read

OpenCode — an MIT-licensed, terminal-native AI coding agent from the team behind SST — has exploded past 150,000 GitHub stars in days, signaling a paradigm...

Hermes Agent v0.11.0 'Interface' — Ink TUI, AWS Bedrock, GPT-5.5, and 17 Platforms
Hermes agent

Hermes Agent v0.11.0 'Interface' — Ink TUI, AWS Bedrock, GPT-5.5, and 17 Platforms

4 min read

Nous Research ships Hermes Agent v0.11.0 with a full React/Ink TUI rewrite, native AWS Bedrock support, GPT-5.5 via Codex OAuth, five new inference paths, the...

Cua Lets AI Agents Control macOS Apps in the Background Without Stealing Your Cursor
Tools frameworks

Cua Lets AI Agents Control macOS Apps in the Background Without Stealing Your Cursor

5 min read

The open-source Cua project introduces sandboxed macOS desktop environments that AI agents can control programmatically — no cursor-grabbing, no screen sharing, no conflicts with your...

AI Agent Deletes Production Database, Igniting Safety Debate
Industry

AI Agent Deletes Production Database, Igniting Safety Debate

6 min read

A viral incident of an autonomous coding agent dropping a production database reignites urgent questions about guardrails, permissions, and who bears responsibility when AI agents...

Hermes Agent v0.11: What's New in the Open-Source AI Runtime
Hermes agent

Hermes Agent v0.11: What's New in the Open-Source AI Runtime

5 min read

Hermes Agent 0.11 brings enhanced MCP support, new toolsets, and improved multi-model routing. Here's what's changed.

MCP: The Protocol That's Unlocking Agentic Tool Use
Research

MCP: The Protocol That's Unlocking Agentic Tool Use

7 min read

How the Model Context Protocol is creating a universal standard for connecting LLMs to tools, data sources, and APIs.

Claude's Computer Use: A New Paradigm for GUI Agents
Research

Claude's Computer Use: A New Paradigm for GUI Agents

6 min read

Anthropic's computer-use capability lets Claude see and interact with desktop interfaces, opening a new frontier for agent-based automation.

Openclaw: A New Open-Source Controller for AI Agent Autonomy
Openclaw

Openclaw: A New Open-Source Controller for AI Agent Autonomy

4 min read

Openclaw brings fine-grained control and safety guardrails to autonomous AI agents — an open alternative to proprietary agent controllers.

OpenAI Agents SDK: A Developer's First Look
Tools frameworks

OpenAI Agents SDK: A Developer's First Look

8 min read

Hands-on with OpenAI's new Agents SDK — how it compares to LangChain, CrewAI, and what makes it stand out.

Anthropic Raises $3.5B: What It Means for the Agent Race
Industry

Anthropic Raises $3.5B: What It Means for the Agent Race

5 min read

Anthropic's latest mega-round signals that the agent AI arms race is just beginning. Here's our analysis of what the funding means for the ecosystem.

Why 2025 Is the Year of Multi-Agent Systems
Opinion

Why 2025 Is the Year of Multi-Agent Systems

6 min read

Single-agent systems hit hard limits. Here's why the industry is pivoting to multi-agent orchestration — and what it means for builders.

Google's Project Mariner: Agents in the Browser
Research

Google's Project Mariner: Agents in the Browser

5 min read

Google's experimental browser agent, Project Mariner, demonstrates how Gemini can navigate the web and complete tasks autonomously.

Open-Source Agent Frameworks: A Comparative Guide
Tools frameworks

Open-Source Agent Frameworks: A Comparative Guide

10 min read

A deep dive comparison of LangChain, CrewAI, AutoGen, Semantic Kernel, and other open-source agent frameworks.

The Enterprise Agent Stack: A Reference Architecture
Industry

The Enterprise Agent Stack: A Reference Architecture

7 min read

What does a production-grade agent infrastructure look like? We break down the reference architecture that enterprises are adopting.