Uncategorized

Google and AWS split the AI agent stack between control and execution

The era of enterprises stitching together prompt chains and shadow agents is nearing its end as more options for orchestrating complex multi-agent systems emerge. As organizations move AI agents into production, the question remains: “how will we manage them?” Google and Amazon Web Services offer fundamentally different answers, illustrating a split in the AI stack….

Read More

Are you paying an AI ‘swarm tax’? Why single agents often beat complex systems

Enterprise teams building multi-agent AI systems may be paying a compute premium for gains that don’t hold up under equal-budget conditions. New Stanford University research finds that single-agent systems match or outperform multi-agent architectures on complex reasoning tasks when both are given the same thinking token budget. However, multi-agent systems come with the added baggage…

Read More

OpenAI launches Privacy Filter, an open source, on-device data sanitization model that removes personal information from enterprise datasets

In a significant shift toward local-first privacy infrastructure, OpenAI has released Privacy Filter, a specialized open-source model designed to detect and redact personally identifiable information (PII) before it ever reaches a cloud-based server. Launched today on AI code sharing community Hugging Face under a permissive Apache 2.0 license, the tool addresses a growing industry bottleneck:…

Read More

Google doesn’t pay the Nvidia tax. Its new TPUs explain why.

Every frontier AI lab right now is rationing two things: electricity and compute. Most of them buy their compute for model training from the same supplier, at the steep gross margins that have turned Nvidia into one of the most valuable companies in the world. Google does not. On Tuesday night, inside a private gathering…

Read More

Google’s new Deep Research and Deep Research Max agents can search the web and your private data

Google on Monday unveiled the most significant upgrade to its autonomous research agent capabilities since the product’s debut, launching two new agents — Deep Research and Deep Research Max — that for the first time allow developers to fuse open web data with proprietary enterprise information through a single API call, produce native charts and…

Read More

Vercel breach exposes the OAuth gap most security teams cannot detect, scope or contain

One employee at Vercel adopted an AI tool. One employee at that AI vendor got hit with an infostealer. That combination created a walk-in path to Vercel’s production environments through an OAuth grant that nobody had reviewed. Vercel, the cloud platform behind Next.js and its millions of weekly npm downloads, confirmed on Sunday that attackers…

Read More

The AI governance mirage: Why 72% of enterprises don’t have the control and security they think they do

Decision makers at 72% of organizations claim to have two or more AI platforms that they identify as their “primary” layer, according to a survey of 40 enterprise companies conducted by VentureBeat last month, revealing real gaps in security and control.  For enterprise management and technical leaders, and especially security leaders, these multiple AI platforms…

Read More

OpenAI’s ChatGPT Images 2.0 is here and it does multilingual text, full infographics, slides, maps, even manga — seemingly flawlessly

It’s been only a few months since OpenAI released its last big improvement to AI image generations in ChatGPT and through its application programming interface (API) — namely, a new image generation model known as GPT-Image-1.5, released in December 2025, which brought about improved instruction following, colors, and lighting. Now, after weeks of testing, the…

Read More

Kimi K2.6 runs agents for days — and exposes the limits of enterprise orchestration

Most orchestration frameworks were built for agents that run for seconds or minutes. Now that agents are running for hours — and in some cases days — those frameworks are starting to crack. Several model providers, such as Anthropic with Claude Code and OpenAI with Codex, introduced early support for long-horizon agents through multi-session tasks,…

Read More

Train-to-Test scaling explained: How to optimize your end-to-end AI compute budget for inference

The standard guidelines for building large language models (LLMs) optimize only for training costs and ignore inference costs. This poses a challenge for real-world applications that use inference-time scaling techniques to increase the accuracy of model responses, such as drawing multiple reasoning samples from a model at deployment. To bridge this gap, researchers at University…

Read More