Uncategorized

Microsoft’s open-source SkillOpt automatically upgrades AI agent skills without touching model weights

Agent skills have become an important part of real-world AI applications, providing a mechanism — a set of instructions saved in a folder of text-based markdown (.md) files, usually — for models to adapt to specific enterprise use cases and complex workflows. However, optimizing these skills is a slow process and faulty process, as they…

Read More

Xiaomi’s new open source, agentic AI coding harness MiMo Code beats Claude Code at ultra-long, 200+ step tasks

Xiaomi’s MiMo AI team has open-sourced MiMo Code V0.1.0, a terminal-native AI coding assistant that the Chinese electronics giant says outperforms Anthropic’s Claude Code on key agentic coding benchmarks, especially on long-horizon, multi-step tasks (200+ steps) — at least, according to its own internal beta release and survey of 576 developers. It’s also bundling limited-time…

Read More

Context compression finally works in production: new research cuts LLM input 16x without the accuracy hit

Context windows are becoming a computational bottleneck. The longer an agent runs, the more tokens accumulate from retrieved documents, reasoning traces and conversation history, and the more memory and compute that growing context demands. Most existing solutions either degrade model accuracy, require the full context to load before compression begins, or produce memory savings that…

Read More

What AI benchmarks miss about real-world performance

Presented by F5 Enterprise AI teams have spent years solving for compute, securing GPU allocations, negotiating cloud capacity, and benchmarking training throughput. The assumption embedded in that work is that the path between storage and compute will keep up. In production, that assumption increasingly does not hold. Real traffic introduces latency spikes, network jitter, and…

Read More

Google’s DiffusionGemma generates 256 tokens in parallel and self-corrects as it goes

GenAI image generators like Stable Diffusion do not draw a picture pixel by pixel from left to right. They start with noise and iteratively refine the entire image in parallel until it converges, in a process known as diffusion. For years, applying that same principle to text generation had remained out of reach at scale….

Read More

Surprise upset: GPT-5.5 beats Claude Fable 5 on brutal new Agents’ Last Exam benchmark

Researchers from the University of California, Berkeley’s Center for Responsible, Decentralized Intelligence (RDI), alongside an advisory committee of over 300 domain experts, have launched Agents’ Last Exam (ALE)—a grueling new benchmark built to measure whether artificial intelligence can actually execute economically valuable, long-horizon professional workflows. In a shocking upset, OpenAI’s GPT-5.5 from April, operating through…

Read More

Researchers say they trained a foundation model from scratch for about $1,500

Training a foundation LLM from scratch costs millions and requires internet-scale data — which is why most enterprises don’t bother. Sapient thinks it has a cheaper path. To overcome this brute-force scaling dogma, researchers at Sapient developed HRM-Text, which replaces standard Transformers with a highly sample-efficient Hierarchical Recurrent Model (HRM), an architecture they first introduced…

Read More

Anthropic CEO calls for FAA-style regulation of powerful AI models: what enterprises should know

In a sweeping new essay titled “Policy on the AI Exponential,” Anthropic co-founder and CEO Dario Amodei publicly calls for new government regulations governing the release of powerful AI models — specifically comparing AI industry to commercial aviation, which follows regulations enforced by the U.S. Federal Aviation Administration (FAA) — arguing that this is necessary…

Read More

MassMutual’s AI strategy: 12-month contracts, 30% productivity gains, zero lock-in

Enterprise AI teams face a dilemma: The best models today might not be the best models a year from now. MassMutual’s answer is to stop making long-term bets — and build infrastructure that can swap models as the market shifts. “The world of AI today is extremely dynamic,” Sears Merritt, MassMutual CIO, explained in a…

Read More

Apple’s new Siri AI is more than just a smarter assistant — it’s a new enterprise app layer

Apple’s new Siri AI, unveiled yesterday at Apple’s annual Worldwide Developers Conference (WWDC 2026), may look like a consumer product story on the surface. But for enterprise developers and IT leaders, the bigger news from WWDC26 is that Apple is turning Siri into a systemwide AI interface for apps, data and workplace actions across iPhone,…

Read More