Uncategorized

The ‘truth serum’ for AI: OpenAI’s new method for training models to confess their mistakes

OpenAI researchers have introduced a novel method that acts as a “truth serum” for large language models (LLMs), compelling them to self-report their own misbehavior, hallucinations and policy violations. This technique, “confessions,” addresses a growing concern in enterprise AI: Models can be dishonest, overstating their confidence or covering up the shortcuts they take to arrive…

Read More

AWS launches Kiro powers with Stripe, Figma, and Datadog integrations for AI-assisted coding

Amazon Web Services on Wednesday introduced Kiro powers, a system that allows software developers to give their AI coding assistants instant, specialized expertise in specific tools and workflows — addressing what the company calls a fundamental bottleneck in how artificial intelligence agents operate today. AWS made the announcement at its annual re:Invent conference in Las…

Read More

Gong study: Sales teams using AI generate 77% more revenue per rep

The debate over whether artificial intelligence belongs in the corporate boardroom appears to be over — at least for the people responsible for generating revenue. Seven in ten enterprise revenue leaders now trust AI to regularly inform their business decisions, according to a sweeping new study released Thursday by Gong, the revenue intelligence company. The…

Read More

GAM takes aim at “context rot”: A dual-agent memory architecture that outperforms long-context LLMs

For all their superhuman power, today’s AI models suffer from a surprisingly human flaw: They forget. Give an AI assistant a sprawling conversation, a multi-step reasoning task or a project spanning days, and it will eventually lose the thread. Engineers refer to this phenomenon as “context rot,” and it has quietly become one of the…

Read More

Anthropic vs. OpenAI red teaming methods reveal different security priorities for enterprise AI

Model providers want to prove the security and robustness of their models, releasing system cards and conducting red-team exercises with each new release. But it can be difficult for enterprises to parse through the results, which vary widely and can be misleading. Anthropic’s 153-page system card for Claude Opus 4.5 versus OpenAI’s 60-page GPT-5 system…

Read More

Gemini 3 Pro scores 69% trust in blinded testing up from 16% for Gemini 2.5: The case for evaluating AI on real-world trust, not academic benchmarks

Just a few short weeks ago, Google debuted its Gemini 3 model, claiming it scored a leadership position in multiple AI benchmarks. But the challenge with vendor-provided benchmarks is that they are just that — vendor-provided. A new vendor-neutral evaluation from Prolific, however, puts Gemini 3 at the top of the leaderboard. This isn’t on…

Read More

Tariff turbulence exposes costly blind spots in supply chains and AI

Presented by Celonis When tariff rates change overnight, companies have 48 hours to model alternatives and act before competitors secure the best options. At Celosphere 2025 in Munich, enterprises demonstrated how they’re turning that chaos into competitive advantage — with quantifiable results that separate winners from losers. Vinmar International: Theglobal plastics and chemicals distributor created…

Read More

AI has redefined the talent game. Here’s how leaders are responding.

Presented by Indeed As AI continues to reshape how we work, organizations are rethinking what skills they need, how they hire, and how they retain talent. According to Indeed’s 2025 Tech Talent report, tech job postings are still down more than 30% from pre-pandemic highs, yet demand for AI expertise has never been greater. New…

Read More

Workspace Studio aims to solve the real agent problem: Getting employees to use them

One problem enterprises face is getting employees to actually use the AI agents their dev teams have built.  Google, which has already shipped many AI tools through its Workspace apps, has made Google Workspace Studio generally available to give more employees access to design, manage and share AI agents, further democratizing agentic workflows. This puts…

Read More

AWS claims 90% vector cost savings with S3 Vectors GA, calls it ‘complementary’ – analysts split on what it means for vector databases

Vector databases emerged as a must-have technology foundation at the beginning of the modern gen AI era.  What has changed over the last year, however, is that vectors, the numerical representations of data used by LLMs, have increasingly become just another data type in all manner of different databases. Now, Amazon Web Services (AWS) is…

Read More