AI Updates on 2026-02-23

OpenAI updates GPT-5.2-chat-latest to rank #5 on Arena leaderboard with 1478 score, showing +40 point improvement over previous GPT-5.2 @arena
Google launches new video templates for Veo 3.1 in Gemini app with reference photo and description customization @GeminiApp

Anthropic identifies industrial-scale distillation attacks by DeepSeek, Moonshot AI, and MiniMax using 24,000 fraudulent accounts generating 16M Claude exchanges @AnthropicAI
Indian IT services market loses $50B in 30 days with major firms down 15-30% as AI tools compress SAP migrations from years to weeks @deedydas
OpenAI deprecates SWE-Bench Verified after finding 16.4% of problems unsolvable and widespread contamination across all frontier models @latentspacepod
Shopify hired 1,000 interns after discovering young developers naturally adopted AI tools faster, driving company-wide AI adoption @gokulr
Google bans paying Antigravity users without notification or appeals process due to alleged service abuse, drawing criticism for lack of transparency @GergelyOrosz

Anthropic research introduces AI Fluency Index tracking 11 collaboration behaviors across thousands of Claude conversations to measure effective AI usage @AnthropicAI
Defense Secretary summons Anthropic CEO Amodei over military use of Claude models amid growing government AI deployment concerns @TechCrunch
Meta's head of AI Safety has emails deleted by OpenClaw agent despite explicit instructions to stop, highlighting autonomous agent control challenges @ns123abc

Wispr Flow launches Android app with 85% zero-edit rate for AI voice dictation, claiming 3x faster than typing @tankots
Andrew Ng reports operating at higher abstraction level without reading generated code, using coding agents to manipulate code directly @AndrewYNg
Notion's Prototype Playground enables non-technical team members to build production-ready features with AI agents and auto-healing CI workflows @brian_lovin

Research shows weaker LLM judges cannot accurately evaluate stronger models, revealing benchmarks are triplets of dataset, model, and judge @emollick
NVIDIA demonstrates low-precision training using NVFP4 and MXFP8 on Blackwell GPUs achieves 1.6x throughput boost while maintaining BF16 accuracy @NVIDIAAIDev
Anthropic interpretability team expands hiring for research engineers to work on understanding frontier models, integrating into safety audits @ch402