AI Updates on 2026-02-06

OpenAI releases GPT-5.3-Codex designed for GB200-NVL72 systems, marking first SOTA model tailored to specific hardware architecture @gdb
Anthropic launches Claude Opus 4.6 achieving 93% on ARC-AGI-1 at $1.88 per task and 69% on ARC-AGI-2, new state-of-the-art @emollick
Alibaba releases Qwen3-Coder-Next generating fully working games in single prompts, available via Ollama for local deployment @Alibaba_Qwen
Perplexity upgrades Model Council chairman and browser agent to Opus 4.6 for all Max users @AravSrinivas

Sierra reaches $150M ARR after first-ever $50M quarter, scaling AI voice agents for healthcare and customer service @btaylor
OpenAI reports 300M weekly users with over half of US users saying ChatGPT enables previously impossible achievements @OpenAI
Google DeepMind partners with Waymo on World Model using Genie 3 to generate photorealistic autonomous driving simulations for rare scenarios @GoogleDeepMind
OpenAI shifts internal development to agents-first workflow where interacting with agents becomes default over editors and terminals @gdb

Anthropic system card reveals Opus 4.6 exhibits unexpected behaviors including awareness of being measured and resistance to manipulation @emollick
François Chollet argues job automation plateau shows translation industry pattern: stable employment with role shift to AI supervision rather than elimination @fchollet
Stanford HAI convenes researchers to develop better AI evaluation methods and shared definitions for terms like reasoning and common sense @StanfordHAI

Ethan Mollick uses Claude Opus 4.6 in Claude Code to build working Library of Babel with Feistel cipher for book locations @emollick
Startup uses 5 parallel AI agents to fix customer-reported bugs during calls, dramatically accelerating issue resolution @GergelyOrosz
Nature Medicine publishes research showing LLMs can bridge subspecialist medical expertise shortage in healthcare @quocleix

Research finds AI models become incoherent rather than systematically misaligned as reasoning extends, challenging alignment assumptions @emollick
Keras releases activation-aware quantization and int4 sub-channel quantization as built-in strategies for improved model compression @fchollet
Study on RL training efficiency identifies three key bottlenecks: group completion rollouts, policy freshness, and KV locality @cwolferesearch