AI Updates on 2025-11-24
AI Model Announcements
- Anthropic releases Claude Opus 4.5, described as "the best model in the world for coding, agents, and computer use," achieving top performance on SWE-Bench and ARC-AGI-1+2 benchmarks while being 3x cheaper than Opus 4.1 at $5/M input and $25/M output tokens @claudeai
- Opus 4.5 demonstrates superior token efficiency by performing better on SWE-Bench without extended thinking than with 64K reasoning tokens, and scored higher on a difficult performance engineering exam than any human candidate within a 2-hour time limit @AndrewCurran_
- Meta releases SAM 3 with enhanced object detection and tracking capabilities, partnering with ConservationX to create the SA-FARI dataset containing 10,000+ annotated videos of over 100 animal species for conservation efforts @AIatMeta
- Microsoft Research introduces Fara-7B, a native agentic small language model designed for computer use that achieves frontier performance on web automation tasks while maintaining privacy, now available on Microsoft Foundry and Hugging Face @peteratmsr
- OpenAI launches shopping research feature in ChatGPT that conducts deep internet research, asks clarifying questions, and builds personalized buyer's guides, with nearly unlimited usage through the holidays for all plan tiers @OpenAI
- Google introduces Sora styles feature offering 6 different visual styles (Thanksgiving, Vintage, News, Selfie, Comic, Anime) for video generation, rolling out to all Sora users on web and iOS @soraofficialapp
- Google showcases Nano Banana Pro capabilities for high-fidelity image generation with precision and consistency from simple prompts and sketches @GeminiApp
AI Industry Analysis
- Gemini 3 launch drove market share increase from 23% to 30% according to SimilarWeb data tracking desktop and mobile web views, demonstrating significant competitive gains @deedydas
- Cursor announces Claude Opus 4.5 availability at Sonnet pricing (3x cheaper than Opus 4.1) until December 5th, making frontier model capabilities more accessible to developers @cursor_ai
- AWS commits $50 billion to build AI infrastructure specifically for US government applications, representing major investment in public sector AI deployment @TechCrunch
- Revolut achieves $75 billion valuation in new capital raise, with market research showing the company captures 20-40% of all new bank account openings across 6 European markets and adds 1 million customers every 17 days @aleximm
- X-energy raises $700 million Series D funding, riding the nuclear energy wave driven by AI infrastructure power demands @TechCrunch
AI Ethics & Society
- Anthropic publishes 150-page system card for Opus 4.5 including 50 pages dedicated to alignment research, representing the most thoroughly documented model understanding at launch according to researchers @sleepinyourhat
- New AI benchmark tests whether chatbots protect human wellbeing, addressing growing concerns about AI safety and user protection @TechCrunch
- Research on racial bias proposes testing methodology based on inconsistent perceptions of race, examining whether the same person receives different treatment when perceived as different races, published in Science Advances @2plus2make5
AI Applications
- Andrew Ng releases Agentic Reviewer for research papers at paperreview.ai, achieving Spearman correlation of 0.42 between AI and human reviewers compared to 0.41 between two human reviewers, demonstrating near human-level performance in accelerating research feedback loops @AndrewYNg
- Claude Opus 4.5 demonstrates practical capabilities including creating PowerPoint presentations from Excel data and achieving best-ever results on poetry generation tests in single attempts @emollick
- Meta's SAM 3 enables ConservationX to precisely measure animal species survival rates globally and support extinction prevention efforts through advanced object detection and tracking @AIatMeta
- Google demonstrates Gemini 3 coding a complete retro-themed dance night website from a single prompt, showcasing end-to-end development capabilities @GoogleDeepMind
- Developer creates text interface for Notion AI, demonstrating practical integration of AI assistants into existing productivity workflows @brian_lovin
- MIT engineers design ultrasonic system to shake water out of atmospheric water harvesters, improving efficiency of water collection technology @MIT
AI Research
- Study on GPT-4o and GPT-3.5 finds AI works as an amplifier where users with higher creative and cognitive ability without AI produce better work with AI, with baseline ability predicting 40% of variance in AI-assisted creative performance @emollick
- Research on small multimodal models explores perception and reasoning bottlenecks when downscaling model size, providing insights into what breaks during model compression @mark_endo1
- Google DeepMind paper on raw pixel space pretraining forecasts that next-pixel modeling will reach competitive ImageNet classification (over 80% top-1 accuracy) and generation metrics (90 Frechet Distance) within five years @skywalkeryxc
- Researchers note that KL divergence exclusion from GRPO loss is becoming standard for reasoning and RL training pipelines without causing training instability, highlighting differences between RL for LLMs versus traditional deep RL @cwolferesearch
- Multi-task RL research introduces BRC, a simple recipe that outperforms state-of-the-art single-task agents while using less compute, unlocking LLM-style transfer and fine-tuning capabilities @mic_nau
- Developer demonstrates making Claude's code analysis 2x faster and use half the tokens by adding instruction to use newly released mgrep tool, showing significant improvements in speed, efficiency, and quality @isaac_flath