AI Updates on 2025-05-09

AI Model Announcements

  • Google announced Gemini 2.5 Pro (05-06) which achieves state-of-the-art performance on video understanding tasks by a large margin @JeffDean @sundarpichai @OfficialLoganK

AI Research

  • WebGPT paper from 2021 now looks ahead of its time with the capabilities demonstrated by o3 and AI search @natolambert
  • Stanford researchers developed NNetNav, an open-source AI agent that learns by interacting directly with websites while preserving privacy @StanfordHAI
  • Research shows LLMs can be valuable tools for middle school math teachers to enhance learning experiences for students across diverse skill levels @StanfordHAI

AI Applications

  • Reinforcement fine-tuning is now available for o4-mini, allowing developers to customize model behavior @gdb @OpenAIDevs
  • Deep research capabilities for codebases now available, enabling developers to better understand their code @gdb @OpenAIDevs
  • Qwen Chat introduced Web Dev feature that allows building frontend webpages and apps using simple prompts with just one line of text @Alibaba_Qwen
  • Copilot Assistant is now available on Android, allowing users to access it via long press of power button or swipe to launch voice sessions in context of current activity @Copilot
  • Gemini 2.5 now automatically applies 75% cached token discount, potentially offering significant cost savings for applications running prompts against the same long context @simonw
  • Perplexity on WhatsApp is now more conversational and ignores searching when not needed @AravSrinivas
  • Windsurf Reviews streamlines code review process by taking a first pass at reviewing pull requests @windsurf_ai
  • Zero is an open-source AI-native email client that manages your inbox automatically @garrytan @ycombinator
  • Scout now offers seamless website deployment - users can simply ask it to "deploy my website" @ycombinator
  • YouLearn is an AI tutor that turns learning materials into concise notes, provides an AI tutor to talk to, and creates personalized tests @ycombinator
  • Klavis AI is building open source MCP integrations for AI applications with an API that provides hosted, secure MCP servers @ycombinator
  • MorphoAI offers AI-powered software for robotics and machine engineering to develop hardware at software speeds @ycombinator
  • Sai is an AI lab test analysis and health optimization assistant that lives in the SiPhox dashboard, supporting uploads from any lab @ycombinator

AI Industry Analysis

  • YC partners discuss how AI coding tools are transforming software development, enabling small teams to accomplish what once required armies of engineers @garrytan @ycombinator
  • Rippling raised $450M at a $16.8B valuation, highlighting continued strong investment in AI-powered HR and finance platforms @TechCrunch @ycombinator
  • PyTorch Foundation has expanded into an umbrella foundation with vLLM and DeepSpeed accepted as hosted projects, advancing community-driven AI across the full lifecycle @PyTorch
  • Google signed a deal to develop 1.8 GW of advanced nuclear power, likely to support growing AI infrastructure needs @TechCrunch
  • SoundCloud changed policies to allow AI training on user content, joining the trend of platforms opening content for AI development @TechCrunch

AI Ethics & Society

  • Concerns raised about AI detectors like Pangram Labs being used adversarially without independent assessment of false positive rates @emollick
  • Microsoft Research discusses ethical considerations in healthcare AI, including governance frameworks and bias mitigation @MSFTResearch
  • Yann LeCun counters common misconceptions about LLMs, noting they don't make users lazy but instead encourage learning more and faster @ylecun
  • Concerns expressed about NSF budget cuts potentially harming US technological leadership in AI compared to countries like China that are making massive investments in science @jeffclune @ylecun

AI Updates on 2025-05-08

AI Model Announcements

  • Alibaba introduces Qwen3 family of eight open large language models, including two mixture-of-experts (MoE) models and six dense models ranging from 32B to 0.6B parameters, supporting reasoning mode and 119 languages @DeepLearningAI
  • Meta introduces Meta Locate 3D, a model for accurate object localization in 3D environments to help robots understand surroundings and interact with humans @AIatMeta

AI Research

  • Research shows GPT-4o makes up citations to papers, with bias towards shorter titles and famous papers, though error rates appear lower for Deep Research models @emollick
  • Base language models outperform aligned models at randomness and creativity, suggesting alignment doesn't only extract abilities hidden in pretraining but also hides other abilities @stanfordnlp
  • Google DeepMind's AI co-scientist validated for liver fibrosis research, successfully identifying HDAC inhibitor Vorinostat as having significant anti-fibrotic effects in human liver organoid models @demishassabis
  • Tsinghua University researchers reportedly developed a method for AI to generate its own training data, surpassing performance of models trained on expert human-curated data @garrytan

AI Applications

  • Gemini 2.5 can comprehend video content, allowing users to record app explanations, upload to YouTube, and prompt "Build me this" for AI to understand and recreate the application @deedydas @sundarpichai
  • ChatGPT's deep research tool now connects to GitHub repositories, allowing users to ask questions about code while the agent reads and searches source code and PRs @OpenAI
  • Meta and NVIDIA integrate NVIDIA cuVS into Faiss v1.10 for vector search on GPUs, boosting build times by up to 4.7x and reducing search latency by up to 8.1x @AIatMeta
  • Replit launches Notion integration allowing users to connect Notion databases to create customer support pages with AI chatbots trained on support docs @amasad
  • Google launches implicit caching in the Gemini API, enabling 75% cost savings when requests hit cache, with lowered minimum token requirements @LoganKilpatrick @sundarpichai
  • Microsoft integrates Copilot into GroupMe chat app, bringing GPT-4o image generation capabilities directly into group conversations @mustafasuleyman
  • Wells Fargo implemented Microsoft Teams Agent for 35,000 bankers across 4,000 branches, cutting response times for internal questions from 10 minutes to 30 seconds @Microsoft

AI Industry Analysis

  • OpenAI expands leadership team with Fidji Simo as CEO of Applications, allowing Sam Altman to increase focus on research, compute, and safety as the company approaches superintelligence @sama
  • ChatGPT was the only website among the top 10 most visited to grow in April compared to March @aidan_mclau
  • Bill Gates announces plan to give away virtually all his wealth through the Gates Foundation over the next 20 years @BillGates
  • Soumith Chintala takes on role of leading Fundamental AI Research (FAIR) at Meta @soumithchintala
  • AI Fund closes $190M for new fund to co-found AI companies, focusing on speed as the critical factor for startup success @AndrewYNg
  • OpenAI reportedly considering offering a lifetime subscription @AndrewCurran_
  • The true cost of running Gemini 2.5 Pro Preview benchmark was higher than initially reported $6.32, with the new 05-06 version costing $37 to run the benchmark @aidan_mclau

AI Ethics & Society

  • Sam Altman testifies that people are increasingly relying on AI for life advice and emotional support, noting "it's not all bad, but we have to understand it and watch it very carefully" @AndrewCurran_
  • UN releases 200-page report examining AI through the lens of global human development, taking an opinionated approach to the technology's impacts @random_walker
  • Position paper accepted to ICML 2025 outlines steps needed to enable user-centric AI agents that safeguard user autonomy and privacy rather than being controlled by big tech companies @random_walker
  • Stanford students reflect on why the anticipated "EdTech Revolution" hasn't happened two years after ChatGPT's release, questioning who AI education tools are being designed for and who bears the risks @StanfordHAI
  • Sam Altman expresses concerns about EU regulations potentially preventing deployment of "great models and services that are quite safe and robust" due to lengthy approval processes @AndrewCurran_

AI Updates on 2025-05-07

AI Model Announcements

  • Meta introduces Perception Language Model (PLM), an open and reproducible vision-language model for challenging visual tasks, with research paper, code, and dataset available @AIatMeta
  • Google releases updated Gemini 2.0 Image Generation model with better visual quality, more accurate text rendering, lower block rates, higher rate limits, and $0.039 per image generated @demishassabis @OfficialLoganK
  • NVIDIA open sources Open Code Reasoning models (32B, 14B, 7B versions) under Apache 2.0 license, beating o3 mini & o1 (low) on LiveCodeBench @huggingface

AI Research

  • Google and Institute of Science and Technology Austria report first-ever method using light microscopy to comprehensively map all neurons and their connections in mouse brain tissue @GoogleAI @fchollet
  • Stanford researchers release SWE-smith, a toolkit for generating software engineering training data that achieved 40.2% Pass@1 on SWE-bench Verified, making it the top open-source model for software engineering @stanfordnlp
  • MIT researchers develop new AI method modeled after neural oscillations in the brain to analyze long data sequences like climate trends and financial metrics @MIT_CSAIL
  • Researchers release SIFT-50M, a large-scale multilingual dataset for speech instruction fine-tuning covering 5 languages, with the resulting SIFT-LLM outperforming SALMONN & Qwen2-Audio on speech-following benchmarks @huggingface
  • MegaMath, the largest open-source math pre-training corpora collection, reaches 70k+ downloads @huggingface
  • SwallowCode dataset released with 16.1B tokens of LLM-rewritten Python code, filtered by syntax and pylint score, showing +17.0 pass@1 improvement on HumanEval @huggingface

AI Applications

  • Anthropic adds web search to their API, allowing developers to augment Claude's knowledge with up-to-date data, including citations and domain control features @AnthropicAI
  • Figma announces Figma Make, an AI-powered tool that turns designs into interactive prototypes, along with Figma Sites for web publishing with code and AI capabilities coming soon @figma
  • Stripe unveils payments foundation model that creates embeddings for transactions, improving fraud detection from 59% to 97% for card-testing attacks on large users @paulg
  • Coinbase launches x402, described as "HTTP for Money," built on stablecoins for Agentic Commerce, enabling AI agents to make payments without human intervention @garrytan
  • DeepLearning.AI releases new course on Building AI Voice Agents for Production in collaboration with LiveKit and RealAvatar, teaching how to build voice agents with low latency @AndrewYNg @DeepLearningAI
  • MIT researchers develop fiber computer that can be woven into clothing, allowing apparel to run apps and understand the wearer @MIT
  • Neuralink brain implant gets a boost from generative AI to improve functionality @techreview

AI Industry Analysis

  • PyTorch Foundation expands into an umbrella foundation with vLLM and DeepSpeed joining as the first hosted projects @PyTorch @soumithchintala
  • OpenAI reportedly discussing with FDA about using AI for drug evaluations @TechCrunch
  • OpenAI seeking to team up with governments to grow AI infrastructure @TechCrunch
  • CB Insights releases 2024 AI 100 list of promising early-stage startups, showing growing market for agents and infrastructure with over 20% of companies building or supporting agents @DeepLearningAI
  • Y Combinator publishes Requests for Startups focused on AI, seeking founders who treat AI agents as core operating systems for new companies and industries @ycombinator
  • Stanford HAI analysis of DeepSeek's rise challenges assumption that US leads in AI talent attraction and retention, as most DeepSeek researchers were educated in China @StanfordHAI
  • Google and Elementl Power sign agreement to develop three new project sites for advanced nuclear reactors, each generating at least 600 megawatts @AndrewCurran_

AI Ethics & Society

  • Research by MIT Media Lab and OpenAI finds that extensive use of AI chatbots correlates with increased feelings of loneliness @medialab
  • Anthropic Interpretability Team planning virtual Q&A about how they plan to make models safer, the role of the team, and future directions @ch402
  • Microsoft Research hosts Fusion Summit bringing together global experts to explore how AI can help unlock fusion energy potential @MSFTResearch

AI Updates on 2025-05-06

AI Model Announcements

  • Google releases updated Gemini 2.5 Pro (I/O edition) with significantly improved coding capabilities, ranking #1 on WebDev Arena leaderboard with a +147 Elo score gain @JeffDean @GoogleDeepMind
  • The new Gemini 2.5 Pro model (gemini-2.5-pro-preview-05-06) is now #1 across all LMArena leaderboards including text, vision, and WebDev @OfficialLoganK
  • Meta introduces Meta Perception Encoder, a vision encoder setting new standards in image & video tasks, excelling in zero-shot classification & retrieval @AIatMeta
  • ServiceNow and NVIDIA announce Apriel Nemotron 15B, a compact AI model built with NVIDIA NeMo and trained on NVIDIA DGX Cloud @NVIDIAAI

AI Research

  • Gemini 2.5 Pro achieves 84.8% on the VideoMME benchmark, demonstrating state-of-the-art performance on image and video understanding @JeffDean
  • Google Research introduces a system using Gemini models designed for high fidelity text simplification that enhances clarity while preserving meaning, detail, and nuance @GoogleAI
  • Second-order optimization shows promise for more efficient LLM pretraining according to a study on the advantages of Muon @eladgil
  • BayesFlow 2.0, a Python package for amortized Bayesian inference powered by Keras 3, released with support for JAX, PyTorch, and TF @fchollet

AI Applications

  • Gemini 2.5 Pro can build interactive web apps, games, and simulations from a single prompt, with significantly improved capabilities for front-end web development, editing, and transformation @demishassabis
  • Hugging Face releases Open Computer Agent, allowing LLMs to complete tasks using a virtual machine, testing how well current models use a computer to solve everyday tasks @huggingface
  • Microsoft introduces Claimify, a new method for extracting simple, verifiable claims from LLM outputs that preserves critical context and outperforms past approaches @MSFTResearch
  • Google launches Simplify feature for iOS that uses AI to make dense text easier to understand @GoogleAI
  • Computer Use in smolagents launched by Hugging Face, allowing vision models to power complex agentic workflows, especially with Qwen-VL models that support built-in grounding @huggingface
  • Cursor announces free access for students to their AI-powered coding assistant @cursor_ai
  • Windsurf introduces Knowledge Base in Wave 8, allowing users to import documents from Google Drive for Cascade to use as context @windsurf_ai

AI Industry Analysis

  • OpenAI's acquisition of Windsurf for $3 billion is moving forward according to Bloomberg reports @AndrewCurran_
  • Windsurf was acquired for $3B at ~$40M ARR (75x) by OpenAI, while Cursor raised at $9B at ~$300M ARR (30x) @deedydas
  • Google's improved position in AI is evident as Gemini dethrones previous Gemini versions, signaling that "the dragon woke up" @AndrewCurran_
  • Chinese AI startups are going to great lengths to not be seen as Chinese, with companies like Genspark presenting themselves as "Palo Alto based" despite connections to China @deedydas
  • Reinforcement Learning (RL) is very expensive compared to Supervised Fine-Tuning (SFT), but perfect for businesses as it can optimize metrics that matter like sales or customers @alexgraveley
  • AI lowers the barrier to getting started on anything, but doing great work still requires execution, judgment, creativity, and domain knowledge @paulg

AI Ethics & Society

  • Paul Tudor Jones reported on CNBC that a leading AI developer at a tech conference stated "I think it's going to take an accident where 50 to 100 million people die to make the world take the threat of this really seriously" @AndrewCurran_
  • MIT Media Lab researchers present TeleAbsence, exploring design principles for how AI could help people cope with loss and plan for how they might be remembered @medialab
  • Numerous members of the MIT and Media Lab communities will participate in the Venice Biennale with the theme "Intelligens. Natural, artificial, collective," focusing on applying adaptive intelligence to a demanding world @medialab

AI Updates on 2025-05-05

AI Model Announcements

  • Hugging Face announced that Nvidia has released Llama-Nemotron, an efficient reasoning model @huggingface
  • Nvidia open-sourced Parakeet TDT 0.6B, described as the best speech recognition model on Open ASR Leaderboard, capable of transcribing 60 minutes of audio in 1 second @huggingface

AI Research

  • MIT researchers developed a new method to make AI models more trustworthy for high-stakes settings by conveying uncertainty more precisely @MIT
  • Chris Olah investigated whether superposition is a major cause of adversarial examples by training SAEs on adversarially trained models @ch402
  • Research suggests D-FINE, a real-time object detector faster and more accurate than YOLO with Apache 2.0 license, has been added to Hugging Face transformers @huggingface

AI Applications

  • Simon Willison released a new llm-video-frames plugin that turns video files into sequences of JPEGs to feed into long-context vision models like GPT-4.1-mini @simonw
  • Perplexity on WhatsApp provides a convenient way to use AI when in flight, as flight WiFi supports messaging apps well @AravSrinivas
  • Claude 3.7 Sonnet can now crawl entire websites, extract specific data, and complete research tasks without leaving the desktop app @ycombinator
  • Google's Veo 2 on the Gemini app allows users to input prompts directly to generate videos, with the model only able to respond in video format @AndrewCurran_
  • Pulse AI launched Ultra, described as their new hybrid reasoning model and "the most accurate document extraction model in the industry" @ycombinator
  • Alex 3.0 released with features to automatically compile and fix errors, auto-apply code, add packages, search the web, run terminal commands, and review code with local LLM support @ycombinator

AI Industry Analysis

  • OpenAI announced structural changes: the nonprofit will continue to control the for-profit entity, which will become a Public Benefit Corporation with the same mission @OpenAI @sama
  • Many companies cannot use Qwen and DeepSeek open models because they come from China, slowing adoption of open models across enterprises @natolambert
  • Google refreshed its music-generation tools with Lyria 2 for Music AI Sandbox and Lyria RealTime for DJ, producing high-quality 48kHz audio with extensive control over musical attributes @DeepLearningAI
  • The Keras team released KerasRS, a new library for building recommender systems with easy-to-use building blocks compatible with JAX, PyTorch, TF, and optimized for TPUs @fchollet
  • Hugging Face introduced the Common Crawl Creative Commons Corpus (C5), a heavily filtered web-crawl dataset containing only Creative Commons licensed documents with 150 billion tokens collected so far @huggingface

AI Ethics & Society

  • Arvind Narayanan discusses two ever-present risks when using generative AI for work: hallucinations/confabulations and deskilling, emphasizing the importance of having a plan to address these risks @random_walker
  • Stanford HAI reports that Visa's integration of AI into its payment system could lead to consumers facing higher prices and deceptive practices without realizing it @StanfordHAI
  • A study found that people struggle to get useful health advice from chatbots @TechCrunch
  • Ethan Mollick shares research suggesting people may be massively underreporting their AI usage in surveys @emollick