AI Updates on 2025-06-30

Baidu releases ERNIE 4.5 series with 23 models ranging from 0.3B to 424B parameters, achieving state-of-the-art performance across text and multimodal benchmarks, competitive with DeepSeek V3 and Qwen 235B @PaddlePaddle
Alibaba releases Ovis-U1-3B multimodal model for understanding, generation, and editing, powered by MMDiT and bidirectional token refinement @AdinaYakup
Qwen launches Qwen-TTS via API, trained on millions of hours of speech with support for 3 Chinese dialects and 7 bilingual voices @Alibaba_Qwen
Arcee AI releases five language models including three enterprise-grade production models and two research models as part of their transition to the Arcee Foundation Model family @arcee_ai
OpenAI's rumored open source model is generating significant buildup from reliable sources, with speculation about a real name and substantial impact @AndrewCurran_

Companies are rewriting their core products to leverage reasoning models, removing LLM 1.0 scaffolding and building entirely new user experiences as the era of reasoning models accelerates @OfficialLoganK
AI infrastructure companies like Lovable, Vercel, Cursor, and Replit are positioned as predictable winners in the AI gold rush, selling tools to build ideas even for non-developers @GergelyOrosz
Apple is testing both Anthropic and OpenAI models on their cloud infrastructure, with the winner potentially powering the new Siri, creating significant competition between the two AI companies @AndrewCurran_
Meta restructures its AI unit under "Superintelligence Labs" and hires top talent with $10M+ annual compensation packages for their new team @deedydas
Many firms built around GPT-3.5 limitations are now stuck with complex, expensive solutions that are worse than newer reasoner models without scaffolding @emollick
Microsoft releases VS Code and GitHub Copilot as open source, while its main competitor Cursor remains a closed-source fork, representing an unexpected industry dynamic @GergelyOrosz

AI agents demonstrate brand preferences and are attracted to different types of advertisements, with significant money likely to be spent influencing these preferences in the near future @emollick
Representative national surveys show real AI productivity gains: teachers report 6-hour weekly time savings and workers report 3x productivity gains on one-fifth of tasks, contradicting claims that AI isn't useful to real people @emollick
Stanford research reveals an "ideation-execution gap" where LLM-generated research ideas sound novel but result in worse projects than human-generated ideas when executed by PhD students over 100+ hours @ChengleiSi
Jason Wei argues that AI self-improvement will be gradual over many years rather than a fast takeoff, citing bottlenecks in real-world experiments and domain-specific improvement difficulties @_jasonwei

Cursor launches web and mobile versions allowing users to spin off dozens of agents and review them later in their editor, expanding beyond desktop development @cursor_ai
Perplexity's Comet can play Pokemon and will be available simultaneously on Windows, Mac, iOS, and Android platforms @AravSrinivas
Microsoft's MAI-DxO achieves 85.5% diagnostic accuracy on complex medical cases from the New England Journal of Medicine, four times better than experienced physicians while reducing costs @satyanadella
Google Gemini's Veo 3 creates highly realistic animal skateboarding videos, demonstrating advanced video generation capabilities for creative applications @GeminiApp
Perplexity works effectively across multiple languages, gathering data from both English and non-English sources while presenting results in the requested language, creating a new multilingual search superpower @GergelyOrosz
MIT researchers use generative AI to refine robot blueprints and test 3D designs in simulation, creating machines that out-jump and land more consistently than human-designed robots @MIT_CSAIL

SparseLoRA achieves 1.6-1.9x faster LLM fine-tuning with 2.2x fewer FLOPs via contextual sparsity while maintaining performance on math, coding, chat, and ARC-AGI tasks @xiuyu_l
Google's text-to-text regression approach successfully optimizes massive compute clusters, demonstrating that models can be rewarded with literally any world feedback by training encoder-decoders to read complex states as text @XingyouSong
Chai-2 enables zero-shot antibody discovery in a 24-well plate, exceeding previous state-of-the-art by over 100x in molecular design capabilities @chaidiscovery
Stanford research on RL via Implicit Imitation Guidance shows how to use expert data to guide more efficient exploration rather than constraining policies through imitation losses @_anniechen_
Physicists reproduce AI creativity in image generation using two predictable factors, providing theoretical understanding of diffusion model behavior @QuantaMagazine
Research demonstrates that hierarchical Bayesian methods can predict the rise and fall of in-context learning in LLMs without knowing architecture or learning algorithms @EkdeepL
New study shows online DPO and GRPO give similar performance, while semi-online iterative DPO works well with better efficiency, and combining verifiable with non-verifiable tasks provides cross-transfer gains @jaseweston