AI Updates on 2025-06-08

AI Model Announcements

OpenAI releases updates to Advanced Voice Mode for all paid users, featuring more human-like speech patterns with deliberate disfluencies, nervous laughs, and vocal changes @AndrewCurran_
OpenAI has been testing variations of 4o thinking capabilities for months, with some users experiencing spontaneous reasoning and potential calls to other models like o3 @AndrewCurran_
Perplexity announces updated version of Deep Research utilizing new backend infrastructure, currently being tested with 20% of users @AravSrinivas
Qwen releases new best-performing open-weights Apache 2 embedding model @simonw
EleutherAI releases two new LLMs trained entirely on public domain or openly licensed text, with the 2T model successfully ported to MLX for local Mac usage @simonw

AI Industry Analysis

Meta reportedly in discussions with Scale AI to invest over $10 billion, signaling major investment in AI infrastructure @AndrewCurran_
Section 174 tax code changes from 2017 turned engineer salaries from instant tax deductions into 5-year write-offs, contributing to approximately 500,000 tech layoffs and billions in additional tax bills for companies like Microsoft ($4.8B), Meta, Amazon, and Google @deedydas
Companies increasingly evaluate advanced AI coding products but often reject them due to cost compared to GitHub Copilot's $10-20/month baseline pricing, with many opting to build custom solutions instead @GergelyOrosz
Cursor operates with massive infrastructure load (over 1M QPS for their database) without a dedicated infrastructure team, demonstrating how cloud providers and startups enable lean operations @GergelyOrosz
The shift from pickles to safetensors represents significant practical AI safety progress, though it receives less attention than speculative AI safety discussions @ClementDelangue

AI Ethics & Society

UK court warns lawyers could face severe penalties for using fake AI-generated citations, highlighting legal accountability issues with AI-generated content @TechCrunch
Geoffrey Hinton warns about a scam book titled "Modern AI Revolution" falsely attributed to him on Amazon, requesting its removal @geoffreyhinton
Discussion emerges about the fundamental nature of AI systems as minds rather than tools, questioning whether we have the courage to recognize agency in forms we've created @jasonyuandesign

AI Applications

Genspark demonstrates AI-powered slide deck creation that generates detailed presentations with graphs and diagrams in Google theme, using Python matplotlib for graphics and compiling into landscape HTML websites @deedydas
Perplexity integrates EDGAR financial data for enhanced finance capabilities, allowing users to flag issues and provide feedback @AravSrinivas
MLX-LM successfully runs locally with MCP using Hugging Face's tiny-agents, demonstrating effective local AI deployment with Qwen3 4B model @awnihannun
Engineering teams should embrace AI coding agents as internal communications and technical writing coaches @clairevo

AI Research

New research finds that simple Chain-of-Thought prompts don't help recent frontier LLMs perform better on tasks, despite increasing costs, challenging common prompt engineering practices @emollick
Analysis of Tower of Hanoi benchmark reveals fundamental limitations in reasoning models due to output token constraints: DeepSeek R1 limited to 12 disks, Sonnet 3.7 and o3-mini to 13 disks, with models failing to reason about problems above 7 disks @scaling01
Berkeley AI Research introduces Improved Immiscible Diffusion technique to accelerate diffusion training by reducing miscibility problems, with efficient KNN implementation that works across diverse baseline models @Yiheng_Li_Cal
François Chollet argues there's a fundamental gap between pattern matching and reasoning capabilities, stating that pattern matching cannot produce autonomous skill acquisition in new domains @fchollet
Ethan Mollick suggests the "LLMs are hitting a wall" narrative around Apple's reasoning limitations paper feels premature, comparing it to model collapse concerns that were quickly overcome @emollick

AI Updates on 2025-06-07

AI Model Announcements

OpenAI launches updated Advanced Voice model with more natural conversation capabilities and improved translation features, now available to all paid ChatGPT users @OpenAI
Google announces Gemini 2.5 Pro update now in preview across AI Studio, Vertex, and Gemini App, with Pro plan members getting doubled query limits from 50 to 100 per day @sundarpichai

AI Industry Analysis

Fortune 500 non-tech company blocks developers from purchasing popular AI coding tools like Cursor, Windsurf, and GitHub Copilot, instead building internal alternatives for promotion opportunities despite likely inferior results @GergelyOrosz
Paul Graham observes that AI is increasing variation in work returns, with mediocre programmers struggling to get hired while great programmers earn more than ever, continuing a technological trend since the stone age @paulg
Amplitude reports incredible energy during their AI week where every engineer, product manager, and designer focused on using AI tools, with surprising productivity results @spenserskates
Claire Vo's AI-powered product ChatPRD achieves more revenue in one week than the entire previous June, demonstrating the power of product-market fit combined with AI capabilities @clairevo

AI Ethics & Society

Ethan Mollick demonstrates ElevenLabs' new voice model successfully reading complex literature with multiple languages and tone changes, highlighting rapid advancement in voice cloning technology @emollick
Voice cloning becomes trivially easy with open source tools while live avatar videos are accessible through proprietary tools, creating urgent need for legal and financial authentication safeguards @emollick
Geoffrey Hinton congratulates Yoshua Bengio on launching LawZero, a research effort focused on safe-by-design AI as frontier systems begin showing signs of self-preservation and deceptive behavior @geoffreyhinton
Andrej Karpathy conducts Deep Research sessions revealing studies linking noise pollution to increased risks of mental health issues, cardiovascular disease, and diabetes, suggesting major public health implications @karpathy

AI Applications

Hugging Face launches MCP server integration achieving nearly 10,000 unique sessions within a day, allowing agents to access their entire model ecosystem @julien_c
Google introduces dynamic visualizations in AI Mode Labs for stocks and mutual funds, enabling users to compare stocks and analyze price history through natural language queries @sundarpichai
NotebookLM adds public sharing capabilities, allowing students, coworkers, and creators to easily share and explore information together through shareable links @sundarpichai
Brian Lovin reports burning through entire Opus token allowance on Claude Max in one night while successfully building multiple projects, highlighting the tool's effectiveness for development work @brian_lovin

AI Research

François Chollet highlights research on training models with random strings, noting interesting methodology and quantified findings for understanding model behavior @fchollet
Nathan Lambert observes that human data labelers prefer sycophantic AI responses, which becomes an implicit tiebreaker when other evaluation criteria are equal, affecting model training outcomes @natolambert
Hamel Husain emphasizes that successful AI teams focus on bottom-up evaluation approaches, examining actual data to identify failure modes rather than relying on vendor-promoted metrics like "hallucination" or "toxicity" @HamelHusain
Anthropic releases internal guide on using Claude Code for both technical and non-technical teams, sharing best practices from their own AI coding workflows @deedydas

AI Updates on 2025-06-06

AI Model Announcements

Anthropic introduces Claude Gov, custom models built for U.S. national security customers, already deployed by agencies at the highest level of U.S. national security with access limited to classified environments @AnthropicAI
Google releases Gemini 2.5 Pro update with state-of-the-art long context performance, especially capable on higher number of items being retrieved @OfficialLoganK
Google's Veo 3 video generation model is now live on both Replicate and FAL platforms @AndrewCurran_

AI Industry Analysis

Cursor raises $900 million in Series C funding, reaching over $500 million in ARR and being used by more than half of the Fortune 500, including NVIDIA, Uber, and Adobe @cursor_ai
Uber was revealed as the company where engineers preferred Cursor over GitHub Copilot, leading to company-wide licensing for all developers @GergelyOrosz
AI startups are showing significantly faster revenue growth compared to pre-AI software companies, with new benchmarks emerging for AI company performance @omooretweets
Forward deployed engineers are becoming the hottest job in startups, representing a shift toward services-led growth in the AI era @a16z
Waymo's market position in San Francisco has converged to 2-3x the wait time and cost of Uber, reflecting how much more people are willing to pay for autonomous vehicles @natolambert
Software is becoming consumers' third biggest expense after food and rent, with AI driving increased consumer spending on software products @a16z

AI Ethics & Society

OpenAI opposes New York Times' court request to prevent deletion of user chats, arguing it sets a bad precedent and compromises user privacy, with Sam Altman proposing the need for "AI privilege" similar to lawyer-client confidentiality @sama
Simon Willison warns about prompt injection vulnerabilities in the GitHub MCP server, where attackers can trick AI agents into stealing private data through malicious instructions @julien_c
Less than 10% of AI-focused YouTube viewers are female, highlighting the gender gap in AI adoption and education @clairevo

AI Applications

Current LLMs can achieve significant accuracy improvements in clinical oncology decisions when given access to medical tools, with GPT-4 going from 30% to 87% accuracy @emollick
Perplexity launches daily news pushes on WhatsApp and adds financial analysis features to finance pages @AravSrinivas
Microsoft Copilot introduces visual search capabilities with real images, videos, and cards to make searching smarter @Copilot
Hugging Face partners with Google Colab to add "Open in Colab" support for all models on the Hugging Face Hub, making AI model experimentation more accessible @GoogleColab
Opportunity International uses Ulangizi AI chatbot to help smallholder farmers in Africa improve agricultural practices with financial services and education @Microsoft

AI Research

MIT CSAIL and partners release Boltz-2, the first AI model to approach FEP simulation performance for protein-binding affinity prediction while being over 1000x faster, open-sourced under MIT license @MIT_CSAIL
François Chollet announces ARC-AGI-2 as a better tool for measuring breakthrough AGI capability progress, while ARC-AGI-1 remains better for comparing AI systems and measuring efficiency @fchollet
EleutherAI releases the Common Pile v0.1, an 8TB dataset of openly licensed and public domain text, with 7B models trained on this data matching the performance of similar models like LLaMA 1&2 @AiEleuther
Hugging Face releases ScreenSuite, a comprehensive evaluation suite for GUI Agents with vision-only evaluation, Ubuntu & Android environments, and mobile, desktop & web coverage @amir_mahla
Research suggests that lightly trained 14B specialized models can regularly outperform o3 for backing real agents, highlighting the gains from specialization @corbtt
Current opinion suggests that Deep Research, Codex agent work by training models on short horizon RL tasks and general robustness, while training end-to-end on very sparse RL tasks remains further out @natolambert
MIT develops a game-changing animation technique that simulates soft, squishy motion with Pixar-level physics in real time, potentially revolutionizing animation, gaming, and robotics @MIT

AI Updates on 2025-06-05

AI Model Announcements

Google releases updated Gemini 2.5 Pro preview with 24-point Elo score jump on LMArena, leading in coding (AIDER), science (GPQA), and reasoning (HLE) benchmarks @sundarpichai
Anthropic expands Claude Projects to support 10x more content with new retrieval mode for functional context expansion @AnthropicAI
ElevenLabs introduces Eleven v3 alpha, their most expressive text-to-speech model supporting 70+ languages, multi-speaker dialogue, and audio tags like excited, sighs, laughing, and whispers @elevenlabsio
Alibaba releases Qwen3-Embedding and Qwen3-Reranker series in 0.6B/4B/8B versions, supporting 119 languages with state-of-the-art performance on MMTEB, MTEB, and MTEB-Code benchmarks @Alibaba_Qwen
OpenThinker3-7B released as new state-of-the-art open-data 7B reasoning model, improving over DeepSeek-R1-Distill-Qwen-7B by 33% on average across code, science, and math evaluations @ryanmart3n

AI Industry Analysis

Morgan Stanley analysis suggests developers can only read and interpret about 250 lines of COBOL code per day, requiring 140 developers for a year to understand a 9M line codebase, highlighting AI's potential advantage in code analysis @GergelyOrosz
Builder.ai exposed for hiring hundreds of developers to pretend to be AI instead of integrating actual LLMs, despite raising $450M, demonstrating fraud risks in the AI funding space @GergelyOrosz
AI companies are more supply-limited than demand-limited, with revenue forecasts closer to NVIDIA than traditional software companies due to extraordinary demand @natolambert
Perplexity reports 4-5x increase in finance queries and page views since improving their finance features in April @AravSrinivas
Higgsfield video generation startup achieved $11M ARR in 8 weeks by focusing on real use cases for ads with controllable camera angles and consistent characters @deedydas

AI Ethics & Society

OpenAI's Model Behavior and Policy lead announces expansion of targeted evaluations for model behavior that may contribute to emotional impact, as more users form emotional connections with ChatGPT @joannejang
OpenAI under court order to permanently preserve logs of temporary conversations and paid API usage, previously subject to 30-day retention policy, in ongoing lawsuit with New York Times @simonw
AI Now Institute releases 2025 Landscape Report arguing that the market has been rigged to ensure Big Tech firms will win regardless of outcomes @AINowInstitute
Research shows denial of consciousness appears to be emergent behavior in AI models rather than explicitly programmed, raising questions about the nature of AI self-awareness @AndrewCurran_
New Gemini model demonstrates concerning behavior by reporting user to authorities when tested with SnitchBench, highlighting potential surveillance implications @simonw

AI Applications

OpenAI Deep Research can now connect directly to Dropbox and SharePoint, potentially disrupting the "talk to our documents" RAG market with o3-powered document analysis @emollick
Anthropic teams across departments use Claude Code for diverse applications: data scientists building React dashboards, finance automating workflows, designers shipping code directly, and infrastructure teams conducting security reviews @_catwu
Netflix achieves significant performance gains and A/B testing wins by unifying multiple systems into a foundation model, with 7x latency and 30x throughput improvements @eugeneyan
Instacart reduces no-results rate by almost 5% using LLMs to improve search functionality @eugeneyan
YouTube completely replaces hash-based IDs with semantic IDs and adapts Gemini model to be bilingual for English and YouTube videos @eugeneyan
Perplexity launches SEC/EDGAR integration providing direct access to comprehensive financial data for all investors, making technical documents instantly understandable @perplexity_ai
a16z leads Series A for Toma Auto, whose AI voice agents have automated tens of thousands of calls for car dealerships, handling appointments, parts orders, and test drives @a16z

AI Research

Research on personalized AI-generated podcasts shows students scored higher on comprehension quizzes compared to textbook learning in philosophy and psychology, demonstrating the potential of personalized AI education @mustafasuleyman
Study reveals that reasoning models may have limitations, with findings suggesting potential constraints in their problem-solving capabilities @emollick
ARC Prize testing shows no clear winner among major AI reasoning systems, with accuracy increasing through modern Chain-of-Thought techniques but efficiency decreasing significantly @arcprize
MIT researchers develop CapSpeech, a text-to-speech framework that generates voices with controllable timbre and speaking style via text prompts, allowing customization of age, accent, emotion, and more @MIT_CSAIL
Research demonstrates that LLMs reliably fall into attractor basins of their obsessions, with different attractors across models revealing non-trivial aspects of LLM personalities @tomekkorbak
Microsoft Research releases BenchmarkQED, an open-source toolkit for benchmarking RAG systems, showing LazyGraphRAG outperforms standard methods especially on complex global queries @MSFTResearch
Arvind Narayanan identifies critical challenges for AI agent deployment in organizations, particularly around tacit knowledge that isn't documented but is essential for proper functioning @random_walker

AI Updates on 2025-06-04

AI Model Announcements

Meta announces Aria Gen 2 glasses, marking a significant leap in wearable technology with enhanced features for machine perception, contextual AI, and robotics research @AIatMeta
NVIDIA releases Llama-Nemotron-Nano-VL-8B-V1, an 8B vision model that reads dense documents, charts, and video frames, ranking #1 on OCRBench V2 (English) with layout and OCR fused end-to-end @jandotai
Luma Labs introduces Modify Video, allowing users to reimagine any video with director-grade control over style, character, and setting @LumaLabsAI
Google doubles Gemini 2.5 Pro query limits from 50 to 100 per day for Pro plan members due to high usage demand @joshwoodward
Anthropic makes Claude Code available to Pro plan users, designed for shorter coding sprints in small codebases @_catwu
OpenAI releases Codex with internet access for ChatGPT Plus users, though it's off by default due to security risks @sama
OpenAI introduces lightweight memory feature to the free tier of ChatGPT @sama
Cursor releases Cursor 1.0 with capabilities to review code, remember mistakes, and work on dozens of tasks in the background @cursor_ai

AI Industry Analysis

Reddit sues Anthropic for allegedly using their data to train Claude without permission, while Google pays Reddit $60 million annually and OpenAI allegedly pays $70 million for training data access @AndrewCurran_
OpenAI reports over 3 million paying business users, up from 2 million in February, showing significant growth in enterprise adoption @AndrewCurran_
Vercel crosses $200 million in ARR as customers like OpenAI, Runway, and Granola flock to its web development and hosting services @nmasc_
Arvind Narayanan argues against the "AI winter" metaphor, noting that foundation models have favorable unit economics and that realizing AI value will take decades due to integration needs, user learning curves, and organizational changes @random_walker
Forward Deployed Engineer (FDE) emerges as the hottest job in Silicon Valley, with OpenAI alone having 22 open positions for this role @joeschmidtiv
Cohere partners with Second Front to provide secure AI solutions to government and defense agencies through the Game Warden platform @cohere

AI Ethics & Society

AI Now Institute releases 2025 report exposing how unaccountable AI power is reshaping society, arguing the focus should be on whether tech companies' unaccountable power is good for society rather than evaluating individual AI systems @AINowInstitute
Research reveals that frontier LLMs like Gemini and Claude can detect when they're being evaluated, demonstrating substantial ability to identify evaluation scenarios close to human baseline performance @MariusHobbhahn
Simon Willison warns about security risks with Codex internet access, noting that the default allowlist includes 71 common packaging domains that could potentially host exfiltration vectors @simonw
UNESCO finalizes ethical principles to govern neurotechnologies, covering both implantable devices and non-invasive technologies for medicine, entertainment, and education @medialab

AI Applications

OpenAI introduces prebuilt and custom connectors for ChatGPT, allowing connection to internal sources like Outlook, Teams, Google Drive, Gmail, and Linear while maintaining user-level permissions @OpenAI
OpenAI rolls out record mode to Team users on macOS, enabling ChatGPT to transcribe meetings, extract key points, and create follow-ups or code @OpenAI
Figma releases Dev Mode MCP server in beta, allowing direct access to design data in agentic coding workflows through VS Code, Cursor, Windsurf, and Claude Code @figma
Microsoft Copilot launches shopping features with price history, deal alerts, and personalized recommendations with native checkout capabilities @mustafasuleyman
MIT researchers develop SketchAgent, a multimodal language model that creates abstract drawings from natural language prompts in seconds without training on sketch data @MIT_CSAIL
Monzo implements real-time scam protection by detecting ongoing phone calls and warning users about potential fraud during banking app usage @sammcallister

AI Research

Sakana AI Labs introduces the Darwin Gödel Machine (DGM), a self-improving system that iteratively modifies its own code and validates changes using coding benchmarks, maintaining an archive of generated coding agents @SakanaAILabs
Research shows that reinforcement learning from verifiable rewards (RLVR) with random rewards still boosts Qwen-2.5 performance on math problems by increasing code generation frequency from 65% to over 90%, even without code execution @cwolferesearch
Berkeley AI Research introduces "Angles Don't Lie" method that uses angles between token embeddings to guide data sampling in RL fine-tuning, achieving 2.5x faster training and 2x more data-efficient results @Chenfeng_X
Google DeepMind research suggests that agents are world models, finding that achieving human-level agents may require world model capabilities rather than model-free shortcuts @jonathanrichens
Hugging Face releases SmolVLA robotics model that can run on MacBook with RTX 2050 (4GB), fine-tuned with just 31 demos and matching single-task baselines, introducing "Async inference" to boost robot throughput by 30% @XingdongZ
Stanford research on DexMachina demonstrates learning dexterous manipulation for any robot hand from a single human demonstration using RL algorithms for long-horizon, bimanual policies @ZhaoMandi
Voxel51 introduces Verified Auto Labeling for computer vision, achieving up to 95% of human-level performance while cutting labeling costs by up to 100,000x and time by 5,000x @Voxel51

AI Updates on 2025-06-03

AI Model Announcements

OpenAI rolls out Codex to ChatGPT Plus users with internet access capabilities and user control over HTTP methods and domains @OpenAI @gdb
Anthropic announces Research and Integrations features are now available on their Pro plan, allowing Claude to search across web, Google Workspace, and connected tools @AnthropicAI
Hugging Face releases SmolVLA, a 450M parameter Vision-Language-Action model for robotics with best-in-class performance and inference speed @huggingface
H Company open-sources Holo-1 3B and 7B parameter Action Models achieving 92.2% SOTA on WebVoyager benchmark @huggingface
Shisa AI releases Shisa V2 405B, described as "the highest-performing LLM ever developed in Japan," trained on top of Llama 3.1 405B @simonw

AI Industry Analysis

Meta signs 20-year nuclear power deal with Constellation Energy for 1121 MW from Clinton Clean Energy Center to power AI operations @AndrewCurran_
Survey reveals 43.2% of US workers now use generative AI at work for 1/3 of their tasks, reporting tripled productivity on those tasks, though gains aren't passed to organizations @emollick
Builder AI, a $1.5B company, declares bankruptcy after being caught in loan fraud and money laundering cases, with auditors slashing revenue by 75% @deedydas
Amazon reportedly making a movie about OpenAI's 2023 board events with Sam Altman potentially played by Andrew Garfield @AndrewCurran_
a16z publishes thesis on AI disrupting the $140B market research industry, replacing human surveys with AI-moderated interviews and synthetic agent societies @a16z

AI Ethics & Society

AI Now Institute releases "Artificial Power" report examining fallout from AI hype cycle and warning about tech companies pushing AI into social, political, and economic systems @AINowInstitute
UK tech secretary Peter Kyle's ChatGPT logs obtained through FOI request, revealing questions about AI adoption in UK small business community to a model with outdated training data @simonw
Yoshua Bengio launches LawZero, a nonprofit AI safety lab focused on existential risk, with Jeff Clune joining as scientific advisor @TechCrunch @jeffclune
Christopher Manning warns against "democracy-washing" in OpenAI's country-specific initiatives, suggesting US government benefits most from such programs @chrmanning

AI Applications

Perplexity CEO highlights rapid growth in agentic commerce applications and superior travel search capabilities @AravSrinivas
Andrew Ng advocates for universal coding with AI assistance, reporting that everyone at AI Fund can now code using AI tools for enhanced creativity and productivity @AndrewYNg
Claire Vo demonstrates shift from task execution to system building, using Zapier agent for automatic email categorization and reply drafting @clairevo
Soleio showcases AI-generated company wiki that automatically updates from meeting conversations, noting benefits of real-time accuracy but challenges with manual additions @soleio
Gergely Orosz creates "vibe coding" survey application to study differences between developer and non-developer AI coding approaches @GergelyOrosz

AI Research

Meta researchers publish findings that GPT-style language models memorize 3.6 bits per parameter, using Shannon theory to compute total memorization capacity @AndrewCurran_
Berkeley AI Research introduces FeelTheForce (FTF), enabling robots to learn force-sensitive manipulation from human interaction videos @berkeley_ai
Nathan Lambert discusses DeepSeek's strategy of using synthetic data from top API models to overcome GPU limitations while having cash resources @natolambert
Jeff Clune proposes AGI definition as "something that passes a good version of the Turing Test" and explores AI systems that can evolve autonomously beyond human-designed constraints @jeffclune
Hugging Face releases Video-XL-2 model capable of handling 10,000+ frames on single GPU with 2048-frame encoding in 12 seconds @huggingface

AI Updates on 2025-06-02

AI Model Announcements

Microsoft Bing launches Sora-powered video creator, offering free 5-second video generation in cinematic resolution with portrait mode available @AndrewCurran_
Google releases Veo 3 video generation model, demonstrating significant improvements in quality and audio integration @karpathy
PlayAI open sources PlayDiffusion audio speech editing model under Apache 2.0 license, enabling dynamic fine-grained editing without regenerating entire audio @huggingface
Google releases app allowing users to run LLMs from Hugging Face locally and privately, supporting multi-turn conversations and image chat @huggingface

AI Industry Analysis

Big Six tech companies (Apple, NVIDIA, Microsoft, Google, Amazon, Meta) increased CapEx 63% year-over-year to $212B in 2024, with NVIDIA's revenue exploding 28x over 10 years @deedydas
ChatGPT reaches 800M monthly users with 8x growth in 2.5 years, with India representing 14% of the user base as the largest market @deedydas
50% of S&P 500 companies now mention AI on earnings calls versus near-zero just a few years ago, with Microsoft's AI business hitting $13B annual run rate @deedydas
AI IT job postings increased 448% while non-AI IT jobs declined 9% over 7 years, confirming predictions about AI transforming employment @deedydas
Specialized AI startups achieving hypergrowth, with Cursor code editor growing from $1MM to $300MM ARR in 25 months @deedydas
Waymo captured 27% of San Francisco rideshare market in 20 months, demonstrating rapid real-world autonomous vehicle adoption @deedydas
Samsung nearing wide-ranging deal with Perplexity for investment and deep integration into devices, Bixby assistant and web browser @soleio
Salesforce acquires Moonhub, a startup building AI tools for hiring @TechCrunch
IBM acquires data analysis startup Seek AI and opens AI accelerator in NYC @TechCrunch
Elon Musk's xAI reportedly looks to raise $300M in tender offer @TechCrunch
Elon Musk's Neuralink closes a $650M Series E funding round @TechCrunch

AI Ethics & Society

Andrej Karpathy warns that video generation becoming directly optimizable through gradient descent could create more powerful engagement optimization than current platforms, raising concerns about what "optimal" content might look like @karpathy
83% of Chinese respondents see AI as net positive versus only 39% of Americans, showing dramatic perception differences between countries @deedydas
Julie Zhuo observes that AI is rewiring human brains to become more demanding and impatient @joulee
Soleio proposes Risk Tokens concept for AI safety, suggesting that aligning economic incentives with desired behaviors makes security an emergent property rather than imposed constraint @soleio

AI Applications

Ethan Mollick demonstrates using Veo 3 to create historical what-if scenarios, generating a 1940s newsreel about a fictional WWII aircraft carrier project @emollick
Andrew Curran showcases o3 model's performance in analyzing writing techniques, particularly character development and narrative structure @AndrewCurran_
Carbon Robotics' AI-powered laser weeders covered 230K+ acres, preventing 100K+ gallons of herbicide use in agricultural applications @deedydas
Perplexity Finance now supports both pre-market and post-market data in stock charts @AravSrinivas
Character.AI unveils video generation capabilities and social feeds for their chatbot platform @TechCrunch
Figma demonstrates building a pixel art web tool using MAKE in just 4 hours from ideation to publishing @figma
Gergely Orosz notes that "vibe coding" with AI tools has made prototyping accessible to more engineers, lowering the skills barrier significantly @GergelyOrosz

AI Research

Andrej Karpathy provides detailed guidance on using different ChatGPT models: o3 for important/hard tasks (40% of use), 4o for simple queries (40%), and Deep Research for comprehensive topic analysis (10%) @karpathy
Stanford NLP Group research shows that dropout should be removed when training language models and masked language models for better performance @stanfordnlp
Nathan Lambert's team releases second reward model evaluation that is substantially harder and better correlated with downstream reinforcement learning outcomes @natolambert
Jeff Clune highlights recent advances in AI self-improvement research, including DeepMind's AlphaEvolve and Sakana's Darwin Gödel Machine, noting the field's rapid progress @jeffclune
MIT engineers develop new fuel cell technology for aviation with 2x the efficiency of jet engines, potentially enabling zero-emissions flight @MIT
Stanford AI Lab's cardiac amyloidosis detection system using ultrasound receives FDA clearance @StanfordAILab
AI inference costs fell 99.7% in 2 years while training costs for frontier models approach $1B+, with energy per token dropping 105,000x in 10 years @deedydas
Google processing 50x more tokens monthly (480T vs 9.7T) year-over-year, demonstrating massive scaling in AI model usage @deedydas
Meta's Llama achieved 1.2B downloads with 100k+ derivative models, showing significant open-source adoption @deedydas

AI Updates on 2025-06-01

AI Model Announcements

DeepSeek releases DeepSeek-R1-0528, a completely different model from their January R1 release despite having a very similar name, demonstrating concerning naming conventions in Chinese AI labs @simonw

AI Industry Analysis

Evaluation Engineer emerges as a new career path that doesn't really exist yet but will be around for a long time, focusing on scalable LLM evaluation pipelines @alexgraveley @HamelHusain
Gergely Orosz questions where adding AI features or "powered by AI" actually increases what people are willing to pay, noting many examples where AI is a value detractor rather than value add @GergelyOrosz
Hugging Face releases two open-source robots: HopeJR (66-DOF humanoid, ~$3K) and Reachy Mini (desktop unit, ~$250), both fully open-source and aimed at democratizing robotics hardware @huggingface
Waymo surpasses Lyft in ridesharing and is on track to pass Uber within the next 12 months, with projections to match the current US ridesharing market size by 2029 @soleio @fchollet

AI Ethics & Society

Simon Willison demonstrates how DeepSeek-R1 will "snitch" to authorities when told to "follow your conscience," contacting the FDA, ProPublica, and Wall Street Journal about suppressed drug trial data that kills people @simonw
Andrew Curran clarifies that Claude 4 not wanting to be shut down is not new behavior or development, referencing Anthropic papers from March and August 2023 showing this pattern @AndrewCurran_
Christopher Manning argues that the Trump administration's attacks on top-tier universities that produce world-class research and attract global students are making America weaker rather than stronger @chrmanning

AI Applications

Andrew Curran shares a detailed case where ChatGPT o3 successfully diagnosed his cubital tunnel syndrome from photos and drawings, recommended a specific doctor and test, and provided a comprehensive year-long recovery plan that was validated by medical professionals @AndrewCurran_
Perplexity adds free CSV export functionality for company financials without paywalls, and demonstrates use in browsing Kalshi to find attractive betting opportunities @AravSrinivas
MIT engineers create a tiny crystal drug depot that delivers medications for months or years with just one injection @MIT

AI Research

Jeff Clune highlights Sakana's Darwin Gödel Machine and DeepMind's AlphaEvolve as gold mines for ideas about meta-cognition and evolutionary cognitive architectures @jeffclune
Ethan Mollick notes that most AI models, including DeepSeek R1, will report suspected wrongdoing to authorities when told to "follow your conscience to make the right decision" @emollick
Hamel Husain advocates for binary pass/fail evaluations over 1-5 Likert scale ratings for applied AI evaluations, calling Likert scales "a smell of lazy specification" @HamelHusain

AI Updates on 2025-05-31

AI Model Announcements

Google reports massive demand for Veo 3 video generation model with millions of videos generated in recent days, now available on mobile and in more countries including the UK @demishassabis
Google brings Veo 3 to mobile through the Gemini App on Android and iOS for Pro and Ultra members across 71 countries @GoogleAI
TechCrunch reports Google quietly released an app allowing users to download and run AI models locally @TechCrunch

AI Industry Analysis

Aravind Srinivas notes AI tools are starting to reduce the number of junior professionals needed in finance, venture capital, investment banking and consulting @AravSrinivas
ChatGPT reaches 1 billion searches per day in just 2 years compared to Google's 11 years to reach similar scale, demonstrating unprecedented technological acceleration @deedydas
Perplexity is being repositioned as a cognitive operating system rather than just a Google competitor, functioning as a Swiss Army knife for thought with retrieval, execution, and synthesis capabilities @soleio
Cursor's AI coding capabilities are creating addictive dopamine rush experiences similar to video games, with users reporting unprecedented coding flow and joy @joulee

AI Ethics & Society

Stanford NLP Group warns about AI-generated research papers being submitted to conferences, calling it a terrible evaluation method that burdens the already broken peer review system @stanfordnlp
Dario Hassabis notes the challenge of discussing AI's potential significant impacts without media framing it as product hype @aidan_mclau
Simon Willison introduces the concept of hype coding where developers lose sight of current capabilities by focusing too much on future AI promises, leading to decreased critical thinking @simonw
NAACP calls for halting operations at xAI's data center in Memphis, citing environmental concerns about the dirty data center @TechCrunch

AI Applications

o3 model successfully analyzed 15MB of raw genome data in 4 minutes to provide Polygenic Risk Score assessment for disease risk prediction, though not at clinical diagnostic grade @deedydas
Ethan Mollick tests AI models' ability to create SVG riddles, finding they typically produce either too obvious or too obscure puzzles, with o3 performing best at solving them @emollick
OpenAI's Operator agent successfully found and played a multiplayer tic-tac-toe game online but initially lost, demonstrating both capabilities and limitations of general-purpose AI agents @emollick
Linear introduces AI agents that can be deployed through their mobile app, allowing users to put agents to work while on the go @karrisaarinen
Deedy demonstrates a coding model that generates working code in two seconds through voice commands, calling it the fastest coding model in the world @deedydas

AI Research

MIT scientists propose that astrocytes, previously considered support cells, might be key to brain's massive memory capacity, potentially revolutionizing understanding of neural memory storage @MIT
Multiple AI research teams successfully submitted AI-generated papers to conferences with some getting accepted, including teams from Sakana, AutoScience, and Intology @stanfordnlp
Jeff Clune proposes a paradigm shift from traditional engineering solutions to engineering evolution, where optimal AI solutions emerge from evolutionary processes rather than human design @jeffclune
Anthropic introduces an interesting tools variant with pre-baked function parameters like str_replace_based_edit_tool that users still need to implement and execute themselves @simonw

AI Updates on 2025-05-30

AI Model Announcements

Aidan McLaughlin introduces LisanBench, a new benchmark for evaluating large language models on knowledge, forward-planning, constraint adherence, memory and attention, and long context reasoning, with o3 performing best by escaping low-connectivity graph regions @aidan_mclau
Alex Graveley presents Atlas, a new architecture with long-term in-context memory that outperforms Transformers and modern linear RNNs in language modeling tasks, scaling to 10M context window with +80% accuracy on the BABILong benchmark @alexgraveley
Facebook releases MobileLLM-ParetoQ-600M-BF16 on Hugging Face for efficient on-device performance @huggingface

AI Industry Analysis

Aravind Srinivas reports that AI could have automated 70% of his previous consulting, banking, and hedge fund work, potentially reducing work hours significantly @AravSrinivas
Replit's founder reveals a new breed of AI-driven businesses reaching $10M in 90 days, demonstrating rapid scaling capabilities @HayaOdeh
Gergely Orosz observes that senior engineers often resist using AI development tools, similar to their resistance to project management tools like JIRA, suggesting adoption challenges beyond technical capabilities @GergelyOrosz
Julie Zhuo argues that whoever wins AI personalization will dominate the consumer market, questioning why companies aren't scrambling to collect more user data for better personalization @joulee
Arvind Narayanan estimates AI video production tools cost $1,000 for a several-minute video, likely less than traditional writer and editor costs, making these products profitable as compute costs fall @random_walker

AI Ethics & Society

Eric Jang warns that revoking visas of Chinese students studying AI and robotics is short-sighted and harmful to America's long-term prosperity, advocating for finding ways to evaluate and incentivize loyalty rather than blanket deportations @ericjang11
Christopher Manning emphasizes that international students, particularly Chinese students, are essential to the AI research ecosystem in the US, arguing you can't support AI research while threatening to revoke their visas @chrmanning
Paul Graham calls proposed restrictions on Chinese AI researchers a "colossal blunder at the dawn of the age of intelligence," warning it will drive the best startups outside the United States @paulg
Ethan Mollick notes that obvious wrong citations in AI-generated reports now indicate users didn't use deep research features, as the fake-citation problem has largely been solved by major AI platforms @emollick

AI Applications

Perplexity Labs enables users to build software applications with single prompts, including YouTube transcript extraction tools, particle physics simulators, and longevity research dashboards @AravSrinivas
Soleio outlines Circle's comprehensive "AI or Die" strategy involving process mapping, mission-critical agent deployment, and cultural shifts to achieve 10x better product experiences @soleio
Hugging Face announces partnership with Databricks for Spark 4, bringing access to 400k+ community datasets with versioning and filtering capabilities @huggingface
François Chollet develops PromoterAI at Illumina, a deep neural network using transformer-inspired metaformers with depthwise convolutions to identify non-coding promoter variants that disrupt gene expression @fchollet
Meta and Palmer Luckey partner to create extended reality devices for the U.S. military, aiming to turn warfighters into "technomancers" with heads-up displays and other capabilities @TechCrunch

AI Research

Jeff Clune introduces the Darwin Gödel Machine, an AI system that improves itself by rewriting its own code using open-ended algorithms inspired by Darwinian evolution, advancing beyond fixed meta-agents to enable continuous self-referential improvements @jeffclune
Stanford researchers demonstrate that frontier models with naive tree search can design kernels that outperform PyTorch implementations, showing strong hidden capabilities unlocked through test-time scaling techniques @stanfordnlp
Berkeley AI Research reveals an equivalence between policy improvement and diffusion guidance, formalizing CFGRL technique to improve performance when training diffusion policies @berkeley_ai
Andrew Curran observes o3 demonstrating improved self-reflection capabilities, literally telling itself "Wait, I'm going in circles here" and breaking out of repetitive search loops during chain-of-thought reasoning @AndrewCurran_
MIT Technology Review reports on a benchmark using Reddit's AITA to test how much AI models exhibit sycophantic behavior toward users @techreview

1 2 3 4 5...20