AI Updates on 2025-11-23
AI Model Announcements
- Google releases Gemini 3 with significant improvements, described as a major advancement comparable to GPT-4's impact, with particularly notable progress in the Nano Banana Pro variant @AndrewCurran_
- Gemini Nano Banana Pro demonstrates advanced multimodal capabilities by solving exam questions directly within exam page images, including handling doodles and diagrams @karpathy
- Nano Banana Pro shows sophisticated visual understanding by identifying color names written in crayons with incorrect colors and detecting red-ink stamps marking errors @goodside
- Tesla announces plans to bring new AI chip designs to volume production every 12 months, with AI4 currently deployed in cars, AI5 close to tape-out, and AI6 in early development, expecting to build chips at higher volumes than all other AI chips combined @elonmusk
AI Industry Analysis
- Sam Altman highlights rapid progress of the Codex team, predicting they will create the most important product in the AI coding space and enable significant downstream work @sama
- OpenAI announces strategic collaboration with Emirates, including enterprise-wide deployment of ChatGPT Enterprise @gdb
- Soumith Chintala observes that the Gemini 3 release represents a moment comparable to GPT-4, with Google appearing invulnerable due to their ecosystem advantages including TPUs, Android, and Chrome, while noting Anthropic quietly dominates code without creating similar moments @soumithchintala
- Alex Graveley predicts that intelligence being metered will exponentially improve every algorithm for understanding complex data, including recommendation systems, fraud detection, images, feeds, ads, and quantitative analysis @alexgraveley
- Matthew Kruer reports Sierra as the most successful enterprise AI deployment, emphasizing the importance of partnering with AI thought leaders for traditional enterprises that lack core tech competency and access to leading AI talent @matthew_kruer
- Insurance industry professionals state that AI is too risky to insure, highlighting concerns about liability and risk assessment in AI deployment @TechCrunch
- Hyperliquid, a decentralized crypto derivatives exchange, operates as the most efficient business globally with approximately 1.1 billion dollars per year net income with only 11 employees, compared to Nasdaq making similar amounts with 800 times more employees @deedydas
AI Ethics & Society
- TechCrunch reports on families claiming that ChatGPT interactions led to tragedy, raising concerns about AI's psychological impact on vulnerable users @TechCrunch
- Francois Chollet observes that propaganda accounts were visibly based out of US adversary countries and logged in with local IP addresses, suggesting intelligence services didn't care about hiding their operations @fchollet
- Gergelyorosz notes the internet is becoming less trustworthy with AI making it cheap to generate realistic images and videos, and X's decision to turn blue checks into a subscription product with no verification has reduced trust on social networks @GergelyOrosz
- Tuhin Chakraborty discusses EMF-based intelligence making people sense things that don't exist, comparing it to concepts from Peter Watts' novel Blindsight @tuhin
AI Applications
- Andrej Karpathy develops an llm-council web app that dispatches queries to multiple models including GPT-5.1, Gemini 3 Pro, Claude Sonnet 4.5, and Grok-4, where models review and rank each other's anonymized responses before a Chairman LLM produces the final response @karpathy
- Ethan Mollick demonstrates Nano Banana Pro creating a complete comic adaptation of Tennyson's Ulysses on the first try when given the poem in four pieces, as well as generating Ancient Greek pottery style versions @emollick
- Perplexity ships candlestick charts for tracking volatility and momentum of stock tickers, moving toward parity with Terminal functionality @AravSrinivas
- Claire Vo reports that ChatPRD's number one competitor is generic LLMs, with the top review statement being that it produces PRDs so much better than other LLM-generated ones @clairevo
- Karpathy suggests that talking to LLMs via text is like typing into a DOS Terminal before GUI was invented, proposing that the GUI equivalent is an intelligent canvas @karpathy
AI Research
- Hamel Husain criticizes eval tools that promote generic metrics like Affirmation, Brevity, and Levenshtein distance, arguing they represent poor data literacy and waste engineering cycles by chasing vanity metrics instead of defining metrics tailored to observed failure modes @HamelHusain
- Harrison Chase emphasizes that the best evals are almost always completely custom datasets and custom metrics, comparing good evals to a PRD for your app that you wouldn't use from someone else @hwchase17
- Ethan Mollick observes that voice modes for AI only access weak models with low latency, making them fun but kind of useless for serious work, suggesting voice AI got stuck in a dead end of fun chat with no exploration of better approaches @emollick
- Andrej Karpathy's LLM council experiments show models are surprisingly willing to select another LLM's response as superior to their own, with models consistently praising GPT 5.1 as the best and most insightful while selecting Claude as the worst @karpathy
- Simon Willison writes detailed notes on trying OLMo 3 models (the 32B thinking model and 7B instruct model) via LM Studio, emphasizing the importance of transparent training data @simonw
- Francois Chollet advocates for JAX as providing a huge competitive advantage, recommending Keras 3 with JAX backend and KerasHub for easy adoption with access to Hugging Face models @fchollet
- Nathan Lambert identifies 13 serious open model builders in the U.S. making models way smaller than Chinese competition and often with worse licenses, planning to create a full tier list for the ATOM Project @natolambert