---
### **OpenAI o3-mini: Cost-Efficient Reasoning Model**
OpenAI launched **o3-mini**, its most cost-efficient reasoning model optimized for STEM tasks (science, math, coding). Key features:
- **24% faster** than o1-mini (7.7s avg response vs 10.16s) with **39% fewer errors** [1][12][15]
- Supports **structured outputs**, function calling, and three reasoning effort modes (low/medium/high) [22][119]
- Available to **free ChatGPT users** via "Reason" mode and API ($1.10/million input tokens) [116][120]
- Outperforms o1-mini in 56% of tests but trails DeepSeek-R1 in cost efficiency ($0.55/million tokens) [15][118]
---
### **Alibaba's Qwen2.5-Plus Update**
Alibaba upgraded its **Qwen Chat** with:
- **Qwen2.5-Plus-0125-Exp** model using advanced post-training techniques [2][24]
- **10,000-character text input** and PDF/DOCX file support [2][128]
- Flexible mode switching (web search, normal, etc.) in a single session [2][126]
- Outperforms GPT-4o and Gemini in document analysis but lags in multilingual tasks [24][129]
---
### **OpenAI's Deep Research Agent**
New **AI research agent** synthesizes web data into reports:
- Powered by **o3 model**, analyzes text/images/PDFs in minutes [3][131][137]
- Generates citations and summaries, available for **ChatGPT Pro** users [44][135]
- 5–30 minutes per query, with lower hallucination rates than ChatGPT [137]
---
## **⚡️ Trending Signals**
### **Google DeepMind: RL > Supervised Fine-Tuning**
- **SCoRe** (Self-Correction via RL) improves math/coding accuracy by 15.6% and 9.1% over supervised methods [4][48]
- RL enables **self-correction traces**, reducing bias and distributional shift [53][140]
### **NVIDIA Eagle2-9B Vision-Language Model**
- **92.6% accuracy on DocVQA**, surpassing GPT-4V (88.4%) [5][59]
- Trained on 180+ sources with transparent data strategy [64][68]
- 80% smaller dataset (4.6M samples) maintains SOTA performance [5]
### **Meta's EvalPlanner for LLM Judges**
- **Three-stage evaluation** (plan → execute → judge) improves fairness in AI assessments [6][75]
- Optimizes synthetic preference pairs, outperforming human evaluators in coding/math tasks [71][78]
---
## **💻 Top Tutorials**
1. **LangGraph Multi-Agent Workflows**
- Build **supervised agent systems** with web research, RAG, and NL2SQL nodes [7][87]
2. **Reduce DeepSeek-R1 Size by 80%**
- Use **1.58-bit dynamic quantization** (131GB → 158GB VRAM) [93][97]
- Maintains performance with 7-8 tokens/sec speed on RTX 4090 [95]
3. **Gradio/Hugging Face Apps**
- Deploy ML demos with text summarization, image captioning, and LLM chat [106][115]
---
**Key Takeaway**: OpenAI and Chinese rivals (DeepSeek, Alibaba) are racing to optimize cost, speed, and reasoning—with reinforcement learning and quantization emerging as critical tools.
Citations: