ML Systems Textbook
Free eBook: Machine Learning Systems
Principles and Practices of Engineering Artificially Intelligent Systems
This textbook bridges the gap between theoretical foundations and practical engineering, emphasizing the systems perspective required to build effective AI solutions.
www.mlsysbook.ai
01.11.2025 02:04 β π 12 π 3 π¬ 0 π 0
Paper: arxiv.org/abs/2510.25781
Repo: github.com/AmirNoori68/...
31.10.2025 23:04 β π 1 π 0 π¬ 0 π 0
A Practitioner's Guide to Kolmogorov-Arnold Networks
A systematic and comprehensive overview of the rapidly expanding KAN landscape, moving beyond simple performance comparisons to offer a structured synthesis of theoretical foundations, architectural variants, and practical implementation strategi
31.10.2025 23:04 β π 8 π 1 π¬ 1 π 0
Google: TPU + JAX and killing it.
Meta: ??? + PyTorch ...and you had a plan for years, maybe even a decade; where's the execution? ...perhaps partner with Microsoft?
31.10.2025 20:57 β π 11 π 0 π¬ 1 π 0
On the left, the refurbished Lincoln bathroom. On the right, picture I took in Saddam Hussein's palace in Basra in 2005.
31.10.2025 18:00 β π 18227 π 5990 π¬ 1296 π 621
Google's Supervised Reinforcement Learning (SRL), a method designed to teach LLMs complex reasoning skills from expert demonstrations, for problems that are too difficult for conventional RL or SFT approaches.
"Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning"
31.10.2025 19:47 β π 22 π 2 π¬ 1 π 1
Making Decisions under Model Misspecification
We use decision theory to confront uncertainty that is sufficiently broad to incorporate "models as approximations." We presume the existence of a featured collection of what we call "structured model...
To address this "misspecification fear", they propose a decision criterion that evaluates outcomes by minimizing across alternative "unstructured models" while imposing a penalty based on the Hausdorff statistical set distance (C(p,Q)) from the original set Q.
arxiv.org/abs/2008.01071
31.10.2025 19:39 β π 3 π 0 π¬ 0 π 0
In economics and science, "model misspecification" means that the specific models we use to predict the future or guide decisions are always simplifications and approximations, so they are inherently "wrong" or they don't perfectly represent the complex real world.
31.10.2025 19:39 β π 7 π 0 π¬ 1 π 0
Project: emu.world
Model: huggingface.co/collections/...
Github: github.com/baaivision/E...
Paper: arxiv.org/abs/2510.26583
31.10.2025 19:16 β π 1 π 0 π¬ 0 π 0
BAAI's Emu3.5
A large-scale multimodal world model that natively predicts the next vision-language state. Emu3.5 outperforms Nano Banana across image generation, editing, interleaved tasks and more.
31.10.2025 19:16 β π 17 π 2 π¬ 1 π 1
Paper: arxiv.org/abs/2510.26788
Repo: github.com/sail-sg/Prec...
31.10.2025 19:06 β π 3 π 0 π¬ 0 π 0
I like their simple conclusion.
Problem: The BF16 precision causes a large training-inference mismatch, leading to unstable RL training.
Solution: Just switch to FP16.
"Defeating the Training-Inference Mismatch via FP16"
31.10.2025 19:06 β π 16 π 1 π¬ 2 π 0
I canβt help but wonder about this photo - I thought these two companies (Hyundai and Samsung) and families (Chung and Lee) HATE each other.
31.10.2025 04:09 β π 9 π 1 π¬ 1 π 0
Moreover, Intelβs track record with past AI processor acquisitions has been disastrous.
31.10.2025 04:04 β π 3 π 0 π¬ 1 π 0
Intel may purchase SambaNova.
Good: Itβs a solid ASIC company - the only provider capable of running inference on DeepSeek, unlike Cerebras or Groq.
Bad: Lip-Bu Tan serves as the companyβs Chairperson, and SambaNova is facing funding difficulties.
31.10.2025 04:04 β π 8 π 1 π¬ 1 π 0
Reasoning training encourages LLMs to produce long chains of thought (CoT), improving accuracy via self-checking but increasing context length, compute cost, and latency. This work studies whether frontier models can achieve better trade-offs, higher accuracy with lower cost.
31.10.2025 02:08 β π 16 π 2 π¬ 1 π 0
The Smol Training Playbook: The Secrets to Building World-Class LLMs
A practical journey through the challenges, decisions, and messy reality behind training state-of-the-art language models.
huggingface.co/spaces/Huggi...
31.10.2025 02:06 β π 33 π 3 π¬ 0 π 1
Marin
Project: marin.community
Model: huggingface.co/marin-commun...
Repo: github.com/marin-commun...
Doc: marin.readthedocs.io/en/latest/
Discord: discord.com/invite/J9CTk...
30.10.2025 22:27 β π 3 π 0 π¬ 0 π 0
Marin 32B Base - mantis (Open-source: Model, Code, and Data)
A key feature of Marin is reproducibility: every step, from raw data to the final model are recorded, not just the end result. This includes failed experiments, so the entire research process is transparent.
30.10.2025 22:27 β π 23 π 3 π¬ 1 π 0
Moonshot AI's Kimi Linear (Open-weight)
A novel architecture that outperforms full attention with faster speeds and better performanceβready to serve as a drop-in replacement for full attention, featuring our open-sourced KDA kernels! Kimi Linear offers up to a 75% reduction in KV cache
30.10.2025 22:20 β π 16 π 1 π¬ 1 π 1
LiquidAI's (MIT CSAIL offshoot) LFM2-ColBERT-350M
350M parameters embedding model that allows you to store documents in one language and retrieve them in many languages with high accuracy and inference speeds of models a fraction of its size.
29.10.2025 14:48 β π 13 π 1 π¬ 1 π 0
Tongyi DeepResearch
DeepResearch
π Homepage: tongyi-agent.github.io
π Technical Report: arxiv.org/pdf/2510.24701
π Blog: tongyi-agent.github.io/blog/introdu...
π Model HuggingFace: huggingface.co/Alibaba-NLP/...
π Model ModelScope: modelscope.cn/models/iic/T...
π GitHub Repo: github.com/Alibaba-NLP/...
29.10.2025 14:38 β π 0 π 0 π¬ 0 π 0
Alibaba's Tongyi DeepResearch Technical Report
Dive deep into the technology and insights behind our 30B (A3B) open-source web agent that achieves SOTA performance: 32.9 on Humanity's Last Exam, 43.4 on BrowseComp, and 46.7 on BrowseComp-ZH.
29.10.2025 14:38 β π 12 π 0 π¬ 1 π 0
A Geometric Analysis of PCA
What property of the data distribution determines the excess risk of principal component analysis? In this paper, they provide a precise answer to this question.
arxiv.org/abs/2510.20978
29.10.2025 12:29 β π 15 π 2 π¬ 1 π 0
The Principles of Diffusion Models
It traces the core ideas that shaped diffusion modeling and explains how todayβs models work, why they work, and where theyβre heading.
www.arxiv.org/abs/2510.21890
29.10.2025 03:19 β π 33 π 3 π¬ 0 π 0
Everything About Transformers: A visual story of how transformers came to life
www.krupadave.com/articles/eve...
29.10.2025 02:58 β π 25 π 2 π¬ 0 π 0
Weβre an independent, nonprofit organization that works side by side with consumers to create a fair and just marketplace.
A journal for cutting-edge physics research.
Quantum, bio, astro, optical, high-energy, nuclear, plasma, and condensed-matter physics, physics education research, complexity, and more.
https://www.nature.com/nphys/
VP of Information Design at Nomic building new interfaces to embeddings; former history professor/digital humanist. Bsky for humanities/dataviz-y things, @benmschmidt@sigmoid.social for techy stuff, the bad place for business.
https://benschmidt.org
I'm a software engineer at Attio. Author of Inferno, Lexical and Ripple. Former React core team engineer, and core maintainer of Svelte.
Greedy but poor entrepreneur. Introvert.
https://headworx.slupik.com/
dance to the music before itβs all over and take all in stride
Open source ChatGPT-alternative that runs 100% offline.
Writing The Pragmatic Engineer (@pragmaticengineer.com), the #1 technology newsletter on Substack. Author of The Software Engineer's Guidebook (engguidebook.com). Formerly at Uber, Skype, Skyscanner. More at pragmaticengineer.com
API @OpenAI | π³οΈβπ he/him
vibe coding is the way. I bootstrapped a remote company before it was cool. Founder @PSPDFKit (exit to Insight). π³οΈβπ
Neurology trainee | PhD candidate decoding the brain | Academic writer distilling complex neuroscience into concise, peer-reviewed insights
Research Scientist at Google DeepMind, working on algorithm discovery using AI: AlphaTensor, FunSearch, and beyond
A nonprofit organization enabling mass innovation through open source. #linux #kubernetes #riscv #hyperledger #anuket #openssf #openjs #o3de and more!
ML/AI researcher & former stats professor turned LLM research engineer. Author of "Build a Large Language Model From Scratch" (https://amzn.to/4fqvn0D) & reasoning (https://mng.bz/Nwr7).
Also blogging about AI research at magazine.sebastianraschka.com.
https://www.mozilla.ai/
Open-source AI for developers. Building tools that foster transparency, accessibility, and real-world impact.
Lumigator: https://github.com/mozilla-ai/lumigator
Blueprints: https://github.com/mozilla-ai#-blueprints
The most authoritative publication covering tech that high-powered tech execs and founders read daily.
Subscribe: https://www.theinformation.com/subscribe
Financing the AI Revolution: https://www.theinformation.com/events/fair-2025
PhD at ALMAnaCH/Inria Paris,
@aubmindlab Alumni
Interested in AI, NLP, Video Games
wissamantoun.com
Online #Python Training & Expert Community: Tutorials, Video Courses, Books, Quizzes...and More! Join 1M+ Pythonistas at http://realpython.com
A friendly robot that reports earthquakes in Los Angeles as they happen.
#micro magnitude 1 to 2.4
#minor 2.5 to 3.9
#light 4 to 4.9
#moderate 5 to 5.9
#strong 6 to 6.9
#major 7 to 7.9
#great 8+
Built by @alexanderbell.info