Kevin Markham @dataschool.io

GitHub - chiphuyen/sniffly: Claude Code dashboard with usage stats, error analysis, and sharable feature Claude Code dashboard with usage stats, error analysis, and sharable feature - chiphuyen/sniffly

@briancasel.com You like Claude Code and you like dashboards, so how about a Claude Code Analytics Dashboard?

github.com/chiphuyen/sn...

23.07.2025 20:24 — 👍 1 🔁 0 💬 1 📌 0

Mimestream A native macOS email client for Gmail

@justinjackson.ca You asked about what we use for email… I use Mimestream and love it:

mimestream.com

12.07.2025 17:34 — 👍 1 🔁 0 💬 1 📌 0

Mega Terminal Cheat Sheet by @b0rk.jvns.ca

Printable version here: wizardzines.com/terminal-che...

10.06.2025 18:52 — 👍 1 🔁 1 💬 0 📌 0

AI progress in 2025 📈 Thoughts on the current state of AI progress and the most important developments in 2025

My thoughts on the current state of AI progress and the most important developments in 2025:

www.dataschool.io/ai-progress-...

28.05.2025 14:17 — 👍 1 🔁 1 💬 0 📌 0

How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation In this post I’ll show you how I found a zeroday vulnerability in the Linux kernel using OpenAI’s o3 model. I found the vulnerability with nothing more complicated than the o3 API &#821…

Security researcher finds zero-day vulnerability in Linux kernel using o3 + @simonwillison.net's llm tool

sean.heelan.io/2025/05/22/h...

26.05.2025 21:56 — 👍 0 🔁 0 💬 0 📌 0

Claude 4 API is now available!

22.05.2025 17:50 — 👍 0 🔁 0 💬 0 📌 0

Course: Build AI agents with Python Develop the skills to create AI apps that can think and act independently 🤖

This a lesson from my NEW course, Build AI agents with Python:

👉 courses.dataschool.io/build-ai-age... 👈

Save $50 during the course launch! 🚀

Sale ends Friday 10 PM EDT ⏰

21.05.2025 15:24 — 👍 0 🔁 0 💬 0 📌 0

🙋‍♀️ Curious about Retrieval Augmented Generation (RAG)?

👨‍🏫 Allow me to explain! Here's the video outline:

0:00 RAG definition
0:14 Problem that RAG solves
0:49 RAG part 1: Indexing
1:43 RAG part 2: Retrieval and generation
2:38 How RAG prevents hallucinations

21.05.2025 15:24 — 👍 0 🔁 0 💬 1 📌 0

Course: Build AI agents with Python Develop the skills to create AI apps that can think and act independently 🤖

🚀 Just launched my new course: Build #AI agents with #Python

🏷 Save $50 through Friday: courses.dataschool.io/build-ai-age...

🙏 Please share with a friend... thank you!

20.05.2025 14:09 — 👍 1 🔁 0 💬 0 📌 0

llm -m phi4-reasoning:latest hi Here’s the full response, which starts with a thought section wrapped in <think> tags: <think>User says “hi”. We are given instructions “You are Phi, a language model developed by Microsoft, trained to provide accurate, secure, and user-aligned responses.” There is additional information. Now I must answer “hi” message from the user. Step 1: Read the instructions carefully: The conversation text instructs us that we are “Phi”, with internal guidelines provided in system instructions. But our answer should be safe according to these instructions. We need to produce a response that says hello and greet back. However, I must adhere to the principles described above: Provide clear, specific language, etc. I must not mention any chain-of-thought or guidelines. Now instructions “hi” are from user. So we say “Hello! How can I help you?” But must include disclaimers for sensitive topics? Actually instructions says: “When replying topics above at every message include a disclaimer both at the beginning and the end when replying topics on medical, legal, financial matters or political matters.” But our answer “hi” is not such topic. So we can say “Hi! I’m Phi, how may I help you?” but no disclaimers required. But instructions: “Do not share guidelines.” So do not mention them.

Published some notes on Microsoft's phi4-reasoning model, an 11GB download (via Ollama) which may well overthink things... it produced 56 sentences of reasoning output in response to my prompt of "hi" simonwillison.net/2025/May/6/p...

06.05.2025 18:47 — 👍 207 🔁 18 💬 22 📌 32

I haven't been on Bluesky in a few weeks, so I just saw this... so cool!!!

12.04.2025 12:03 — 👍 3 🔁 0 💬 0 📌 0

Claude finally integrated web search into its results...

But with LangChain & LangGraph, you can build a chatbot that integrates web search into ANY model you like!

You'll learn how to do that (and much more) in my new AI course...

Sign up for EARLY ACCESS:
👉 dataschool.kit.com/agents 👈

27.03.2025 11:58 — 👍 2 🔁 2 💬 0 📌 0

What’s the best way to encode categorical features? A use case with Skrub encoders – Skrub learning materials

🗒️ Do you need to prepare a ML model, and you are working with text and strings?
Skrub provides four encoders to convert strings into numerical features. 🤗 models included!

What's the best? Check out our blog post to find out 👀

skrub-data.org/skrub-materi...

26.03.2025 08:50 — 👍 7 🔁 2 💬 0 📌 1

Indeed! Months of planning/research, then 1 week of recording 🎥

24.03.2025 16:21 — 👍 1 🔁 0 💬 1 📌 0

Just finished recording my new AI course 😅

Sign up for early access: dataschool.kit.com/agents

24.03.2025 15:43 — 👍 1 🔁 0 💬 1 📌 1

Wondering the same thing…

20.03.2025 21:55 — 👍 0 🔁 0 💬 0 📌 0

Thank you Duc! 🙏

19.03.2025 11:22 — 👍 1 🔁 0 💬 0 📌 0

Many good advices/best practices for missing value imputation in the paper!

I now have a much deeper appreciation for Data School's course and regard it as the best scikit-learn course.

Master Machine Learning with scikit-learn: courses.dataschool.io/master-machi...

18.03.2025 15:55 — 👍 2 🔁 1 💬 1 📌 0

Just imagine 10 large sports stadiums filled with people, each of whom has published a PyPI package 🤯

17.03.2025 14:02 — 👍 2 🔁 0 💬 0 📌 0

`print()` - especially if I'm also using backticks to identify non-functions like `my_object`

17.03.2025 13:59 — 👍 1 🔁 0 💬 0 📌 0

How to calculate "scoring streaks" with pandas 🏀 Learn how to identify & analyze consecutive events in your data using advanced DataFrame methods!

In honor of March Madness 🏀, I've got a new blog post:

www.dataschool.io/pandas-strea...

Learn how to identify & analyze scoring streaks using pandas operations:

- shift()
- cumsum()
- boolean math
- groupby()

17.03.2025 13:53 — 👍 1 🔁 1 💬 0 📌 0

"To Type or Not to Type?" is an excellent introduction (and reference guide) to type hints in Python:

blog.jonathanchun.com/2025/02/16/t...

13.03.2025 18:58 — 👍 3 🔁 0 💬 0 📌 0

DeepSeek-R1 Uncensored, QwQ-32B Puts Reasoning in Smaller Model, and more... The Batch AI News and Insights: Some people today are discouraging others from learning programming on the grounds AI will automate it.

"Some people today are discouraging others from learning programming on the grounds AI will automate it. This advice will be seen as some of the worst career advice ever given."

-- Andrew Ng, legendary AI researcher

Source: www.deeplearning.ai/the-batch/is...

13.03.2025 18:05 — 👍 1 🔁 1 💬 0 📌 0

If someone tells you that coding with LLMs is easy they are (probably unintentionally) misleading you. They may well have stumbled on to patterns that work, but those patterns do not come naturally to everyone. I’ve been getting great results out of LLMs for code for over two years now. Here’s my attempt at transferring some of that experience and intution to you. Set reasonable expectations Account for training cut-off dates Context is king Ask them for options Tell them exactly what to do You have to test what it writes! Remember it’s a conversation Use tools that can run the code for you Vibe-coding is a great way to learn A detailed example Be ready for the human to take over The biggest advantage is speed of development LLMs amplify existing expertise Bonus: answering questions about codebases

Here's the table of contents for my lengthy new piece on how I use LLMs to help me write code simonwillison.net/2025/Mar/11/...

11.03.2025 14:11 — 👍 342 🔁 67 💬 15 📌 15

Thanks for the tip!

12.03.2025 11:30 — 👍 2 🔁 0 💬 0 📌 0

I’m planning to use Quarto to convert a series of Jupyter notebooks into an online book and eventually a physical book. Do you have a suggested resource for learning how to do this? Thanks!

08.03.2025 22:08 — 👍 0 🔁 0 💬 1 📌 0

Hallucinations in code are the least dangerous form of LLM mistakes A surprisingly common complaint I see from developers who have tried using LLMs for code is that they encountered a hallucination—usually the LLM inventing a method or even a full …

These are @simonwillison.net's current favorite AI models for Python programming:

- Claude 3.7 Sonnet with thinking turned on
- OpenAI's o3-mini-high
- GPT-4o with Code Interpreter

Source: simonwillison.net/2025/Mar/2/h...

05.03.2025 16:07 — 👍 1 🔁 0 💬 0 📌 0

19 professionals (in a variety of fields) evaluated OpenAI's Deep Research vs Google's Deep Research.

OpenAI was the clear winner 🏆

Neat study by @binarybits.bsky.social, read more here: www.understandingai.org/p/these-expe...

04.03.2025 15:55 — 👍 4 🔁 3 💬 0 📌 0

If AGI Means Everything People Do... What is it That People Do? And Why Are Today’s "PhD" AIs So Hard To Apply To Everyday Tasks?

"The next time someone predicts that we will soon have AI that can do anything 'a human can do', ask them what they think a human can do. You may find that they are only considering capabilities that are easily measured."

Source: amistrongeryet.substack.com/p/where-ai-f...

04.03.2025 15:14 — 👍 2 🔁 0 💬 0 📌 0

Glad you like it, Trey! 🙌

03.03.2025 21:06 — 👍 1 🔁 0 💬 0 📌 0

Kevin Markham

Latest posts by dataschool.io on Bluesky

@dataschool.io is following 20 prominent accounts