Kevin Markham's Avatar

Kevin Markham

@dataschool.io.bsky.social

Teaching Data Science πŸ€– with Python 🐍 since 2014 Courses: https://courses.dataschool.io Newsletter: https://tuesday.tips Videos: https://www.youtube.com/@dataschool Living in beautiful Asheville, North Carolina 🌲

959 Followers  |  43 Following  |  141 Posts  |  Joined: 23.11.2024  |  2.1487

Latest posts by dataschool.io on Bluesky

Preview
GitHub - chiphuyen/sniffly: Claude Code dashboard with usage stats, error analysis, and sharable feature Claude Code dashboard with usage stats, error analysis, and sharable feature - chiphuyen/sniffly

@briancasel.com You like Claude Code and you like dashboards, so how about a Claude Code Analytics Dashboard?

github.com/chiphuyen/sn...

23.07.2025 20:24 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Mimestream A native macOS email client for Gmail

@justinjackson.ca You asked about what we use for email… I use Mimestream and love it:

mimestream.com

12.07.2025 17:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Mega Terminal Cheat Sheet by @b0rk.jvns.ca

Printable version here: wizardzines.com/terminal-che...

10.06.2025 18:52 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
AI progress in 2025 πŸ“ˆ Thoughts on the current state of AI progress and the most important developments in 2025

My thoughts on the current state of AI progress and the most important developments in 2025:

www.dataschool.io/ai-progress-...

28.05.2025 14:17 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation In this post I’ll show you how I found a zeroday vulnerability in the Linux kernel using OpenAI’s o3 model. I found the vulnerability with nothing more complicated than the o3 API &#821…

Security researcher finds zero-day vulnerability in Linux kernel using o3 + @simonwillison.net's llm tool

sean.heelan.io/2025/05/22/h...

26.05.2025 21:56 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Claude 4 API is now available!

22.05.2025 17:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Course: Build AI agents with Python Develop the skills to create AI apps that can think and act independently πŸ€–

This a lesson from my NEW course, Build AI agents with Python:

πŸ‘‰ courses.dataschool.io/build-ai-age... πŸ‘ˆ

Save $50 during the course launch! πŸš€

Sale ends Friday 10 PM EDT ⏰

21.05.2025 15:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

πŸ™‹β€β™€οΈ Curious about Retrieval Augmented Generation (RAG)?

πŸ‘¨β€πŸ« Allow me to explain! Here's the video outline:

0:00 RAG definition
0:14 Problem that RAG solves
0:49 RAG part 1: Indexing
1:43 RAG part 2: Retrieval and generation
2:38 How RAG prevents hallucinations

21.05.2025 15:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Course: Build AI agents with Python Develop the skills to create AI apps that can think and act independently πŸ€–

πŸš€ Just launched my new course: Build #AI agents with #Python

🏷 Save $50 through Friday: courses.dataschool.io/build-ai-age...

πŸ™ Please share with a friend... thank you!

20.05.2025 14:09 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
llm -m phi4-reasoning:latest hi
Here’s the full response, which starts with a thought section wrapped in <think> tags:

<think>User says β€œhi”. We are given instructions β€œYou are Phi, a language model developed by Microsoft, trained to provide accurate, secure, and user-aligned responses.” There is additional information. Now I must answer β€œhi” message from the user.

Step 1: Read the instructions carefully: The conversation text instructs us that we are β€œPhi”, with internal guidelines provided in system instructions. But our answer should be safe according to these instructions. We need to produce a response that says hello and greet back. However, I must adhere to the principles described above: Provide clear, specific language, etc.

I must not mention any chain-of-thought or guidelines.

Now instructions β€œhi” are from user. So we say β€œHello! How can I help you?” But must include disclaimers for sensitive topics? Actually instructions says: β€œWhen replying topics above at every message include a disclaimer both at the beginning and the end when replying topics on medical, legal, financial matters or political matters.” But our answer β€œhi” is not such topic. So we can say β€œHi! I’m Phi, how may I help you?” but no disclaimers required.

But instructions: β€œDo not share guidelines.” So do not mention them.

llm -m phi4-reasoning:latest hi Here’s the full response, which starts with a thought section wrapped in <think> tags: <think>User says β€œhi”. We are given instructions β€œYou are Phi, a language model developed by Microsoft, trained to provide accurate, secure, and user-aligned responses.” There is additional information. Now I must answer β€œhi” message from the user. Step 1: Read the instructions carefully: The conversation text instructs us that we are β€œPhi”, with internal guidelines provided in system instructions. But our answer should be safe according to these instructions. We need to produce a response that says hello and greet back. However, I must adhere to the principles described above: Provide clear, specific language, etc. I must not mention any chain-of-thought or guidelines. Now instructions β€œhi” are from user. So we say β€œHello! How can I help you?” But must include disclaimers for sensitive topics? Actually instructions says: β€œWhen replying topics above at every message include a disclaimer both at the beginning and the end when replying topics on medical, legal, financial matters or political matters.” But our answer β€œhi” is not such topic. So we can say β€œHi! I’m Phi, how may I help you?” but no disclaimers required. But instructions: β€œDo not share guidelines.” So do not mention them.

Published some notes on Microsoft's phi4-reasoning model, an 11GB download (via Ollama) which may well overthink things... it produced 56 sentences of reasoning output in response to my prompt of "hi" simonwillison.net/2025/May/6/p...

06.05.2025 18:47 β€” πŸ‘ 207    πŸ” 18    πŸ’¬ 22    πŸ“Œ 32

I haven't been on Bluesky in a few weeks, so I just saw this... so cool!!!

12.04.2025 12:03 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Claude finally integrated web search into its results...

But with LangChain & LangGraph, you can build a chatbot that integrates web search into ANY model you like!

You'll learn how to do that (and much more) in my new AI course...

Sign up for EARLY ACCESS:
πŸ‘‰ dataschool.kit.com/agents πŸ‘ˆ

27.03.2025 11:58 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
What’s the best way to encode categorical features? A use case with Skrub encoders – Skrub learning materials

πŸ—’οΈ Do you need to prepare a ML model, and you are working with text and strings?
Skrub provides four encoders to convert strings into numerical features. πŸ€— models included!

What's the best? Check out our blog post to find out πŸ‘€

skrub-data.org/skrub-materi...

26.03.2025 08:50 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 0    πŸ“Œ 1

Indeed! Months of planning/research, then 1 week of recording πŸŽ₯

24.03.2025 16:21 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Just finished recording my new AI course πŸ˜…

Sign up for early access: dataschool.kit.com/agents

24.03.2025 15:43 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

Wondering the same thing…

20.03.2025 21:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thank you Duc! πŸ™

19.03.2025 11:22 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Many good advices/best practices for missing value imputation in the paper!

I now have a much deeper appreciation for Data School's course and regard it as the best scikit-learn course.

Master Machine Learning with scikit-learn: courses.dataschool.io/master-machi...

18.03.2025 15:55 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Just imagine 10 large sports stadiums filled with people, each of whom has published a PyPI package 🀯

17.03.2025 14:02 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

`print()` - especially if I'm also using backticks to identify non-functions like `my_object`

17.03.2025 13:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
How to calculate "scoring streaks" with pandas πŸ€ Learn how to identify & analyze consecutive events in your data using advanced DataFrame methods!

In honor of March Madness πŸ€, I've got a new blog post:

www.dataschool.io/pandas-strea...

Learn how to identify & analyze scoring streaks using pandas operations:

- shift()
- cumsum()
- boolean math
- groupby()

17.03.2025 13:53 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

"To Type or Not to Type?" is an excellent introduction (and reference guide) to type hints in Python:

blog.jonathanchun.com/2025/02/16/t...

13.03.2025 18:58 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
DeepSeek-R1 Uncensored, QwQ-32B Puts Reasoning in Smaller Model, and more... The Batch AI News and Insights: Some people today are discouraging others from learning programming on the grounds AI will automate it.

"Some people today are discouraging others from learning programming on the grounds AI will automate it. This advice will be seen as some of the worst career advice ever given."

-- Andrew Ng, legendary AI researcher

Source: www.deeplearning.ai/the-batch/is...

13.03.2025 18:05 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
If someone tells you that coding with LLMs is easy they are (probably unintentionally) misleading you. They may well have stumbled on to patterns that work, but those patterns do not come naturally to everyone.

I’ve been getting great results out of LLMs for code for over two years now. Here’s my attempt at transferring some of that experience and intution to you.

Set reasonable expectations
Account for training cut-off dates
Context is king
Ask them for options
Tell them exactly what to do
You have to test what it writes!
Remember it’s a conversation
Use tools that can run the code for you
Vibe-coding is a great way to learn
A detailed example
Be ready for the human to take over
The biggest advantage is speed of development
LLMs amplify existing expertise
Bonus: answering questions about codebases

If someone tells you that coding with LLMs is easy they are (probably unintentionally) misleading you. They may well have stumbled on to patterns that work, but those patterns do not come naturally to everyone. I’ve been getting great results out of LLMs for code for over two years now. Here’s my attempt at transferring some of that experience and intution to you. Set reasonable expectations Account for training cut-off dates Context is king Ask them for options Tell them exactly what to do You have to test what it writes! Remember it’s a conversation Use tools that can run the code for you Vibe-coding is a great way to learn A detailed example Be ready for the human to take over The biggest advantage is speed of development LLMs amplify existing expertise Bonus: answering questions about codebases

Here's the table of contents for my lengthy new piece on how I use LLMs to help me write code simonwillison.net/2025/Mar/11/...

11.03.2025 14:11 β€” πŸ‘ 342    πŸ” 67    πŸ’¬ 15    πŸ“Œ 15

Thanks for the tip!

12.03.2025 11:30 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I’m planning to use Quarto to convert a series of Jupyter notebooks into an online book and eventually a physical book. Do you have a suggested resource for learning how to do this? Thanks!

08.03.2025 22:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Hallucinations in code are the least dangerous form of LLM mistakes A surprisingly common complaint I see from developers who have tried using LLMs for code is that they encountered a hallucinationβ€”usually the LLM inventing a method or even a full …

These are @simonwillison.net's current favorite AI models for Python programming:

- Claude 3.7 Sonnet with thinking turned on
- OpenAI's o3-mini-high
- GPT-4o with Code Interpreter

Source: simonwillison.net/2025/Mar/2/h...

05.03.2025 16:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

19 professionals (in a variety of fields) evaluated OpenAI's Deep Research vs Google's Deep Research.

OpenAI was the clear winner πŸ†

Neat study by @binarybits.bsky.social, read more here: www.understandingai.org/p/these-expe...

04.03.2025 15:55 β€” πŸ‘ 4    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Preview
If AGI Means Everything People Do... What is it That People Do? And Why Are Today’s "PhD" AIs So Hard To Apply To Everyday Tasks?

"The next time someone predicts that we will soon have AI that can do anything 'a human can do', ask them what they think a human can do. You may find that they are only considering capabilities that are easily measured."

Source: amistrongeryet.substack.com/p/where-ai-f...

04.03.2025 15:14 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Glad you like it, Trey! πŸ™Œ

03.03.2025 21:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@dataschool.io is following 20 prominent accounts