Jason Brownlee's Avatar

Jason Brownlee

@jason2brownlee.bsky.social

Research scientist & software engineer. PhD in #AI #MachineLearning #DataScience Authored 40+ tech books and 1500+ tutorials. Home: JasonBrownlee.me

36 Followers  |  17 Following  |  11 Posts  |  Joined: 15.11.2024  |  1.5755

Latest posts by jason2brownlee.bsky.social on Bluesky

Preview
Stacking Ensemble With Dropout Regularization I was thinking about stacking ensembles (stacked generalization) in the sauna. Stacked ensembles overfit, so we need to regularize. Generally, we use cross-validation to ensure that the meta model is ...

Stacking Ensemble With Dropout Regularization
jasonbrownlee.me/blog/posts/s...

23.01.2025 22:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - Jason2Brownlee/Awesome-AutoML-Books: Awesome AutoML Books: Curated list of books on Automated Machine Learning Awesome AutoML Books: Curated list of books on Automated Machine Learning - Jason2Brownlee/Awesome-AutoML-Books

Awesome AutoML Books
A curated list of books for engineers on development with Automated Machine Learning (#AutoML).
github.com/Jason2Brownl...

30.12.2024 22:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - Jason2Brownlee/awesome-llm-books: Awesome LLM Books: Curated list of books on Large Language Models Awesome LLM Books: Curated list of books on Large Language Models - Jason2Brownlee/awesome-llm-books

Awesome LLM Books
This is a curated list of books for engineers on development with Large Language Models (LLMs)
github.com/Jason2Brownl...

26.12.2024 22:10 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Is there evidence that model performance on train and test sets have the same distributions?

Use statistical tests to confirm general model performance distributions are equivalent.

Check Model Performance Distributions:
datasciencediagnostics.com/diagnostics/...

11.12.2024 17:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Is there evidence that your train and test sets have the same distributions?

Use statistical tests to confirm that numerical and categorical distributions are equivalent.

Train/Test Data Distributions:
datasciencediagnostics.com/diagnostics/...

10.12.2024 20:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Is there evidence that the Performance Gap is real or just statistical noise?

Carefully quantify the difference between train and test set performance.

Quantify the Performance Gap:
datasciencediagnostics.com/diagnostics/...

09.12.2024 18:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Data Science Diagnostics

Data Science Diagnostics

Data Science Diagnostics
Helpful checks for data scientists with urgent problems
DataScienceDiagnostics.com

#DataScience #MachineLearning

08.12.2024 21:57 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Are you sure your train/test split percentage is well chosen?

Common split percentages are just heuristics, it is better to know how your data/model behaves under different split scenarios.

Perform a split-size sensitivity analysis:
datasciencediagnostics.com/diagnostics/...

08.12.2024 21:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - Jason2Brownlee/DataScienceDiagnosticChecklist: Data Science Diagnostic Checklist Data Science Diagnostic Checklist. Contribute to Jason2Brownlee/DataScienceDiagnosticChecklist development by creating an account on GitHub.

Data Science Diagnostic Checklist
(from 10+ years of consulting)
github.com/Jason2Brownl...

03.12.2024 18:34 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Use code BLACKFRIDAY for 30% off the "Python Concurrency Boxed Set": superfastpython.com/python-jump-...

29.11.2024 20:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
XGBoost is all you need!

XGBoost is all you need!

XGBoost is all you need: XGBoosting.com

25.11.2024 02:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@jason2brownlee is following 14 prominent accounts