Mathias Nielsen @mathiasesn1

we heard you hate writing boilerplate code

so we built something...

> open gradio sketch
> select and add components
> configure visually
> get perfect python code 🤯

Building AI apps will never be the same 🔥
Coming very soon 👀

19.02.2025 10:12 — 👍 4 🔁 2 💬 1 📌 0

[dk] Ok, nu begynder det at blive dumt, det her..

02.02.2025 16:58 — 👍 26 🔁 2 💬 2 📌 0

We’re building a new static type checker for Python, from scratch, in Rust.

From a technical perspective, it’s probably our most ambitious project yet. We’re about 800 PRs deep!

29.01.2025 17:18 — 👍 726 🔁 104 💬 35 📌 34

Vectorsearch Hub Datasets - a Hugging Face Space by davidberenstein1957 Add vectors to Hub datasets and do in memory vector search.

🤯 Vector search on top of millions of docs in seconds. no pre-indexing!

Model2Vec is an embedding powerhouse that distils good models and makes them up by 500x faster and 15x smaller.

Vector Search on Hub Datasets demo: https://buff.ly/4gYhVlY
Library: https://buff.ly/42miwte

24.01.2025 13:00 — 👍 5 🔁 2 💬 1 📌 0

The image shows an illustration titled "Hygge Web Data" featuring three cartoon animals - a fox, an owl, and what appears to be a bear or similar animal - sitting at a table or surface reviewing various documents and papers. The style is cute and whimsical, with the animals drawn in a simple, friendly manner. Each animal is looking at different papers with sketched symbols, text, and designs on them. The illustration has a gentle, cozy feel to it, fitting with the "hygge" (Danish concept of coziness and comfort) mentioned in the title.

Introducing Scandi-fine-web-cleaner, a decoder model trained to remove low-quality web from FineWeb 2 for Danish and Swedish

- Uses FineWeb-c community annotations
- 90%+ precision + minimal compute required
- Enables efficient filtering of 43M+ documents

huggingface.co/davanstrien/...

13.01.2025 15:48 — 👍 17 🔁 4 💬 1 📌 1

This is a particularly bad case-study in how badly AI summarization can go when its exposed to the wilds of the internet - posted some notes on my blog: simonwillison.net/2024/Dec/29/...

29.12.2024 01:32 — 👍 117 🔁 23 💬 11 📌 7

Material for MkDocs Write your documentation in Markdown and create a professional static site in minutes – searchable, customizable, in 60+ languages, for all devices

Er selv stor fan af squidfunk.github.io/mkdocs-mater... 👨‍💻

23.12.2024 19:50 — 👍 1 🔁 0 💬 1 📌 0

💥 Ending 2024: A full data annotation journey on the Hugging Face Hub—from raw data to training-ready datasets!

With Argilla 2.6.0, push your data to the Hub from the UI

Let’s make 2025 the year anyone can build more transparent and accountable AI—no coding or model skills needed.

20.12.2024 11:14 — 👍 20 🔁 3 💬 1 📌 0

Paper page - Phi-4 Technical Report Join the discussion on this paper page

The Phi-4 Technical Report briefly mentions the importance of the sequence in which the training data is fed to the model. Actually I think that determining the ideal sequence should be the next big research topic.

huggingface.co/papers/2412....

14.12.2024 18:32 — 👍 1 🔁 1 💬 1 📌 0

Supply-chain attack analysis: Ultralytics - The Python Package Index Blog Analysis of a package targeted by a supply-chain attack to the build and release process

Angreb på Ultralytics via GitHub Actions og PyPI: blog.pypi.org/posts/2024-1... #dkdev

14.12.2024 16:25 — 👍 1 🔁 1 💬 0 📌 0

09.12.2024 04:28 — 👍 3 🔁 2 💬 0 📌 0

Stærkt! 💪

09.12.2024 18:41 — 👍 1 🔁 0 💬 0 📌 0

Overview of PixMo and its relation to Molmo's ability. PixMo's captions data enables Molmo's fine-grained understanding; PixMo's AskModelAnything enables Molmo's user interaction; PixMo's pointing data enables Molmo's pointing and counting; PixMo's synthetic data enables Molmo's visual skills.

Remember Molmo? The full recipe is finally out!

Training code, data, and everything you need to reproduce our models. Oh, and we have updated our tech report too!

Links in thread 👇

09.12.2024 18:33 — 👍 78 🔁 14 💬 1 📌 1

YouTube video by dottxt Creating a Structured AI Log Analysis System with Python & LLMs

🏮 New YouTube video! 🏮

We experimented with a log monitoring system. Spin up the agent and it'll monitor your logs for any potential issues -- it works with webserver logs like nginx or Apache, Linux system logs, etc.

youtu.be/csw6TVfzBcw

05.12.2024 17:08 — 👍 7 🔁 3 💬 2 📌 1

Look at this! 🤩

@pydantic.bsky.social for AI Agents 🤖🚀

02.12.2024 11:33 — 👍 17 🔁 3 💬 0 📌 0

@moltke.bsky.social, tror du det er et uheld?

28.11.2024 18:30 — 👍 0 🔁 0 💬 0 📌 0

FYI, here's the entire code to create a dataset of every single bsky message in real time:

```
from atproto import *
def f(m): print(m.header, parse_subscribe_repos_message())
FirehoseSubscribeReposClient().start(f)
```

28.11.2024 09:56 — 👍 442 🔁 62 💬 19 📌 10

The thing is, there's already a dataset of 235 MILLION posts from 4 MILLION users available for months. Not sure why @hf.co is a target of abuse

zenodo.org/records/1108...

28.11.2024 01:32 — 👍 116 🔁 13 💬 7 📌 0

Hvad er på vej? En ny evaluering til scandeval? 😀

26.11.2024 13:37 — 👍 1 🔁 0 💬 1 📌 0

Gad vide hvad AI-Sweden-Models/Llama-3-8B-instruct (few-shot) (med en rank på 1.35 og en top 4 placering) har gjort rigtigt? 🤔

26.11.2024 13:13 — 👍 1 🔁 0 💬 1 📌 0

NLPnorth/snakmodel-7b-instruct · Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

*** New Model on ScandEval ***

New Danish LLM from the NLP North Lab, SnakModel, based on Llama-2-7b.

Danish results (lower is better):
- NLPnorth/snakmodel-7b-base: 3.60
- NLPnorth/snakmodel-7b-instruct: 2.59

For reference, Llama-2-7b achieves 3.08.

Leaderboards: scandeval.com

#dkai #nlp

26.11.2024 12:21 — 👍 9 🔁 1 💬 2 📌 1

Vi er gået all in på uv i MediaCatch og det har givet nogle gevaldige speed ups i GitHub workflows og docker builds. 🔥
Man skal dog være opmærksom på cache, da den kan bruge en del plads. Så uv cache prune en gang imellem. 😅

24.11.2024 19:09 — 👍 1 🔁 0 💬 1 📌 0

📱 Jeg har lavet et feed, der samler dansk tech-indhold via hashtaggene #dkai, #dkdev og #dktech! Følg med for at holde dig opdateret med det danske tech-community 🇩🇰

Prøv det her: bsky.app/profile/did:...

23.11.2024 14:53 — 👍 10 🔁 3 💬 0 📌 0

Mathias Nielsen

Latest posts by mathiasesn1.bsky.social on Bluesky

@mathiasesn1 is following 20 prominent accounts