Albert Thomas's Avatar

Albert Thomas

@albertcthomas.bsky.social

Research engineer at Huawei. https://albertcthomas.github.io/blog/

60 Followers  |  72 Following  |  16 Posts  |  Joined: 21.11.2024  |  2.091

Latest posts by albertcthomas.bsky.social on Bluesky

Post image

That was incredibly relieving to me to read “motivation was problem-solving, machine learning as a means to an end” in @lawrennd.bsky.social “Atomic human”.
I always felt as an impostor among colleagues who want to solve intelligence, while I just enjoy working on cool stuff.

14.06.2025 08:45 — 👍 26    🔁 4    💬 1    📌 0
Preview
Qwen2.5-VL/cookbooks/ocr.ipynb at main · QwenLM/Qwen2.5-VL Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud. - QwenLM/Qwen2.5-VL

Also github.com/QwenLM/Qwen2... gives prompts you can try such as "Read all the text in the image" (in case you were not aware of this notebook). Not sure this will lead to a drastic change.

19.05.2025 13:55 — 👍 2    🔁 0    💬 0    📌 0

What's the difference between your model and the MLX version?

19.05.2025 13:50 — 👍 0    🔁 0    💬 2    📌 0

And reviewing will be the bottleneck. Although we can use LLMs to help us write the reviews (not saying that the review should be made by the LLM alone).

15.05.2025 19:44 — 👍 1    🔁 0    💬 0    📌 1
Preview
What people get wrong about the leading Chinese open models: Adoption and censorship Narrative violations on licenses, adoption, and censorship.

Many companies won't use Chinese *open* models for long-tail information and generated code security concerns - a major adjustment in how I see the open model ecosystem. Adoption is on the table for entrants amid DeepSeek and Qwen releasing some of the best models on paper.
buff.ly/y6MMoBt

06.05.2025 13:46 — 👍 8    🔁 1    💬 1    📌 1
Albert Thomas

TIL about huggingface-cli delete-cache to clean the models or other resources (datasets, ...) you downloaded from huggingface albertcthomas.github.io/blog/removin...

06.05.2025 11:32 — 👍 1    🔁 0    💬 0    📌 0

« When software is open-source, it means it is open-source – that the source is open – nothing more. […]
It does not mean open to contributions;
It does not mean support is offered;
It does not mean you’re entitled to feature requests;
It does not mean the developer owes you their time;
[…] »

12.04.2025 07:58 — 👍 3    🔁 1    💬 0    📌 0

Really enjoying the series of posts by @beenwrekt.bsky.social on overfitting.

18.02.2025 09:39 — 👍 0    🔁 0    💬 0    📌 0
Preview
Open source software: how to live long and go far An opinionated guide to building open-source software tools with a focus on Python and science A talk that I gave when I was stepping down as a lead…

Just put on line a talk I gave summarizing what I have learned across the years as a maintainer of open source.

It's _opinions_ (been there, done that), but I'm willing to defend them, having stewarded my share of successful open source projects.
speakerdeck.com/gaelvaroquau...

06.02.2025 20:31 — 👍 53    🔁 12    💬 3    📌 0

Yes I was going to ask for more details about LangChain given how popular it seems to be.

23.12.2024 01:51 — 👍 2    🔁 0    💬 0    📌 0

Thanks for the pointer!

06.12.2024 10:00 — 👍 1    🔁 0    💬 0    📌 0

Really? mamba still seems to be faster than conda, I might just need to update my conda :)

04.12.2024 23:51 — 👍 0    🔁 0    💬 1    📌 0

Never mind I saw someone asked the same question below :)

04.12.2024 23:49 — 👍 0    🔁 0    💬 0    📌 0

Why is it considered anti pattern to have global environments?

04.12.2024 22:10 — 👍 0    🔁 0    💬 1    📌 0
Post image

Good, published, benchmarks of machine learning / data science is crucial.

But so hard.
Well-cited "SOTA" methods typically crash often. They tend to be very computational expensive. Both make a systematic study impossible.

Finally, reviewers always ask for more methods, and more "SOTA".

01.12.2024 16:54 — 👍 52    🔁 6    💬 2    📌 0

Yes! I often do the same when I am in the debugger

29.11.2024 19:40 — 👍 1    🔁 0    💬 0    📌 0

Game on! 👾 for @scikit-learn.bsky.social
experts only: the ✨boss level✨ has arrived 🚀
For seasoned pros ready to master ML:
🔹 Custom algorithms
🔹 MLOps & deployment
🔹 Align ML with business projects
Be among the first to get certified! 👉https://eu1.hubs.ly/H0dZ18x0
#machinelearning #datascience

26.11.2024 07:56 — 👍 7    🔁 2    💬 0    📌 1

Ok release updates cannot be automatic :)

26.11.2024 17:36 — 👍 0    🔁 0    💬 0    📌 0

Wow this is nice, thanks a lot for sharing! This configures automatic updates as well? What about release updates? I find myself stuck with old Ubuntu releases...

26.11.2024 10:07 — 👍 0    🔁 0    💬 1    📌 0

Here’s a little script I made which I use to get a server up and running automatically (after you answering a few questions, including “what’s your name”) in just a few minutes.

You can even fully automate it with a few environment variables.
github.com/AnswerDotAI/...

26.11.2024 09:35 — 👍 41    🔁 3    💬 2    📌 0

👋

24.11.2024 13:00 — 👍 0    🔁 0    💬 0    📌 0

Totally agree. This is a great paper that everyone doing RL should read.

23.11.2024 19:45 — 👍 1    🔁 0    💬 0    📌 0

@albertcthomas is following 20 prominent accounts