David Hall @dlwh - Bluesky Profile

I think a lot of federal money is tied to accreditation like Pell grants and research funds and stuff. So while Harvard has lots of money in the endowment, it would still be a pretty big hit to the budget.

09.07.2025 23:15 — 👍 6 🔁 0 💬 1 📌 0

Many thanks to the Google TPU Research Cloud program for providing the much needed compute for this project, and to all the other great open efforts: @ai2.bsky.social @eleutherai.bsky.social and more!

19.05.2025 19:51 — 👍 2 🔁 0 💬 0 📌 0

Introducing Marin: An Open Lab for Building Foundation Models Open-source software is a success story: It powers the world’s digital infrastructure. It allows anyone in the world to contribute based on merit. It leads to greater innovation, collaboration, and se...

You can read more in our:

- Website: marin.community
- GitHub: github.com/marin-commun...
- Discord: discord.gg/J9CTk7pqcM
- Documentation: marin.readthedocs.io
- Announcement: marin.community/blog/2025/05/1

19.05.2025 19:51 — 👍 1 🔁 0 💬 1 📌 0

Explanation of data shop: prompt or sample data comes in, llm finds more data, train a cheap model to find even more, train, --> llm

Have a specific use case? Come to our Datashop to curate data and train models.
Here’s how we curated more math data:
github.com/marin-commun...
Check out the data:
marin.community/data-browser/

19.05.2025 19:51 — 👍 1 🔁 0 💬 1 📌 0

pareto frontier of flops vs bits-per-byte

Have a new algorithm for training? Choose your compute budget and get on the speedrun leaderboard: how fast can you drive down validation loss?
marin.community/speedrun/

19.05.2025 19:51 — 👍 0 🔁 0 💬 1 📌 0

Flowchart shoing Github issue (preregistration) -> pull request (experiment.py) -> execution (watch it live) -> WandB report (analysis)

Marin (marin.community) repurposes GitHub, which has been successful for open-source *software*, for AI:
1. Preregister an experiment as a GitHub issue
2. Submit a PR, which implements the experiment in code
3. PR is reviewed by experts in the community
4. Watch the execution of the experiment live!

19.05.2025 19:51 — 👍 0 🔁 0 💬 1 📌 0

open weights vs open source (weights + code + recipe) vs open development (+ process, anyone can contribute)

Marin is a new "open lab" for developing foundation models. More than open weights, and even open source, with Marin we're committing to "open development": everything is documented and traceable, and anyone can contribute.

19.05.2025 19:51 — 👍 1 🔁 0 💬 1 📌 0

Learn more about the project in Percy's blog post: marin.community/blog/2025/05...

And about the Models we are releasing in @dlwh.bsky.social's training retro: marin.readthedocs.io/en/latest/re...

19.05.2025 19:11 — 👍 0 🔁 1 💬 0 📌 0

Percy Liang on X: "What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision: https://t.co/racsvmhyA3" / X What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision: https://t.co/racsvmhyA3

Super excited Marin is finally out! Come see what we've been building! Code/platform for training fully reproducible models end-to-end, from data to evals. Plus a new high quality 8B base model. Percy did a good job explaining it on the other place. marin.community

x.com/percyliang/s...

19.05.2025 19:35 — 👍 19 🔁 6 💬 1 📌 0

David Hall

Latest posts by dlwh.bsky.social on Bluesky

@dlwh is following 19 prominent accounts