there's a bunch of scripts that can migrate old posts through the api, such as this one: github.com/marcomaroni-...
i don't know if timestamps would migrate, but I've seen folks who have posts with older timestamps than the start of the bluesky
@flaviucipcigan.bsky.social
Building AIs for scientific discovery. Discovered antibiotics and materials for carbon capture. Tango dancer. See more at flaviucipcigan.com. Opinions my own.
there's a bunch of scripts that can migrate old posts through the api, such as this one: github.com/marcomaroni-...
i don't know if timestamps would migrate, but I've seen folks who have posts with older timestamps than the start of the bluesky
Super interesting application of program search
Goals are mapped to programs which are embedded in a latent space.
A fitness metric is assigned to the programs and program search is done to synthesise new human-like goals.
Thanks! Not sure, I'll try it π€
21.02.2025 08:30 β π 0 π 0 π¬ 0 π 0Using GFlowNets to discover new materials for carbon capture
20.02.2025 21:09 β π 1 π 0 π¬ 1 π 0One of my big motivations is accelerating science with AI.
Every discovery project had a beautiful aha moment, such as the structure of antibiotics emerging in the latent space of a model or a GFlowNet proposing new carbon capture materials.
Here's some of the threads I've wrote on this topic.
{'lol': ['5.0E6', '5.0e6', '5.E6', '5.e6', '5E6', '5e6', 5e-06, 5e-06, 5e-06, 5e-06, '5E-6', '5e-6', 5000000.0, 5000000.0, 5000000.0, 5000000.0, '5E+6', '5e+6']}
Wanna try to guess which of those gets parsed as a string and which as a number? Answer in alt text.
YAML parsing in python is weird.
Interesting idea to generate responses using diffusion rather than left-to-right auto-regressive models
17.02.2025 12:31 β π 6 π 1 π¬ 0 π 0From here https://www.youtube.com/watch?v=dZQ7x0-MZcI
Supercomputers - large computer clusters - allow you to work a number of years ahead.
Creating the GUI at PARC seemed like a "waste of FLOPs" but revolutionized computing.
Where do large compute clusters come into play in this case?
Alan Kay talked about the Wayne Gretzky game, a hockey player famous for his quote about skating where the puck will be.
Similarly, the benchmark scores of a model with a given number of parameters increases each generation due to better data and training algorithms, caveated by dataset leakage.
15.02.2025 12:56 β π 0 π 0 π¬ 1 π 0For each generation, for a fixed parameter count, the speed of training & inferring decreases due to hardware and software advances, like flash attention and multi-head latent attention.
At each generation, larger and larger number of parameters can be ran locally.
My first computer used a processor in the Intel 8086 generation, which had about 29k transistors.
Today, an Apple M4 has 28B transistors, meaning I experienced a scale-up of 1,000,000x in my lifetime.
I expect a similar scale-up for language models.
What is large for a language model? Is it 400B, 70B or maybe 1T?
I think focus on raw number of parameters is a less useful frame than thinking about inference speed, cost and location of inference (on-device vs cloud).
Follow where curiosity leads. It's the most durable source of motivation in research.
13.02.2025 18:20 β π 1 π 0 π¬ 1 π 0More open reasoning datasets and distilled models.
It's great to see the energy of the community that got unleashed after open models that generate chains of thought!
ColabFit Exchange is another great dataset curation effort that I'd like to boost.
Great work by @stemartiniani.bsky.social and team to curate the most diverse materials database in the world!
Neat idea! Fine-tuning using majority voting and length filtering generalises a model's capabilities.
Models generalise to slightly harder versions of a problem, and the correct answers are used to bootstrap the next model and the next one and so on.
Link to the initial data, more to comeπ
13.02.2025 10:18 β π 0 π 0 π¬ 0 π 0