Hi Jan-Willem, thanks, could you add me?
20.11.2024 15:24 β π 1 π 0 π¬ 0 π 0
Similarly, I can dump results of the indermediate step to a file, with a helper function like this:
def dump(filename):
def wrapper(df):
df.to_parquet(filename)
return df
return wrapper
And so on! If you didn't use .pipe before, give it a try, it's nice!
20.11.2024 11:28 β π 0 π 0 π¬ 0 π 0
Now I wrote a small helper function
def assrt(condition):
def wrapper(df):
assert condition(df)
return df
return wrapper
and use this function with a .pipe method:
df.assign(β¦).query(β¦).pipe(assrt(lambda _: not _['column'].isna().any())).groupby(β¦)β¦
20.11.2024 11:28 β π 0 π 0 π¬ 1 π 0
Assume I want to make sure that at some intermediate step I do not have NaNs in a column, and give an error otherwise. Previously, I would break the method chain, assign that intermediate result to a variable, add an assert on that variable, and then continue the chain. Not nice.
20.11.2024 11:28 β π 0 π 0 π¬ 1 π 0
I like pandas method chaining and my code usually looks like this:
df.assign(β¦).query(β¦).groupby(β¦)[['some', 'var']].sum().sort_values(β¦).iloc[:10].mean()
The problem with this approach is that it is not easy to get access to the results of intermediate steps. Recently I stumbled upon a solution!
20.11.2024 11:28 β π 1 π 0 π¬ 1 π 0
Hi there! I am a mathematician, ML researcher and educator, currently working at Radboud University, Nijmegen, The Netherlands. Applying ML and some mathematical stuff to condenced matter physics (Neural Quantum States and friends). I also teach Scientific Computing at Constructor University, Bremen
20.11.2024 11:07 β π 4 π 0 π¬ 0 π 0
π§βπ» Senior Software Engineer at Bloomberg using C++
π½ Content: https://youtube.com/c/chshersh
All opinions are my own.
https://unireps.org
Discover why, when and how distinct learning processes yield similar representations, and the degree to which these can be unified.
the internet can still be fun!
https://onemillioncheckboxes.com β’ http://eieio.games β’ https://everyuuid.com β’ https://onemillionchessboards.com
Austin Powered. OpenStack co-founder, OpenInfra Foundation COO, ex Rackspace & Yahoo! open source for fun & profit.
Open Source AI early and often
@sparkycollier on twitter and elsewhere
Links: markcollier.me
GPU Poor @ Hugging Face | F1 fan
Deployed by thousands.
Backed by one of the most active open source communities in the world.
#OpenStack is a set of open source software components that provide common services for cloud infra.
Supported by @openinfra.org
The steward of the Open Source Definition, the foundation of the modern software ecosystem. We build a world where the freedoms and opportunities of Open Source software can be enjoyed by all. #OpenSource
I make sure that OpenAI et al. aren't the only people who are able to study large scale AI systems.
Non-profit, Open Source Computer Vision since 2000. OpenCV.org
proud mediterrenean π§Ώ open-sourceress at hugging face π€ multimodality, zero-shot vision, vision language models, transformers
Previously CTO, Greywing (YC W21). Building something new at the moment.
Writes at https://olickel.com
Open source, open science, AI in science for earth/ice and healthcare. IPython creator, @projectjupyter.bsky.social and 2i2c.org co-founder.
Prof @ UC Berkeley Stats, director of @ucbids.bsky.social, co-director @schmidtdse.bsky.social; LBL scientist.
Also an architect, GIS enthusiast, sailor.
Working on fully open-source LLMs and training data. We believe in community-owned AI.
https://www.llm360.ai
β¨ Keep it simple, make it scale. AI should be about empowering users and building understanding. π©βπ» AI Developer Experience @ Google DeepMind, ex-Github, ex-Google
I like tokens! Lead for OLMo data at @ai2.bsky.social (Dolma π) w @kylelo.bsky.social. Open source is fun π€βοΈππ³οΈβπ Opinions are sampled from my own stochastic parrot
more at https://soldaini.net