Delip Rao's Avatar

Delip Rao

@deliprao.bsky.social

Building. Affiliations: @JHU, @Penn, @UCSC, @Amazon, @Twitter || Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈

2,248 Followers  |  27 Following  |  11 Posts  |  Joined: 23.04.2023
Posts Following

Posts by Delip Rao (@deliprao.bsky.social)

Post image

Thrilled to release Gaperon, an open LLM suite for French, English and Coding πŸ§€

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social

07.11.2025 21:11 β€” πŸ‘ 35    πŸ” 18    πŸ’¬ 1    πŸ“Œ 4

Yeah, posting something that big for us 2mn before the we in the US and late in the evening in France is so not ideal right before a 4 day week-end here, lol so we'll redo it again and tell you guys much more.. #TrainingTragedy
Tbh the only visual allegory possible is this...

07.11.2025 22:51 β€” πŸ‘ 7    πŸ” 6    πŸ’¬ 1    πŸ“Œ 0

Thank you for the interest in our work. Look forward to any feedback.

16.12.2024 19:15 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
WithdrarXiv: A Large-Scale Dataset for Retraction Study Retractions play a vital role in maintaining scientific integrity, yet systematic studies of retractions in computer science and other STEM fields remain scarce. We present WithdrarXiv, the first larg...

😳 WithdrarXiv πŸ™

- Dataset of 14K+ withdrawn arXiv papers
- associated retraction comments
- entire history through 09/24
- taxonomy of retraction reasons, from critical errors to policy violations
- WithdrarXiv-SciFy, enriched version w/ scripts for parsed full-text PDFs

arxiv.org/abs/2412.03775

15.12.2024 18:34 β€” πŸ‘ 158    πŸ” 46    πŸ’¬ 5    πŸ“Œ 4
Preview
Juicy Research Ideas and How to Find them? How do people come up with research ideas in AI? Will the "AI Scientist" finally make me work full-time on my chicken farm?

Stumbled across this post on Substack by
@deliprao.bsky.social today that I really appreciated as someone trying to break into the field. Simple categorizations can seem trite at times, but they can be deceptively profound in breaking down complex problems.

substack.com/home/post/p-...

09.12.2024 01:04 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0

anyone on my TL can endorse me for cs.DL (digital libraries) on arXiv? πŸ™

04.12.2024 22:56 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Releasing: a dataset of two million Bluesky posts.

This dataset has been collected using Bluesky's API, and I hope it will be useful for all the researchers out there!

27.11.2024 19:13 β€” πŸ‘ 475    πŸ” 54    πŸ’¬ 249    πŸ“Œ 136

Slack knows you have given up on the rest 😏

27.11.2024 18:47 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Nice crown molding

25.11.2024 15:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Are you rich enough to use compute as a noun?

23.11.2024 02:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

May I propose beets

23.11.2024 02:35 β€” πŸ‘ 11    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

but you can run oogabooga

19.11.2024 16:17 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Did you just get your BlueSky invite? great! Now, help me complete my threads graph. 😘

https://www.threads.net/@delip.rao

06.07.2023 03:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Posts here are called beets. I don’t make the rules.

28.04.2023 04:31 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

get in loser

we’re re-territorializing the hilbert space

28.04.2023 01:17 β€” πŸ‘ 14    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0

New stage, new tune

28.04.2023 02:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Testing

25.04.2023 19:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0