Mozhdeh Gheini's Avatar

Mozhdeh Gheini

@mgheini.bsky.social

USC Graduate Student | USC ISI NLP Researcher | 3x Apple Intern | Self-proclaimed Michelin 3-star Foodie | she/her

348 Followers  |  410 Following  |  6 Posts  |  Joined: 12.11.2024
Posts Following

Posts by Mozhdeh Gheini (@mgheini.bsky.social)

I must also add that I’m assuming there’s no breakthrough architecture/pre-training/post-training method that pushes us to start everything from scratch. I’m simply asking about the decision factors in greenlighting such a full restart in the current status quo.

07.01.2025 02:47 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Are there any good pointers on when/why one would decide to run pre-training from scratch (and follow it with post-training ofc) to create a fresh LLM? Is it simply about shifting the knowledge cutoff or more than that? Do we know how/if that happens nowadays? What are the deciding factors?

07.01.2025 02:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

i was annoyed at having many chrome tabs with PDF papers having uninformative titles, so i created a small chrome extension to fix it.

i'm using it for a while now, works well.

today i put it on github. enjoy.

github.com/yoavg/pdf-ta...

05.01.2025 22:22 β€” πŸ‘ 98    πŸ” 22    πŸ’¬ 5    πŸ“Œ 1

Given how bad I am at it, it’s out of my league too; still fun though πŸ˜…

06.12.2024 05:54 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Were you doing the NYT’s crossword? That’s how it happened for me. Also, if you want a bonus one, β€œdoe” :)

05.12.2024 01:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

f’ as in fine-tuned from f, not the derivative of f πŸ˜…

03.12.2024 04:53 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I got confused there yoo. Maybe something like β€œfurther condition the model’s output” (instead of update the model)?
So if the model is f(x), before the dashed line it’s f’(x), and after that it’s f(x|prompt/context).

03.12.2024 04:49 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

USC NLP folks are on Bluesky!
Follow my amazing colleagues here

go.bsky.app/KUwSZ6W

12.11.2024 17:44 β€” πŸ‘ 17    πŸ” 5    πŸ’¬ 3    πŸ“Œ 2