Same boat as your AC
02.03.2025 11:13 β π 2 π 0 π¬ 1 π 0@schumann.bsky.social
Natural Language Processing PhD Student @ Heidelberg University. https://schumann.pub #NLP #NLProc #ML #AI
Same boat as your AC
02.03.2025 11:13 β π 2 π 0 π¬ 1 π 0Could you add me please?
14.01.2025 18:31 β π 5 π 0 π¬ 0 π 0CBOW vs. Skip-gram
20.12.2024 11:59 β π 6 π 0 π¬ 0 π 0Great work! Are you going to release the models?
14.12.2024 11:16 β π 6 π 0 π¬ 0 π 0A starter pack for #NLP #NLProc researchers! π
go.bsky.app/SngwGeS
#EMNLP has a nice set of tokenization/subword modeling papers this year.
It's a good mix of tokenization algorithms, tokenization evaluation, tokenization-free methods, and subword embedding probing. Lmk if I missed some!
Here is a list with links + presentation time (in chronological order).
First time ML/NLP Bluesky feels alive.
07.11.2024 21:39 β π 3 π 0 π¬ 0 π 0This helped a lot!
07.11.2024 21:27 β π 1 π 0 π¬ 0 π 0I make sure to even delete paths with my username from code in supplementary material
05.01.2024 15:49 β π 1 π 0 π¬ 0 π 0TIL that the ACL Wiki has/had a state-of-the-art overview:
aclweb.org/aclwiki/Stat...
It also works with Flash Attention 2, although I don't see additional speedups. I don't think FA is optimized for generation.
13.10.2023 11:35 β π 0 π 0 π¬ 0 π 0Conceptually it is clear that this works but I wasn't aware that huggingface passes this through correctly.
Github Gist to reproduce:
gist.github.com/raphael-sch/...
You have to place the padding tokens in between the prefill and input tokens (example with 3 prefilled tokens):
input_ids: [0, 0, X, X, X, X]
position_ids: [0, 0, 3, 4, 5, 6]
attn_mask: [1, 1, 1, 0, 0, 1, 1, 1, 1]
Turns out that with the right attention_mask and position_ids you can prefill tokens AND pad batches in huggingface transformers. This speeds up inference, especially if if each instance has the same system prompt prepended. Code below β
13.10.2023 11:34 β π 4 π 0 π¬ 1 π 1