Lisa Bylinina

Lisa Bylinina

@bylinina.bsky.social

linguist bylinina.github.io

237 Followers 189 Following 25 Posts Joined Nov 2024
4 months ago

there is this guy on my flight to shanghai sitting next to me checking out the emnlp program and a bunch of papers and his bsky feed is 90% emnlp stuff and on the one hand it would be nice to chat but on the other hand it‘s a 12-hour flight so maybe i it’a better if i focus on my netfix downloads..

4 0 0 0
4 months ago
Post image

I will be attending EMNLP in China to present our paper with @bylinina.bsky.social (who will be in China, too) and Jakub Dotlacil in the BabyLM workshop! Looking forward to meeting people there! ✨ 😊 #EMNLP2025 @emnlpmeeting.bsky.social

lnkd.in/e-Bzz6De

12 3 1 0
5 months ago

oh super-interesting

1 0 0 0
5 months ago

who'll be at emnlp?

0 0 0 0
10 months ago
Post image

got a tiny (approx 50k) grant from NWO to do something about whether (instruction-tuned) lms are an 'agent', a superposition of agents, what's going on there epistemically and also how people interact with these 'personae' -- we'll seeeeee www.nwo.nl/en/researchp...

7 1 0 0
10 months ago
NSF Grant Termination Information Collection Form

Please use this form to submit information identifying specific NSF grants that have been cancelled for any reason after January 20, 2025.


We are tracking these grants to increase transparency, organize affected PIs, and facilitate responses, including via litigation. Please share the form as widely as possible with your networks. 


We are actively building a pipeline to organize these terminations and will soon have a tracker akin to our NIH grant tracker at https://airtable.com/appjhyo9NTvJLocRy/shrNto1NNp9eJlgpA


WE WILL NOT DISCLOSE THE IDENTITY OF ANYONE WHO USES THIS FORM TO PROVIDE INFORMATION. We will keep your identity confidential.


These resources are maintained by Noam Ross of rOpenSci and Scott Delaney of the Harvard T.H. Chan School of Public Health, with input and support from additional volunteers. For any questions, please contact Scott Delaney on Signal (sdelaney.84).


THANK YOU FOR YOUR ASSISTANCE!

🚨Report your NSF grant terminations! 🚨

We are starting to collect information on NSF grant terminations to create a shared resource as we have for NIH. The more information we collect, the more we can organize, advocate, and fight back! Please share widely!

airtable.com/appGKlSVeXni...

640 664 7 50
10 months ago
Preview
Cutting international bachelor programs threatens psychological science » Eiko Fried Two days ago, four Dutch universities announced discontinuing their English-speaking psychology bachelor programs (1, 2). I will briefly explain (1) how this decision came to be, (2) why this is such ...

Four large Dutch universities, including Leiden University where I work, have decided to throw international psychology bachelor programs under the bus in an effort to appease the rightwing government.

Here's my blog why this is a terrible idea.

eiko-fried.com/cutting-inte...

229 102 10 11
10 months ago

i just need students to see the difference between base and instruction-tuned models trying out different types of prefixes, without them needing to write any code or send their info anywhere

0 0 0 0
10 months ago

do we know a pair of base vs. instruct models that are both deployed by an inference provider on hf (or maybe a hf space but less preferable..) AND that don't require students sending their info for the license agreement?

0 0 2 0
11 months ago
Preview
Going beyond open data – increasing transparency and trust in language models with OLMoTrace | Ai2 OLMoTrace lets you trace the outputs of language models back to their full, multi-trillion-token training data in real time.

oh wow ok allenai.org/blog/olmotrace

7 1 1 0
11 months ago

i mean i'd be really surprised if what lms generate as 'reasoning' text faithfully reflected the ways they come up with the answer. like, what would guarantee that

6 0 1 0
11 months ago

nice!!

1 0 0 0
11 months ago
from minicons import scorer
from nltk.tokenize import TweetTokenizer

lm = scorer.IncrementalLMScorer("gpt2")

# your own tokenizer function that returns a list of words
# given some sentence input
word_tokenizer = TweetTokenizer().tokenize

# word scoring
lm.word_score_tokenized(
    ["I was a matron in France", "I was a mat in France"], 
    bos_token=True, # needed for GPT-2/Pythia and NOT needed for others
    tokenize_function=word_tokenizer,
    bow_correction=True, # Oh and Schuler correction
    surprisal=True,
    base_two=True
)

'''
First word = -log_2 P(word | <beginning of text>)

[[('I', 6.1522440910339355),
  ('was', 4.033324718475342),
  ('a', 4.879510402679443),
  ('matron', 17.611848831176758),
  ('in', 2.5804288387298584),
  ('France', 9.036953926086426)],
 [('I', 6.1522440910339355),
  ('was', 4.033324718475342),
  ('a', 4.879510402679443),
  ('mat', 19.385351181030273),
  ('in', 6.76780366897583),
  ('France', 10.574726104736328)]]
'''

another day another minicons update (potentially a significant one for psycholinguists?)

"Word" scoring is now a thing! You just have to supply your own splitting function!

pip install -U minicons for merriment

21 7 3 0
11 months ago

ah that's great, makes a lot of things much faster to try out!

1 0 0 0
11 months ago

or what happened!

0 0 0 0
11 months ago

you have to tell me which starter pack i apparently suddenly ended up in

1 0 1 0
11 months ago

Sounds familiar

1 0 0 0
1 year ago
Post image

the tiny books have arrived

27 0 1 0
1 year ago

Forthcoming titles in Elements in Semantics: 1. Abzianidze, @bylinina.bsky.social, Paperno Deep Learning and Semantics. 2.K. Davidson Semantics of Depiction, 3. Chatzikyriakidis Cooper, Gregoromichelaki, Sutton Types and the structure of meaning: Issues in compositional and lexical semantics
(2/3)

3 2 0 0
1 year ago

waluigi!!

1 0 0 0
1 year ago

all invitations i find in my inbox are actually invitations to work a bit more

4 0 0 0
1 year ago

yeah it's super-interesting to me somehow suddenly which i didn't expect and i don't know what to do with it but i'll just be curious about it i guess

1 0 1 0
1 year ago

this is so cool - it's the 2nd time i see this thread and again i think how cool it is. you know why? well for obv reasons but also bc i've been thinking recently about how the linguistic will of one person or group of people (prescriptive organizations but not necessarily) can do things to language

1 0 1 0
1 year ago

... buying out research time with grant budgets -- most likely gone. maybe that's just the reality of an assistant prof position (and up), maybe also amplified by budget cuts -- but is that it? am i just going to be talking most of the time rather than doing anything? depressing really

2 0 0 0
1 year ago

in order to actually do smth in research directions i'm interested in i need some bandwidth: research time, phd students to work with, experiment budgets. in nl it's getting more and more complicated (for obv reasons): some ways to get phd students are frozen, some grants not announced anymore..

3 0 1 0
1 year ago
Preview
Semantics and Deep Learning Cambridge Core - Philosophy of Mind and Language - Semantics and Deep Learning

this thing is coming out soon btw! www.cambridge.org/core/element...

3 0 0 0
1 year ago

meeee!

2 0 0 0
1 year ago

nah i wasn’t serious

0 0 0 0
1 year ago

thx!! now i’m annoyed i’m not in it

0 0 1 0
1 year ago

linguists? computational linguists? nlp people? semantics people? anybody?

8 1 3 0