's Avatar

@siddheshp.bsky.social

Grad Student; Into Multilingual NLP

48 Followers  |  634 Following  |  3 Posts  |  Joined: 11.11.2024  |  2.0411

Latest posts by siddheshp.bsky.social on Bluesky

Preview
Position: Evaluating Generative AI Systems Is a Social Science Measurement Challenge The measurement tasks involved in evaluating generative AI (GenAI) systems lack sufficient scientific rigor, leading to what has been described as "a tangle of sloppy tests [and] apples-to-oranges com...

Check out the camera-ready version of our ICML position paper ("Position: Evaluating Generative AI Systems Is a Social Science Measurement Challenge") to learn more!!! arxiv.org/abs/2502.00561

(6/6)

15.06.2025 00:20 โ€” ๐Ÿ‘ 42    ๐Ÿ” 10    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 0

i mean, people have different goals, and if you cared about some niche aspect of query focused multi doc sum before, it is legit to continue. or you can switch focus and start thinking of HCI. the second became much more possible now, the first maybe hasnt.

17.12.2024 17:16 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I wonder if people have suggestions about what parts of writing could be complemented using AI with compromising thinking or could help better organization of thoughts: Making arguments stronger, reviewing, generating ideas about structure?

12.12.2024 17:00 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

๐ŸŒถ๏ธ(?) take: Agents are somehow hot right because people realized that LLM output can be interpreted as a DSL which directs side effects in the world (e.g. tool calls) rather than just returning text in a chat/autocomplete sense. What are the open challenges? A ๐Ÿงต... [1/11]

19.11.2024 09:32 โ€” ๐Ÿ‘ 168    ๐Ÿ” 30    ๐Ÿ’ฌ 9    ๐Ÿ“Œ 7

#EMNLP has a nice set of tokenization/subword modeling papers this year.

It's a good mix of tokenization algorithms, tokenization evaluation, tokenization-free methods, and subword embedding probing. Lmk if I missed some!

Here is a list with links + presentation time (in chronological order).

11.11.2024 22:38 โ€” ๐Ÿ‘ 48    ๐Ÿ” 16    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 2

Tagging my co-authors as I find them:
@iaugenstein.bsky.social @rnv.bsky.social

11.11.2024 09:58 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Survey of Cultural Awareness in Language Models: Text and Beyond Large-scale deployment of large language models (LLMs) in various applications, such as chatbots and virtual assistants, requires LLMs to be culturally sensitive to the user to ensure inclusivity. Cul...

We are excited to share our comprehensive survey on cultural awareness in #LLMs! ๐Ÿ—บ๏ธ [Was posted on X a few days before]
We reviewed 300+ papers across diverse modalities (language, vision-language, etc.)
arxiv.org/abs/2411.00860

11.11.2024 09:57 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

@siddheshp is following 19 prominent accounts