π¨ Introducing our @tmlrorg.bsky.social paper βUnlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluationβ
We present UnLOK-VQA, a benchmark to evaluate unlearning in vision-and-language models, where both images and text may encode sensitive or private information.
07.05.2025 18:54 β π 10 π 8 π¬ 1 π 0
Flying to SG πΈπ¬ to attend #ICLR2025.
Check out our 3 papers:
βοΈCREMA: Video-language + any modality reasoning
π‘οΈSAFREE: A training-free concept guard for any visual diffusion models
π§SRDF: Human-level VL-navigation via self-refined data loop
feel free to DM me to grab a coffee&citywalk together π
22.04.2025 00:09 β π 1 π 0 π¬ 0 π 0
π¨Real-world retrieval is messy: queries are ambiguous or docs conflict & have incorrect/irrelevant info. How can we jointly address these problems?
β‘οΈRAMDocs: challenging dataset w/ ambiguity, misinformation & noise
β‘οΈMADAM-RAG: multi-agent framework, debates & aggregates evidence across sources
π§΅β¬οΈ
18.04.2025 17:05 β π 14 π 7 π¬ 3 π 0
π¨Announcing TaCQ π¨ a new mixed-precision quantization method that identifies critical weights to preserve. We integrate key ideas from circuit discovery, model editing, and input attribution to improve low-bit quant., w/ 96% 16-bit acc. at 3.1 avg bits (~6x compression)
π arxiv.org/abs/2504.07389
12.04.2025 14:19 β π 15 π 7 π¬ 1 π 1
VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation
VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation
Thanks to all of my amazing co-authors from Adobe Research, UMichi, UNC
Difan Liu (co-lead), @marstin.bsky.social (co-led)
Yicong Hong, Yang Zhou, Hao Tan, Joyce Chai, @mohitbansal.bsky.social
Check out more details on our homepage/paper.
Website: veggie-gen.github.io
19.03.2025 18:56 β π 0 π 0 π¬ 0 π 0
We further find that VEGGIE shows emergent zero-shot multimodal instruction following and in-context video editing ability, which may facilitate a broader range of future applications.
19.03.2025 18:56 β π 0 π 0 π¬ 1 π 0
We project grounded queries into 2D spaces with PCA & t-SNE. We found Reasoning and Grounding cluster together, while Color, Env, and Change are closely grouped. Addition aligns with Reasoning and Grounding, suggesting addition involves semantic processes, while Removal is a more independent task.
19.03.2025 18:56 β π 0 π 0 π¬ 1 π 0
we evaluate 7 different models on VEG-Bench across 8 distinct editing skills. Overall, VEGGIE demonstrates the best performance among instructional video editing models.
19.03.2025 18:56 β π 0 π 0 π¬ 1 π 0
To further support our training, we also introduce a novel automatic instructional video data generation pipeline that lifts high-quality instructional image editing data into the video domain using image-to-video and video evaluation tools.
19.03.2025 18:56 β π 0 π 0 π¬ 1 π 0
VEGGIE first leverages an MLLM to interpret complex instructions, generating frame-wise conditions, and then a video diffusion model is applied to reflect these conditions at the pixel space. Such continuous, learnable task query embeddings enable end-to-end training & capture task representations.
19.03.2025 18:56 β π 0 π 0 π¬ 1 π 0
Existing video editing methods fall short of the goal of a simple, versatile video editor, requiring multiple models, complex pipelines, or extra caption/layout/human guidance. We introduce VEGGIE which formulates diverse editing tasks as end-to-end grounded generation in pixel space.
19.03.2025 18:56 β π 0 π 0 π¬ 1 π 0
Introducing VEGGIE π₯¦βa unified, end-to-end, and versatile instructional video generative model.
VEGGIE supports 8 skills, from object addition/removal/changing, and stylization to concept grounding/reasoning. It exceeds SoTA and shows 0-shot multimodal instructional & in-context video editing.
19.03.2025 18:56 β π 5 π 4 π¬ 1 π 1
π Congrats to the awesome students, postdocs, & collaborators for this exciting batch of #ICLR2025 and #NAACL2025 accepted papers (FYI some are on the academic/industry job market and a great catch π), on diverse, important topics such as:
-- adaptive data generation environments/policies
...
π§΅
27.01.2025 21:38 β π 18 π 9 π¬ 1 π 0
π¨ We have postdoc openings at UNC π
Exciting+diverse NLP/CV/ML topics**, freedom to create research agenda, competitive funding, very strong students, mentorship for grant writing, collabs w/ many faculty+universities+companies, superb quality of life/weather.
Please apply + help spread the word π
23.12.2024 19:32 β π 37 π 15 π¬ 1 π 3
I was so lucky to work with Jaemin in my 1st year and learned a lot from him. I can confidently say he's not only a top mind in multimodal AI but also an incredible mentor&collaborator. He is insightful, hands-on, and genuinely knows how to guide and inspire junior studentsππ
09.12.2024 09:59 β π 2 π 2 π¬ 1 π 0
π¨ I am on the faculty job market this year π¨
I will be presenting at #NeurIPS2024 and am happy to chat in-person or digitally!
I work on developing AI agents that can collaborate and communicate robustly with us and each other.
More at: esteng.github.io and in thread below
π§΅π
05.12.2024 19:00 β π 47 π 14 π¬ 2 π 6
Looking forward to giving this Distinguished Lecture at StonyBrook next week & meeting the several awesome NLP + CV folks there - thanks Niranjanβ¬ + all for the kind invitation π
PS. Excited to give a new talk on "Planning Agents for Collaborative Reasoning and Multimodal Generation" β‘οΈβ‘οΈ
π§΅π
03.12.2024 16:07 β π 23 π 8 π¬ 1 π 0
π¨ Reverse Thinking Makes LLMs Stronger Reasoners
We can often reason from a problem to a solution and also in reverse to enhance our overall reasoning. RevThink shows that LLMs can also benefit from reverse thinking π 13.53% gains + sample efficiency + strong generalization (on 4 OOD datasets)!
02.12.2024 19:29 β π 19 π 11 π¬ 1 π 2
Research Engineer at Bloomberg | ex PhD student at UNC-Chapel Hill | ex Bloomberg PhD Fellow | ex Intern at MetaAI, MSFTResearch | #NLProc
https://zhangshiyue.github.io/#/
Visiting Scientist at Schmidt Sciences. Visiting Researcher at Stanford NLP Group
Interested in AI safety and interpretability
Previously: Anthropic, AI2, Google, Meta, UNC Chapel Hill
phd<<<1,1>>>(UMich);
ex<<<3,1>>>({MIT_IBM_Watson, Adobe, Amazon});
Make the community better @ACLMentorship @GrowAILikeChild
Herborium Lover, Fortune Teller, PokΓ©mon Trainer, Szechuan Cuisine Chef.
https://mars-tin.github.io
Incoming assistant professor at JHU CS & Young Investigator at AI2
PhD at UNC
https://j-min.io
#multimodal #nlp
Assistant Professor in Media Law and Ethics, UMass-Amherst
Proud UNC J-school PhD alumna
heesoojang.com
Ph.D. Student @unccs, @uncnlp, MURGe-Lab. Student Researcher @Google
We are the Leuven AI Group of Multilingual NLP (LAGoM NLP), a research lab at the department of Computer Science at KU Leuven, led by @mdlhx
The Ubiquitous Knowledge Processing Lab researches Natural Language Processing (#NLProc) with a strong emphasis on Large Language Models, Conversational AI & Question Answering | @cs-tudarmstadt.bsky.social Β· @TUDa.bsky.social
https://www.ukp.tu-darmstadt
Information is nothing without retrieval
The Webis Group contributes to information retrieval, natural language processing, machine learning, and symbolic AI.
The Low-resource + Endangered language and Computational Semantics (LECS) group at @bouldernlp.bsky.social, led by @alexispalmer.bsky.social
Research lab for Computational Cultural Studies in the Department of Humanities at @uni.lu
Current topics: critical media studies, cultural language technology, digital sociality studies
Led by @questoph.bsky.social
Get in touch via cucolab.uni.lu
Posting about research fby and events and news relevant for the Amsterdam NLP community. Account maintained by @wzuidema@bsky.social
Natural Language Processing research community at the University of Colorado Boulder.
www.colorado.edu/research/bouldernlp
Gemma Boleda, Marco Baroni, Thomas Brochhagen, Iria de Dios Flores | Computational Linguistics and Linguistic Theory Universitat Pompeu Fabra.
upf.edu/web/colt
Barcelona
CompLing group (CLAUSE) at Bielefeld U (PI: Sina ZarrieΓ). We work on: NLG, Language & Vision, Pragmatics & Dialogue, HateSpeech, BabyLMs, DH, and more!
clause-bielefeld.github.io
This is the account for the NLP community at Imperial College London! Looking forward to sharing our NLP research with you π
Conversational AI | NLP | Headed by Dr. Dilek Hakkani-Tur and Dr. Gokhan Tur | UIUC | IllinoisCDS
The NLP group at the University of Washington.
Computational linguistics β’ Natural language processing β’ Formal linguistics β’ Machine translation | at Faculty of Mathematics and Physics, Charles University
Yulia Tsvetkovβs Group at the University of Washington @uwnlp.bsky.social