Work done during an internship at @amazon. Huge thanks to my mentor, @zlwang_cs, and advisor, @Meng_CS, for their support in making this work possible, and to collaborators @ShiyangLi5, Xin Liu, Changlong Yu, @YinQingyu, Zhan Shi, and @zhangzxUIUC for their valuable feedback!
               
            
            
                16.09.2025 18:15 β π 0    π 0    π¬ 0    π 0                      
            
         
            
        
            
            
            
            
                                                
                                                
    
    
    
    
            8/8 [Convergence rate]
The gradient-based method consistently has a higher convergence rate, reducing the required steps by 6.1 on average across RL algorithms.
               
            
            
                16.09.2025 18:15 β π 0    π 0    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
                                                
                                                
    
    
    
    
            7/8 [Generalizability] 
We further extend experiments to different math datasets and model families. Our two methods yield superior Pareto fronts compared to the baseline, with the gradient-based weighting showing the best overall performance.
               
            
            
                16.09.2025 18:15 β π 0    π 0    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
                                                
                                                
    
    
    
    
            6/8 [Gradient-based weight optimization] 
Our method generates superior Pareto fronts that dominate all baseline approaches under both GRPO and REINFORCE training.
               
            
            
                16.09.2025 18:15 β π 0    π 0    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
                                                
                                                
    
    
    
    
            5/8 [Hypervolume-guided weight adaptation] 
Across all three online RL algorithms, there is consistently at least one weight configuration our method outperforms the baselines on all objectives.
               
            
            
                16.09.2025 18:15 β π 0    π 0    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
                                                
                                                
    
    
    
    
            Dynamic reward weights show objectives learn differently. For example, accuracy is a more challenging objective that requires continual learning, while conciseness quickly converges to 0.2.
4/8
               
            
            
                16.09.2025 18:15 β π 0    π 0    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
                                                
                                                
    
    
    
    
            3/8 [Preliminary finding] 
Different objectives vary in learning difficulty. Each objective reaches saturation at different training stages.
               
            
            
                16.09.2025 18:15 β π 0    π 0    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
            
    
    
    
    
            Question: How to redirect learning effort towards objectives with the greatest potential for improvement.
Answer: 
- If the user preference for objectives is given, use our hypervolume-based method
- If the user preference is unknown, use our gradient-based method.
2/8
               
            
            
                16.09.2025 18:15 β π 0    π 0    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
                                                
                                                
    
    
    
    
            β΄οΈ Pleased to introduce our new paper yining610.github.io/dynamic-rew...
- Rebalance multiobjectives during training through dynamic reward weighting
- Build Pareto-dominant front over static baselines across online RL algorithms, datasets, and model families
- Faster convergence rate
1/8
               
            
            
                16.09.2025 18:15 β π 0    π 0    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
            
    
    
            
            
            
                
                ACL2025: Optimizing Decomposition for Optimal Claim Verification
            
         
    
    
            This is our teaser video π 
youtu.be/TgloG4Oefeg
               
            
            
                25.07.2025 22:11 β π 0    π 0    π¬ 0    π 0                      
            
         
            
        
            
            
            
            
                                                
                                                
    
    
    
    
            Can't make it to #ACL2025 this year, but for people interested in RL for factuality and textual decomposition, please check out our paper!
TL;DR: We found a mismatch between the decomposition policy and LLM verifier, and propose a dynamic training paradigm to bridge the gap.
               
            
            
                25.07.2025 22:11 β π 1    π 0    π¬ 1    π 0                      
            
         
            
        
            
        
            
            
            
            
                                                
                                                
    
    
    
    
            Quick reminder that our paper, Benchmarking Language Model Creativity: A Case Study on Code Generation, will be presented today!
π
 11AM-12:30PM, Fri, May 2
π Hall 3
π arxiv.org/abs/2407.09007
π₯ www.youtube.com/watch?v=v1c...
               
            
            
                02.05.2025 13:11 β π 0    π 0    π¬ 0    π 0                      
            
         
            
        
            
            
            
            
            
    
    
    
    
            Highlighting our #NAACL2025 papers π§΅π§΅π§΅
               
            
            
                28.04.2025 12:30 β π 1    π 1    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
            
    
    
    
    
            I will be at #NAACL2025 to present our LLM creativity benchmark. Drop by if interested (Poster Session 8, Fri, May 2)!
I'd love to chat about RL and its interpretability, data influence for post-training, CogSci for LLM. Feel free to reach out and let's have some coffee together β !
               
            
            
                28.04.2025 19:53 β π 2    π 1    π¬ 0    π 0                      
            
         
            
        
            
            
            
            
            
    
    
            
            
            
                Yining Lu: https://yining610.github.io/ Based on the following paper: https://arxiv.org/abs/2407.09007 As LLMs become increasingly prevalent, it is interesti...
                Benchmarking Language Model Creativity: A Case Study on Code Generation --- NAACL 2025 (Yining Lu)
            
         
    
    
            A video teaser of @Yining__Lu 's paper: 
www.youtube.com/watch?v=v1c...
               
            
            
                28.04.2025 12:30 β π 1    π 1    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
            
    
    
            
                        
                Midwest Speech and Language Days 2025
                
            
        
    
    
            Midwest Speech and Language Days will be held Apr 15-16 at 
@NotreDame! Abstract submissions are due Mar 20, and registration deadline is Mar 27. Financial assistance for students (lodging, poster printing) is available. nlp.nd.edu/msld25
               
            
            
                08.03.2025 18:35 β π 0    π 2    π¬ 1    π 0                      
            
         
            
        
            
            
            
            
            
    
    
    
    
            A starter pack for #NLP #NLProc researchers! π
go.bsky.app/SngwGeS
               
            
            
                04.11.2024 10:01 β π 253    π 100    π¬ 45    π 13                      
            
         
    
         
        
            
        
                            
                    
                    
                                    
                            
                    
                    
                                            Researcher in NLP, ML, computer music. Prof @uwcse @uwnlp & helper @allen_ai @ai2_allennlp & familiar to two cats.  Single reeds, tango, swim, run, cocktails, ΧΧΦ·ΧΧ’ΦΎΧΧ©ΧΧ, GenX. Opinions not your business.
                                     
                            
                    
                    
                                            UC Berkeley/BAIR, AI2 || Prev: UWNLP, Meta/FAIR || sewonmin.com
                                     
                            
                    
                    
                                            I play with intuitions and data.
Now:  @jhuclsp @jhucompsci
Past: @allen_ai @uwnlp @Penn @cogcomp @Illinois_Alma @MSFTResearch
                                     
                            
                    
                    
                                            PostDoc @uwcse.bsky.social; β¨ Trustworthy, multimodal, human-centered AI; π https://xinyizhou.xyz/
                                     
                            
                    
                    
                                            Boeing Endowed Professor in the Allen School of Computer Science & Engineering at the University of Washington. Interested in AI/ML, computational biology, and AI in medicine. https://suinlee.cs.washington.edu/
                                     
                            
                    
                    
                                            AI Research & News: At Your Fingertips
                                     
                            
                    
                    
                                            Groundbreaking foundational research in Big Data Management, Machine Learning, and their intersection. #AI #Research
www.bifold.berlin
π°News: www.bifold.berlin/news-events/news
πData Privacy: www.bifold.berlin/data-privacy
                                     
                            
                    
                    
                                            Grad Student in Comp Sci @ UIUC @uofigrainger.bsky.social | Former Sr. Research Asst. in Radiation Physics @ Univ. of Texas MD Anderson Cancer Center @mdanderson.bsky.social, developed AI for RPA https://rpa.mdanderson.org | π³οΈβπ he/him | Views are my own
                                     
                            
                    
                    
                                    
                            
                    
                    
                                            The 2025 Conference on Language Modeling will take place at the Palais des Congrès in Montreal, Canada from October 7-10, 2025
                                     
                            
                    
                    
                                            The College of Science at the University of Notre Dame βοΈπ¬
                                     
                            
                    
                    
                                            Official account for the Rutgers Computer Science Department 
Rutgers University - New Brunswick
School of Arts and Sciences (SAS)
Busch Campus
https://www.cs.rutgers.edu/
ALL LINKS: https://linktr.ee/rucomputerscience
                                     
                            
                    
                    
                                            βοΈ Assistant Professor of Computer Science at CU Boulder π©βπ» NLP, cultural analytics, narratives, online communities π https://maria-antoniak.github.io π¬ books, bikes, games, art
                                     
                            
                    
                    
                                            Book: https://thecon.ai
Web: https://faculty.washington.edu/ebender
                                     
                            
                    
                    
                                            Data janitor and leftover linguist (retired). Tsundoku expert. Language & Cognition. NLP. Japanese literature. Anti-authoritarian. Pro-science. 
                                     
                            
                    
                    
                                            Researcher trying to shape AI towards positive outcomes. ML & Ethics +birds. Generally trying to do the right thing. TIME 100 | TED speaker | Senate testimony provider | Navigating public life as a recluse. 
Former: Google, Microsoft; Current: Hugging Face
                                     
                            
                    
                    
                                            Associate Professor, School of Information, UC Berkeley. NLP, computational social science, digital humanities. 
                                     
                            
                    
                    
                                            Associate professor of computer science at Northeastern University. Natural language processing, digital humanities, OCR, computational bibliography, and computational social sciences. Artificial intelligence is an archival science.
                                     
                            
                    
                    
                                            Associate prof at @UMich in SI and CSE working in computational social science and natural language processing.  PI of the Blablablab blablablab.si.umich.edu