 
                        
                CoBRA: Quantifying Strategic Language Use and LLM Pragmatics
                Language is often used strategically, particularly in high-stakes, adversarial settings, yet most work on pragmatics and LLMs centers on cooperativity. This leaves a gap in systematic understanding of...
            
        
    
    
            Current LLMs are not able to jailbreak cooperative principles and still show limited understanding of strategic language. We believe this work lays foundations for sophisticated strategic reasoning and safety monitoring in downstream tasks.
๐: arxiv.org/abs/2506.01195
               
            
            
                03.06.2025 11:56 โ ๐ 0    ๐ 0    ๐ฌ 0    ๐ 0                      
            
         
            
        
            
            
            
            
                                                 
                                                         
                                                
    
    
    
    
            By analyzing model reasoning, we find extra reasoning introduces overcomplication (img left), misunderstanding, and internal inconsistency (img right). This shows the current LLMs still lack sophisticated pragmatic understanding in many ways.
               
            
            
                03.06.2025 11:56 โ ๐ 0    ๐ 0    ๐ฌ 1    ๐ 0                      
            
         
            
        
            
            
            
            
                                                 
                                                
    
    
    
    
            We evaluate a range of LLMs in terms of how good they are at perceiving strategic language. We show models struggle with our metrics while showing an overall good understanding of Gricean principles. Model size tends to have a positive effect, while reasoning does not help.
               
            
            
                03.06.2025 11:56 โ ๐ 0    ๐ 0    ๐ฌ 1    ๐ 0                      
            
         
            
        
            
            
            
            
                                                 
                                                
    
    
    
    
            (2) BaT and PaT are valid terms that reflect strategic gains/losses, which can to some extent predict conversational outcomes. In addition, our metrics are more objective. When conditioned on cases where the outcome is made based on logical arguments, the predictive power rises.
               
            
            
                03.06.2025 11:56 โ ๐ 0    ๐ 0    ๐ฌ 1    ๐ 0                      
            
         
            
        
            
            
            
            
                                                 
                                                         
                                                
    
    
    
    
            We also introduce CHARM, an annotated dataset of real legal cross-examination dialogues. By applying our framework, we show (1) (non-)cooperative discourse are distinct over the identified properties (img left), and BaT and PaT show such a distributional distinction (img right).
               
            
            
                03.06.2025 11:56 โ ๐ 0    ๐ 0    ๐ฌ 1    ๐ 0                      
            
         
            
        
            
            
            
            
                                                 
                                                         
                                                
    
    
    
    
            Based on the components above, we introduce three metricsโBenefit at Turn (BaT), Penalty at Turn (PaT), and Normalized Relative Benefit at Turn (NRBaT)โto measure the strategic gains, losses, and cumulative benefits at a turn.
               
            
            
                03.06.2025 11:56 โ ๐ 2    ๐ 0    ๐ฌ 1    ๐ 0                      
            
         
            
        
            
            
            
            
            
    
    
    
    
            For example, one witness can make a commitment that leads to a win for her but violates the maxim of manner to make her less liable to the commitment. The commitment itself is beneficial, but the gains are reduced due to vagueness.
               
            
            
                03.06.2025 11:56 โ ๐ 0    ๐ 0    ๐ฌ 1    ๐ 0                      
            
         
            
        
            
            
            
            
            
    
    
    
    
            We derive non-cooperativity from both Gricean and game-theoretic pragmatics. In our framework, a strategic move is evaluated based on two components: the commitment it expresses (base value) and the violation of maxims to maintain consistency (penalties/compensations).
               
            
            
                03.06.2025 11:56 โ ๐ 0    ๐ 0    ๐ฌ 1    ๐ 0                      
            
         
            
        
            
            
            
            
                                                 
                                                         
                                                
    
    
    
    
            Language is often strategic, but LLMs tend to play nice. How strategic are they really? Probing into that is key for future safety alignment.
๐Introducing CoBRA๐, a framework that assesses strategic language.
Work with my amazing advisors @jessyjli.bsky.social and @David I. Beaver!
               
            
            
                03.06.2025 11:56 โ ๐ 11    ๐ 3    ๐ฌ 1    ๐ 1                      
            
         
            
        
            
            
            
            
                                                 
                                                         
                                                
    
    
    
    
            Have that eerie feeling of dรฉjร  vu when reading model-generated text ๐, but canโt pinpoint the specific words or phrases ๐? 
โจWe introduce QUDsim, to quantify discourse similarities beyond lexical, syntactic, and content overlap.
               
            
            
                21.04.2025 21:29 โ ๐ 21    ๐ 9    ๐ฌ 2    ๐ 3                      
            
         
            
        
            
            
            
            
            
    
    
    
    
            ๐โโ๏ธ
               
            
            
                25.11.2024 14:39 โ ๐ 0    ๐ 0    ๐ฌ 0    ๐ 0                      
            
         
    
         
        
            
        
                            
                    
                    
                                            I do research in social computing and LLMs at Northwestern with @robvoigt.bsky.social and Kaize Ding.
                                     
                            
                    
                    
                                            Postdoctoral researcher at the Institute for Logic, Language and Computation at the University of Amsterdam.
Previously PhD Student at NLPNorth at the IT University of Copenhagen, with internships at AWS, Parameter Lab, Pacmed.
dennisulmer.eu
                                     
                            
                    
                    
                                            The account of the School of Philosophy, Psychology and Language Sciences at The University of Edinburgh.  
                                     
                            
                    
                    
                                            ๐ซ asst. prof. of compling at university of pittsburgh
 past: 
๐๏ธ postdoc @mainlp.bsky.social, LMU Munich  
๐ค  PhD in CompLing from Georgetown
๐บ๐ป x2 intern @Spotify @SpotifyResearch
https://janetlauyeung.github.io/
                                     
                            
                    
                    
                                            assistant professor in computer science / data science at NYU. studying natural language processing and machine learning. 
                                     
                            
                    
                    
                                            Academic books, journals and news from the Linguistics department at De Gruyter Brill @degruyterbrill.bsky.social, including its imprint Mouton. Posts by our editors.
                                     
                            
                    
                    
                                            Undergrad at UT Austin in CS and Linguistics
                                     
                            
                    
                    
                                            ELLIS PhD Fellow @belongielab.org | @aicentre.dk | University of Copenhagen | @amsterdamnlp.bsky.social | @ellis.eu 
Multi-modal ML | Alignment | Culture | Evaluations & Safety| AI & Society
Web: https://www.srishti.dev/
                                     
                            
                            
                    
                    
                                            Linguistics PhD student at UT Austin
                                     
                            
                    
                    
                                            Prof @sydneyuni.bsky.social Engineer and scientist building bridges between research & industry at USYD & @ansto.bsky.social, Australia. 
She/Her. Mum & wife. Leader & Mentor. Loving โท๐โโ๏ธ & life in general! Comments are my own
                                     
                            
                    
                    
                                            PhD #NLProc from CLTL, Vrije Universiteit Amsterdam || Interests: Computational Argumentation, Responsible AI, interdisciplinarity, cats || I express my own views
                                     
                            
                    
                    
                                            AI, Ethics, SmartCity
๐ฆ๐ฎ๐ฝ๐ถ๐ฒ๐ป๐๐ฎ ๐จ๐ป๐ถ๐๐ฒ๐ฟ๐๐ถ๐๐: lecturer Planning and Strategic Management 
๐๐ก๐๐: member of National Authority for AI association
๐ง๐๐ : Account Manager
๐ฐ https://doi.org/10.1108/TG-04-2024-0096
๐ https://www.linkedin.com/in/vriccardi
                                     
                            
                    
                    
                                            Applied Scientist @ Amazon
(Posts are my own opinion)
#NLP #AI #ML
I am specially interested in making our NLP advances accessible to the Indigenous languages of the Americas
                                     
                            
                    
                    
                                            Como todos los hombres de Babilonia, he sido procรณnsul; como todos, esclavo; tambiรฉn he conocido la omnipotencia, el oprobio, las cรกrceles.
very sane ai newsletter: verysane.ai
                                     
                            
                    
                    
                                            Graduate student at Penn with an interest in machine learning 
                                     
                            
                    
                    
                                            Center for Language and Speech Processing at Johns Hopkins University
 #NLProc #MachineLearning #AI http://tinyurl.com/clspy2ube
                                     
                            
                    
                    
                                            Natural Language Processing PhD Student @ Heidelberg University.
https://schumann.pub
#NLP #NLProc #ML #AI
                                     
                            
                    
                    
                                            Computational Linguistics MS @ UW | Staff Data Scientist (NLP) @ Cision 
ibrahimsharaf.github.io
                                     
                            
                    
                    
                                            Industrial PhD student at the IT University of Copenhagen ๐ and Corti.ai ๐ผ. I research in Clinical NLP ๐๐. On a mission to assist healthcare professionals ๐ฉบ with their documentation burden โ๏ธ.