Erik Arakelyan (@kirekara) — Bluesky Profile

4 months ago

Back after a successful #EMNLP2025 conference in Suzhou, China -- some impressions ⤵️

Our papers: www.copenlu.com/news/8-paper...

@apepa.bsky.social @rnv.bsky.social @siddesh.bsky.social @kirekara.bsky.social @shoejoe.bsky.social @zainmujahid.me @lucasresck.bsky.social @copenlu.bsky.social
#NLProc

23 4 0 0

4 months ago

Attending EMNLP 2025 this week? So is CopeNLU -- come find us there! ⤵️

www.copenlu.com/news/8-paper...

@apepa.bsky.social @rnv.bsky.social @kirekara.bsky.social @shoejoe.bsky.social @dustinbwright.com @zainmujahid.me @lucasresck.bsky.social @iaugenstein.bsky.social

#NLProc #AI #EMNLP2025

4 3 0 0

11 months ago

The last round of applause goes to the @copenlu.bsky.social lab, @ucph.bsky.social and my amazing colleagues and friends there for the heartwarming, inspiring and fun times we had ♥️ to everyone involved in this journey goes my deepest sympathy ♥️♥️

2 0 0 0

11 months ago

I also want to thank the fantastic PhD committee,
@barbaraplank.bsky.social , Ivan Titov and
@delliott.bsky.social sky.social, for their deep, thought-provoking and insightful questions and analysis.

1 0 0 0

11 months ago

I defended my PhD at the University of Copenhagen ☺️ What a journey! I want to give massive thanks to my amazing supervisors, @iaugenstein.bsky.social and @neuralnoise.com who were there with me throughout the whole process.

Thesis on: osoblanco.github.io/thesis/
The Arxiv version is coming soon!

7 1 3 0

1 year ago

@dfdazac.bsky.social was an honor to work with someone as amazing as you.

The line made me teary 🥹🥹♥️♥️

11 2 1 0

1 year ago

Hello bluesky!
I'm using this first post to share that my PhD thesis is now available online at research.vu.nl/en/publicati...
Thanks to all my collaborators who joined me in this journey!

20 2 1 2

1 year ago

I think given the current weird/awful state of how reviewing is handled in major ML venues we would explicitly need ranking the reviewers even if they are anonymous. This can help (S)ACs at least internally filter out malicious and unqualified ones.

Will work on smth like this closer to ~ICML.

5 0 1 0

1 year ago

What i secretly desire is even stricter than grounding with RAG. Maybe have a big Knowledge Graph for grounding and use a good neural link predictor for confirming if the facts are correct. This covers factuality, we also would like deductive and analytic reasoning similar to a theorem prover.

3 0 1 0

1 year ago

The main question about the current LLM “reasoning” research is what to do next. Most go into synthetic generation and training on maybe with self-Refinement in hopes the model becomes better. I think we are missing controlled task formalization, step by step reasoning and strict step verification.

24 3 5 1

1 year ago

My amazing collaborators will be presenting three papers next week at EMNLP 2024! I wrote a blog post about our EMNLP papers and some of the other projects we're brewing 🚀🙂 neuralnoise.com/2024/nov-res...

11 5 0 0

1 year ago

The results consistently show that, over each model, traces that lead to correct answers had a higher percentage of unique emergent facts and overlap in the relations used between the code and search, while the portion of underutilized relations was lower.🤔🤔

2 0 0 0

1 year ago

By comparing relations in code with those in search traces, we measure emergent hallucinations and unused relations, highlighting areas of sub-optimal reasoning. We also assess the uniqueness of emergent facts per inference hop, indicating the extent of problem-space exploration.

2 0 0 0

1 year ago

We found out that there is a strong correlation between the search faithfulness towards the code and model performance across all of the models.

2 0 0 0

1 year ago

Using FLARE also allows the evaluation of faithfulness of the completed search w.r.t. the defined facts, relations, and search logic (taken from Prolog). We simply compare (ROUGE-Lsum) the simulated search with the actual code execution when available.

2 0 0 0

1 year ago

The method boosts the performance of various LLMs at different scales (8B -> 100B+) compared to CoT and Faithful CoT on various Mathematical, Multi-Hop, and Relation Inference tasks.

2 0 0 0

1 year ago

LLM formalizes the tasks using Prolog into facts, relations, and search logic and simulates exhaustive search by iteratively exploring the problem space with backtracking.

2 0 0 0

1 year ago

👋Psst! Want more faithful, verifiable and robust #LLM reasoning than with CoT, but using external solvers is meh? Our FLARE💫uses Logic Prog with Exhaustive Simulated Search to achieve this.🧵
@pminervini.bsky.social, Patrick Lewis, Pat Verga and @iaugenstein.bsky.social

arxiv.org/abs/2410.11900

10 5 6 0

1 year ago

At #EMNLP2024 we will present our paper on LLM values and opinions!

We introduce tropes: repeated and consistent phrases which LLMs generate to argue for political stances.

Read the paper to learn more! arxiv.org/abs/2406.19238
Work done Uni Copenhagen + Pioneer Center for AI

21 5 1 2

1 year ago

Analysing The Impact of Sequence Composition on Language Model Pre-Training Most language model pre-training frameworks concatenate multiple documents into fixed-length sequences and use causal masking to compute the likelihood of each token given its context; this strategy i...

Hey! 🙂 we analysed what happens during pre-training, and for causal LMs, intra-document causal masking helps quite a bit both in terms of pre-training dynamics and downstream task performance: arxiv.org/abs/2402.13991

7 2 1 0