's Avatar

@adamlsteinl.bsky.social

23 Followers  |  92 Following  |  16 Posts  |  Joined: 16.11.2024  |  1.5843

Latest posts by adamlsteinl.bsky.social on Bluesky

Preview
Instruction Following by Boosting Attention of Large Language Models We improve instruction-following in large language models by boosting attention, a simple technique that outperforms existing steering methods.

Check out our blog and paper for more details!
πŸ”—Blog: debugml.github.io/instaboost
πŸ”—Paper: arxiv.org/abs/2506.13734
πŸ€–Code: github.com/BrachioLab/I...

Thank you to my awesome co-authers @viguardieiro.bsky.social, @avishree.bsky.social e.bsky.social‬, and advisor @profericwong.bsky.social.
(7/7)

10.07.2025 18:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Crucially, InstABoost achieves this control without degrading text quality. While other latent steering methods can cause generation fluency to drop sharply as you increase their strength, InstABoost maintains coherence while steering towards the instruction.
(6/7)

10.07.2025 18:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Across 15 tasks, InstABoost either outperforms or matches the best steering method (prompt or latent-based). For tasks where prompt and latent-based steering perform equivalently, InstABoost can even combine the strengths of both and outperform both categories of methods.
(5/7)

10.07.2025 18:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Anton Xue on X: "Excited to present our paper on a logic-based perspective of LLM jailbreaks with @Avishreekh at @ICLR_conf this Saturday, April 26! Poster #268 in Hall 3+2B at 15:00 Singapore time πŸ“„ arXiv: https://t.co/2wBtqvIIwD πŸ”— Blog: https://t.co/f6OHxORDgb \begin{thread}" / X Excited to present our paper on a logic-based perspective of LLM jailbreaks with @Avishreekh at @ICLR_conf this Saturday, April 26! Poster #268 in Hall 3+2B at 15:00 Singapore time πŸ“„ arXiv: https://t.co/2wBtqvIIwD πŸ”— Blog: https://t.co/f6OHxORDgb \begin{thread}

InstABoost is theoretically grounded by prior work which shows that rule following can be manipulated by controlling attention to instructions.
(4/7)
x.com/AntonXue/sta...

10.07.2025 18:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

InstABoost steers an LLM in attention space, bridging the performance gap between latent and prompt-based steering. InstABoost can be implemented in ~3 lines of code which simply increases attention weight to an in-context instruction.
(3/7)

10.07.2025 18:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Existing steering methods are either prompt or latent-based (modifying the hidden state), but which is better? We show the answer depends on the task. The steering task landscape includes those which are latent-optimal, instruction-optimal, and equivalent.
(2/7)

10.07.2025 18:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Excited to share our new paper: "Instruction Following by Boosting Attention of Large Language Models"!

We introduce Instruction Attention Boosting (InstABoost), a simple yet powerful method to steer LLM behavior by making them pay more attention to instructions.
(🧡1/7)

10.07.2025 18:21 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1
Preview
The Road to Generalizable Neuro-Symbolic Learning Should be Paved with Foundation Models Neuro-symbolic learning was proposed to address challenges with training neural networks for complex reasoning tasks with the added benefits of interpretability, reliability, and efficiency. Neuro-sym...

Read our full position paper for in-depth experiments and insights:
πŸ”— Paper: arxiv.org/abs/2505.24874
πŸ’» Code: github.com/adaminsky/ne...
Thanks to my collaborators Aaditya Naik, Neelay Velingker, Mayur Naik, and @profericwong.bsky.social .
(9/9)

13.06.2025 20:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

As foundation models continue to scale, we argue it’s time to move beyond enforcing rigid symbolic structure in NeSy during training and tackle the exciting problem of inferring which symbols and which program are needed for each task.
(8/9)

13.06.2025 20:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

On the other hand, NeSy prompting provides two key benefits atop foundation models:

Reliability: A symbolic program enables accurate, stable, and trustworthy results.

Interpretability: Explicit symbols provide a clear, debuggable window into the model's "understanding."
(7/9)

13.06.2025 20:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

3️⃣ The Program Pitfall: Training neural nets in conjunction with a fixed program leads to "hallucinated" symbols, reaching the correct answer for the wrong reasons, similar to reasoning shortcuts.
(6/9)

13.06.2025 20:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

2️⃣ The Data Pitfall: Training on small, specialized datasets encourages overfitting.
(5/9)

13.06.2025 20:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

1️⃣ The Compute Pitfall: Training specialized NeSy models has diminishing returns. As foundation models scale, the gap between NeSy training and NeSy prompting disappears, making dedicated training a costly detour.
(4/9)

13.06.2025 20:30 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We compare traditional NeSy systems (trained end-to-end) with what we call neuro-symbolic prompting (foundation models performing perception tasks via prompting connected to a symbolic program) and find that the NeSy training process itself introduces three key pitfalls.
(3/9)

13.06.2025 20:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Neuro-symbolic learning combines neural nets + programs for efficient, interpretable AI. But NeSy training is challenging and brittle due to the symbolic component.
With foundation models succeeding via prompting alone, we argue it’s time to rethink NeSy system design.
(2/9)

13.06.2025 20:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

🧠 Foundation models are reshaping reasoning. Do we still need specialized neuro-symbolic (NeSy) training, or can clever prompting now suffice?
Our new position paper argues the road to generalizable NeSy should be paved with foundation models.
πŸ”— arxiv.org/abs/2505.24874
(🧡1/9)

13.06.2025 20:30 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

@adamlsteinl is following 20 prominent accounts