Samuele Bortolotti @samubortolotti

In collaboration with @ema-ridopoco.bsky.social Tommaso Carraro @paolomorettin.bsky.social @emilevankrieken.com @nolovedeeplearning.bsky.social @looselycorrect.bsky.social @andreapasserini.bsky.social

10.12.2024 19:10 — 👍 7 🔁 0 💬 0 📌 0

Not All Neuro-Symbolic Concepts Are Created Equal: Analysis and Mitigation of Reasoning Shortcuts Neuro-Symbolic (NeSy) predictive models hold the promise of improved compliance with given constraints, systematic generalization, and interpretability, as they allow to infer labels that are consiste...

Want to know more?

1️⃣ Learn more about RSs: Why they appear, their root causes, and mitigation: arxiv.org/abs/2305.19951

2️⃣ Make NeSy models aware of their shortcuts: arxiv.org/abs/2402.12240

10.12.2024 19:10 — 👍 8 🔁 0 💬 1 📌 0

rsbench A Neuro-Symbolic Benchmark Suite for Concept Quality and Reasoning Shortcuts “A Neuro-Symbolic Benchmark Suite for Concept Quality and Reasoning Shortcuts” benchmark paper

For other details regarding rsbench, datasets, and experiments, check the links below:

Website: unitn-sml.github.io/rsbench/
Paper: openreview.net/forum?id=5Vt...
GitHub: github.com/unitn-sml/rs...

10.12.2024 19:10 — 👍 4 🔁 0 💬 1 📌 0

Easy to set up and use!

1️⃣ Configurable: can be easily configured with YAML/JSON files.
2️⃣ Intuitive: straightforward to use:

10.12.2024 19:10 — 👍 2 🔁 0 💬 1 📌 0

📊 8 challenging tasks, all with predefined settings.

3 new benchmarks:
🔢 MNMath for arithmetic reasoning
🛃 MNLogic for SAT-like problems
🚖 SDD-OIA, a synthetic self-driving task!

They can all be made easier or harder with our data generator!

10.12.2024 19:10 — 👍 2 🔁 0 💬 1 📌 0

🧪 Test your models!

- 🌍 Evaluate concepts in in- and out-of-distribution scenarios.
- 🎯 Ground-truth concept annotations are available for all tasks.
- 📊 Visualize how your models handle different learning & reasoning tasks!

10.12.2024 19:10 — 👍 2 🔁 0 💬 1 📌 0

🔍 rsbench allows you to:

- 🧮 Run algorithmic, logical, and high-stakes tasks w/ known reasoning shortcuts (RSs).
- 📊 Eval concept quality via F1, accuracy & concept collapse.
- 🛠️ Easily customize the tasks and count RSs a priori using our countrss tool!

10.12.2024 19:10 — 👍 2 🔁 0 💬 1 📌 0

🤔 What are reasoning shortcuts?

NeSy models might learn wrong concepts but still make perfect predictions!

Example: A self-driving car 🚗 stops in front of a 🚦🔴 or a 🚶. Even if it confuses the two, it outputs the right prediction!

10.12.2024 19:10 — 👍 2 🔁 0 💬 1 📌 0

🌐 rsbench allows you to evaluate the concepts learned by:

1️⃣ Neuro-Symbolic models (#NeSy)
2️⃣ Concept Bottleneck Models (#CBMs)
3️⃣ Black-box Neural Networks (NNs*)
4️⃣ Vision-Language Models (#VLMs*)

* through post-hoc concept-based explanations (e.g., TCAV)

10.12.2024 19:10 — 👍 2 🔁 0 💬 1 📌 0

📣 Does your model learn high-quality #concepts, or does it learn a #shortcut?

Test it with our #NeurIPS2024 dataset & benchmark track paper!

rsbench: A Neuro-Symbolic Benchmark Suite for Concept Quality and Reasoning Shortcuts

What's the deal with rsbench? 🧵

10.12.2024 19:10 — 👍 35 🔁 8 💬 1 📌 4

Not All Neuro-Symbolic Concepts Are Created Equal: Analysis and Mitigation of Reasoning Shortcuts

by @ema-ridopoco.bsky.social @looselycorrect.bsky.social @andreapasserini.bsky.social @samubortolotti.bsky.social

eg

👉 proceedings.neurips.cc/paper_files/...

👉 openreview.net/forum?id=pDc...

👉 unitn-sml.github.io/rsbench/

10.12.2024 15:46 — 👍 5 🔁 2 💬 0 📌 0

Samuele Bortolotti

Latest posts by samubortolotti.bsky.social on Bluesky

@samubortolotti is following 20 prominent accounts