Vahid Behzadan @behzadan - Bluesky Profile

Transformers: Origins An unofficial origin story of the transformer neural network architecture.

I have converted a portion of my NLP Online Masters course to blog form. This is the progression I present that takes one from recurrent neural network to seq2seq with attention to transformer. mark-riedl.medium.com/transformers...

26.11.2024 02:15 — 👍 116 🔁 15 💬 6 📌 2

Open RL Benchmark: Comprehensive Tracked Experiments for... In many Reinforcement Learning (RL) papers, learning curves are useful indicators to measure the effectiveness of RL algorithms. However, the complete raw data of the learning curves are rarely...

Neurips reviews are now publicly available.
Don't forget to check out Open RL benchmark, very useful when implementing algorithms or checking performance/impact of hyperparameters.

openreview.net/forum?id=ZDv...

14.11.2024 11:01 — 👍 22 🔁 2 💬 0 📌 1

ALTA: Compiler-Based Analysis of Transformers We propose a new programming language called ALTA and a compiler that can map ALTA programs to Transformer weights. ALTA is inspired by RASP, a language proposed by Weiss et al. (2021), and Tracr (Lin...

I’m pretty excited about this one!

ALTA is A Language for Transformer Analysis.

Because ALTA programs can be compiled to transformer weights, it provides constructive proofs of transformer expressivity. It also offers new analytic tools for *learnability*.

arxiv.org/abs/2410.18077

24.10.2024 03:31 — 👍 53 🔁 16 💬 2 📌 0

AI Safety Events and Training: 2024 Week 46 update This is a weekly newsletter listing newly announced AI safety events and training programs.

AI Safety Events and Training: 2024 Week 46 update

aisafetyeventsandtraining.substack.com/p/ai-safety-...

15.11.2024 02:17 — 👍 3 🔁 1 💬 0 📌 0

A tweet from Tim van der Zee, from August 10, 2017, that reads: "Academia is a bunch of people emailing "sorry for the late response" back and forth until one of them gets tenure."

This was seven years ago. I think about this often.

14.11.2024 06:58 — 👍 258 🔁 20 💬 5 📌 4

On Evaluating Explanation Utility for Human-AI Decision Making in NLP Is explainability a false promise? This debate has emerged from the insufficient evidence that explanations help people in situations they are introduced for. More human-centered, application-grounded...

I will be at #EMNLP2024! My student 𝙁𝙖𝙩𝙚𝙢𝙚 𝙃𝙖𝙨𝙝𝙚𝙢𝙞 𝘾𝙝𝙖𝙡𝙚𝙨𝙝𝙩𝙤𝙧𝙞 will present "On Evaluating Explanation Utility for Human-AI Decision Making in NLP" in the poster session on 𝗪𝗲𝗱𝗻𝗲𝘀𝗱𝗮𝘆 𝟭𝟬:𝟯𝟬𝗮𝗺: arxiv.org/abs/2407.03545 1/

09.11.2024 18:07 — 👍 30 🔁 4 💬 2 📌 2

The AI Interdisciplinary Institute at the University of Maryland (AIM) is hiring

40 new faculty members

in all areas of AI, particularly:
- accessibility,
- sustainability,
- social justice, and
- learning;

building on computational, humanistic, or social scientific approaches to AI.

>

13.11.2024 12:37 — 👍 64 🔁 19 💬 1 📌 5

Humanities and AI Virtual Institute - Schmidt Sciences

Schmidt Sciences is outlining the timeline for a new program to support research at the intersection of artificial intelligence and the humanities. Open call for proposals to come Dec 15. www.schmidtsciences.org/humanities-a...

13.11.2024 18:00 — 👍 77 🔁 31 💬 0 📌 0

This one is a study on voting-based evaluation to comparisons of models in LMSYS Chatbot Arena leaderboard, by independent researcher Nick Ryan. Simulations show that two Condorcet-consistent methods (Copeland and Ranked Pairs) can be robust to uncertain/noisy evals.

nickcdryan.com/2024/09/06/u...

13.11.2024 10:22 — 👍 18 🔁 3 💬 2 📌 1

Honestly very disappointed since joining BlueSky, this is not the weather app I was hoping for

13.11.2024 21:34 — 👍 329 🔁 21 💬 17 📌 0

Text Shot: Further experiments reveal two key insights about the generalization mechanisms of these models: (1) the models fail to abstract general physical rules and instead exhibit "case-based" generalization behavior, i.e., mimicking the closest training example; (2) when generalizing to new cases, models are observed to prioritize different factors when referencing training data: color > size > velocity > shape. Our study suggests that scaling alone is insufficient for video generation models to uncover fundamental physical laws, despite its role in Sora's broader success.

How Far is Video Generation from World Model: A Physical Law Perspective https://arxiv.org/abs/2411.02385v1 #AI #video

10.11.2024 01:30 — 👍 2 🔁 2 💬 0 📌 0

NSF COA | Jordan Matelsky

NSF makes you say who you got conflicts (coauthored) with. We (really just Jordan Matelsky) just built you a tool for that. Literally one click: bib.experiments.kordinglab.com/nsf-coa

11.11.2024 20:11 — 👍 682 🔁 319 💬 84 📌 74

New York Theory Day finally returns on December 6, 2024, after being put on hiatus during COVID.

Will be held at @nyutandon.bsky.social in Brooklyn. Registration is free!

Ft stellar speakers Amir Abboud, Sanjeev Khanna, Rotem Oshman, and
Ron Rothblum!

sites.google.com/view/nyctheo...

14.11.2024 01:06 — 👍 19 🔁 4 💬 2 📌 0

Hello… world?

13.11.2024 20:14 — 👍 4 🔁 0 💬 0 📌 0

Vahid Behzadan

Latest posts by behzadan.bsky.social on Bluesky

@behzadan is following 19 prominent accounts