Honestly, it feels like as an AI researcher it might actually be worth it to throw your dignity aside and pay Elon for Twitter blue to advertise your papers. Getting papers famous is literally just a social media clout game now.
28.01.2025 15:33 β π 4 π 0 π¬ 1 π 0
See the second part of my post - yes, they are likely using explicit search to improve performance at test time. But the focus should be on the search through reasoning chains itself, which the model has been trained to do with RL. Even for the explicit search, you require the RL value functions.
23.12.2024 01:02 β π 0 π 0 π¬ 0 π 0
Few fields reward quick pivoting as much as AI, or vice versa punish the very thing a phd is usually meant to be: stick with one research direction for 5 years no matter what, go really deep, becoming a niche expert
for your research to be relevant in AI, you might wanna pivot every 1-2 years
22.12.2024 06:10 β π 15 π 2 π¬ 3 π 0
I think the intersection of builders and researchers is higher in machine learning, compared to other disciplines.
22.12.2024 05:13 β π 1 π 0 π¬ 0 π 0
You could still wrap this with explicit search techniques like MCTS if you have value functions for partial sequences (which would also be a product of the RL training). This could further improve performance, similar to fast vs slow policy in AlphaZero.
22.12.2024 04:09 β π 3 π 0 π¬ 0 π 0
Saying o3 is just a βmore principled search techniqueβ is quite reductive. The o series of models donβt require βexplicit searchβ strategies in the form of tree search, wrapped in loops etc. Instead, RL is used to train the model to βlearn to searchβ using long CoT chains.
22.12.2024 04:07 β π 2 π 0 π¬ 3 π 0
Youβre correct, thereβs plenty of simulated environments we canβt solve yet. But do you consider having 1 million parallel instances of an environment sped up 100x solving it with PPO with low wall clock time a desirable solution?
22.12.2024 02:31 β π 0 π 0 π¬ 0 π 0
This isnβt a general solution to RL. The point is to make learning algorithms sample efficient. If the environment you are doing RL on is the real world, you canβt make the βenvironment go fastβ.
With βinfinite samplesβ, you can random sample policies till you stumble on one with high reward.
21.12.2024 15:51 β π 5 π 0 π¬ 1 π 0
GitHub - GFNOrg/diffusion-samplers
Contribute to GFNOrg/diffusion-samplers development by creating an account on GitHub.
Come check out our neurips poster today! We will be at West Ballroom #7101 from 4:30pm - 7:30pm.
Website: github.com/gfnorg/diffu...
12.12.2024 20:51 β π 1 π 1 π¬ 0 π 0
If you're at NeurIPS, RLC is hosting an RL event from 8 till late at The Pearl on Dec. 11th. Join us, meet all the RL researchers, and spread the word!
10.12.2024 21:55 β π 63 π 18 π¬ 2 π 4
Even his current claim that o1 is βbetter than most humans in most tasksβ is pretty wild imo. What are βmost tasksβ here even? Obviously not any physical tasks because there is no embodiment. Can o1 actually completely replace a human in any job? Can it manage a project from start to finish?
07.12.2024 23:07 β π 0 π 0 π¬ 0 π 0
x.com
x.com/vahidk/statu...
07.12.2024 22:54 β π 0 π 0 π¬ 1 π 0
It also doesnβt help when OpenAI staff post about how o1 is already AGI (yes this happened today).
Unfortunately the dialogue is directed by those on either end of the spectrum (AI is useless vs AGI is already here) without much room for nuance.
07.12.2024 22:14 β π 4 π 0 π¬ 1 π 0
A year before CEO shooting, lawsuit alleged UHC used AI to deny coverage
The lawsuit accuses UnitedHealthcare of using artificial intelligence to deny coverage to elderly patients.
www.newsweek.com/united-healt...
I have anecdotal evidence from a friend who works at a client company for a popular insurance firm. They are using shitty βAI modelsβ which are basically just CatBoost to mass process claims. They know the models are shit, but thatβs also the point. Truly sickening.
06.12.2024 09:01 β π 2 π 0 π¬ 0 π 0
It is reductive to blame it all on a single CEO, but I find it hard to believe how you are βshockedβ by this public reaction. UHC has the highest claim denial rate among insurance providers, resulting in untold medical bankruptcies and preventable deaths. Iβm shocked this doesnβt happen more often.
06.12.2024 08:45 β π 5 π 0 π¬ 1 π 0
Subtlety and nuance go out the window when strong political feelings are thrown in the mix. I understand why AI researchers can get defensive/angry due to toxic comments, but we should still try to understand the origin of peopleβs anger. Imo, right wing AI silicon valley billionaires are the root.
01.12.2024 20:40 β π 0 π 0 π¬ 0 π 0
I think the recent conflict between AI researchers and the anti-AI clique hints at the latter. This broad left leaning user base could fracture again as differences in opinions between the farther left and moderate factions get amplified.
01.12.2024 04:28 β π 1 π 0 π¬ 0 π 0
This app is an interesting social experiment. Assuming Bluesky doesnβt just fizzle out, will hostile social relations as in Twitter resurface here too? If hostilities do return, will it be because conservatives come to this app, or will it be new political tensions within left leaning communities?
01.12.2024 04:23 β π 0 π 0 π¬ 1 π 0
Another thing; letβs reflect if they actually have a point. When I deeply reflect upon it, I am not even personally convinced that in the grand scheme of things AI is going to be a net good for humanity. So, maybe the distaste is warranted and weβre the ones in the bubble?
30.11.2024 14:05 β π 2 π 0 π¬ 1 π 0
As AI researchers, we shouldnβt demonize people outside our space who have a passionate distaste for AI. You have to understand that most of the pro-AI sentiment people see online comes from absolutely vile βAI-brosβ, especially on twitter. We just need to distinguish ourselves as academics.
30.11.2024 14:03 β π 3 π 1 π¬ 1 π 0
Yeah, it will definitely not be βtrue OTβ at end, but it works to get surprisingly smooth ODE paths which can be easily numerically integrated. You can train a CIFAR 10 flow model which can generate high quality images with 5-10 Euler steps.
30.11.2024 13:51 β π 0 π 0 π¬ 0 π 0
Sure, that argument works from a utilitarian perspective.
From monkey brain casual user point of view, it looks ugly and outdated. And I think this is what should be focused on.
29.11.2024 04:03 β π 0 π 0 π¬ 1 π 0
Anyone has thoughts about which generative models are also the best for representation learning features for downstream tasks?
My guess is GANs are a dark horse and the latents carry important abstract features. But we havenβt explored this much since they are hard to train.
29.11.2024 04:02 β π 1 π 0 π¬ 0 π 0
You can just have a verification system like the system in pre-Elon twitter, where blue check marks are verified accounts.
29.11.2024 03:52 β π 1 π 0 π¬ 2 π 0
Ideally it should default to your username like Twitter. These small inconveniences add up over time and could cause people to go back over to twitter and need to be changed. Twitter perfected the design of this kind of social media, and these minor design choices matter.
29.11.2024 01:38 β π 2 π 0 π¬ 2 π 0
IQL and BCQ are still the most consistent, reliable offline RL algorithms. Interestingly, IQL optimizes for the optimal batch constrained policy too (just without a behavior policy model which is needed for BCQ).
Many other algorithms seem to work βbetterβ since they overfit hyperparams for D4RL.
27.11.2024 14:30 β π 4 π 0 π¬ 1 π 0
xLSTM: Extended Long Short-Term Memory
In the 1990s, the constant error carousel and gating were introduced as the central ideas of the Long Short-Term Memory (LSTM). Since then, LSTMs have stood the test of time and contributed to numerou...
XLSTM helps with the parallelizable thing arxiv.org/abs/2405.04517
I suspect the memory issues and compute scaling with sequence lengths will motivate some large scale model with these soon. Probably for high dimensional data like videos rather than language.
27.11.2024 14:20 β π 3 π 0 π¬ 0 π 0
Pretty cool, didnβt know of this work. Recurrent nets are still quite slow to train for large sequences like in LLMs cuz itβs not parallelizable (though chunking like your paper would definitively help). Would be curious to see how well it works at very large scale.
27.11.2024 14:17 β π 1 π 0 π¬ 1 π 0
Would like to be added :)
27.11.2024 14:09 β π 0 π 0 π¬ 0 π 0
PhD candidate at UCSD. Prev: NVIDIA, Meta AI, UC Berkeley, DTU. I like robots π€, plants πͺ΄, and they/them pronouns π³οΈβπ
https://www.nicklashansen.com
AI prof at Mila (HEC) trying to make the future more cooperative and cool πποΈ
Deep learning, real-world generalization, responsible AI, safety, risk, climate, ecology, artscience, opensource, anticolonial AI
they/she
teganmaharaj.neocities.org
Assistant Professor at Mila and UdeM
https://necludov.github.io/
MSc at Mila, Reinforcement learning, representation learning and probabilistic inference.
AI Research @ Mila | Harvard | Cambridge | Edinburgh
Ph.D. Student at Carnegie Mellon,
Student Research at Google
Formerly Applied Science Intern Amazon, Undergrad at Delhi Technological University
π Foundation Models for Structured Data (Time Series, Tabular), applications in healthcare.
Research Scientist at ServiceNow
Gradient-descent enthusiast building LLM agents.
Formerly Mila, Deepmind, Amazon, ElemenAI, Spotify
Master's @Mila Quebec | Generative Models, AI4Science, ML for Chemistry
FAIR Chemistry. Simulation-based Inference.
Research Scientist@Google DeepMind
Assoc Prof@York University, Toronto
mbrubake.github.io
Research: Computer Vision and Machine Learning, esp generative models.
Applications: CryoEM (cryoSPARC), Statistics (Stan), Forensics, and more
PhD @ucberkeleyofficial.bsky.social | Past: AI4Code Research Fellow @msftresearch.bsky.social | Summer @EPFL Scholar, CS and Applied Maths @IIITDelhi | Hobbyist Saxophonist
https://lakshyaaagrawal.github.io
Maintainer of https://aka.ms/multilspy
Postdoctoral researcher at McGill in #AI #ML core developer of SpeechBrain.
Studies analysis of speech patterns for bio markers, for speech enhancement, robust ASR, continual learning, etc.
Ph.D. Student at Mila
Visiting Researcher at Meta FAIR
Causality, Trustworthy ML
Former: Microsoft Research, IIT Kanpur
divyat09.github.io
PhD student at @mcgillu/ @MILAMontreal. Bandits and Reinforcement Learning. BS, MS @Stanford π§π©πΊπΈπ¨π¦ I am on the job market!
https://hmishfaq.github.io/
Postdoc at CBS, Harvard University
(New around here)
Ex-SWE @ Google Life Science // Radiology Resident @ UPenn // researching self supervision for radiology AI
CTO @ Peppermint Robotics. Prev EEE @ BITS Pilani.
Making robots do the hard work
Developing the next generation of audio production tools for Apple Silicon @ www.almostrealism.com
ML for remote sensing @Mila_Quebec * UdeM x McGill CS alum
Interests: Responsible ML for climate & societal impacts, STS, FATE, AI Ethics & Safety
prev: SSofCS lab
ππ¨π¦ Montreal (allegedly)
TW: @XMichellelinX
https://mchll-ln.github.io/
Research Intern at Mila- Quebec AI Institute
AI Engineering at Vector Institute