Check the cheese in your cheese drawer! My parents in Illinois had two packages of shredded cheese in their cheese drawer that had been recalled.
www.npr.org/2025/12/03/n...
@jmac-ai.bsky.social
Ask me about Reinforcement Learning Research @ Sony AI AI should learn from its experiences, not copy your data. My website for answering RL questions: https://www.decisionsanddragons.com/ Views and posts are my own.
Check the cheese in your cheese drawer! My parents in Illinois had two packages of shredded cheese in their cheese drawer that had been recalled.
www.npr.org/2025/12/03/n...
Didn't say I didn't like it. I said I wasn't sure how to best continue with it for the point I was making.
03.12.2025 14:20 β π 1 π 0 π¬ 0 π 0Blue Sky isn't a great medium to go into all the details about how I think it should be changed, but I expand a bit on it in a separate thread here:
bsky.app/profile/jmac...
At some point I may write something more complete up.
Victim means people's career is dependent on a system that is bad, even counter productive, at quality scientific dissemination and it abuses people's time.
Like I said in the replies, deanonymizing alone isn't a fix. We need substantive changes to the whole system, which includes it.
Because they -- we -- are all victims of the publication system established 70 some years ago. On both sides of it. The system does not work for us and has only gotten worse with time.
It's time for a change.
Good post that shows connections between policy gradient and βotherβ methods and what that implies about the effectiveness of PG.
(I also donβt think it has that many equations, but I may have a higher than typical threshold :p )
Fast simulators are very useful experimental tools, but I think they're the wrong "scale" for RL in general. The dream of adaptive embodied agents means sample efficiency is king.
However, it is paramount that our RL algorithms _computationally_ scale. Currently, they don't as much as they should.
I'm not sure there is an appropriate signal metaphor to continue with, but my preference is that we should make it easier for the community to find the good work relevant to them, rather than gate keep bad work.
Gate keeping is a losing battle requiring enormous effort, and hurts good work.
I agree that increased pressure would be bad. Our goal should be to make it easier on them, provide more rewards for it, and make reviews more useful to the community.
Furthermore, if all we did was deanonymize reviews with the current system, that would likely be a detriment.
I think there is! The context of changes I think we should make is larger than I included in this thread though, so I understand if that's not obvious here :p
At some point I need to write up all my thoughts in a more digestible form.
And FWIW, I've been calling for these changes for many years now. It didn't require a leak for me to reach these conclusion.
01.12.2025 16:15 β π 3 π 0 π¬ 1 π 0There may be special cases where we still want anonymous reviews, but they won't be the norm like it is now.
01.12.2025 16:13 β π 2 π 0 π¬ 1 π 0If we change how publishing works substantively, we can ease the burden & move the role of reviews from filtering bad to work to promoting good work with constructive feedback.
With that change, it's desirable for both reviewers and the community to be deanonymized reviews in most cases.
It's not punitive at all to me. Quite the contrary. I think reviewers are victims of the system as is and get no benefit from it despite the incredible amount of work writing a good review requires.
01.12.2025 16:13 β π 1 π 0 π¬ 2 π 0(And while I say so in my thread, just be clear, I'm not opposed to review. I think that's incredibly important. And I'm open to some blind review in some capacity. But I'm very against the way it works in our publishing system.)
01.12.2025 14:50 β π 1 π 0 π¬ 0 π 0I'm not surprised people support it. But I think they still support it because you probably cannot merely remove it and do nothing else about our systems of publication. That and because it's the status quo that we were taught was really really important.
01.12.2025 14:48 β π 1 π 0 π¬ 1 π 0I think we can do both! There are a lot of things I would restructure if I could will it into existence, but more focused communities and making them more viable for career advancement would absolutely be among them.
01.12.2025 14:45 β π 1 π 0 π¬ 0 π 0Too many people suck, sorry you had to deal with that.
If it's any consolation, they sound to me like they're trying to cope with their own failures.
With the right publishing system (see later), we can get another benefit: changing the primary role of review from filtering bad work to promoting good work with constructive feedback for further analysis.
The current adversarial "filter bad work" role of reviews does not serve science well.
Will it be perfect in getting new researcher's work noticed (assuming it should be)? No. But we are *far* from perfect along this dimension as it is. Our blind review and publication process hasn't equalized the playing field. We can do better, even in this dimension, under a different system.
30.11.2025 16:13 β π 1 π 0 π¬ 0 π 0As it is, I already find most paper's of interest these days via social media, and social media wasn't designed for this problem. We can do better if we choose to target it.
30.11.2025 16:13 β π 1 π 0 π¬ 1 π 0The objection will be that new researchers will have a hard time getting their work noticed.
I don't believe that. First, new researchers are usually attached to established researchers. Second, we can address that problem by building community tooling.
The only way I can see to win against the flood of papers is to separate publication/archiving from review. I'd go further and separate review from curation.
Garbage papers won't get reviewed by good people unless it's scathing and they're pissed. Garbage auto-fails and we don't waste our time.
Some may be concerned that some people will be disinclined to write reviews if not blind and our system buckles even more.
Good. It's already broken. It's time to change it so it's not built on this demand for k poorly assigned reviewers who can't do the work load anyway.
Re: powerful actors suppressing others. The thing about openness is it exposes them too. If we design our system to allow any number of reviews instead of "k random assigned reviewers who probably are not good candidates to review your work anyway" other reviewers can push back on revenge reviews.
30.11.2025 16:13 β π 1 π 0 π¬ 1 π 0With the right publishing system (see later), we can get another benefit: changing the primary role of review from filtering bad work to promoting good work with constructive feedback for further analysis.
The current adversarial "filter bad work" role of reviews does not serve science well.
Some upsides of non-blind review especially if we design publishing around it:
- Better curation: find the work w/ positive reviews of people you respect.
- Reviews can be part of how you advance. Taking the time to write good reviews has value to both reviewer and community
- Accountability
I'm open to there being a role for blind review, but introducing non-blind review has a lot of upsides that may reduce how much we actually care about blind review.
I think we care about blind review only because our publishing system is poorly designed and needs change in the modern era anyway.
It was always cool. That the AI groupies didnβt know it until an AI pop star said so proves it.
26.11.2025 18:06 β π 3 π 0 π¬ 1 π 0So in summary, the things I think are going on are
* The task isn't truly sparse.
* The pre-training highly correlates ways of answering (like policies) so that you get very good generalization.
* Inference search means you only need to slightly increase the probabilities to see major changes.