James MacGlashan's Avatar

James MacGlashan

@jmac-ai.bsky.social

Ask me about Reinforcement Learning Research @ Sony AI AI should learn from its experiences, not copy your data. My website for answering RL questions: https://www.decisionsanddragons.com/ Views and posts are my own.

2,209 Followers  |  1,067 Following  |  701 Posts  |  Joined: 28.09.2024  |  2.0926

Latest posts by jmac-ai.bsky.social on Bluesky

Preview
Check your cheese: Shredded and grated varieties are recalled nationwide The FDA is urging customers to toss certain brands of grated Pecorino Romano; at the same time, it escalated an existing recall of numerous shredded cheeses.

Check the cheese in your cheese drawer! My parents in Illinois had two packages of shredded cheese in their cheese drawer that had been recalled.

www.npr.org/2025/12/03/n...

04.12.2025 04:32 β€” πŸ‘ 5    πŸ” 4    πŸ’¬ 1    πŸ“Œ 1

Didn't say I didn't like it. I said I wasn't sure how to best continue with it for the point I was making.

03.12.2025 14:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Blue Sky isn't a great medium to go into all the details about how I think it should be changed, but I expand a bit on it in a separate thread here:
bsky.app/profile/jmac...

At some point I may write something more complete up.

03.12.2025 01:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Victim means people's career is dependent on a system that is bad, even counter productive, at quality scientific dissemination and it abuses people's time.

Like I said in the replies, deanonymizing alone isn't a fix. We need substantive changes to the whole system, which includes it.

03.12.2025 01:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Because they -- we -- are all victims of the publication system established 70 some years ago. On both sides of it. The system does not work for us and has only gotten worse with time.

It's time for a change.

03.12.2025 00:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Good post that shows connections between policy gradient and β€œother” methods and what that implies about the effectiveness of PG.

(I also don’t think it has that many equations, but I may have a higher than typical threshold :p )

02.12.2025 19:01 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Fast simulators are very useful experimental tools, but I think they're the wrong "scale" for RL in general. The dream of adaptive embodied agents means sample efficiency is king.

However, it is paramount that our RL algorithms _computationally_ scale. Currently, they don't as much as they should.

02.12.2025 15:15 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I'm not sure there is an appropriate signal metaphor to continue with, but my preference is that we should make it easier for the community to find the good work relevant to them, rather than gate keep bad work.

Gate keeping is a losing battle requiring enormous effort, and hurts good work.

02.12.2025 15:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I agree that increased pressure would be bad. Our goal should be to make it easier on them, provide more rewards for it, and make reviews more useful to the community.

Furthermore, if all we did was deanonymize reviews with the current system, that would likely be a detriment.

01.12.2025 18:28 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I think there is! The context of changes I think we should make is larger than I included in this thread though, so I understand if that's not obvious here :p

At some point I need to write up all my thoughts in a more digestible form.

01.12.2025 17:32 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

And FWIW, I've been calling for these changes for many years now. It didn't require a leak for me to reach these conclusion.

01.12.2025 16:15 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

There may be special cases where we still want anonymous reviews, but they won't be the norm like it is now.

01.12.2025 16:13 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

If we change how publishing works substantively, we can ease the burden & move the role of reviews from filtering bad to work to promoting good work with constructive feedback.

With that change, it's desirable for both reviewers and the community to be deanonymized reviews in most cases.

01.12.2025 16:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

It's not punitive at all to me. Quite the contrary. I think reviewers are victims of the system as is and get no benefit from it despite the incredible amount of work writing a good review requires.

01.12.2025 16:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

(And while I say so in my thread, just be clear, I'm not opposed to review. I think that's incredibly important. And I'm open to some blind review in some capacity. But I'm very against the way it works in our publishing system.)

01.12.2025 14:50 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I'm not surprised people support it. But I think they still support it because you probably cannot merely remove it and do nothing else about our systems of publication. That and because it's the status quo that we were taught was really really important.

01.12.2025 14:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I think we can do both! There are a lot of things I would restructure if I could will it into existence, but more focused communities and making them more viable for career advancement would absolutely be among them.

01.12.2025 14:45 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Too many people suck, sorry you had to deal with that.

If it's any consolation, they sound to me like they're trying to cope with their own failures.

30.11.2025 18:24 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

With the right publishing system (see later), we can get another benefit: changing the primary role of review from filtering bad work to promoting good work with constructive feedback for further analysis.

The current adversarial "filter bad work" role of reviews does not serve science well.

30.11.2025 16:13 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Will it be perfect in getting new researcher's work noticed (assuming it should be)? No. But we are *far* from perfect along this dimension as it is. Our blind review and publication process hasn't equalized the playing field. We can do better, even in this dimension, under a different system.

30.11.2025 16:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

As it is, I already find most paper's of interest these days via social media, and social media wasn't designed for this problem. We can do better if we choose to target it.

30.11.2025 16:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The objection will be that new researchers will have a hard time getting their work noticed.

I don't believe that. First, new researchers are usually attached to established researchers. Second, we can address that problem by building community tooling.

30.11.2025 16:13 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The only way I can see to win against the flood of papers is to separate publication/archiving from review. I'd go further and separate review from curation.

Garbage papers won't get reviewed by good people unless it's scathing and they're pissed. Garbage auto-fails and we don't waste our time.

30.11.2025 16:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

Some may be concerned that some people will be disinclined to write reviews if not blind and our system buckles even more.

Good. It's already broken. It's time to change it so it's not built on this demand for k poorly assigned reviewers who can't do the work load anyway.

30.11.2025 16:13 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Re: powerful actors suppressing others. The thing about openness is it exposes them too. If we design our system to allow any number of reviews instead of "k random assigned reviewers who probably are not good candidates to review your work anyway" other reviewers can push back on revenge reviews.

30.11.2025 16:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

With the right publishing system (see later), we can get another benefit: changing the primary role of review from filtering bad work to promoting good work with constructive feedback for further analysis.

The current adversarial "filter bad work" role of reviews does not serve science well.

30.11.2025 16:13 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Some upsides of non-blind review especially if we design publishing around it:
- Better curation: find the work w/ positive reviews of people you respect.
- Reviews can be part of how you advance. Taking the time to write good reviews has value to both reviewer and community
- Accountability

30.11.2025 16:13 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I'm open to there being a role for blind review, but introducing non-blind review has a lot of upsides that may reduce how much we actually care about blind review.

I think we care about blind review only because our publishing system is poorly designed and needs change in the modern era anyway.

30.11.2025 16:13 β€” πŸ‘ 10    πŸ” 3    πŸ’¬ 3    πŸ“Œ 2

It was always cool. That the AI groupies didn’t know it until an AI pop star said so proves it.

26.11.2025 18:06 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

So in summary, the things I think are going on are
* The task isn't truly sparse.
* The pre-training highly correlates ways of answering (like policies) so that you get very good generalization.
* Inference search means you only need to slightly increase the probabilities to see major changes.

25.11.2025 15:16 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@jmac-ai is following 20 prominent accounts