Finbarr's Avatar

Finbarr

@finbarr.bsky.social

building the future research at midjourney, deepmind. slinging ai hot takes πŸ₯žat artfintel.com

2,073 Followers  |  58 Following  |  89 Posts  |  Joined: 16.04.2023  |  1.9144

Latest posts by finbarr.bsky.social on Bluesky

would love your take when you do!

07.12.2024 15:56 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ₯Έ

07.12.2024 15:53 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
On the State of the Art of Evaluation in Neural Language Models Show that LSTMs are as good or better than recent innovations for LM and that model evaluation is often unreliable.

This is one of my all time favorite papers:

openreview.net/forum?id=ByJ...

It shows that, under fair experimental evaluation, lstms do just as well as a bunch of β€œimprovements”

07.12.2024 15:51 β€” πŸ‘ 24    πŸ” 4    πŸ’¬ 3    πŸ“Œ 0
Preview
On the State of the Art of Evaluation in Neural Language Models Show that LSTMs are as good or better than recent innovations for LM and that model evaluation is often unreliable.

openreview.net/forum?id=ByJ...

07.12.2024 15:50 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I’ll find it one sec

07.12.2024 15:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Very likely.

06.12.2024 00:57 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ™

06.12.2024 00:56 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Fun fact: I recently encountered (well, saw on the news) the only other person named finbarr in Canada I’ve ever seen.

The only issue is, he was an arsonist who set a ton of fires in Edmonton.

06.12.2024 00:56 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Really fun conversation with @natolambert.bsky.social!

05.12.2024 20:43 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This is mckernan! What I thought was a nice neighborhood πŸ˜‚

03.12.2024 18:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Apparently there *is* another finbar(r) in Alberta.

03.12.2024 06:19 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

New homeowner fear unlocked; someone hit and ran my neighbor’s garage

03.12.2024 06:18 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I thought that was ai gen at first!

02.12.2024 12:32 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

there’s a type of β€œnot trying” which means not executing at the level of competence of a $XX billion corporation

this is the complaint about eg Google products. They’re good! better than most startups! But not β€œtrillion dollar corporation famed for engineering expertise” good.

01.12.2024 16:50 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

would also accept Austria

30.11.2024 04:57 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

I watched too many ski movies and now am trying to convince my wife we should move to Alaska

30.11.2024 04:56 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

building my own mlp implementation from scratch in numpy, including backprop, remains one of the most educational exercises I’ve done

30.11.2024 04:18 β€” πŸ‘ 19    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Welcome!

30.11.2024 04:18 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ™

26.11.2024 23:38 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Ha, it’s been on my list of todos for a while! I’m glad someone got to it.

26.11.2024 23:38 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Love this. Very clean implementations of various inference optimizations.

26.11.2024 23:20 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Agreed! Folk knowledge is worth publishing!

26.11.2024 22:53 β€” πŸ‘ 10    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I mailed this out like a month ago and just never did the promo πŸ™ˆ

26.11.2024 16:52 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Force of habit!

26.11.2024 16:30 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Ahh you’re right!

26.11.2024 16:30 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Papers I've read this week: vision language models They kept releasing VLMs, so I kept writing...

again, link is: www.artfintel.com/p/papers-ive...

26.11.2024 16:19 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
The Bitter Lesson

seems like we're seeing convergence in VLM design. most recent models (Pixtral, PaliGemma, etc) are moving away from complex fusion techniques toward simpler approaches
as usual, the bitter lesson holds: better to learn structure than impose it
incompleteideas.net/IncIdeas/Bit...

26.11.2024 16:19 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

open source VLMs use relatively little compute compared to what you might expect:
LLaVa: 768 A100 hours
DeepSeek-VL: 61,440 A100 hours
PaliGemma: ~12k A100 hours
(for reference, Stable Diffusion used 150k A100 hours)

26.11.2024 16:19 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

what i found interesting: VLMs are way simpler than they first appear. current SOTA is basically:
1. ViT encoder (init from SigLIP/CLIP)
2. pretrained LLM base
3. concat image features with text
4. finetune

26.11.2024 16:19 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Papers I've read this week: vision language models They kept releasing VLMs, so I kept writing...

link: www.artfintel.com/p/papers-ive...

26.11.2024 16:19 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

@finbarr is following 18 prominent accounts