Nanne van Noord

Nanne van Noord

@nanne.bsky.social

Assistant Professor of Visual Culture and Multimedia at University of Amsterdam. http://nanne.github.io

300 Followers 164 Following 48 Posts Joined Sep 2023
3 weeks ago

Interested to find out why there are fewer women artists in Dutch museums? Few days left to apply!

0 0 0 0
1 month ago
Vacancy — Postdoc Quantifying Gender Inequality in Visual Art <p><span>Are you passionate about art, gender equality, and data-driven research?</span><span> Join the HERAtlas project to uncover the "invisible" women of art history<span>.</span> We are looking for a Postdoc to combine data science, psychology, and history to reveal the structural barriers behind gender inequality in the creative industries and translate these insights into public storytelling<span>.</span></span></p>

For the HERAtlas project at the University of Amsterdam (Netherlands) we are looking for a Postdoc to combine data science, psychology, and digital history to uncover the "invisible" women of art history!

More info at: werkenbij.uva.nl/en/vacancies...

8 5 0 1
1 month ago

Congrats @mila-oiva.bsky.social ! Excited to see all the amazing things you'll be doing at FAU 🤩

1 0 0 0
3 months ago
YouTube
JUTTERS YouTube video by Meike

Two students of our lab are presenting an artwork at NeurIPS, how amazing is that? Really impressed with the project openreview.net/pdf?id=BZjSU..., and the video they made for it!

www.youtube.com/watch?v=L631...

3 0 0 0
3 months ago

Wait, you get summaries of your own papers? That seems like step up from the "I see you work on <insert topic I've not touched in my life>" emails at least

2 0 0 0
5 months ago

And lastly, if @neuripsconf.bsky.social would choose to reverse the decisions on the papers affected by space constraints, we would be happy and able to accommodate their presentation

26 11 0 0
7 months ago

You're arguing in bad faith, so this will be my last reply.

But yes, if you actually want to learn about multimodality then you shouldnt read about MLLM.

0 0 1 0
7 months ago

I'm not sure what the point here is, but if you're going to believe Gemini over actual research done by AI researchers there isn't much more to discuss.

If you're willing to actually learn about this then you can start here: arxiv.org/abs/2505.19614, or even here: academic.oup.com/dsh/article/...

0 0 1 0
7 months ago

That's a bit sealion-y, but I'll bite - *artificial* neural networks are a poorly analogy.

Those different details also matter a lot; especially because the brain isn't just floating in a jar, it's part of an embodied system.

1 0 1 0
7 months ago

This is where your misunderstanding is happening, as they are not elementary pieces. For the visual tokens a lot of the semantics have already been determined, and hence the interpretations it can arrive at are limited.

Brain analogy really doesnt hold here. NN != Brains.

0 0 1 0
7 months ago

Its clearly not; neural nets are a poor analogy for the brain, and clearly don't work the same way.

0 0 1 0
7 months ago

This, plus the (initial) interpretation of the modalities should not be independent - even at the pixel/word-level we may want to interpret differently depending on the other modalities (e.g., sense disambiguation)

Partial Information Decomposition has been used to formalise some of this

1 0 1 0
7 months ago

No.. that's not how any of that works 😵‍💫

0 0 1 0
7 months ago

It means I said 'mix' to explain the process, but I obviously know this involves attention - so the Gemini explanation is not meaningfully different.

Potential limited: if key visual info is missing, then attention wont recover that. So alot of 'decisions' about visual are made before fusion

0 0 1 0
7 months ago

Ah, I see how you and Gemini misunderstood. I was talking about extracting visual tokens, and mix referred to attention.

That doesnt make it meaningfully multimodal; potential of visual tokens is still limited by visual encoder.

Anyway, if I wanted to talk to an LLM I would do that directly

1 0 1 0
7 months ago

Please do explain then how whatever you're referring to is different and actually meaningfully multimodal.

0 0 1 0
7 months ago

*all semantic information* is quite the claim; in our experiments they miss a lot of semantics from visual

'text space' in that after the image encoder the visual information is fixed, and mixed with text tokens for seq2text - which is not how multimodality works..

1 1 1 0
7 months ago

Natively is a bit of an exaggeration, as it's mostly just other modalities mapped to text space as input - but this makes their 'understanding' rather shallow

3 1 1 0
7 months ago
Preview
Identifying Prompted Artist Names from Generated Images A common and controversial use of text-to-image models is to generate pictures by explicitly naming artists, such as "in the style of Greg Rutkowski". We introduce a benchmark for prompted-artist reco...

This paper on identifying prompted artist names from generated images is such a fun and creative take on data attribution arxiv.org/abs/2507.18633

Wonder if it would do something meaningful for analysing artistic influence for human-made art 🤔

5 1 0 0
7 months ago

This paper is 💯

Generally, I have the impression NLP does better at this than CV - but clearly both fields should push studying culture beyond just looking at national identities

5 0 0 0
7 months ago

If the priority is to dunk on people that know less about AI, instead of being accurate, that could be a conclusion I guess.

0 0 0 0
7 months ago
Visual Geometry Group - University of Oxford Computer Vision group from the University of Oxford

It would be weird to describe this 2012 system, that is doing search, as an SVM classifier doing search: www.robots.ox.ac.uk/%7Evgg/publi...

Similarly, I wouldn't describe an LLM that translates a query to a destination for a Waymo as an 'LLM driving a car'

0 0 1 0
7 months ago

I'm not questioning your definition of searching, I'm questioning your use of "LLMs".

I don't think defining an LLM as a transformer-based NN is inaccurate, in which case it isn't doing search by itself, and then it would be fine to argue that it can only hallucinate.

0 0 2 0
7 months ago

That statement mostly seems to apply to hosted commercial systems. It takes more than just downloading an LLM from huggingface to have a system that does this.

Sure an LLM can be trained to formulate queries and process results, but the system doing the searching is more than 'just' an LLM.

1 0 0 0
7 months ago

Fair, but still meaningful to make the distinction between LLMs and reasoning models, as not all LLMs are reasoning models. Especially if the point is to communicate across silos.

0 0 1 0
7 months ago

Do LLMs do search? Afaik there have been systems built around LLMs that do search, and then send these results back to them (i.e., RAG-like) - but that isn't the same as an LLM doing search.

0 0 1 0
7 months ago

I couldnt find EurIPS registration costs; hopefully they can address this by lowering costs for authors

But yes - this has been absurd; especially for those with visa issues - and I do think for that group this is a (minor) improvement

1 0 0 0
7 months ago

Not my intention to defend the requirement for a full registration, but this has been common practice for a while across multiple conferences.

The main change of new locations seems primarily that those with US visa issues will be able to present somewhere. But it doesnt really change costs

1 0 1 0
7 months ago

This considers registration only, no? One could register for in person, but not go - folks with visa issues have had to do this

1 0 1 0
8 months ago

This distinction is also useful because it makes it harder to avoid responsibility, as its easy to avoid directly working on surveillance - yet harder to avoid doing CV work that is surveillance-enabling.

Unless your position is that these are the same?

0 0 0 0