Will Smith's Avatar

Will Smith

@willsmithvision.bsky.social

Professor in Computer Vision at the University of York, vision/graphics/ML research, Boro @mfc.co.uk fan and climber πŸ“York, UK πŸ”— https://www-users.york.ac.uk/~waps101/

1,095 Followers  |  558 Following  |  67 Posts  |  Joined: 12.11.2024  |  2.2504

Latest posts by willsmithvision.bsky.social on Bluesky

Or maybe in the style of Dr Seuss. Or Shakespeare.

If so, did it work? Asking for a friend...

11.05.2025 09:56 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Has anyone ever tried a very non-standard tone for a rebuttal? I'm thinking something like "Hey reviewers! Sit back, relax and let me convince you that you actually want to accept this paper..." or "You wouldn't let a little thing like that stop you accepting the paper would you? WOULD YOU?!!!"

11.05.2025 09:56 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

The date is just an unfortunate coincidence. This is a genuine SIGBOVIK submission and the full working source code is in our arXiv repository.

01.04.2025 19:24 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Neuralatex: A machine learning library written in pure LATEX Neuralatex: A machine learning library written in pure LATEX

Web: neuralatex.com
Paper: arxiv.org/abs/2503.24187
Source: arxiv.org/abs/2503.24187 (click "TeX source")

01.04.2025 12:29 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

This has been a massively entertaining and challenging side project with @jadgardner.bsky.social and @willrowan.bsky.social over the past year and (subject to rigorous peer review) will be appearing at SIGBOVIK 2025.

01.04.2025 12:29 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@overleaf.com becomes your cloud compute provider. This should level the playing field within the community as both industry and academia will have to work within the same compute limits.

01.04.2025 12:29 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

NeuRaLaTeX brings many side benefits. Your paper source code is also your method source code. No more "Code coming soon" on github. If the paper is on arxiv, the paper source link *is* the method source code! No need to remember those silly git commands anymore!

01.04.2025 12:29 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

In case your arXiv submission is timing out, we've also implemented checkpointing so you can include your trained model weights as a text file with your paper source.

01.04.2025 12:29 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

In a NeuRaLaTeX paper, when you compile your PDF, a neural network is constructed, trained and evaluated with all results and figures generated dynamically. We've worked hard on efficiency and the NeuRaLaTeX paper only took 48 hours to compile.

01.04.2025 12:29 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Our neural network library extends the autograd engine to create neurons, linear layers and MLPs. Constructing an MLP and making a forward pass is as easy as this.

01.04.2025 12:29 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Just like Karpathy's micrograd, NeuRaLaTeX implements backpropagation (reverse-mode autodiff) over a dynamically built DAG. However, while micrograd uses 150 lines of Python, NeuRaLaTeX uses around 1,100 line of pure LaTeX, making it about 700% better.

01.04.2025 12:29 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Not only that, but LaTeX has elegant and intuitive programming syntax. For example, creating a new Value object only uses the word "expand" four times:
\expanded{
\noexpand\pgfoonew\expandafter\noexpand
\csname #2\endcsname=
new Value(\newdata,{\selfid,\otherid},*,0)
}

01.04.2025 12:29 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Wait, what?

Well, LaTeX is itself a Turing-complete programming language. When you "compile" a LaTeX document, really you are executing a program written in LaTeX. This program need not only format your paper contents into a PDF but can also perform useful computation.

01.04.2025 12:29 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

So, we wrote a neural net library entirely in LaTeX...

01.04.2025 12:29 β€” πŸ‘ 81    πŸ” 15    πŸ’¬ 3    πŸ“Œ 4

On a similar note: if you have two figures one above the other at the top of one column, move one figure to the top of the other column and you gain the white space between them.

07.03.2025 19:12 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

I just pushed a new paper to arXiv. I realized that a lot of my previous work on robust losses and nerf-y things was dancing around something simpler: a slight tweak to the classic Box-Cox power transform that makes it much more useful and stable. It's this f(x, Ξ») here:

18.02.2025 18:42 β€” πŸ‘ 111    πŸ” 23    πŸ’¬ 2    πŸ“Œ 1

I must confess I just immediately copy/pasted the turkey eggs question into ChatGPT to get the answer myself. Great question!

17.01.2025 15:20 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

#CVPR2025 Area Chair update: depending on which time zone the review deadline is specified in, we are past or close to the review deadline. Of the 60 reviews needed for my batch, I currently have 52 and they have been coming in quite fast this morning. In general, review standard looks good.

14.01.2025 08:09 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
ChatGPT and Image Matching – Wide baseline stereo meets deep learning Are we done yet?

Image matching and ChatGPT - new post in the wide baseline stereo blog.

tl;dr: it is good, even feels like human, but not perfect.
ducha-aiki.github.io/wide-baselin...

02.01.2025 21:01 β€” πŸ‘ 34    πŸ” 8    πŸ’¬ 2    πŸ“Œ 1
Post image

This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical

19.12.2024 04:59 β€” πŸ‘ 54    πŸ” 7    πŸ’¬ 4    πŸ“Œ 0
Turning scholarship into an β€œengaging” podcast using AI: interdisciplinary perspectives AI generated podcast to accompany the blog post

Enjoy! on.soundcloud.com/2ALY7LnJrmiR... 5/5

18.12.2024 12:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Of course, we then put our blog post back through the podcast generator! So now its AI podcast hosts talking about our blog post about themselves talking about our research - how meta is that?! They have some minor existential crises as they discuss that they are themselves LLMs. 4/5

18.12.2024 12:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

On another level, we noticed that it made fundamental mistakes. Often these were wrapped up within clever metaphors that gave a falsely confident impression that it deeply understood the material. We were left wondering what effect a deluge of these accessible but incorrect podcasts might have. 3/5

18.12.2024 12:20 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

When NotebookLM launched the podcast feature, we both put our own papers through it and chatted about what we thought of it. On one level, we were blown away by the convincing podcast style and the way it seemed to distill complex research into an accessible form. 2/5

18.12.2024 12:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Turning scholarship into an "engaging" podcast using AI: interdisciplinary perspectives byΒ Dr Edward Kirton-Darling,Β Senior Lecturer at the University of Bristol Law School, andΒ Professor Will Smith, Department for Computer Science, University of York, Ed & Will in 1987 (or is it?) ...

Me and my friend-since-before-school @ekd.bsky.social (a law academic) have written a blog post about the NotebookLM podcast generator in the style of, well, a corny podcast dialogue: slsablog.co.uk/blog/blog-po... 1/5

18.12.2024 12:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 2
Video thumbnail

Entropy is one of those formulas that many of us learn, swallow whole, and even use regularly without really understanding.

(E.g., where does that β€œlog” come from? Are there other possible formulas?)

Yet there's an intuitive & almost inevitable way to arrive at this expression.

09.12.2024 22:44 β€” πŸ‘ 546    πŸ” 129    πŸ’¬ 22    πŸ“Œ 12

Ah - that makes sense then. It's cool that it seems to disentangle camera motion from scene dynamics.

10.12.2024 15:38 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

If all videos are upsampled like this then yes - it makes sense that the models fits this. But if the training data is a mix of full frame rate and upsampled by frame replication then you might expect it to learn smooth frame motion as a tradeoff to fit both training data types.

10.12.2024 15:36 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I agree this is very likely overfitting to temporal upsampling. I guess this makes learning video even harder as the prior that motion should be smooth is lost. It's slightly surprising that it didn't instead learn to interpolate between frames.

10.12.2024 15:24 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0

Any short block of frames appears physically-plausible but beyond some temporal scale, physical properties are no longer preserved. The big question is whether this is fixable by a bigger context length (perhaps new attention mechanisms) or whether some more explicit world model reasoning is needed.

10.12.2024 15:21 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@willsmithvision is following 20 prominent accounts