Andrew Glennerster's Avatar

Andrew Glennerster

@ag3dvr.bsky.social

3D vision http://www.personal.reading.ac.uk/~sxs05ag/

79 Followers  |  215 Following  |  24 Posts  |  Joined: 25.11.2024  |  1.9784

Latest posts by ag3dvr.bsky.social on Bluesky

Redirecting

A corrigendum for this paper is out today:
doi.org/10.1016/j.ne...
(some quite weird things happened to the text between submission and proofs). Anyway, corrected now. I hope the weirdities in the uncorrected version do not put people off.

09.10.2025 15:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
In the centre is a yellow plane. This represents the set of all egocentric representations, where neighbouring points on the surface correspond to the egocentric representations for neighbouring locations in space. The large beige region shows that the egocentric representations barely change over a large region of the yellow plane when the scene consists only of mountains. This is because the angles between mountains barely change as the observer moves. The light brown region lies within the beige region. This shows that adding in a forest in the middle distance constrains the set of egocentric representations more precisely. Finally, adding in a picnic table (and all the angles between the picnic table and other object in the scene) constrains the possible location of the observer even more tightly within the overall set of egocentric locations.

In the centre is a yellow plane. This represents the set of all egocentric representations, where neighbouring points on the surface correspond to the egocentric representations for neighbouring locations in space. The large beige region shows that the egocentric representations barely change over a large region of the yellow plane when the scene consists only of mountains. This is because the angles between mountains barely change as the observer moves. The light brown region lies within the beige region. This shows that adding in a forest in the middle distance constrains the set of egocentric representations more precisely. Finally, adding in a picnic table (and all the angles between the picnic table and other object in the scene) constrains the possible location of the observer even more tightly within the overall set of egocentric locations.

The set of saccades that take the fovea between points in the scene can be described either as an egocentric representation or a β€˜policy’, a set of context-dependent actions. When the observer moves, the most enduring part of the representation is the set of angles between distant points.

02.09.2025 07:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
On the left is a grid-like coordinate frame and four locations (O1 to O4). On the right are the same four locations (O1 to O4) but now, instad of a grid coordinate frame, there are three fixation points, A, B and C. The observer fixates on one of these as they move and then makes a sudden eye rotation (saccade) to a new fixation point.

On the left is a grid-like coordinate frame and four locations (O1 to O4). On the right are the same four locations (O1 to O4) but now, instad of a grid coordinate frame, there are three fixation points, A, B and C. The observer fixates on one of these as they move and then makes a sudden eye rotation (saccade) to a new fixation point.

One element of the argument concerns the way that we (and most animals) move. We fixate on a point and move relative to that. The dorsal stream is well set up to control movements in this way.

02.09.2025 07:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Redirecting

New paper out today:

doi.org/10.1016/j.ne...

I argue that the brain navigates through image space rather than using a 3D coordinate-based reference frame.

02.09.2025 07:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Fig 2 from the paper. It shows an observer/eye moving while fixating a point and then making an eye movement to fixate a new object. Sub-panels show how the image changes during this movement and how responses in the ventral and dorsal streams relate to this movement across image space.

Fig 2 from the paper. It shows an observer/eye moving while fixating a point and then making an eye movement to fixate a new object. Sub-panels show how the image changes during this movement and how responses in the ventral and dorsal streams relate to this movement across image space.

Here is preprint corresponding to the talk I gave at #iNav2024: 'Navigating image space' doi.org/10.31234/osf.... No data, but some thoughts about the difference between Cartesian (grid-like) reference frames and an image-based frame for navigation.

17.01.2025 14:05 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Redirecting

A comprehensive review of VR and real world memory for object location (e.g. task: replace object in correct position). Some sex differences and better performance in real environments. There is no theoretical model here - that was not the goal. doi.org/10.1016/j.bb...

20.12.2024 10:08 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Practice talk, iNav 2024
YouTube video by Andrew Glennerster Practice talk, iNav 2024

A bit more detail here: www.youtube.com/watch?v=Q5XN... and in a related paper doi.org/10.1098/rstb... .

18.12.2024 09:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The solution is likely to involve abandoning 3D coordinate frames and transformations. Instead, egocentric and allocentric tasks can be solved in a space of images or sensory states. This is what is done in reinforcement learning, which is probably a better model for biology than SLAM.

18.12.2024 09:50 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

This review asks good questions and summarises well our lack of progress in answering them. The fundamental problem is the failure to conceptualise what a biologically plausible alternative to SLAM might look like.

18.12.2024 09:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Not as daft as it might seem. At least, when discussing where the interesting complexity lies.

17.12.2024 00:56 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Overall, my take is that signals relating to the current and upcoming retinal image, and the task, are highly relevant for both HPC and PPC, rather than β€˜place’ _per se_. Also, differences between processing in retinotopic areas (PPC) versus HPC are less dramatic than has been supposed in the past.

06.12.2024 17:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Text from Discussion. Paraphrasing: "We extend the interpretation to hippocampal cells. In both PPC and HPC, cues processed during fixation impacted the cells strongly"

Text from Discussion. Paraphrasing: "We extend the interpretation to hippocampal cells. In both PPC and HPC, cues processed during fixation impacted the cells strongly"

These points are relevant for hippocampal as well as PPC cells.

06.12.2024 17:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Text from Discussion. Paraphrasing: "Visual stimulation of PPC resulted in spatial selectivity that could be mistaken for 'place' codes akin to place cells found in rodent HPC cells."

Text from Discussion. Paraphrasing: "Visual stimulation of PPC resulted in spatial selectivity that could be mistaken for 'place' codes akin to place cells found in rodent HPC cells."

Visual responses can be β€˜erroneously interpreted as place codes’:

06.12.2024 17:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Part of Fig 7, showing responses of PPC and HPC cells aligned according to the time a landmark appears.

Part of Fig 7, showing responses of PPC and HPC cells aligned according to the time a landmark appears.

Conclusion: Overall, PPC and HPC responses were remarkably similar. Certainly, the original adage that PPC encodes egocentric and HPC allocentric coordinate frames seems inconsistent with these results.

06.12.2024 17:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Organizing space through saccades and fixations between primate posterior parietal cortex and hippocampus - Nature Communications How neural representation of space stems from visual exploration in primates remains unclear. Here, the authors show that neurons in parietal cortex and hippocampus are driven by saccades and fixation...

Excellent study of posterior parietal cortex (PPC) and hippocampal (HPC) cell responses as two macaques explore a virtual maze. www.nature.com/articles/s41...

06.12.2024 17:06 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Not really, though (maps)

01.12.2024 21:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Always thought provoking. Not a good model for biological 3D vision, in my view, but that is not Davison’s goal. Biologists need to know about working versions of (or feasible hypotheses for) the processes they are studying, and these systems really work.

29.11.2024 10:43 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Redirecting

Muryy et al (2020) doi.org/10.1016/j.vi...

27.11.2024 13:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Other information changes rapidly and so should give a fine scale refinement of the location estimate. This would tally with other types of biological representation. Is a coarse-to-fine hierarchy noticeable in policies built up with RL?

27.11.2024 11:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

In particular, I am interested in the extent to which there is a coarse-to-fine structure in embedding space. Some sensory information changes slowly as the agent moves (eg angles between distant objects). This should give a coarse scale location estimate.

27.11.2024 11:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I have been out of the loop since then but I know RL is used a lot in navigation, including for drones. Is there a similar analysis of the structure of the embedding space (eg tSNE) underlying the policy in these more modern RL systems?

27.11.2024 11:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Figure from Muryy et al (Vision Research 174, 79-93) showing a tSNE projection of the DRL representation generated by an agent navigating towards a rewarded goal. it shows clustering according to the target image but very little structure in relation to the camera location ('distance').

Figure from Muryy et al (Vision Research 174, 79-93) showing a tSNE projection of the DRL representation generated by an agent navigating towards a rewarded goal. it shows clustering according to the target image but very little structure in relation to the camera location ('distance').

By contrast, there was very little information about the camera location.

27.11.2024 11:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
Target-driven visual navigation in indoor scenes using deep reinforcement learning Two less addressed issues of deep reinforcement learning are (1) lack of generalization capability to new goals, and (2) data inefficiency, i.e., the model requires several (and often costly) episodes...

Our lab looked in detail at one of the early papers using RL to learn to navigate to a target image (Zhu et al 2017, doi.org/10.1109/ICRA...). We showed that the feature vectors it had learned were clustered in the embedding space according to the target image.

27.11.2024 11:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

What are the important questions about biological 3D vision and navigation that can be informed by recent advances in reinforcement learning?

27.11.2024 11:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@ag3dvr is following 20 prominent accounts