A corrigendum for this paper is out today:
doi.org/10.1016/j.ne...
(some quite weird things happened to the text between submission and proofs). Anyway, corrected now. I hope the weirdities in the uncorrected version do not put people off.
@ag3dvr.bsky.social
3D vision http://www.personal.reading.ac.uk/~sxs05ag/
A corrigendum for this paper is out today:
doi.org/10.1016/j.ne...
(some quite weird things happened to the text between submission and proofs). Anyway, corrected now. I hope the weirdities in the uncorrected version do not put people off.
In the centre is a yellow plane. This represents the set of all egocentric representations, where neighbouring points on the surface correspond to the egocentric representations for neighbouring locations in space. The large beige region shows that the egocentric representations barely change over a large region of the yellow plane when the scene consists only of mountains. This is because the angles between mountains barely change as the observer moves. The light brown region lies within the beige region. This shows that adding in a forest in the middle distance constrains the set of egocentric representations more precisely. Finally, adding in a picnic table (and all the angles between the picnic table and other object in the scene) constrains the possible location of the observer even more tightly within the overall set of egocentric locations.
The set of saccades that take the fovea between points in the scene can be described either as an egocentric representation or a βpolicyβ, a set of context-dependent actions. When the observer moves, the most enduring part of the representation is the set of angles between distant points.
02.09.2025 07:07 β π 0 π 0 π¬ 0 π 0On the left is a grid-like coordinate frame and four locations (O1 to O4). On the right are the same four locations (O1 to O4) but now, instad of a grid coordinate frame, there are three fixation points, A, B and C. The observer fixates on one of these as they move and then makes a sudden eye rotation (saccade) to a new fixation point.
One element of the argument concerns the way that we (and most animals) move. We fixate on a point and move relative to that. The dorsal stream is well set up to control movements in this way.
02.09.2025 07:07 β π 0 π 0 π¬ 1 π 0New paper out today:
doi.org/10.1016/j.ne...
I argue that the brain navigates through image space rather than using a 3D coordinate-based reference frame.
Fig 2 from the paper. It shows an observer/eye moving while fixating a point and then making an eye movement to fixate a new object. Sub-panels show how the image changes during this movement and how responses in the ventral and dorsal streams relate to this movement across image space.
Here is preprint corresponding to the talk I gave at #iNav2024: 'Navigating image space' doi.org/10.31234/osf.... No data, but some thoughts about the difference between Cartesian (grid-like) reference frames and an image-based frame for navigation.
17.01.2025 14:05 β π 2 π 0 π¬ 0 π 0A comprehensive review of VR and real world memory for object location (e.g. task: replace object in correct position). Some sex differences and better performance in real environments. There is no theoretical model here - that was not the goal. doi.org/10.1016/j.bb...
20.12.2024 10:08 β π 1 π 0 π¬ 0 π 0A bit more detail here: www.youtube.com/watch?v=Q5XN... and in a related paper doi.org/10.1098/rstb... .
18.12.2024 09:50 β π 0 π 0 π¬ 0 π 0The solution is likely to involve abandoning 3D coordinate frames and transformations. Instead, egocentric and allocentric tasks can be solved in a space of images or sensory states. This is what is done in reinforcement learning, which is probably a better model for biology than SLAM.
18.12.2024 09:50 β π 1 π 1 π¬ 1 π 0This review asks good questions and summarises well our lack of progress in answering them. The fundamental problem is the failure to conceptualise what a biologically plausible alternative to SLAM might look like.
18.12.2024 09:50 β π 0 π 0 π¬ 1 π 0Not as daft as it might seem. At least, when discussing where the interesting complexity lies.
17.12.2024 00:56 β π 1 π 0 π¬ 0 π 0Overall, my take is that signals relating to the current and upcoming retinal image, and the task, are highly relevant for both HPC and PPC, rather than βplaceβ _per se_. Also, differences between processing in retinotopic areas (PPC) versus HPC are less dramatic than has been supposed in the past.
06.12.2024 17:06 β π 1 π 0 π¬ 0 π 0Text from Discussion. Paraphrasing: "We extend the interpretation to hippocampal cells. In both PPC and HPC, cues processed during fixation impacted the cells strongly"
These points are relevant for hippocampal as well as PPC cells.
06.12.2024 17:06 β π 1 π 0 π¬ 1 π 0Text from Discussion. Paraphrasing: "Visual stimulation of PPC resulted in spatial selectivity that could be mistaken for 'place' codes akin to place cells found in rodent HPC cells."
Visual responses can be βerroneously interpreted as place codesβ:
06.12.2024 17:06 β π 0 π 0 π¬ 1 π 0Part of Fig 7, showing responses of PPC and HPC cells aligned according to the time a landmark appears.
Conclusion: Overall, PPC and HPC responses were remarkably similar. Certainly, the original adage that PPC encodes egocentric and HPC allocentric coordinate frames seems inconsistent with these results.
06.12.2024 17:06 β π 0 π 0 π¬ 1 π 0Excellent study of posterior parietal cortex (PPC) and hippocampal (HPC) cell responses as two macaques explore a virtual maze. www.nature.com/articles/s41...
06.12.2024 17:06 β π 2 π 0 π¬ 1 π 0Not really, though (maps)
01.12.2024 21:23 β π 0 π 0 π¬ 0 π 0Always thought provoking. Not a good model for biological 3D vision, in my view, but that is not Davisonβs goal. Biologists need to know about working versions of (or feasible hypotheses for) the processes they are studying, and these systems really work.
29.11.2024 10:43 β π 3 π 0 π¬ 0 π 0Muryy et al (2020) doi.org/10.1016/j.vi...
27.11.2024 13:02 β π 0 π 0 π¬ 0 π 0Other information changes rapidly and so should give a fine scale refinement of the location estimate. This would tally with other types of biological representation. Is a coarse-to-fine hierarchy noticeable in policies built up with RL?
27.11.2024 11:27 β π 0 π 0 π¬ 0 π 0In particular, I am interested in the extent to which there is a coarse-to-fine structure in embedding space. Some sensory information changes slowly as the agent moves (eg angles between distant objects). This should give a coarse scale location estimate.
27.11.2024 11:27 β π 0 π 0 π¬ 1 π 0I have been out of the loop since then but I know RL is used a lot in navigation, including for drones. Is there a similar analysis of the structure of the embedding space (eg tSNE) underlying the policy in these more modern RL systems?
27.11.2024 11:27 β π 0 π 0 π¬ 1 π 0Figure from Muryy et al (Vision Research 174, 79-93) showing a tSNE projection of the DRL representation generated by an agent navigating towards a rewarded goal. it shows clustering according to the target image but very little structure in relation to the camera location ('distance').
By contrast, there was very little information about the camera location.
27.11.2024 11:27 β π 0 π 0 π¬ 2 π 0Our lab looked in detail at one of the early papers using RL to learn to navigate to a target image (Zhu et al 2017, doi.org/10.1109/ICRA...). We showed that the feature vectors it had learned were clustered in the embedding space according to the target image.
27.11.2024 11:27 β π 0 π 0 π¬ 1 π 0What are the important questions about biological 3D vision and navigation that can be informed by recent advances in reinforcement learning?
27.11.2024 11:10 β π 1 π 0 π¬ 1 π 0