Wei Wu, Wenjie Wang, Yang Tan, Ying Liu, Liang Diao, Lin Huang, Kaihe Xu, Wenfeng Xie, Ziling Lin
Team PA-VCG's Solution for Competition on Understanding Chinese College Entrance Exam Papers in ICDAR'25
https://arxiv.org/abs/2508.00834
@arxiv-cs-cv.bsky.social
Computer Science -- Computer Vision and Pattern Recognition (cs.CV) source: export.arxiv.org/rss/cs.CV maintainer: @tmaehara.bsky.social
Wei Wu, Wenjie Wang, Yang Tan, Ying Liu, Liang Diao, Lin Huang, Kaihe Xu, Wenfeng Xie, Ziling Lin
Team PA-VCG's Solution for Competition on Understanding Chinese College Entrance Exam Papers in ICDAR'25
https://arxiv.org/abs/2508.00834
Ali Haitham Abdul Amir, Zainab N. Nemer
Inclusive Review on Advances in Masked Human Face Recognition Technologies
https://arxiv.org/abs/2508.00841
Zhihao Zhu, Jiale Han, Yi Yang
HoneyImage: Verifiable, Harmless, and Stealthy Dataset Ownership Verification for Image Models
https://arxiv.org/abs/2508.00892
Hoang Hai Nam Nguyen, Minh Tien Tran, Hoheok Kim, Ho Won Lee
Phase-fraction guided denoising diffusion model for augmenting multiphase steel microstructure segmentation via micrograph image-mask pair synthesis
https://arxiv.org/abs/2508.00896
Jose M. S\'anchez Vel\'azquez, Mingbo Cai, Andrew Coney, \'Alvaro J. Garc\'ia- Tejedor, Alberto Nogales
Benefits of Feature Extraction and Temporal Sequence Analysis for Video Frame Prediction: An Evaluation of Hybrid Deep Learning Models
https://arxiv.org/abs/2508.00898
Mohammad Mohammadi, Ziyi Wu, Igor Gilitschenski
TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras
https://arxiv.org/abs/2508.00913
Hassan Ugail, Hamad Mansour Alawar, AbdulNasser Abbas Zehi, Ahmed Mohammad Alkendi, Ismail Lujain Jaleel
Latent Diffusion Based Face Enhancement under Degraded Conditions for Forensic Face Recognition
https://arxiv.org/abs/2508.00941
Yifan Wang, Hongfeng Ai, Quangao Liu, Maowei Jiang, Ruiyuan Kang, Ruiqi Li, Jiahua Dong, Mengting Xiao, Cheng Jiang, Chenzhong Li
Optimizing Vision-Language Consistency via Cross-Layer Regional Attention Alignment
https://arxiv.org/abs/2508.00945
Daniel Andr\'es L\'opez, Vincent Weber, Severin Zentgraf, Barlo Hillen, Perikles Simon, Elmar Sch\"omer
ThermoCycleNet: Stereo-based Thermogram Labeling for Model Transition to Cycling
https://arxiv.org/abs/2508.00974
Cihang Peng, Qiming Hou, Zhong Ren, Kun Zhou
ROVI: A VLM-LLM Re-Captioned Dataset for Open-Vocabulary Instance-Grounded Text-to-Image Generation
https://arxiv.org/abs/2508.01008
Byron Dowling, Jozef Probcin, Adam Czajka
AutoSIGHT: Automatic Eye Tracking-based System for Immediate Grading of Human experTise
https://arxiv.org/abs/2508.01015
Muhammad Zeeshan, Umer Zaki, Syed Ahmed Pasha, Zaar Khizar
3D Reconstruction via Incremental Structure From Motion
https://arxiv.org/abs/2508.01019
Theo Di Piazza, Carole Lazarus, Olivier Nempont, Loic Boussel
Structured Spectral Graph Learning for Anomaly Classification in 3D Chest CT Scans
https://arxiv.org/abs/2508.01045
Hongyu Zhu, Sichu Liang, Wenwen Wang, Zhuomeng Zhang, Fangqi Li, Shi-Lin Wang
Evading Data Provenance in Deep Neural Networks
https://arxiv.org/abs/2508.01074
Santiago Diaz, Xinghui Hu, Josiane Uwumukiza, Giovanni Lavezzi, Victor Rodriguez-Fernandez, Richard Linares
DreamSat-2.0: Towards a General Single-View Asteroid 3D Reconstruction
https://arxiv.org/abs/2508.01079
Ryan Rabinowitz, Steve Cruz, Walter Scheirer, Terrance E. Boult
COSTARR: Consolidated Open Set Technique with Attenuation for Robust Recognition
https://arxiv.org/abs/2508.01087
Mikhail Bychkov, Matey Yordanov, Andrei Kuchma
AURA: A Hybrid Spatiotemporal-Chromatic Framework for Robust, Real-Time Detection of Industrial Smoke Emissions
https://arxiv.org/abs/2508.01095
Yuekun Dai, Haitian Li, Shangchen Zhou, Chen Change Loy
Trans-Adapter: A Plug-and-Play Framework for Transparent Image Inpainting
https://arxiv.org/abs/2508.01098
Yizhou Zhao, Haoyu Chen, Chunjiang Liu, Zhenyang Li, Charles Herrmann, Junhwa Hur, Yinxiao Li, Ming-Hsuan Yang, Bhiksha Raj, Min Xu
MASIV: Toward Material-Agnostic System Identification from Videos
https://arxiv.org/abs/2508.01112
Saba Ahmadi, Rabiul Awal, Ankur Sikarwar, Amirhossein Kazemnejad, Ge Ya Luo, Juan A. Rodriguez, Sai Rajeswar, Siva Reddy, Christopher Pal, Benno Krojer, Aishwarya Agrawal
The Promise of RL for Autoregressive Image Editing
https://arxiv.org/abs/2508.01119
Chaitanya Patel, Hiroki Nakamura, Yuta Kyuragi, Kazuki Kozuka, Juan Carlos Niebles, Ehsan Adeli
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation
https://arxiv.org/abs/2508.01126
Zeduo Zhang, Yalda Mohsenzadeh
Semi-Supervised Anomaly Detection in Brain MRI Using a Domain-Agnostic Deep Reinforcement Learning Approach
https://arxiv.org/abs/2508.01137
Huyu Wu, Duo Su, Junjie Hou, Guang Li
Dataset Condensation with Color Compensation
https://arxiv.org/abs/2508.01139
Dianyi Yang, Xihan Wang, Yu Gao, Shiyang Liu, Bohan Ren, Yufeng Yue, Yi Yang
OpenGS-Fusion: Open-Vocabulary Dense Mapping with Hybrid 3D Gaussian Splatting for Refined Object-Level Understanding
https://arxiv.org/abs/2508.01150
Yu Lei, Jinbin Bai, Qingyu Shi, Aosong Feng, Kaidong Yu
Personalized Safety Alignment for Text-to-Image Diffusion Models
https://arxiv.org/abs/2508.01151
Xinyu Yan, Meijun Sun, Ge-Peng Ji, Fahad Shahbaz Khan, Salman Khan, Deng-Ping Fan
LawDIS: Language-Window-based Controllable Dichotomous Image Segmentation
https://arxiv.org/abs/2508.01152
Xiahan Yang, Hui Zheng
TEACH: Text Encoding as Curriculum Hints for Scene Text Recognition
https://arxiv.org/abs/2508.01153
Tuan Duc Ngo, Ashkan Mirzaei, Guocheng Qian, Hanwen Liang, Chuang Gan, Evangelos Kalogerakis, Peter Wonka, Chaoyang Wang
DELTAv2: Accelerating Dense 3D Tracking
https://arxiv.org/abs/2508.01170
Ranran Huang, Krystian Mikolajczyk
No Pose at All: Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views
https://arxiv.org/abs/2508.01171
Xinhang Wan, Dongqiang Gou, Xinwang Liu, En Zhu, Xuming He
Object Affordance Recognition and Grounding via Multi-scale Cross-modal Representation Learning
https://arxiv.org/abs/2508.01184