VGGT-Ω
GTA: Advancing Image-to-3D World Generation via Geometry Then Appearance Video Diffusion
RAEv2: Improved Baselines with Representation Autoencoders
Depth2Pose: A Pose-Based Benchmark for Monocular Depth Estimation without Ground-Truth Depth
PhyWorld: Physics-Faithful World Model for Video Generation
Tango3D: Towards Alignment for Global and Local 2D-3D Correspondence
Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model
DINOv2: Learning Robust Visual Features without Supervision
DINOv3
Emerging Properties in Self-Supervised Vision Transformers
avatar
Zhu Jiajun
North-Western polytechnical University
Follow Me
公告
记录读研过程中学习阅读的相关论文书籍与研究内容。部分前期论文阅读记录格式较乱,请多多包涵。后期会尽量统一论文阅读记录的格式。同时也将更新更多有趣的内容。