Digital Human and Video-Based Rendering

The commoditization of virtual and augmented reality devices and the availability of inexpensive consumer depth cameras have catalyzed a resurgence of interest in spatiotemporal performance capture. Recent systems like Fusion4D and Holoportation address several crucial problems in the real-time fusion of multiview depth maps into volumetric and deformable representations. Nonetheless, stitching multiview video textures onto dynamic meshes remains challenging due to imprecise geometries, occlusion seams, and critical time constraints. In this paper, we present a practical solution towards real-time seamless texture montage for dynamic multiview reconstruction. We build on the ideas of dilated depth discontinuities and majority voting from Holoportation to reduce ghosting effects when blending textures. In contrast to their approach, we determine the appropriate blend of textures per vertex using view-dependent rendering techniques, so as to avert fuzziness caused by the ubiquitous normal-weighted blending. By leveraging geodesics-guided diffusion and temporal texture fields, our algorithm mitigates spatial occlusion seams while preserving temporal consistency. Experiments demonstrate significant enhancement in rendering quality, especially in detailed regions such as faces. We envision a wide range of applications for Montage4D, including immersive telepresence for business, training, and live entertainment.

Publications

teaser image of Montage4D: Real-Time Seamless Fusion and Stylization of Multiview Video Textures

Montage4D: Real-Time Seamless Fusion and Stylization of Multiview Video Textures

Journal of Computer Graphics Techniques (JCGT), 2019.
Keywords: texture montage, 3d reconstruction, texture stitching, view-dependent rendering, discrete geodesics, projective texture mapping, differential geometry, temporal texture fields

teaser image of Fusing Multimedia Data Into Dynamic Virtual Environments

Fusing Multimedia Data Into Dynamic Virtual Environments

Ruofei Du
Ph.D. Dissertation, Computer Science Department., University of Maryland, College Park., 2018.
Keywords: social street view, geollery, spherical harmonics, 360 video, multiview video, montage4d, haptics, cryptography, metaverse, mirrored world
teaser image of HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
Keywords: correspondences, geodesic distance, embeddings, neural networks

teaser image of Montage4D: Interactive Seamless Fusion of Multiview Video Textures

Montage4D: Interactive Seamless Fusion of Multiview Video Textures

Proceedings of ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), 2018.
Keywords: texture montage, 3d reconstruction, texture stitching, view-dependent rendering, discrete geodesics, projective texture mapping, differential geometry, temporal texture fields

teaser image of Video Fields: Fusing Multiple Surveillance Videos Into a Dynamic Virtual Environment

Video Fields: Fusing Multiple Surveillance Videos Into a Dynamic Virtual Environment

Ruofei Du, Sujal Bista, and Amitabh Varshney
Proceedings of the 21st International Conference on Web3D Technology (Web3D), 2016.
Keywords: virtual reality; mixed-reality; video-based rendering; projection mapping; surveillance video; WebGL; WebVR

Videos

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence


Montage4D: Real-Time Seamless Fusion and Stylization of Multiview Video Textures


Talks

Cited By

  • IBRNet: Learning Multi-View Image-Based Rendering. CVPR 2021. Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul P Srinivasan, Howard Zhou, Jonathan T Barron, Ricardo Martin-Brualla, Noah Snavely, and Thomas Funkhouser. [doi]
  • Neural Body: Implicit Neural Representations With Structured Latent Codes for Novel View Synthesis of Dynamic Humans. CVPR 2021. Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, and Xiaowei Zhou. [doi]
  • Instant Panoramic Texture Mapping With Semantic Object Matching for Large-Scale Urban Scene Reproduction. IEEE Transactions on Visualization and Computer Graphics. Jinwoo Park, Ik-Beom Jeon, Sung-Eui Yoon, and Woontack Woo. [doi]
  • Video Content Representation to Support the Hyper-Reality Experience in Virtual Reality. 2021 IEEE Virtual Reality and 3D User Interfaces (VR). Hyerim Park and Woontack Woo. [doi]
  • Spatiotemporal Retrieval of Dynamic Video Object Trajectories in Geographical Scenes. Transactions in GIS. Yujia Xie, Meizhen Wang, Xuejun Liu, Ziran Wang, Bo Mao, Feiyue Wang, and Xiaozhi Wang. [doi]
  • Video\textemdashGeographic Scene Fusion Expression Based on Eye Movement Data. 2021 IEEE 7th International Conference on Virtual Reality (ICVR). Xiaozhi Wang, Yujia Xie, and Xing Wang. [doi]
  • A Review of Video Surveillance Systems. Journal of Visual Communication and Image Representation. Omar Elharrouss, Noor Almaadeed, and Somaya Al-Maadeed. [doi]
  • An Inexpensive Upgradation of Legacy Cameras Using Software and Hardware Architecture for Monitoring and Tracking of Live Threats. IEEE Access. Ume Habiba, Muhammad Awais, Milhan Khan, and Abdul Jaleel. [doi]
  • Multi-Camera Light Field Capture : Synchronization, Calibration, Depth Uncertainty, and System Design. . Elijs Dima. [doi]
  • MonoMR: Synthesizing Pseudo-2.5D Mixed Reality Content From Monocular Videos. Applied Sciences. Dong-Hyun Hwang and Hideki Koike. [doi]
  • Feature Based Object Tracking: A Probabilistic Approach. Florida Institute of Technology. Kaleb Smith. [doi]
  • Reconstruction and Detection of Occluded Portions of 3D Human Body Model Using Depth Data From Single Viewpoint. U.S. Patent 10,818,078. Jie Ni and Mohammad Gharavi-Alkhansari. [doi]
  • Heterogeneous Data Fusion . U.S. Patent 11,068,756. James Browning. [doi]
  • Video Display Method and Device. CN110996087B. Feihu Luo. [doi]
  • Image Processing Module, Image Processing Method, Camera Assembly and Mobile Terminal. CN112291479A. Jingyang Chang. [doi]
  • Multi-View Neural Human Rendering. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Minye Wu, Yuehao Wang, Qiang Hu, and Jingyi Yu. [doi]
  • LookinGood. ACM Transactions on Graphics. Ricardo Martin-Brualla, Rohit Pandey, Shuoran Yang, Pavel Pidlypenskyi, Jonathan Taylor, Julien Valentin, Sameh Khamis, Philip Davidson, Anastasia Tkach, Peter Lincoln, Adarsh Kowdle, Christoph Rhemann, Dan B Goldman, Cem Keskin, Steve Seitz, Shahram Izadi, and Sean Fanello. [doi]
  • Spatiotemporal Texture Reconstruction for Dynamic Objects Using a Single RGB-D Camera. Computer Graphics Forum. Hyomin Kim, Jungeon Kim, Hyeonseo Nam, Jaesik Park, and Seungyong Lee. [doi]
  • RealityCheck. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Jeremy Hartmann, Christian Holz, Eyal Ofek, and Andrew Wilson. [doi]
  • The Relightables. ACM Transactions on Graphics.Kaiwen Guo, Peter Lincoln, Philip Davidson, Jay Busch, Xueming Yu, Matt Whalen, Geoff Harvey, Sergio Orts-Escolano, Rohit Pandey, Jason Dourgarian, Danhang Tang, Anastasia Tkach, Adarsh Kowdle, Emily Cooper, Mingsong Dou, Sean Fanello, Graham Fyffe, Christoph Rhemann, Jonathan Taylor, Paul Debevec, and Shahram Izadi. [doi]
  • Image-Guided Neural Object Rendering. 8th International Conference on Learning Representations. Justus Thies, Michael Zollh{\"o}fer, Christian Theobalt, Marc Stamminger, and Matthias Nie{\ss}ner. [doi]
  • High-Precision 5DoF Tracking and Visualization of Catheter Placement in EVD of the Brain Using AR. ACM Transactions on Computing for Healthcare.Xuetong Sun, Sarah B. Murthi, Gary Schwartzbauer, and Amitabh Varshney. [doi]
  • Volumetric Capture of Humans With a Single RGBD Camera Via Semi-Parametric Learning. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Rohit Pandey, Cem Keskin, Shahram Izadi, Sean Fanello, Anastasia Tkach, Shuoran Yang, Pavel Pidlypenskyi, Jonathan Taylor, Ricardo Martin-Brualla, Andrea Tagliasacchi, George Papandreou, and Philip Davidson. [doi]
  • Pri3D: Can 3D Priors Help 2D Representation Learning?. https://arxiv.org/abs/2104.11225.pdf. Ji Hou, Saining Xie, Benjamin Graham, Angela Dai, and M. Nießner. [doi]
  • SIGNET: Efficient Neural Representation for Light Fields. ICCV 2021.Brandon Feng and Amitabh Varshney. [doi]
  • Stay In Touch