Publications

“Don't worry about what anybody else is going to do. The best way to predict the future is to invent it.”

As a Researcher at Google, I devote to inventing technologies in AI + XR, fusing users' intent and data from the physical and virtual worlds, and making it interactive, accessible, and useful in VR, AR, and MR. I have published over 35 peer-reviewed publications in top venues of HCI, Computer Graphics, and Computer Vision, including CHI, SIGGRAPH Asia, UIST, TVCG, CVPR, ICCV, ECCV, ISMAR, VR, I3D, Web3D, etc. Please feel free to search keywords / authors / journal / conference below or visit my Google Scholar for more details.

augmented communication XR interaction digital world digital human interactive perception interactive graphics

Peer-reviewed Publications [bibTeX]

XR Blocks: Accelerating Human-centered AI + XR Innovation Opensource SDK

arxiv, 2025.
Keywords: Extended Reality, Software Development Kit, WebXR, WebGL, Programming Language, Depth-based Interaction, Mixed Reality, Augmented Reality, Virtual Reality, Toolkit, AI, Gemini, Android XR, TensorFlow Lite, LiteRT


Sensible Agent: A Framework for Unobtrusive Interaction with Proactive AR Agent Future of AI + AR

Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (UIST), 2025.
Keywords: Proactive Agents, Augmented Reality, Unobtrusive Interaction, Context-Awareness, Multimodal Interaction, Human-Agent Interaction, Large Multimodal Models, Adaptive Interfaces



InstructPipe: Building Visual Programming Pipelines in Visual Blocks with Human Instructions Using LLMs 🎖️ Honorable Mentions Award

Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems (CHI), 2025.
Keywords: Visual Programming; Large Language Models; Visual Prototyping; Nodegraph Editor; Graph Compiler; Low-code Development; Deep Neural Networks; Deep Learning; Visual Analytics; Interactive Perception




Geollery: A Mixed Reality Social Media Platform 🌎 Live Demo of a Metaverse of Mirrored World

Ruofei Du, David Li, and Amitabh Varshney
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI), 2019.
Keywords: metaverse, virtual reality, augmented reality, social media, GIS, street view, visualization, 3D user interface, 3D reconstruction, digital twins, mirrored world; digital world; digital world; augmented communication

ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients" Open Source & Data

Findings of the Association for Computational Linguistics: ACL 2026 (ACL), 2026.
Keywords: tool-use, LLM

AgentHands: Generating Interactive Hands Gestures for Spatially Grounded Agent Conversations in XR

Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI), 2026.
Keywords: co-speech gestures, conversational agents, extended reality

How Well Can 3D Accessibility Guidelines Support XR Development? An Interview Study with XR Practitioners in Industry

Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems (CHI), 2026.
Keywords: accessibility, A11y, AR, VR, augmented and virtual reality, extended reality, XR, developerinterviews, guidelines


LegacyAvatars: Volumetric Face Avatars for Traditional Graphics Pipelines

2026 International Conference on 3D Vision (3DV), 2026.
Keywords: Volumetric Rendering, Face Modeling, View Synthesis, PerformanceCapture, digital human, avatar


Enhance Foveated Rendering with Weighted Reservoir Sampling

Proceedings of the 2025 18th ACM SIGGRAPH Conference on Motion, Interaction, and Game (MIG), 2025.
Keywords: foveated rendering, weighted reservoir sampling

DialogLab: Authoring, Simulating, and Testing Dynamic Group Conversations in Hybrid Human-AI Conversations Live Demo Available!

Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (UIST), 2025.
Keywords: Proactive Agents, Augmented Reality, Unobtrusive Interaction,Context-Awareness, Multimodal Interaction, Human-Agent Inter-action, Large Multimodal Models, Adaptive Interfaces


Thing2Reality: Enabling Spontaneous Creation of 3D Objects From 2D Images Using Generative AI in Distributed XR Meetings

Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (UIST), 2025.
Keywords: Extended Reality, Augmented Communication, Image-to-3D, Information Artifacts, Multi-modal, Remote Collaboration


Beyond the Phone: Exploring Context-Aware Interaction Between Mobile and Mixed Reality Devices

2025 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2025.
Keywords: Cross-Device Interaction, Phone-XR Intergration

SVG: 3D Stereoscopic Video Generation Via Denoising Frame Matrix

The International Conference on Learning Representations (ICLR), 2025.
Keywords: Stereoscopic Video, Generative AI

Augmented Object Intelligence with XR-Objects 🎖️ 2025 SXSW Innovation Awards Finalist

Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST), 2024.
Keywords: mixed reality; extended reality; augmented reality; augmented ob￾jects; spatial computing; user interfaces; context menus

Human I/O: Towards a Unified Approach to Detecting Situational Impairments 🎖️ Best Paper Honourable Mentions Award

Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: situational impairments, augmented reality, large language models, multimodal sensing, context awareness, XR interaction, interactive perception


UI Mobility Control in XR: Switching UI Positionings Between Static, Dynamic, and Self Entities

Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: Extended Reality, User Interface, UI Mobility, UI Positioning, XR interaction, interactive graphics

Experiencing Thing2Reality: Transforming 2D Content Into Conditioned Multiviews and 3D Gaussian Objects for XR Communication

Adjunct Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST), 2024.
Keywords: extended reality, augmented communication, image-to-3D, remote collaboration, spatial referencing, co-presence


ChatDirector: Enhancing Video Conferencing with Space-Aware Scene Rendering and Speech-Driven Layout Transition 500K+ Media Coverage

Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: augmented communication, video conferencing, 3D portrait avatar, co-presence, attention transition, depth estimation, video-mediated communication


FaceFolds: Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic Faces 🏆 Best Student Paper Award

Proceedings of the ACM on Computer Graphics and Interactive Techniques (I3D), 2024.
Keywords: Volumetric Rendering, Face Modeling, View Synthesis, PerformanceCapture, digital human

Experiencing InstructPipe: Building Multi-modal AI Pipelines Via Prompting LLMs and Visual Programming

Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: Visual Programming; Large Language Models; Visual Prototyping;Node-graph Editor; Graph Compiler; Low-code Development; DeepNeural Networks; Deep Learning; Visual Analytics




Sandwiched Compression: Repurposing Standard Codecs with Neural Network Wrappers

arxiv, 2024.
Keywords: deep learning, image compression, nonlineartransform coding, high dynamic range, super-resolution, interactive perception

Rapsai: Accelerating Machine Learning Prototyping of Multimedia Applications Through Visual Programming 🎖️ Honorable Mentions Award, 170K+ views

Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), 2023.
Keywords: visual programming, node-graph editor, deep neural networks, data augmentation, deep learning, model comparison, visual analytics, interactive perception




Visual Captions: Augmenting Verbal Communication with On-the-fly Visuals 📹 Open Source, Real-time, Live!

Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), 2023.
Keywords: augmented communication, large language models, video-mediated communication, online meeting, collaborative work, augmented reality, XR interaction


ThingShare: Ad-Hoc Digital Copies of Physical Objects for Sharing Things in Video Meetings

Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), 2023.
Keywords: video-mediated communication, object-centered meetings, online meeting, collaborative work, augmented communication, XR interaction


Modeling and Improving Text Stability in Live Captions 🚀 Landed in Live Transcribe App

Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (CHI EA), 2023.
Keywords: live captions; real-time transcription; visual instability; flickering metric; speech-to-text; text stability; tokenized alignment; augmented communication

Learning Personalized High Quality Volumetric Head Avatars From Monocular RGB Videos

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Keywords: implicit 3D avatar, monocular RGB video, facial expressions, head poses, neural radiance field, photorealism, digital human


Experiencing Visual Blocks for ML: Visual Prototyping of AI Pipelines

Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST), 2023.
Keywords: visual programming, large language models, visual prototyping, multi-modal models, node-graph editor, deep neural networks, data augmentation, deep learning, visual analytics


Experiencing Visual Captions: Augmented Communication with Real-time Visuals Using Large Language Models

Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST), 2023.
Keywords: augmented communication, large language models, video-mediated communication, online meeting, collaborative work, dataset, textto-visual, AI agent, augmented reality


Portrait Expression Editing with Mobile Photo Sequence

SIGGRAPH Asia 2023 Technical Communications (SA), 2023.
Keywords: Neural rendering, Portrait expression editing, Mobile system

RetroSphere: Self-Contained Passive 3D Controller Tracking for Augmented Reality 🏆IMWUT Vol. 6 Distinguished Paper Award

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), 2022.
Keywords: Retroreflectors, Augmented reality, Virtual reality, Infrared marker tracking, Augmented reality glasses, XR interaction


Sandwiched Image Compression: Increasing the Resolution and Dynamic Range of Standard Codecs 🎖️ Best Paper Finalist

2022 Picture Coding Symposium (PCS), 2022.
Keywords: deep learning, image compression, nonlineartransform coding, high dynamic range, super-resolution, interactive perception

PRIF: Primary Ray-based Implicit Function

European Conference on Computer Vision (ECCV), 2022.
Keywords: deep implicit functions, neural representation, signed distance function, interactive perception, interactive graphics

“Slurp” Revisited: Using Software Reconstruction to Reflect on Spatial Interactivity and Locative Media

Proceedings of the Designing Interactive Systems Conference (DIS), 2022.
Keywords: system re-presencing, affordances, metaphor, software reconstruction, historical precedents, gestural interface, augmented reality, spatial interaction, XR interaction

ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard of Hearing Users

Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI), 2022.
Keywords: accessibility, deaf, Deaf, hard of hearing, sound awareness



Opportunistic Interfaces for Augmented Reality: Transforming Everyday Objects Into Tangible 6DoF Interfaces Using Ad Hoc UI

Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI), 2022.
Keywords: augmented reality, everyday objects, tangible user interface, 3D user interface, 6 DoF, spatial interaction, markerless tracking, tangible interaction, hand gestures, XR interaction


OmniSyn: Intermediate View Synthesis Between Wide-baseline Panoramas

2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2022.
Keywords: 360 image, virtual reality, view synthesis, panorama, neural rendering, depth map, mesh rendering, inpainting, digital world

GazeChat: Enhancing Virtual Conferences with Gaze-aware 3D Photos

Proceedings of the 34th Annual ACM Symposium on User Interface Software and Technology (UIST), 2021.
Keywords: eye contact, gaze awareness, video conferencing, video-mediated communication, gaze interaction, augmented communication, augmented conversation, eye tracking, XR interaction

Multiresolution Deep Implicit Functions for 3D Shape Representation

Zhang Chen, Yinda Zhang, Kyle Genova, Thomas Funkhouse, Sean Fanello, Sofien Bouaziz, Christian Häne, Ruofei Du, Cem Keskin, and Danhang Tang
2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
Keywords: deep implicit functions, neural representation, compression, levels of detail, MDIF, interactive perception

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
Keywords: correspondences, geodesic distance, embeddings, neural networks, digital human, interactive perception

A Log-Rectilinear Transformation for Foveated 360-degree Video Streaming 🎖️ TVCG Honorable Mentions

IEEE Transactions on Visualization and Computer Graphics (TVCG), 2021.
Keywords: 360° video, foveation, virtual reality, live video stream-ing, log-rectilinear, summed-area table, eye tracking, digital world

Sandwiched Image Compression: Wrapping Neural Networks Around a Standard Codec

2021 IEEE International Conference on Image Processing (ICIP), 2021.
Keywords: deep learning, image compression, interactive perception

Saliency Computation for Virtual Cinematography in 360° Videos

Ruofei Du and Amitabh Varshney
IEEE Computer Graphics and Applications (CGA), 2021.
Keywords: spherical harmonics, virtual reality, visual saliency, 360°videos, omnidirectional videos, perception, Itti model, spectralresidual, GPGPU, CUDA, eye tracking, interactive graphics

DepthLab: Real-time 3D Interaction with Depth Maps for Mobile Augmented Reality 🚀 100K downloads

Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2020.
Keywords: depth map; interactive 3D graphics; real time; interaction; augmented reality; mobile AR; rendering; GPU; ARCore; XR interaction; digital world, interactive graphics



CollaboVR: A Reconfigurable Framework for Multi-user to Communicate in Virtual Reality

Zhenyi He, Ruofei Du, and Ken Perlin
2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), 2020.
Keywords: chalktalk, virtual reality, collaborative work, layout, telepresence, communication, XR interaction, augmented communication

3D-Kernel Foveated Rendering for Light Fields

IEEE Transactions on Visualization and Computer Graphics (TVCG), 2020.
Keywords: light field, foveated rendering, microscopic light field, eye tracking, visualization, eye tracking, interactive graphics

Experiencing Real-time 3D Interaction with Depth Maps for Mobile Augmented Reality in DepthLab

Adjunct Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2020.
Keywords: depth map; interactive 3D graphics; real time; interaction; augmented reality; mobile AR; rendering; GPU; ARCore; interactive graphics, XR interaction



MeteoVis: Visualizing Meteorological Events in Virtual Reality

Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (CHI EA), 2020.
Keywords: scientific visualization, virtual reality, meteorological data, immersion, interactive visualization, vector field, XR interaction

Eye-Dominance-guided Foveated Rendering

IEEE Transactions on Visualization and Computer Graphics (TVCG, Special Issue of IEEE Conference on Virtual Reality and 3D User Interfaces), 2020.
Keywords: virtual reality, foveated rendering, perception, gaze-contingent rendering, ocular dominance, eye tracking, interactive graphics

Montage4D: Real-time Seamless Fusion and Stylization of Multiview Video Textures Microsoft TechFest 2018

Journal of Computer Graphics Techniques (JCGT), 2019.
Keywords: texture montage, 3d reconstruction, texture stitching, view-dependent rendering, discrete geodesics, projective texture mapping, differential geometry, temporal texture fields; digital human

Language-based Colorization of Scene Sketches

ACM Transactions on Graphics (SIGGRAPH Asia), 2019.
Keywords: deep neural networks; image segmentation; language-based editing; scene sketch; sketch colorization, interactive graphics, interactive perception, augmented communication

ORC Layout: Adaptive GUI Layout with OR-Constraints

Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI), 2019.
Keywords: GUI builder, layout manager, constraint-based layout, visual interface design, visual programming, interactive graphics

Kernel Foveated Rendering 📖 Most read in PACMCGIT

Proceedings of the ACM on Computer Graphics and Interactive Techniques (I3D), 2018.
Keywords: foveated rendering, perception, log-polar mapping, eye-tracking, virtual reality, head-mounted displays, eye tracking, interactive graphics

Project Geollery.com: Reconstructing a Live Mirrored World with Geotagged Social Media

Ruofei Du, David Li, and Amitabh Varshney
Proceedings of the 24th International Conference on Web3D Technology (Web3D), 2019.
Keywords: virtual reality, mixed reality, 360° image, GIS, 3D reconstruction, projection mapping, mirrored world, social media, WebGL, metaverse, mirrored world, interactive graphics, digital world

Experiencing a Mirrored World with Geotagged Social Media in Geollery

Ruofei Du, David Li, and Amitabh Varshney
Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (CHI EA), 2019.
Keywords: virtual reality, augmented reality, social media, GIS, street view, visualization, 3D user interface, 3D reconstruction, metaverse, mirrored world

Interactive Fusion of 360° Images for a Mirrored World

Ruofei Du, David Li, and Amitabh Varshney
2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2019.
Keywords: virtual reality, 360° image, 3D reconstruction, mixed reality, projection mapping, mirrored world, metaverse, mirrored world

Tracking-Tolerent Visual Cryptography

Ruofei Du, Eric Lee, and Amitabh Varshney
2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2019.
Keywords: visual cryptography, augmented reality (AR), tracking, XR interaction

Montage4D: Interactive Seamless Fusion of Multiview Video Textures

Proceedings of ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), 2018.
Keywords: texture montage, 3d reconstruction, texture stitching, view-dependent rendering, discrete geodesics, projective texture mapping, differential geometry, temporal texture fields, digital human

SketchyScene: Richly-Annotated Scene Sketches

European Conference on Computer Vision (ECCV), 2018.
Keywords: sketch dataset, scene sketch, sketch segmentation, interactive graphics

Evaluating Haptic and Auditory Directional Guidance to Assist Blind People in Reading Printed Text Using Finger-Mounted Cameras

ACM Transactions on Accessible Computing (TACCESS), 2016.
Keywords: accessibility, real-time OCR, visual impairments, wearables, XR interaction

Video Fields: Fusing Multiple Surveillance Videos Into a Dynamic Virtual Environment

Proceedings of the 21st International Conference on Web3D Technology (Web3D), 2016.
Keywords: virtual reality; mixed-reality; video-based rendering; projection mapping; surveillance video; WebGL; WebVR; interactive graphics

Fusing Multimedia Data Into Dynamic Virtual Environments Ph.D. Dissertation

Ruofei Du
Ph.D. Dissertation, University of Maryland, College Park., 2018.
Keywords: social street view, geollery, spherical harmonics, 360 video, multiview video, montage4d, haptics, cryptography, metaverse, mirrored world

Social Street View: Blending Immersive Street Views with Geo-Tagged Social Media 🏆 Best Paper Award

Ruofei Du and Amitabh Varshney
Proceedings of the 21st International Conference on Web3D Technology (Web3D), 2016.
Keywords: metaverse, spatial-temporal virtual reality; social media; street view; geographical information systems; mixed reality; WebGL; digital twins; digital world

VRSurus: Enhancing Interactivity and Tangibility of Puppets in Virtual Reality 🎥 Demoed at UIST 2015

Ruofei Du and Liang He
Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA), 2016.
Keywords: Virtual Reality; Tangible User Interface; Haptics; Gesture Recognition; Head-Mounted Display; XR interaction

AtmoSPHERE: Representing Space and Movement Using Sand Traces in an Interactive Zen Garden

Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA), 2015.
Keywords: Visualization; Tangible Interactive Art; Machine Aesthetics; Calm Technology; XY Servo Table; Kinect; XR interaction

The Design and Preliminary Evaluation of a Finger-Mounted Camera and Feedback System to Enable Reading of Printed Text for the Blind

Computer Vision - ECCV 2014 Workshops (ECCVW), 2014.
Keywords: Accessibility, Wearables, Real-time OCR, Text Reading for Blind

Supporting Everyday Activities for Persons with Visual Impairments Through Computer Vision

Proceedings of the 17th International ACM SIGACCESS Conference on Computers Accessibility (ASSETS), 2015.
Keywords: Blind; visually impaired; wearable computing; computer vision; vision-augmented touch


Online Vigilance Analysis Combining Video and Electrooculography Features

Neural Information Processing - 19th International Conference (ICONIP), 2012.
Keywords: Vigilance Analysis, Fatigue Detection, Active Shape Model, Electrooculography, Support Vector Machine, eye tracking

A Pilot Study of Spherical Harmonics for Saliency Computation and Navigation in 360° Videos

Ruofei Du and Amitabh Varshney
ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), 2018.
Keywords: spherical harmonics, virtual reality, visual saliency, 360°videos, omnidirectional videos, perception, Itti model, spectralresidual, GPGPU, CUDA

Research on Fatigue Driving Detection System Based on Video Signals

Ruofei Du
Shanghai Jiao Tong University, 2013.
Keywords:

Technical Reports

Experiencing Rapid Prototyping of Machine Learning Based Multimedia Applications in Rapsai

Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (CHI EA), 2023.
Keywords: visual programming, node-graph editor, deep neural networks, data augmentation, deep learning, model comparison, visual analytics, interactive perception


C-Flow: Visualizing Foot Traffic and Profit Data to Make Informative Decisions

University of Maryland, College Park. Department of Computer Science, 2012.
Keywords: Information Visualization; Data Mapping; Indoor Visualization; Business; Usability Testing

Statistics for K-mer Based Splicing Analysis

Ruofei Du, Hao Li, Hui Miao, and Shangfu Peng
University of Maryland, College Park. Department of Computer Science, 2014.
Keywords:

UistViz: 26 Years of UIST Coauthor Network Visualization

Ruofei Du
University of Maryland, College Park. Department of Computer Science, 2013.
Keywords: Information Visualization; UIST; Coauthor Network; DBLP; NodeXL

Learning Depression Patterns From MyPersonality and Reddit

Weiwei Yang, Xuetong Sun, and Ruofei Du
University of Maryland, College Park. Department of Computer Science, 2015.
Keywords:

Zero-shot Learning Based Pedestrian Parsing

Xiyang Dai, Ruofei Du, and Hao Zhou
University of Maryland, College Park. Department of Computer Science, 2015.
Keywords: pedestrian parsing, zero-shot learning, segmentation

3DVAR: From 3D Reconstruction to Virtual and Augmented Reality

Microsoft Asia Student TechFest 2012, 2012.
Keywords:

Deliberately Planning and Acting for Angry Birds with Refinement Methods

Ruofei Du, Zebao Gao, and Zheng Xu
University of Maryland, College Park. Department of Computer Science, 2015.
Keywords:

Systems, Devices, and Methods for Generating a Social Street View

Ruofei Du and Amitabh Varshney
US Patent 10,380,726, 2016.
Keywords: virtual reality; social media; street view; geographical information systems; mixed reality; WebGL

Stay In Touch