Publications

“Don't worry about what anybody else is going to do. The best way to predict the future is to invent it.”

As a Researcher at Google, I devote to inventing technologies in interactive perception and graphics, fusing the information from the physical and virtual worlds, and making it interactive, accessible, and useful in VR, AR, and MR. I have published over 35 peer-reviewed publications in top venues of HCI, Computer Graphics, and Computer Vision, including CHI, SIGGRAPH Asia, UIST, TVCG, CVPR, ICCV, ECCV, ISMAR, VR, I3D, Web3D, etc. Please feel free to search keywords / authors / journal / conference below or visit my Google Scholar for more details.

augmented communicationXR interactiondigital worlddigital humaninteractive perceptioninteractive graphics

Peer-reviewed Publications [bibTeX]

teaser image of Rapsai: Accelerating Machine Learning Prototyping of Multimedia Applications Through Visual Programming

Rapsai: Accelerating Machine Learning Prototyping of Multimedia Applications Through Visual ProgrammingHonorable Mentions Award, 170K+ views

Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), 2023.
Keywords: visual programming, node-graph editor, deep neural networks, data augmentation, deep learning, model comparison, visual analytics, interactive perception




teaser image of Visual Captions: Augmenting Verbal Communication With On-the-fly Visuals

Visual Captions: Augmenting Verbal Communication With On-the-fly VisualsOpen Source, Real-time, Live!

Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), 2023.
Keywords: augmented communication, large language models, video-mediated communication, online meeting, collaborative work, augmented reality, XR interaction


teaser image of DepthLab: Real-time 3D Interaction With Depth Maps for Mobile Augmented Reality

DepthLab: Real-time 3D Interaction With Depth Maps for Mobile Augmented Reality50K downloads

Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2020.
Keywords: depth map; interactive 3D graphics; real time; interaction; augmented reality; mobile AR; rendering; GPU; ARCore; XR interaction; digital world, interactive graphics



teaser image of Geollery: A Mixed Reality Social Media Platform

Geollery: A Mixed Reality Social Media PlatformLive Demo of a Metaverse of Mirrored World

Ruofei Du, David Li, and Amitabh Varshney
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI), 2019.
Keywords: metaverse, virtual reality, augmented reality, social media, GIS, street view, visualization, 3D user interface, 3D reconstruction, digital twins, mirrored world; digital world; digital world; augmented communication
teaser image of ChatDirector: Enhancing Video Conferencing With Space-Aware Scene Rendering and Speech-Driven Layout Transition

ChatDirector: Enhancing Video Conferencing With Space-Aware Scene Rendering and Speech-Driven Layout Transition

Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: augmented communication, video conferencing, 3D portrait avatar, co-presence, attention transition, depth estimation, video-mediated communication


teaser image of Human I/O: Towards a Unified Approach to Detecting Situational Impairments

Human I/O: Towards a Unified Approach to Detecting Situational Impairments

Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: situational impairments, augmented reality, large language models, multimodal sensing, context awareness, XR interaction, interactive perception


teaser image of UI Mobility Control in XR: Switching UI Positionings Between Static, Dynamic, and Self Entities

UI Mobility Control in XR: Switching UI Positionings Between Static, Dynamic, and Self Entities

Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: Extended Reality, User Interface, UI Mobility, UI Positioning, XR interaction, interactive graphics

teaser image of Experiencing InstructPipe: Building Multi-modal AI Pipelines Via Prompting LLMs and Visual Programming

Experiencing InstructPipe: Building Multi-modal AI Pipelines Via Prompting LLMs and Visual Programming

Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems (CHI), 2024.
Keywords: Visual Programming; Large Language Models; Visual Prototyping;Node-graph Editor; Graph Compiler; Low-code Development; DeepNeural Networks; Deep Learning; Visual Analytics




teaser image of Montage4D: Real-time Seamless Fusion and Stylization of Multiview Video Textures

Montage4D: Real-time Seamless Fusion and Stylization of Multiview Video TexturesMicrosoft TechFest 2018

Journal of Computer Graphics Techniques (JCGT), 2019.
Keywords: texture montage, 3d reconstruction, texture stitching, view-dependent rendering, discrete geodesics, projective texture mapping, differential geometry, temporal texture fields; digital human

teaser image of Social Street View: Blending Immersive Street Views With Geo-Tagged Social Media

Social Street View: Blending Immersive Street Views With Geo-Tagged Social MediaBest Paper Award

Ruofei Du and Amitabh Varshney
Proceedings of the 21st International Conference on Web3D Technology (Web3D), 2016.
Keywords: metaverse, spatial-temporal virtual reality; social media; street view; geographical information systems; mixed reality; WebGL; digital twins; digital world
teaser image of Fusing Multimedia Data Into Dynamic Virtual Environments

Fusing Multimedia Data Into Dynamic Virtual Environments

Ruofei Du
Ph.D. Dissertation, Computer Science Department., University of Maryland, College Park., 2018.
Keywords: social street view, geollery, spherical harmonics, 360 video, multiview video, montage4d, haptics, cryptography, metaverse, mirrored world
teaser image of InstructPipe: Building Visual Programming Pipelines With Human Instructions

InstructPipe: Building Visual Programming Pipelines With Human Instructions

https://arxiv.org/abs/2312.09672, 2023.
Keywords: Visual Programming; Large Language Models; Visual Prototyping; Nodegraph Editor; Graph Compiler; Low-code Development; Deep Neural Networks; Deep Learning; Visual Analytics; Interactive Perception



teaser image of Experiencing Visual Blocks for ML: Visual Prototyping of AI Pipelines

Experiencing Visual Blocks for ML: Visual Prototyping of AI Pipelines

Adjunct Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2023.
Keywords: visual programming, large language models, visual prototyping, multi-modal models, node-graph editor, deep neural networks, data augmentation, deep learning, visual analytics


teaser image of Experiencing Visual Captions: Augmented Communication With Real-time Visuals Using Large Language Models

Experiencing Visual Captions: Augmented Communication With Real-time Visuals Using Large Language Models

Adjunct Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2023.
Keywords: augmented communication, large language models, video-mediated communication, online meeting, collaborative work, dataset, textto-visual, AI agent, augmented reality


teaser image of Portrait Expression Editing With Mobile Photo Sequence

Portrait Expression Editing With Mobile Photo Sequence

SIGGRAPH Asia 2023 Technical Communications (SA), 2023.
Keywords: Neural rendering, Portrait expression editing, Mobile system
teaser image of Learning Personalized High Quality Volumetric Head Avatars From Monocular RGB Videos

Learning Personalized High Quality Volumetric Head Avatars From Monocular RGB Videos

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Keywords: implicit 3D avatar, monocular RGB video, facial expressions, head poses, neural radiance field, photorealism, digital human


teaser image of ThingShare: Ad-Hoc Digital Copies of Physical Objects for Sharing Things in Video Meetings

ThingShare: Ad-Hoc Digital Copies of Physical Objects for Sharing Things in Video Meetings

Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI), 2023.
Keywords: video-mediated communication, object-centered meetings, online meeting, collaborative work, augmented communication, XR interaction


teaser image of Modeling and Improving Text Stability in Live Captions

Modeling and Improving Text Stability in Live Captions

Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (CHI EA), 2023.
Keywords: live captions; real-time transcription; visual instability; flickering metric; speech-to-text; text stability; tokenized alignment; augmented communication

teaser image of RetroSphere: Self-Contained Passive 3D Controller Tracking for Augmented Reality

RetroSphere: Self-Contained Passive 3D Controller Tracking for Augmented RealityIMWUT Vol. 6 Distinguished Paper Award

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), 2022.
Keywords: Retroreflectors, Augmented reality, Virtual reality, Infrared marker tracking, Augmented reality glasses, XR interaction


teaser image of Sandwiched Image Compression: Increasing the Resolution and Dynamic Range of Standard Codecs

Sandwiched Image Compression: Increasing the Resolution and Dynamic Range of Standard CodecsBest Paper Finalist

2022 Picture Coding Symposium (PCS), 2022.
Keywords: deep learning, image compression, nonlineartransform coding, high dynamic range, super-resolution, interactive perception

teaser image of PRIF: Primary Ray-based Implicit Function

PRIF: Primary Ray-based Implicit Function

European Conference on Computer Vision (ECCV), 2022.
Keywords: deep implicit functions, neural representation, signed distance function, interactive perception, interactive graphics
teaser image of “Slurp” Revisited: Using Software Reconstruction to Reflect on Spatial Interactivity and Locative Media

“Slurp” Revisited: Using Software Reconstruction to Reflect on Spatial Interactivity and Locative Media

Proceedings of the Designing Interactive Systems Conference (DIS), 2022.
Keywords: system re-presencing, affordances, metaphor, software reconstruction, historical precedents, gestural interface, augmented reality, spatial interaction, XR interaction
teaser image of ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard of Hearing Users

ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard of Hearing Users

Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (CHI), 2022.
Keywords: accessibility, deaf, Deaf, hard of hearing, sound awareness



teaser image of Opportunistic Interfaces for Augmented Reality: Transforming Everyday Objects Into Tangible 6DoF Interfaces Using Ad Hoc UI

Opportunistic Interfaces for Augmented Reality: Transforming Everyday Objects Into Tangible 6DoF Interfaces Using Ad Hoc UI

Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI), 2022.
Keywords: augmented reality, everyday objects, tangible user interface, 3D user interface, 6 DoF, spatial interaction, markerless tracking, tangible interaction, hand gestures, XR interaction


teaser image of OmniSyn: Intermediate View Synthesis Between Wide-baseline Panoramas

OmniSyn: Intermediate View Synthesis Between Wide-baseline Panoramas

2022 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2022.
Keywords: 360 image, virtual reality, view synthesis, panorama, neural rendering, depth map, mesh rendering, inpainting, digital world
teaser image of GazeChat: Enhancing Virtual Conferences With Gaze-aware 3D Photos

GazeChat: Enhancing Virtual Conferences With Gaze-aware 3D Photos

Proceedings of the 34th Annual ACM Symposium on User Interface Software and Technology (UIST), 2021.
Keywords: eye contact, gaze awareness, video conferencing, video-mediated communication, gaze interaction, augmented communication, augmented conversation, eye tracking, XR interaction
teaser image of Multiresolution Deep Implicit Functions for 3D Shape Representation

Multiresolution Deep Implicit Functions for 3D Shape Representation

2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
Keywords: deep implicit functions, neural representation, compression, levels of detail, MDIF, interactive perception

teaser image of HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence

HumanGPS: Geodesic PreServing Feature for Dense Human Correspondence

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
Keywords: correspondences, geodesic distance, embeddings, neural networks, digital human, interactive perception

teaser image of A Log-Rectilinear Transformation for Foveated 360-degree Video Streaming

A Log-Rectilinear Transformation for Foveated 360-degree Video StreamingTVCG Honorable Mentions

IEEE Transactions on Visualization and Computer Graphics (TVCG), 2021.
Keywords: 360° video, foveation, virtual reality, live video stream-ing, log-rectilinear, summed-area table, eye tracking, digital world

teaser image of Sandwiched Image Compression: Wrapping Neural Networks Around a Standard Codec

Sandwiched Image Compression: Wrapping Neural Networks Around a Standard Codec

2021 IEEE International Conference on Image Processing (ICIP), 2021.
Keywords: deep learning, image compression, interactive perception

teaser image of Saliency Computation for Virtual Cinematography in 360° Videos

Saliency Computation for Virtual Cinematography in 360° Videos

Ruofei Du and Amitabh Varshney
IEEE Computer Graphics and Applications (CGA), 2021.
Keywords: spherical harmonics, virtual reality, visual saliency, 360°videos, omnidirectional videos, perception, Itti model, spectralresidual, GPGPU, CUDA, eye tracking, interactive graphics
teaser image of CollaboVR: A Reconfigurable Framework for Multi-user to Communicate in Virtual Reality

CollaboVR: A Reconfigurable Framework for Multi-user to Communicate in Virtual Reality

Zhenyi He, Ruofei Du, and Ken Perlin
2020 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), 2020.
Keywords: chalktalk, virtual reality, collaborative work, layout, telepresence, communication, XR interaction, augmented communication
teaser image of 3D-Kernel Foveated Rendering for Light Fields

3D-Kernel Foveated Rendering for Light Fields

IEEE Transactions on Visualization and Computer Graphics (TVCG), 2020.
Keywords: light field, foveated rendering, microscopic light field, eye tracking, visualization, eye tracking, interactive graphics
teaser image of Experiencing Real-time 3D Interaction With Depth Maps for Mobile Augmented Reality in DepthLab

Experiencing Real-time 3D Interaction With Depth Maps for Mobile Augmented Reality in DepthLab

Adjunct Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST), 2020.
Keywords: depth map; interactive 3D graphics; real time; interaction; augmented reality; mobile AR; rendering; GPU; ARCore; interactive graphics, XR interaction



teaser image of MeteoVis: Visualizing Meteorological Events in Virtual Reality

MeteoVis: Visualizing Meteorological Events in Virtual Reality

Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (CHI EA), 2020.
Keywords: scientific visualization, virtual reality, meteorological data, immersion, interactive visualization, vector field, XR interaction

teaser image of Eye-Dominance-guided Foveated Rendering

Eye-Dominance-guided Foveated Rendering

IEEE Transactions on Visualization and Computer Graphics (TVCG, Special Issue of IEEE Conference on Virtual Reality and 3D User Interfaces), 2020.
Keywords: virtual reality, foveated rendering, perception, gaze-contingent rendering, ocular dominance, eye tracking, interactive graphics
teaser image of Language-based Colorization of Scene Sketches

Language-based Colorization of Scene Sketches

ACM Transactions on Graphics (SIGGRAPH Asia), 2019.
Keywords: deep neural networks; image segmentation; language-based editing; scene sketch; sketch colorization, interactive graphics, interactive perception, augmented communication
teaser image of ORC Layout: Adaptive GUI Layout With OR-Constraints

ORC Layout: Adaptive GUI Layout With OR-Constraints

Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI), 2019.
Keywords: GUI builder, layout manager, constraint-based layout, visual interface design, visual programming, interactive graphics
teaser image of Kernel Foveated Rendering

Kernel Foveated RenderingMost read in PACMCGIT

Proceedings of the ACM on Computer Graphics and Interactive Techniques (I3D), 2018.
Keywords: foveated rendering, perception, log-polar mapping, eye-tracking, virtual reality, head-mounted displays, eye tracking, interactive graphics
teaser image of Project Geollery.com: Reconstructing a Live Mirrored World With Geotagged Social Media

Project Geollery.com: Reconstructing a Live Mirrored World With Geotagged Social Media

Ruofei Du, David Li, and Amitabh Varshney
Proceedings of the 24th International Conference on Web3D Technology (Web3D), 2019.
Keywords: virtual reality, mixed reality, 360° image, GIS, 3D reconstruction, projection mapping, mirrored world, social media, WebGL, metaverse, mirrored world, interactive graphics, digital world
teaser image of Montage4D: Interactive Seamless Fusion of Multiview Video Textures

Montage4D: Interactive Seamless Fusion of Multiview Video Textures

Proceedings of ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), 2018.
Keywords: texture montage, 3d reconstruction, texture stitching, view-dependent rendering, discrete geodesics, projective texture mapping, differential geometry, temporal texture fields, digital human

teaser image of SketchyScene: Richly-Annotated Scene Sketches

SketchyScene: Richly-Annotated Scene Sketches

European Conference on Computer Vision (ECCV), 2018.
Keywords: sketch dataset, scene sketch, sketch segmentation, interactive graphics
teaser image of Evaluating Haptic and Auditory Directional Guidance to Assist Blind People in Reading Printed Text Using Finger-Mounted Cameras

Evaluating Haptic and Auditory Directional Guidance to Assist Blind People in Reading Printed Text Using Finger-Mounted Cameras

ACM Transactions on Accessible Computing (TACCESS), 2016.
Keywords: accessibility, real-time OCR, visual impairments, wearables, XR interaction

teaser image of Video Fields: Fusing Multiple Surveillance Videos Into a Dynamic Virtual Environment

Video Fields: Fusing Multiple Surveillance Videos Into a Dynamic Virtual Environment

Proceedings of the 21st International Conference on Web3D Technology (Web3D), 2016.
Keywords: virtual reality; mixed-reality; video-based rendering; projection mapping; surveillance video; WebGL; WebVR; interactive graphics
teaser image of Experiencing a Mirrored World With Geotagged Social Media in Geollery

Experiencing a Mirrored World With Geotagged Social Media in Geollery

Ruofei Du, David Li, and Amitabh Varshney
Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (CHI EA), 2019.
Keywords: virtual reality, augmented reality, social media, GIS, street view, visualization, 3D user interface, 3D reconstruction, metaverse, mirrored world
teaser image of Interactive Fusion of 360° Images for a Mirrored World

Interactive Fusion of 360° Images for a Mirrored World

Ruofei Du, David Li, and Amitabh Varshney
2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2019.
Keywords: virtual reality, 360° image, 3D reconstruction, mixed reality, projection mapping, mirrored world, metaverse, mirrored world
teaser image of Tracking-Tolerent Visual Cryptography

Tracking-Tolerent Visual Cryptography

Ruofei Du, Eric Lee, and Amitabh Varshney
2019 IEEE Conference on Virtual Reality and 3D User Interfaces (VR), 2019.
Keywords: visual cryptography, augmented reality (AR), tracking, XR interaction
teaser image of VRSurus: Enhancing Interactivity and Tangibility of Puppets in Virtual Reality

VRSurus: Enhancing Interactivity and Tangibility of Puppets in Virtual RealityDemoed at UIST 2015

Ruofei Du and Liang He
Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA), 2016.
Keywords: Virtual Reality; Tangible User Interface; Haptics; Gesture Recognition; Head-Mounted Display; XR interaction

teaser image of AtmoSPHERE: Representing Space and Movement Using Sand Traces in an Interactive Zen Garden

AtmoSPHERE: Representing Space and Movement Using Sand Traces in an Interactive Zen Garden

Proceedings of the 33rd Annual ACM Conference Extended Abstracts on Human Factors in Computing Systems (CHI EA), 2015.
Keywords: Visualization; Tangible Interactive Art; Machine Aesthetics; Calm Technology; XY Servo Table; Kinect; XR interaction

teaser image of Supporting Everyday Activities for Persons With Visual Impairments Through Computer Vision

Supporting Everyday Activities for Persons With Visual Impairments Through Computer Vision

Proceedings of the 17th International ACM SIGACCESS Conference on Computers Accessibility (ASSETS), 2015.
Keywords: Blind; visually impaired; wearable computing; computer vision; vision-augmented touch


teaser image of Online Vigilance Analysis Combining Video and Electrooculography Features

Online Vigilance Analysis Combining Video and Electrooculography Features

Neural Information Processing - 19th International Conference (ICONIP), 2012.
Keywords: Vigilance Analysis, Fatigue Detection, Active Shape Model, Electrooculography, Support Vector Machine, eye tracking
teaser image of A Pilot Study of Spherical Harmonics for Saliency Computation and Navigation in 360° Videos

A Pilot Study of Spherical Harmonics for Saliency Computation and Navigation in 360° Videos

Ruofei Du and Amitabh Varshney
ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), 2018.
Keywords: spherical harmonics, virtual reality, visual saliency, 360°videos, omnidirectional videos, perception, Itti model, spectralresidual, GPGPU, CUDA

Technical Reports

teaser image of Experiencing Rapid Prototyping of Machine Learning Based Multimedia Applications in Rapsai

Experiencing Rapid Prototyping of Machine Learning Based Multimedia Applications in Rapsai

Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (CHI EA), 2023.
Keywords: visual programming, node-graph editor, deep neural networks, data augmentation, deep learning, model comparison, visual analytics, interactive perception


teaser image of C-Flow: Visualizing Foot Traffic and Profit Data to Make Informative Decisions

C-Flow: Visualizing Foot Traffic and Profit Data to Make Informative Decisions

University of Maryland, College Park. Department of Computer Science, 2012.
Keywords: Information Visualization; Data Mapping; Indoor Visualization; Business; Usability Testing

teaser image of Statistics for K-mer Based Splicing Analysis

Statistics for K-mer Based Splicing Analysis

Ruofei Du, Hao Li, Hui Miao, and Shangfu Peng
University of Maryland, College Park. Department of Computer Science, 2014.
Keywords:
teaser image of UistViz: 26 Years of UIST Coauthor Network Visualization

UistViz: 26 Years of UIST Coauthor Network Visualization

Ruofei Du
University of Maryland, College Park. Department of Computer Science, 2013.
Keywords: Information Visualization; UIST; Coauthor Network; DBLP; NodeXL
teaser image of Learning Depression Patterns From MyPersonality and Reddit

Learning Depression Patterns From MyPersonality and Reddit

Weiwei Yang, Xuetong Sun, and Ruofei Du
University of Maryland, College Park. Department of Computer Science, 2015.
Keywords:
teaser image of Zero-shot Learning Based Pedestrian Parsing

Zero-shot Learning Based Pedestrian Parsing

Xiyang Dai, Ruofei Du, and Hao Zhou
University of Maryland, College Park. Department of Computer Science, 2015.
Keywords: pedestrian parsing, zero-shot learning, segmentation
teaser image of Deliberately Planning and Acting for Angry Birds With Refinement Methods

Deliberately Planning and Acting for Angry Birds With Refinement Methods

Ruofei Du, Zebao Gao, and Zheng Xu
University of Maryland, College Park. Department of Computer Science, 2015.
Keywords:
teaser image of Systems, Devices, and Methods for Generating a Social Street View

Systems, Devices, and Methods for Generating a Social Street View

Ruofei Du and Amitabh Varshney
US Patent 10,380,726, 2016.
Keywords: virtual reality; social media; street view; geographical information systems; mixed reality; WebGL

Stay In Touch