I am a 4th year Engineering Science student at the University of Toronto, where I specialize in the Machine Intelligence option. My academic and research interests are 3D Computer Vision and Generative Models with a focus on 3D Digital Humans and Image/Video Generation.

I am currently an undergraduate researcher working with Felix Taubner and Professor David Lindell on Human-Centric Compositional Video Generation. Previously, I was a research intern with the Creative Vision team at Snap Research, working on Image Generation and Personalization with Gordon Qian, Jackson Wang, and Sergey Tulyakov, and an UTEA-funded undergraduate researcher in the Modelics Lab, with Professor Piero Triverio, working on Dynamic CT analysis.

Outside of academics, I am leading the Research Department at UTMIST. I enjoy playing basketball and going to the gym. I have also been playing the piano and writing Chinese calligraphy since I was in elementary school.

Email  /  Google Scholar  /  Github  /  LinkedIn

profile photo

Updates

[May 2025 | Palo Alto] Joined Snap Research as a Research Intern
[Sept 2025 | Toronto] Joined DGP/TCIG for undergrad thesis, with Felix Taubner & Professor David Lindell

Publications

LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas
Gordon Guocheng Qian*, Ruihang Zhang*, Tsai-Shien Chen, Yusuf Dalva, Anujraaj Goyal, Willi Menapace, Ivan Skorokhodov, Daniil Ostashev, Meng Dong, Arpit Sahni, Ju Hu, Sergey Tulyakov, Kuan-Chieh Jackson Wang,
*Equal Contribution, Co-First Authors
Preprint, 2025
Project Page / arXiv

LayerComposer enables Photoshop-like control for multi-subject text-to-image generation, allowing users to compose scenes by placing, resizing, and locking elements in a layered canvas with high fidelity.

MVP4D: Multi-View Portrait Video Diffusion for Animatable 4D Avatars Diffusion Models
Felix Taubner, Ruihang Zhang, Mathieu Tuli, Sherwin Bahmani, David B. Lindell
SIGGRAPH ASIA, 2025
Project Page / arXiv

MVP4D generates 360° human heads from a reference image and input animation using a Morphable Multi-View Video Diffusion Model, distilling them into a 4D representation for real-time rendering.

CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models
Felix Taubner, Ruihang Zhang, Mathieu Tuli, David B. Lindell
CVPR, 2025 (Oral Presentation, 0.73%)
Project Page / arXiv

CAP4D generates controllable 4D human head avatars given any number of reference images using Morphable Multi-view Diffusion Models and Deformable 3D Gaussian Splatting.

Website template from Jonathan T. Barron.