We are an inter-disciplinary team of researchers working in visual computing, in particular, computer graphics and computer vision. Current areas of focus include 3D and robotic vision, 3D printing and content creation, animation, AR/VR, generative AI, geometric and image-based modelling, language and 3D, machine learning, natural phenomenon, and shape analysis. Our research works frequently appear in top venues such as SIGGRAPH, CVPR, and ICCV (we rank #14 in the world in terms of top publications in visual computing, as of 6/2023) and we collaborate widely with the industry and academia (e.g., Adobe Research, Amazon, Autodesk, Google, MSRA, Princeton, Stanford, Tel Aviv, and Washington). Our faculty and students have won numerous honours and awards, including FRSC, SIGGRAPH Outstanding Doctoral Dissertation Award, Alain Fournier Best Thesis Award, CS|InfoGAN Researcher Award, Google Faculty Award, Google PhD Fellowship, Borealis AI Fellowship, TR35@Singapore, CHCCS Achievement and Early Career Researcher Awards, NSERC Discovery Accelerator Awards, and several best paper awards from CVPR, ECCV, SCA, SGP, etc. Gruvi alumni went on to take up faculty positions in Canada, the US, and Asia, while others now work at companies including Amazon, Apple, EA, Facebook (Meta), Google, IBM, and Microsoft.
February 23, 2024
VCR seminar will hold a workshop for AAAI visitors from other universities. Below is the schedule for the talks. 11 am-11:40 am Levi Lelis (U of Alberta) 11:40 am-12:20 pm Vahid Babaei (MPI) Inverse Design with Neural Surrogate Models 12:20 pm -1:00 pm lunch break 1 pm -1:40 pm Sven Koenig (USC) Multi-Agent Path Finding and Its Applications 1:40 pm -2:20 pm Jiaoyang Li (CMU) Layout Design for Large-Scale Multi-Robot Coordination Talk 1 (11 am-11:40) Levi Lelis, Department of Computing Science, University of Alberta Title: Learning Options by Extracting Programs from Neural Networks Abstract: In this talk, I argue for a programmatic mindset in reinforcement learning, proposing that agents should generate libraries of programs encoding reusable behaviors. When faced with a new task, the agent learns how to combine existing programs and generate new ones. This approach can be helpful even when policies are encoded in seemingly non-decomposable representations like neural networks. I will show that neural networks with piecewise linear activation functions can be mapped to a program with if-then-else structures. Such a program can then be easily decomposed into sub-programs with the same input type of the original network. In the case of networks encoding policies, each sub-program can be seen as an option—a temporally extended action. All these sub-programs form a library of agent behaviors that can be reused later, in downstream tasks. Considering that even small networks can encode a large number of sub-programs, we select sub-programs that are likely to generalize to unseen tasks. This is achieved through a subset selection procedure that minimizes the Levin loss. Empirical evidence from challenging exploration scenarios in two grid-world domains demonstrates that our methodology can extract helpful programs, thus speeding up the learning process in tasks that are similar and yet distinct from the one used to train the original model. Bio: Dr. Levi Lelis is an Assistant Professor at the University of Alberta, an Amii Fellow, and a CIFAR AI Chair. Levi’s research is dedicated to the development of principled algorithms to solve combinatorial search problems. These problems are integral to optimizing tasks in various sectors. Levi’s research group is focused on combinatorial search problems arising from the search for programmatic solutions—computer programs written in a domain-specific language encoding problem solutions. Levi believes that the most promising path to creating agents that learn continually, efficiently, and safely is to represent the agents’ knowledge programmatically. While programmatic representations offer many advantages, including modularity and reusability, they present a significant challenge: the need to search over large, non-differentiable spaces not suited for gradient descent methods. Addressing this challenge is the current focus of Levi’s work. Talk 2 (11:40 am-12:20) Vahid Babaei , Max Planck Institute for Informatics Title: Inverse Design with Neural Surrogate Models Abstract: The digitalization of manufacturing is turning fabrication hardware into computers. As traditional tools, such as computer aided design, manufacturing, and engineering (CAD/CAM/CAE) lag behind this new paradigm, the field of computational fabrication has recently emerged from computer graphics to address this knowledge gap with a computer-science mindset. Computer graphics is extremely powerful in creating content for the virtual world. The connection is therefore a natural one as the digital fabrication hardware is starving for innovative content. In this talk, I will focus on inverse design, a powerful paradigm of content synthesis for digital fabrication, which creates fabricable designs given the desired performances. Specifically, I will discuss a class of inverse design problems that deals with data-driven neural surrogate models. These surrogates learn and replace a forward process, such as a computationally heavy simulation. Bio: Vahid Babaei leads the AI aided Design and Manufacturing group at the Computer Graphics Department of the Max Planck Institute for Informatics in Saarbrücken, Germany. He was a postdoctoral researcher at the Computational Design and Fabrication Group of Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. He obtained his PhD in Computer Science from EPFL. Vahid Babaei is the recipient of the 2023 Germany-wide Curious Mind Award in the area of ‘AI, Digitalization, and Robotics’, the Hermann Neuhaus Prize of the Max Planck Society, and two postdoctoral fellowships awarded by the Swiss National Science Foundation. He is interested in developing original computer science methods for both engineering design and advanced manufacturing. Talk 3 (1 pm -1:40 pm) Sven Koenig, computer science department, University of Southern California Title: Multi-Agent Path Finding and Its Applications Abstract: The coordination of robots and other agents becomes more and more important for industry. For example, on the order of one thousand robots already navigate autonomously in Amazon fulfillment centers to move inventory pods all the way from their storage locations to the picking stations that need the products they store (and vice versa). Optimal and even some approximately optimal path planning for these robots is NP-hard, yet one must find high-quality collision-free paths for them in real-time. Algorithms for such multi-agent path-finding problems have been studied in robotics and theoretical computer science for a longer time but are insufficient since they are either fast but of insufficient solution quality or of good solution quality but too slow. In this talk, I will discuss different variants of multi-agent path-finding problems, cool ideas for both solving them and executing the resulting plans robustly, and several of their applications. Our research on this topic has been funded by both NSF and Amazon Robotics. Bio: Sven Koenig is a professor of computer science at the University of Southern California. Most of his current research focuses on planning for single agents (such as robots) or multi-agent systems. Additional information about him can be found on his webpages: idm-lab.org. Talk 4 (1:40 pm -2:20 pm) Jiaoyang Li, Robotics Institute, Carnegie Mellon University Title: Layout Design for Large-Scale Multi-Robot Coordination Abstract: Today, thousands of robots are navigating autonomously in warehouses, transporting goods from one location to another. While numerous planning algorithms are developed to coordinate robots more efficiently and robustly, warehouse layouts remain largely unchanged – they still adhere to the traditional pattern designed for human workers rather than robots. In this talk, I will share our recent progress in exploring layout design and optimization to enhance large-scale multi-robot coordination. I will first introduce a direct layout design method, followed by a method to optimize layout generators instead of layouts. I will then extend these ideas to virtual layout design, which does not require changes to the physical world that robots navigate and thus has the potential for applications beyond automated warehouses. Bio: Jiaoyang Li is an assistant professor at the Robotics Institute of CMU School of Computer Science. She received her Ph.D. in computer science from the University of Southern California (USC) in 2022. Her research interests lie in the coordination of large robot teams. Her research received recognition through prestigious paper awards (e.g., best student paper, best demo, and best student paper nomination at ICAPS in 2020, 2021, and 2023, along with the best paper finalist at MRS in 2023) and competition championships (e.g., winners of NeurIPS Flatland Challenge in 2020 and Flatland 3 in 2021, as well as the League of Robot Runners sponsored by Amazon Robotics in 2023). Her Ph.D. dissertation also received the best dissertation awards from ICAPS, AAMAS, and USC in 2023.
February 2, 2024
We are thrilled to announce that a special VCR event: the SFU-UBC Visual Computing Meeting will be held in SFU Burnaby campus (Feb 2nd from 10:00 am-3:00 pm). This special session will feature professors and students from the UBC and SFU visual computing community for a day of engaging discussions and networking. Schedule – 10:00am → 12:00pm: talks (Section1 and 2) – 12:00pm → 2:00pm: lunch and posters – 2:00pm → 3:00pm: talks (Section 3)
January 26, 2024
Title: FRONTIERS IN EMBODIED AI FOR AUTONOMOUS DRIVING Abstract: Over the last decade, fundamental advances in AI have driven unprecedented progress across many disciplines and applications. And yet, despite significant progress, autonomous vehicles are still far from mainstream even after billions of dollars of investment. In this talk we’ll explore what’s been holding progress back, and how by adopting a modern embodied AI approach to the problem, Wayve is finally unlocking the potential of autonomous driving in complex and unstructured urban environments such as central London. We’ll also explore some of our latest research in multimodal learning to combine the power of large language models with the driving problem (“LINGO-1”), and in generative world models as learned simulators trained to predict the future conditioned on ego action (“GAIA-1"). Bio: Jamie Shotton is a leader in AI research and development, with a track record of incubating transformative new technologies and experiences from early stage research to shipping product. He is Chief Scientist at Wayve, building foundation models for embodied intelligence, such as GAIA and LINGO, to enable safe and adaptable autonomous vehicles. Prior to this he was Partner Director of Science at Microsoft and head of the Mixed Reality & AI Labs where he shipped foundational features including body tracking for Kinect and the hand- and eye-tracking that enable HoloLens 2’s instinctual interaction model. He has explored applications of AI in autonomous driving, mixed reality, virtual presence, human-computer interaction, gaming, robotics, and healthcare. He has received multiple Best Paper and Best Demo awards at top-tier academic conferences, and the Longuet-Higgins Prize test-of-time award at CVPR 2021. His work on Kinect was awarded the Royal Academy of Engineering’s gold medal MacRobert Award in 2011, and he shares Microsoft’s Outstanding Technical Achievement Award for 2012 with the Kinect engineering team. In 2014 he received the PAMI Young Researcher Award, and in 2015 the MIT Technology Review Innovator Under 35 Award. He was awarded the Royal Academy of Engineering’s Silver Medal in 2020. He was elected a Fellow of the Royal Academy of Engineering in 2021.
January 15, 2024
Congratulations to Professor Richard Zhang on being recognized as a 2024 IEEE Fellow for his “contributions to shape analysis and synthesis in visual computing.” IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. It has more than 450,000 members in more than 190 countries and is a leading authority in many areas, including engineering, computing, and technology information. Less than one-tenth of one percent of IEEE members worldwide are selected as Fellows in any year. Fellow status is awarded to individuals with "an outstanding record of accomplishments in any of the IEEE fields of interest.” Please check out SFU news coverage on his IEEE Fellow elevation.
November 11, 2023
SIGGRAPG Asia, the premier conference on computer graphics, will be held in Sydney, Australia this year (Dec 12-15). GrUVi lab will once again have a good show at SIGGRAPG Asia, with 5 technical papers. Congratulations to all the authors! And here are the 5 accepted papers: Intrinsic Harmonization for Illumination-Aware Compositing ShaDDR: Real-Time Example-Based Geometry and Texture Generation via 3D Shape Detailization and Differentiable Rendering CLIPXPlore: Coupled CLIP and Shape Spaces for 3D Shape Exploration Interaction-Driven Active 3D Reconstruction with Object Interiors Neural Packing for Real: from Visual Sensing to Reinforcement Learning
November 6, 2023
NeurIPS, the premier conference on machine learning, will be held in New Orleans this year (Dec 10-16). GrUVi lab will once again have a good show at NeurIPS, with 7 technical papers and 1 dataset and benchmarks paper! Please refer to our publication page to see more details.
November 3, 2023
Click this link to see the talk replay. Title: Co-speech gesture generation Abstract: Gestures accompanying speech are essential to natural and efficient embodied human communication. The automatic generation of such co-speech gestures is a long-standing problem in computer animation. It is considered an enabling technology in film, games, virtual social spaces, and interaction with social robots. The problem is made challenging by the idiosyncratic and non-periodic nature of human co-speech gesture motion, and by the great diversity of communicative functions that gestures encompass. Gesture generation has seen surging interest recently, owing to the emergence of more and larger datasets of human gesture motion, combined with strides in deep-learning-based generative models that benefit from the growing availability of data. This talk will review co-speech gesture generation research development, focusing on deep generative models. Bio: Dr. Taras Kucherenko is currently a Research Scientist at Electronic Arts. He finished a Ph.D. at the KTH Royal Institute of Technology in Stockholm in 2021. His research is on machine learning models for non-verbal behavior generation, such as hand gestures and facial expressions. For his research papers, he received ICMI 2020 Best Paper Award and IVA 2020 Best Paper Award. Taras was also the main organizer of The GENEA (Generation and Evaluation of Non-verbal Behavior for Embodied Agents) Workshop and Challenge in 2020, 2021, 2022, and 2023.
October 27, 2023
Click this link to see the talk replay. Title: Quantum Computing for Robust Fitting Abstract: Many computer vision applications need to recover structure from imperfect measurements of the real world. The task is often solved by robustly fitting a geometric model onto noisy and outlier-contaminated data. However, relatively recent theoretical analyses indicate that many commonly used formulations of robust fitting in computer vision are not amenable to tractable solution and approximation. In this paper, we explore the usage of quantum computers for robust fitting. To do so, we examine the feasibility of two types of quantum computer technologies—universal gate quantum computers and quantum annealers—to solve robust fitting. Novel algorithms that are amenable to the quantum machines have been developed, and experimental results on current noisy intermediate scale quantum computers (NISQ) will be reported. Our work thus proposes one of the first quantum treatments of robust fitting for computer vision. Bio: Tat-Jun (TJ) Chin is SmartSat CRC Professorial Chair of Sentient Satellites at The University of Adelaide. He received his PhD in Computer Systems Engineering from Monash University in 2007, which was partly supported by the Endeavour Australia-Asia Award, and a Bachelor in Mechatronics Engineering from Universiti Teknologi Malaysia in 2004, where he won the Vice Chancellor’s Award. TJ’s research interest lies in computer vision and machine learning for space applications. He has published close to 200 research articles, and has won several awards for his research, including a CVPR award (2015), a BMVC award (2018), Best of ECCV (2018), three DST Awards (2015, 2017, 2021), an IAPR Award (2019) and an RAL Best Paper Award (2021). TJ pioneered the AI4Space Workshop series and is an Associate Editor at the International Journal of Robotics Research (IJRR) and Journal of Mathematical Imaging and Vision (JMIV). He was a Finalist in the Academic of the Year Category at Australian Space Awards 2021.
October 20, 2023
Click this link to see the talk replay. Title: Efficient, Less-biased and Creative Visual Learning Abstract: In this talk I will discuss recent methods from my group that focus on addressing some of the core challenges of current visual and multi-modal cognition, including efficient learning, bias and user-controlled generation. Centering on these larger themes I will talk about a number of strategies (and corresponding papers) that we developed to address these challenges. I will start by discussing transfer learning techniques in the context of a semi-supervised object detection and segmentation, highlighting a model that is applicable to a range of supervision: from zero to a few instance-level samples per novel class. I will then talk about our recent work on building a foundational image representation model by combining two successful strategies of masking and sequential token prediction. I will also discuss some of our work on scene graph generation which, in addition to improving overall performance, allows for scalable inference and ability to control data bias (by trade off major improvements on rare classes for minor declines on most common classes). The talk will end with some of our recent work on generative modeling which focuses on novel-view synthesis and language-conditioned diffusion-based story generation. The core of the latter approach is visual memory that implicitly captures the actor and background context across the generated frames. Sentence-conditioned soft attention over the memories enables effective reference resolution and learns to maintain scene and actor consistency when needed. Biography: Prof. Leonid Sigal is a Professor at the University of British Columbia (UBC). He was appointed CIFAR AI Chair at the Vector Institute in 2019 and an NSERC Tier 2 Canada Research Chair in Computer Vision and Machine Learning in 2018. Prior to this, he was a Senior Research Scientist, and a group lead, at Disney Research. He completed his Ph.D at Brown University in 2008; received his B.Sc. degrees in Computer Science and Mathematics from Boston University in 1999, his M.A. from Boston University in 1999, and his M.S. from Brown University in 2003. He was a Postdoctoral Researcher at the University of Toronto, between 2007-2009. Leonid’s research interests lie in the areas of computer vision, machine learning, and computer graphics; with the emphasis on approaches for visual and multi-modal representation learning, recognition, understanding and generative modeling. He has won a number of prestigious research awards, including Killam Accelerator Fellowship in 2021 and has published over 100 papers in venues such as CVPR, ICCV, ECCV, NeurIPS, ICLR, and Siggraph.
October 13, 2023
Click this link to see the talk replay. Title: Visual Human Motion Analysis Abstract: Recent advancement of imaging sensors and deep learning techniques has opened door to many interesting applications for visual analysis of human motions. In this talk, I will discuss our research efforts toward addressing the related tasks of 3-D human motion syntheses, pose and shape estimation from images and videos, visual action quality assessment. Looking forward, our results could be applied to everyday life scenarios such as natural user interface, AR/VR, robotics, and gaming, among others. Bio: Li CHENG is a professor at the Department of Electrical and Computer Engineering, University of Alberta. He is associate editors of IEEE Trans. Multimedia and Pattern Recognition Journal. Prior to joining University of Alberta, He worked at A*STAR, Singapore, TTI-Chicago, USA, and NICTA, Australia. His current research interests are mainly on human motion analysis, mobile and robot vision, and machine learning. More details can be found a http://www.ece.ualberta.ca/~lcheng5/.
September 18, 2023
ICCV, the premier conference on computer vision, will be held in Paris this year (Oct 2-6). GrUVi lab will once again have a good show at ICCV, with 6 technical papers, 3 co-organized workshops! Also, Prof. Yasutaka Furukawa serves as a program chair for this year’s ICCV! For the workshops, Prof. Richard Zhang co-organizes the 3D Vision and Modeling Challenges in eCommerce, Prof. Angel Chang co-organizes the 3rd Workshop on Language for 3D Scenes and CLVL: 5th Workshop on Closing the Loop between Vision and Language. Also, Prof. Manolis Savva will give a talk at the 1st Workshop on Open-Vocabulary 3D Scene Understanding And here are the 6 accepted papers: Multi3DRefer: Grounding Text Description to Multiple 3D Objects DS-Fusion: Artistic Typography via Discriminated and Stylized Diffusion SKED: Sketch-guided Text-based 3D Editing PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects HAL3D: Hierarchical Active Learning for Fine-Grained 3D Part Labeling UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding Congrats for the authors!
June 26, 2023
The recording is available at this link. Title: Towards Controllable 3D Content Creation by leveraging Geometric Priors Abstract: The growing popularity for extended realities pushes the demand for the automatic creation and synthesis of new 3D content that would otherwise be a tedious and laborious process. A key property needed to make 3D content creation useful is user controllability as it allows one to realize specific ideas. User-control can be of various forms, e.g. target scans, input images or programmatic edits etc. In this talk, I will be touching works that enable user-control through i) object parts and ii) sparse scene images by leveraging geometric priors. The former utilizes object semantic priors by proposing a novel shape space factorization through an introduced cross diffusion network that enabled multiple applications in both shape generation and editing. The latter leverages pretrained models of large 2D datasets for sparse view 3D NeRF reconstruction of scenes by learning a distribution of geometry represented as ambiguity-aware depth estimates. As an add-on, we will also briefly revisit the volume rendering equation in NeRFs and reformulate it to piecewise linear density that alleviates underlying issues caused by quadrature instability. Bio: Mika is a fourth year PhD student at Stanford advised by Leo Guibas. Her research focuses on the representation and generation of objects/scenes for user-controllable 3D content creation. She was a research intern at Adobe, Autodesk and now, Google, and is generously supported by Apple AI/ML PhD Fellowship and Snap Research Fellowship.
August 4, 2023
We are proud to highlight that three of Prof. Jason Peng's research papers will be presented in the upcoming SIGGRAPH 2023. These papers mark advances in physics-based character animation. Below are titles and links to the related project pages: Learning Physically Simulated Tennis Skills from Broadcast Videos https://xbpeng.github.io/projects/Vid2Player3D/index.html Synthesizing Physical Character-Scene Interactions https://xbpeng.github.io/projects/InterPhys/index.html CALM: Conditional Adversarial Latent Models for Directable Virtual Characters https://xbpeng.github.io/projects/CALM/index.html Note: The SIGGRAPH conference, short for Special Interest Group on Computer GRAPHics and Interactive Techniques, is the world's premier annual event for showcasing the latest innovations in computer graphics and interactive techniques. It brings together researchers, artists, developers, filmmakers, scientists, and business professionals from around the globe. The conference offers a unique blend of educational sessions, hands-on workshops, and exhibitions of cutting-edge technology and applications.
June 14, 2023
CVPR, the premier conference on computer vision, will be held in Vancouver this year (June 18-22). GrUVi lab will once again have an incredible show at CVPR, with 12 technical papers, 6 invited talks, 4 co-organized workshops! Conference and workshop co-organization Former GrUVi Professor Greg Mori serves as one of the four general conference chairs for the main CVPR conference! Prof. Angel Chang, as one of the social activity chairs, is helping to organize the speed mentoring sessions. In addition, we have exciting workshops and challenges that are organized by GrUVi members as well: Computer Vision in the Built Environment workshop - co-organized by Prof. Yasutaka Furukawa Second Workshop on Structural and Compositional Learning on 3D Data (Struco3D) - co-organized by Prof. Richard Zhang. ScanNet Indoor Scene Understanding Challenge - co-organized by Prof. Angel X. Chang and Prof. Manolis Savva Embodied AI Workshop featuring a variety of challenges including the Multi-Object Navigation (MultiON) challenge (co-organized by Sonia Raychaudhuri, Angel Chang, and Manolis Savva) Workshop talks Prof. Andrea Tagliasacchi is invited to give a keynote talk both in Struco3D and Generative Models for Computer Vision (both on June 18th). He also will give a spotlight at the Area Chair workshop on Saturday. In the Women in Computer Vision, Prof. Angel Chang will be giving a talk on June 19th. She is also invited to give talks at workshops on 3D Vision and Robotics (June 18th), Compositional 3D Vision (June 18th), and Open-Domain Reasoning Under Multi-Modal Settings (June 19th). Technical papers Congratulations to all authors for the accepted papers! The full list of papers featured on CVPR 2023 can be accessed here.