(* denotes equal contribution)
Human grasping synthesis has numerous applications including AR/VR, video games and robotics. While some methods have been proposed to generate realistic hand-object interaction for object grasping and manipulation, they typically only consider the hand interacting with objects. In this work, our goal is to synthesize whole-body grasping motion. Given a 3D object, we aim to generate diverse and natural whole-body human motions that approach and grasp the object. This task is challenging as it requires modeling both whole-body dynamics and dexterous finger movements. To this end, we propose SAGA (StochAstic whole-body Grasping with contAct) which consists of two key components: (a) Static whole-body grasping pose generation. Specifically, we propose a multi-task generative model, to jointly learn static whole-body grasping ending poses and human-object contacts. (b) Grasping motion infilling. Given an initial pose and the generated whole-body grasping pose as the starting and ending poses of the motion respectively, we design a novel contact-aware generative motion infilling module to generate a diverse set of grasp-oriented motions. We demonstrate the effectiveness of our method being the first generative framework to synthesize realistic and expressive whole-body motions that approach and grasp randomly placed unseen objects.