Animating a virtual character based on a real performance of an actor is a challenging task that currently requires expensive motion capture setups and additional effort by expert animators, rendering it accessible only to large production houses. The goal of our work is to democratize this task by developing a frugal alternative termed “Transfer4D” that uses only commodity depth sensors and further reduces animators’ effort by automating the rigging and animation transfer process. To handle sparse, incomplete videos from depth video inputs and large variations between source and target objects, we propose to use skeletons as an intermediary representation between motion capture and transfer. We propose a novel skeleton extraction pipeline from single-view depth sequence that incorporates additional geometric information, resulting in superior performance in motion reconstruction and transfer in comparison to the contemporary methods. We use non-rigid reconstruction to track motion from the depth sequence, and then we rig the source object using skinning decomposition. Finally, the rig is embedded into the target object for motion retargeting.