DeepLabCut: markerless pose estimation of user-defined body parts with deep learning

Mathis et al, Nature Neuroscience 2018 rdcu.be/4Rep

Quantifying behavior is crucial for many applications in neuroscience. Videography provides easy methods for the observation and recording of animal behavior in diverse settings, yet extracting particular aspects of a behavior for further analysis can be highly time consuming. In motor control studies, humans or other animals are often marked with reflective markers to assist with computer-based tracking, but markers are intrusive, and the number and location of the markers must be determined a priori. Here we present an efficient method for markerless pose estimation based on transfer learning with deep neural networks that achieves excellent results with minimal training data. We demonstrate the versatility of this framework by tracking various body parts in multiple species across a broad collection of behaviors. Remarkably, even when only a small number of frames are labeled (~200), the algorithm achieves excellent tracking performance on test frames that is comparable to human accuracy.


Preprint: https://arxiv.org/abs/1804.03142v1
Open Source Code: https://alexemg.github.io/DeepLabCut/ 

exampleresult.png
file0000.png
file354.png
MousereachGIF.gif
MATHIS_2018_fly.gif
MATHIS_2018_odortrail.gif

For more information:
Alexander Mathis - alexander.mathis@bethgelab.org
Mackenzie Mathis - mackenzie@post.harvard.edu


Case Study 1: 95 images were used to train DeepLabCut to predict 22 labels on the chestnut horse (video 1). Automatic labeling was then performed on the full video of chestnut and a previously unseen brown horse (video 2).

Video 3 is taking DeepLabCut, first trained on the chestnut horse (video 1), then adding only 11 labeled frames of Justify on a race track, re-training briefly, and applying the automatic labels to the full video. Note the differences in background, change of viewpoint, as well as the different relative sizes of the horse in video 3 vs video 1.

Walking horses: data and human-annotation by Byron Rogers of Performance Genetics

video 1: Chestnut horse                        video 2: Brown horse                         video 3: Justify track practice


Case Study 2: rat skilled reaching assay from Dr. Daniel Leventhal's group at the University of Michigan. The data was collected during an automated pellet reaching task, and it was labeled by Dr. Daniel Leventhal. We used 180 labeled frames for training.


Case Study 3: Left: Mouse locomotion. Data, labeling, DeepLabCut training & video generation by Rick Warren in Dr. Nate Sawtell's lab at Columbia University. Shown here are the 3D movements from a head-fixed mouse running on a treadmill as collected by one camera (plus a mirror). One network was trained to detect the body parts in both views simultaneously. He used 825 frames of data for training (fewer labels would give similar performance). 

Right: Electric fish freely swimming with a tether. Data, labeling, DeepLabCut training & video generation by Avner Wallach, a post-doc at the Sawtell lab. He used 250 frames for training.

 
MouseLocomotion_warren.gif
fish_wallach.gif
 

Case Study 4: James Bonaiuto, PhD (a postdoctoral fellow in the group of Dr. Pier F. Ferrari at the Institut des Sciences Cognitives, CNRS) trained three networks - one trained on each view with ~120 training frames per view. The 3D trajectories were extracted by using the camera calibration functionality in openCV to compute a projection matrix for each camera and then using these to reconstruct the 3D coordinates from the labeled 2D points in each view.

 
JamesB_humantracking.gif
 

Case Study 5: Open field with objects and a patch cable. Korleki Akiti a PhD student in the laboratory of Prof. Nao Uchida, at Harvard University labeled data for mice in an open field setting, and then we tested DeepLabCut's ability to track the 4 parts on a mouse (snout, ears, tail base) in different light conditions. A test image under a normal lighting condition is on the left, and two challenging examples are shown in the middle and right panels: 

TestImg13_ResultsJackson634b180228session01_2018-02-28-151248-0000mp4session01_2018-02-28-151248-0000_file45137.png
Uchida_normal.gif
Uchida_dark.gif