Ross B. Girshick
Postdoctoral fellow
University of California, Berkeley, EECS
r......@eecs.berkeley.edu
cv / google scholar / Ph.D. thesis

About me

I finished my Ph.D. in computer vision at The University of Chicago under the supervision of Pedro Felzenszwalb in April 2012. Now, I'm a postdoctoral fellow working with Jitendra Malik at UC Berkeley.

My main research interests are in computer vision, AI, and machine learning. I'm particularly focused on building models for object detection and recognition. These models aim to incorporate the "right" biases so that machine learning algorithms can understand image content from moderate to large-scale datasets. I always have an eye towards fast systems that work well in practice.

During my Ph.D., I spent time as a research intern at Microsoft Research Cambridge, UK working on human pose estimation from (Kinect) depth images. I also participated in several first-place entries into the PASCAL VOC object detection challenge, and was awarded a "lifetime achievement" prize for my work on deformable part models. I think this refers to the lifetime of the PASCAL challenge—and not mine!

I'm on the faculty job market

cv / research statement / teaching statement / google scholar

News, recent and upcoming talks

Project pages

Refereed journal papers

Efficient Human Pose Estimation from Single Depth Images
J. Shotton, R. Girshick, A. Fitzgibbon, T. Sharp, M. Cook, M. Finocchio, R. Moore, P. Kohli, A. Criminisi, A. Kipman, A. Blake
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, No. 12, Dec. 2013
bibtex
@article{shotton2013kinect,
  Author    = {J. Shotton and R. Girshick and A. Fitzgibbon and T. Sharp and
               M. Cook and M. Finocchio and R. Moore and P. Kohli and 
               A. Criminisi and A. Kipman and A. Blake},
  Title     = {Efficient Human Pose Estimation from Single Depth Images},
  Volume    = {35},
  Number    = {12},
  Journal   = {Pattern Analysis and Machine Intellingence},
  Year      = {2013}}
    
An integrated description of the original Kinect pose estimation algorithm and our ICCV 2011 algorithm.
Object Detection with Discriminatively Trained Part Based Models
P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, Sep. 2010
code: PAMI code / latest code (voc-release5) / bibtex
@article{felzenszwalb2010dpm,
  Author    = {P. Felzenszwalb and R. Girshick and D. McAllester and D. Ramanan},
  Title     = {Object Detection with Discriminatively Trained Part Based Models},
  Volume    = {32},
  Number    = {9},
  Journal   = {Pattern Analysis and Machine Intellingence},
  Year      = {2010}}
    
Deformable part models (DPM). Also, CACM Research Highlight: Visual Object Detection with Deformable Part Models
P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan
Communications of the ACM, no. 9 (2013): 97-105

Refereed conference papers

Rich feature hierarchies for accurate object detection and semantic segmentation
R. Girshick, J. Donahue, T. Darrell, J. Malik
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
oral presentation
supplement / code / arXiv tech report / bibtex
@inproceedings{girshick2014rcnn,
  Author    = {R. Girshick and J. Donahue and T. Darrell and J. Malik},
  Title     = {Rich feature hierarchies for accurate 
               object detection and semantic segmentation},
  Booktitle = {Proceedings of the IEEE Conference on
               Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2014}}
    
This paper proposes R-CNN, a state-of-the-art visual object detection system that combines bottom-up region proposals with rich features computed by a convolutional neural network. At the time of its release, R-CNN improved the previous best detection performance on PASCAL VOC 2012 by 30% relative, going from 40.9% to 53.3% mean average precision. Unlike the previous best results, R-CNN achieves this performance without using contextual rescoring or an ensemble of feature types.
Using k-poselets for detecting people and localizing their keypoints [coming soon]
G. Gkioxari*, B. Hariharan*, R. Girshick, J. Malik
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
* equal contribution
code / bibtex
@inproceedings{gkioxari2014kposelets,
  Author    = {G. Gkioxari and B. Hariharan and R. Girshick and J. Malik},
  Title     = {Using {k-poselets} for detecting people 
               and localizing their keypoints},
  Booktitle = {Proceedings of the IEEE Conference on
               Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2014}}
    
Understanding Objects in Detail with Fine-grained Attributes [coming soon]
A. Vedaldi, S. Mahendran, S. Tsogkas, S. Maji, R. Girshick, J. Kannala, E. Rahtu, I. Kokkinos, M. B. Blaschko, D. Weiss, B. Taskar, K. Simonyan, N. Saphra, S. Mohamed
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
bibtex
@inproceedings{mahendran14understanding,
  Author    = {A. Vedaldi and
               S. Mahendran and
               S. Tsogkas and
               S. Maji and
               R. Girshick and
               J. Kannala and
               E. Rahtu and
               I. Kokkinos and
               M. B. Blaschko and
               D. Weiss and
               B. Taskar and
               K. Simonyan and
               N. Saphra and
               S. Mohamed},
  Title     = {Understanding Objects in Detail with Fine-grained Attributes},
  Booktitle = {Proceedings of the {IEEE} Conference
               on Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2014}}
    
Training Deformable Part Models with Decorrelated Features
R. Girshick, J. Malik
IEEE International Conference on Computer Vision (ICCV), 2013
supplement / LM-LLDA DPM training code / bibtex
@inproceedings{girshick2013training,
  Author    = {R. Girshick and J. Malik},
  Title     = {Training Deformable Part Models with Decorrelated Features},
  Booktitle = {Proceedings of the International
               Conference on Computer Vision ({ICCV})},
  Year      = {2013}}
    
Ever wonder what makes DPM tick? We dissect DPM training to figure out what bits are important. Also, we present Latent LDA, a fast method to train DPMs without using hard negative examples.
Discriminatively Activated Sparselets
R. Girshick*, H. O. Song*, T. Darrell
International Conference on Machine Learning (ICML), 2013
oral presentation
supplement / Caltech-101 demo code / bibtex
@inproceedings{girshick13das,
  Author    = {R. Girshick and H. O. Song and T. Darrell},
  Title     = {Discriminatively Activated Sparselets},
  Booktitle = {Proceedings of the International
               Conference on Machine Learning ({ICML})},
  Year      = {2013}}
    
Speed up a wide array of structured predictors (including DPMs and multiclass SVMs) by discriminatively learning activations over a dictionary of model atoms. Same speedup as the ECCV 2012 sparselets paper, but with much higher accuracy.
Sparselet Models for Efficient Multiclass Object Detection
H.O. Song, S. Zickler, T. Althoff, R. Girshick, M. Fritz, C. Geyer, P. Felzenszwalb, T. Darrell
European Conference on Computer Vision (ECCV), 2012
bibtex
@inproceedings{song2012sparselet,
  Author    = {H. O. Song and S. Zickler and T. Althoff and R. Girshick and 
               M. Fritz and C. Geyer and P. Felzenszwalb and T. Darrell},
  Title     = {Sparselet Models for Efficient Multiclass Object Detection},
  Booktitle = {Proceedings of the European 
               Conference on Computer Vision ({ECCV})},
  Year      = {2012}}
    
Fast DPM detection by sparse coding model parameters.
Object Detection with Grammar Models
R. Girshick, P. Felzenszwalb, D. McAllester
Neural Information Processing Systems (NIPS), 2011
spotlight video / code (voc-release5) / bibtex
@inproceedings{girshick2011grammar,
  Author    = {R. Girshick and P. Felzenszwalb and D. McAllester},
  Title     = {Object Detection with Grammar Models},
  Booktitle = {Proceedings of Advances in Neural
               Information Processing Systems ({NIPS})},
  Year      = {2011}}
    
State-of-the-art person detection on PASCAL VOC using DPM-like models described in a grammar framework.
Efficient Regression of General-Activity Human Poses from Depth Images
R. Girshick, J. Shotton, P. Kohli, A. Criminisi, A. Fitzgibbon
IEEE International Conference on Computer Vision (ICCV), 2011
supplement / video / bibtex
@inproceedings{girshick2011efficient,
  Author    = {R. Girshick and J. Shotton and P. Kohli and 
               A. Criminisi and A. Fitzgibbon},
  Title     = {Efficient Regression of General-Activity Human Poses from Depth Images},
  Booktitle = {Proceedings of the International
               Conference on Computer Vision ({ICCV})},
  Year      = {2011}}
    
Pose estimation using the Kinect depth sensor. Faster (4x) and more accurate than the original Kinect pose estimation algorithm (Shotton et al. CVPR 2011).
Cascade Object Detection with Deformable Part Models
P. Felzenszwalb, R. Girshick, D. McAllester
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010
oral presentation
slides (pdf) / slides (keynote) / talk / code (voc-release5) / bibtex
@inproceedings{felzenszwalb2010cascade,
  Author    = {P. Felzenszwalb and R. Girshick and D. McAllester},
  Title     = {Cascade Object Detection with Deformable Part Models},
  Booktitle = {Proceedings of the IEEE Conference on
               Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2010}}
    
Fast cascade algorithm for DPM detection (about 14x faster than the baseline).
Visibility Constraints on Features of 3D Objects
R. Basri, P. Felzenszwalb, R. Girshick, D. Jacobs, C. Klivans
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009
bibtex
@inproceedings{felzenszwalb2010cascade,
  Author    = {R. Basri and P. Felzenszwalb and R. Girshick 
               and D. Jacobs and C. Klivans},
  Title     = {Visibility Constraints on Features of 3D Objects},
  Booktitle = {Proceedings of the IEEE Conference on
               Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2009}}
    
Simulating Chinese Brush Painting: the Parametric Hairy Brush
R. Girshick
ACM SIGGRAPH Posters, 2004
Session: Nonphotorealistic Animation and Rendering
bibtex
@inproceedings{girshick2004simulating,
  Author = {R. Girshick},
  Title = {Simulating Chinese Brush Painting: The Parametric Hairy Brush},
  Booktitle = {{ACM SIGGRAPH 2004 Posters}},
  Series = {{SIGGRAPH} '04},
  Year = {2004}}
    
Undergrad senior thesis, Brandeis University, May 2004.
Authors listed alphabetically

Ph.D. dissertation

From Rigid Templates to Grammars: Object Detection with Structured Models
R. Girshick
Ph.D. dissertation, The University of Chicago, Apr. 2012
slides / bibtex
@phdthesis{girshick2012phd,
  Author = {R. Girshick},
  School = {University of Chicago},
  Title  = {From Rigid Templates to Grammars: 
            Object Detection with Structured Models},
  Year   = {2012}}
    
Models and algorithms that improve on the original DPM by more than 50% mAP.
Object Detection with Heuristic Coarse-to-Fine Search
R. Girshick
M.S. thesis, The University of Chicago, Dec. 2009
bibtex
@mastersthesis{girshick2009ms,
  Author = {R. Girshick},
  School = {University of Chicago},
  Title  = {Object Detection with Heuristic Coarse-to-Fine Search},
  Year   = {2009}}
    
The DPM cascade (CVPR 2010) was developed in my master's thesis.

Erdös = 3 (via two paths)


I like this website