Ross B. Girshick
Postdoctoral fellow
University of California, Berkeley, EECS
r......@eecs.berkeley.edu
cv / google scholar / Ph.D. thesis
papers: journal / conference

About me

I finished my Ph.D. in computer vision at The University of Chicago under the supervision of Pedro Felzenszwalb in April 2012. Now, I'm a postdoctoral fellow working with Jitendra Malik at UC Berkeley.

My main research interests are in computer vision, AI, and machine learning. I'm particularly focused on building models for object detection and recognition. These models aim to incorporate the "right" biases so that machine learning algorithms can understand image content from moderate to large-scale datasets. I always have an eye towards fast systems that work well in practice.

During my Ph.D., I spent time as a research intern at Microsoft Research Cambridge, UK working on human pose estimation from (Kinect) depth images. I also participated in several first-place entries into the PASCAL VOC object detection challenge, and was awarded a "lifetime achievement" prize for my work on deformable part models. I think this refers to the lifetime of the PASCAL challenge—and not mine!

I will be joining Rick Szeliski's group at Microsoft Research as a Researcher in Sept. 2014!

News

Project pages

Journal papers

Efficient Human Pose Estimation from Single Depth Images
J. Shotton, R. Girshick, A. Fitzgibbon, T. Sharp, M. Cook, M. Finocchio, R. Moore, P. Kohli, A. Criminisi, A. Kipman, A. Blake
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, No. 12, Dec. 2013
bibtex
@article{shotton2013kinect,
  Author    = {J. Shotton and
               R. Girshick and
               A. Fitzgibbon and
               T. Sharp and
               M. Cook and
               M. Finocchio and
               R. Moore and
               P. Kohli and
               A. Criminisi and
               A. Kipman and
               A. Blake},
  Title     = {Efficient Human Pose Estimation
               from Single Depth Images},
  Volume    = {35},
  Number    = {12},
  Journal   = {Pattern Analysis and Machine Intelligence},
  Year      = {2013}}
    
An integrated description of the original Kinect pose estimation algorithm and our ICCV 2011 algorithm.
Object Detection with Discriminatively Trained Part Based Models
P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, Sep. 2010
code: PAMI code / latest code (voc-release5) / bibtex
@article{felzenszwalb2010dpm,
  Author    = {P. Felzenszwalb and
               R. Girshick and
               D. McAllester and
               D. Ramanan},
  Title     = {Object Detection with Discriminatively
               Trained Part Based Models},
  Volume    = {32},
  Number    = {9},
  Journal   = {Pattern Analysis and Machine Intelligence},
  Year      = {2010}}
    
Deformable part models (DPM). Also, CACM Research Highlight: Visual Object Detection with Deformable Part Models
P. Felzenszwalb, R. Girshick, D. McAllester, D. Ramanan
Communications of the ACM, no. 9 (2013): 97-105

Conference papers

Simultaneous Detection and Segmentation
Bharath Hariharan, Pablo Arbeláez, Ross Girshick, Jitendra Malik
European Conference on Computer Vision (ECCV), 2014
project page (with code) / bibtex
@inproceedings{hariharan14sds,
  Author    = {Bharath Hariharan and
               Pablo Arbel\'{a}ez and
               Ross Girshick and
               Jitendra Malik},
  Title     = {Simultaneous Detection and Segmentation},
  Booktitle = {Proceedings of the European
               Conference on Computer Vision ({ECCV})},
  Year      = {2014}}
    
An integrated approach to simultaneously detecting and segmenting objects, based on R-CNN. Achieves state-of-the-art performance on the traditional PASCAL VOC detection and semantic segmentation tasks as was as our newly proposed SDS metrics.
Learning Rich Features from RGB-D Images for Object Detection and Segmentation
Saurabh Gupta, Ross Girshick, Pablo Arbeláez, Jitendra Malik
European Conference on Computer Vision (ECCV), 2014
code [coming soon] / bibtex
@inproceedings{gupta14rcnndepth,
  Author    = {Saurabh Gupta and
               Ross Girshick and
               Pablo Arbel\'{a}ez and
               Jitendra Malik},
  Title     = {Learning Rich Features from {RGB-D} Images
               for Object Detection and Segmentation},
  Booktitle = {Proceedings of the European
               Conference on Computer Vision ({ECCV})},
  Year      = {2014}}
    
How do you learn features for detection in depth images? We present a method that significantly outperforms naively passing a depth map to a convolutional neural network. Our system also creates 2.5D region proposals and outputs instance segmentations.
Analyzing the Performance of Multilayer Neural Networks for Object Recognition
Pulkit Agrawal, Ross Girshick, Jitendra Malik
European Conference on Computer Vision (ECCV), 2014
bibtex
@inproceedings{agrawal14analyzing,
  Author    = {Pulkit Agrawal and
               Ross Girshick and
               Jitendra Malik},
  Title     = {Analyzing the Performance of Multilayer
               Neural Networks for Object Recognition},
  Booktitle = {Proceedings of the European
               Conference on Computer Vision ({ECCV})},
  Year      = {2014}}
    
Part-based R-CNNs for Fine-grained Category Detection
Ning Zhang, Jeff Donahue, Ross Girshick, Trevor Darrell
European Conference on Computer Vision (ECCV), 2014
oral presentation
bibtex
@inproceedings{zhang14finegrained,
  Author    = {Ning Zhang and
               Jeff Donahue and
               Ross Girshick and
               Trevor Darrell},
  Title     = {Part-based {R-CNNs} for Fine-grained
               Category Detection},
  Booktitle = {Proceedings of the European
               Conference on Computer Vision ({ECCV})},
  Year      = {2014}}
    
On Learning to Localize Objects with Minimal Supervision
Hyun Oh Song, Ross Girshick, Stefanie Jegelka, Julien Mairal, Zaid Harchaoui, Trevor Darrell
International Conference on Machine Learning (ICML), 2014
code / bibtex
@inproceedings{song14slsvm,
  Author    = {Hyun Oh Song and
               Ross Girshick and
               Stefanie Jegelka and
               Julien Mairal and
               Zaid Harchaoui and
               Trevor Darrell},
  Title     = {On learning to localize objects with
               minimal supervision},
  Booktitle = {Proceedings of the International
               Conference on Machine Learning ({ICML})},
  Year      = {2014}}
    
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
R. Girshick, J. Donahue, T. Darrell, J. Malik
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
oral presentation
arXiv tech report (includes ImageNet results) / supplement / code / poster / slides / bibtex
@inproceedings{girshick2014rcnn,
  Author    = {Ross Girshick and
               Jeff Donahue and
               Trevor Darrell and
               Jitendra Malik},
  Title     = {Rich feature hierarchies for accurate
               object detection and semantic segmentation},
  Booktitle = {Proceedings of the IEEE Conference on
               Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2014}}
    
This paper proposes R-CNN, a state-of-the-art visual object detection system that combines bottom-up region proposals with rich features computed by a convolutional neural network. At the time of its release, R-CNN improved the previous best detection performance on PASCAL VOC 2012 by 30% relative, going from 40.9% to 53.3% mean average precision.
Using k-poselets for Detecting People and Localizing their Keypoints
G. Gkioxari*, B. Hariharan*, R. Girshick, J. Malik
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
* equal contribution
project page / code / github / bibtex
@inproceedings{gkioxari2014kposelets,
  Author    = {Georgia Gkioxari and
               Bharath Hariharan and
               Ross Girshick and
               Jitendra Malik},
  Title     = {Using {k-poselets} for detecting people
               and localizing their keypoints},
  Booktitle = {Proceedings of the IEEE Conference on
               Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2014}}
    
Understanding Objects in Detail with Fine-grained Attributes
A. Vedaldi, S. Mahendran, S. Tsogkas, S. Maji, R. Girshick, J. Kannala, E. Rahtu, I. Kokkinos, M. B. Blaschko, D. Weiss, B. Taskar, K. Simonyan, N. Saphra, S. Mohamed
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014
dataset [coming soon] / bibtex
@inproceedings{vedaldi14understanding,
  Author    = {A. Vedaldi and
               S. Mahendran and
               S. Tsogkas and
               S. Maji and
               R. Girshick and
               J. Kannala and
               E. Rahtu and
               I. Kokkinos and
               M. B. Blaschko and
               D. Weiss and
               B. Taskar and
               K. Simonyan and
               N. Saphra and
               S. Mohamed},
  Title     = {Understanding Objects in Detail with
               Fine-grained Attributes},
  Booktitle = {Proceedings of the {IEEE} Conference
               on Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2014}}
    
Training Deformable Part Models with Decorrelated Features
R. Girshick, J. Malik
IEEE International Conference on Computer Vision (ICCV), 2013
supplement / LM-LLDA DPM training code / bibtex
@inproceedings{girshick2013training,
  Author    = {Ross Girshick and
               Jitendra Malik},
  Title     = {Training Deformable Part Models with
               Decorrelated Features},
  Booktitle = {Proceedings of the International
               Conference on Computer Vision ({ICCV})},
  Year      = {2013}}
    
Ever wonder what makes DPM tick? We dissect DPM training to figure out what bits are important. Also, we present Latent LDA, a fast method to train DPMs without using hard negative examples.
Discriminatively Activated Sparselets
R. Girshick*, H. O. Song*, T. Darrell
International Conference on Machine Learning (ICML), 2013
oral presentation
supplement / Caltech-101 demo code / bibtex
@inproceedings{girshick13das,
  Author    = {Ross Girshick and
               Hyun Oh Song and
               Trevor Darrell},
  Title     = {Discriminatively Activated Sparselets},
  Booktitle = {Proceedings of the International
               Conference on Machine Learning ({ICML})},
  Year      = {2013}}
    
Speed up a wide array of structured predictors (including DPMs and multiclass SVMs) by discriminatively learning activations over a dictionary of model atoms. Same speedup as the ECCV 2012 sparselets paper, but with much higher accuracy.
Sparselet Models for Efficient Multiclass Object Detection
H.O. Song, S. Zickler, T. Althoff, R. Girshick, M. Fritz, C. Geyer, P. Felzenszwalb, T. Darrell
European Conference on Computer Vision (ECCV), 2012
code / bibtex
@inproceedings{song2012sparselet,
  Author    = {H. O. Song and
               S. Zickler and
               T. Althoff and
               R. Girshick and
               M. Fritz and
               C. Geyer and
               P. Felzenszwalb and
               T. Darrell},
  Title     = {Sparselet Models for Efficient
               Multiclass Object Detection},
  Booktitle = {Proceedings of the European
               Conference on Computer Vision ({ECCV})},
  Year      = {2012}}
    
Fast DPM detection by sparse coding model parameters.
Object Detection with Grammar Models
R. Girshick, P. Felzenszwalb, D. McAllester
Neural Information Processing Systems (NIPS), 2011
spotlight video / code (voc-release5) / bibtex
@inproceedings{girshick2011grammar,
  Author    = {R. Girshick and
               P. Felzenszwalb and
               D. McAllester},
  Title     = {Object Detection with Grammar Models},
  Booktitle = {Proceedings of Advances in Neural
               Information Processing Systems ({NIPS})},
  Year      = {2011}}
    
State-of-the-art person detection on PASCAL VOC using DPM-like models described in a grammar framework.
Efficient Regression of General-Activity Human Poses from Depth Images
R. Girshick, J. Shotton, P. Kohli, A. Criminisi, A. Fitzgibbon
IEEE International Conference on Computer Vision (ICCV), 2011
supplement / video / bibtex
@inproceedings{girshick2011efficient,
  Author    = {R. Girshick and
               J. Shotton and
               P. Kohli and
               A. Criminisi and
               A. Fitzgibbon},
  Title     = {Efficient Regression of General-Activity
               Human Poses from Depth Images},
  Booktitle = {Proceedings of the International
               Conference on Computer Vision ({ICCV})},
  Year      = {2011}}
    
Pose estimation using the Kinect depth sensor. Faster (4x) and more accurate than the original Kinect pose estimation algorithm (Shotton et al. CVPR 2011).
Cascade Object Detection with Deformable Part Models
P. Felzenszwalb, R. Girshick, D. McAllester
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010
oral presentation
slides (pdf) / slides (keynote) / talk / code (voc-release5) / bibtex
@inproceedings{felzenszwalb2010cascade,
  Author    = {P. Felzenszwalb and
               R. Girshick and
               D. McAllester},
  Title     = {Cascade Object Detection with
               Deformable Part Models},
  Booktitle = {Proceedings of the IEEE Conference on
               Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2010}}
    
Fast cascade algorithm for DPM detection (about 14x faster than the baseline).
Visibility Constraints on Features of 3D Objects
R. Basri, P. Felzenszwalb, R. Girshick, D. Jacobs, C. Klivans
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009
bibtex
@inproceedings{felzenszwalb2010cascade,
  Author    = {R. Basri and
               P. Felzenszwalb and
               R. Girshick and
               D. Jacobs and
               C. Klivans},
  Title     = {Visibility Constraints on
               Features of 3D Objects},
  Booktitle = {Proceedings of the IEEE Conference on
               Computer Vision and Pattern Recognition ({CVPR})},
  Year      = {2009}}
    
Simulating Chinese Brush Painting: the Parametric Hairy Brush
R. Girshick
ACM SIGGRAPH Posters, 2004
Session: Nonphotorealistic Animation and Rendering
bibtex
@inproceedings{girshick2004simulating,
  Author = {R. Girshick},
  Title = {Simulating Chinese Brush Painting:
           The Parametric Hairy Brush},
  Booktitle = {{ACM SIGGRAPH 2004 Posters}},
  Series = {{SIGGRAPH} '04},
  Year = {2004}}
    
Undergrad senior thesis, Brandeis University, May 2004.
Authors listed alphabetically

Ph.D. dissertation

From Rigid Templates to Grammars: Object Detection with Structured Models
R. Girshick
Ph.D. dissertation, The University of Chicago, Apr. 2012
slides / bibtex
@phdthesis{girshick2012phd,
  Author = {R. Girshick},
  School = {University of Chicago},
  Title  = {From Rigid Templates to Grammars: 
            Object Detection with Structured Models},
  Year   = {2012}}
    
Models and algorithms that improve on the original DPM by more than 50% mAP.
Object Detection with Heuristic Coarse-to-Fine Search
R. Girshick
M.S. thesis, The University of Chicago, Dec. 2009
bibtex
@mastersthesis{girshick2009ms,
  Author = {R. Girshick},
  School = {University of Chicago},
  Title  = {Object Detection with Heuristic Coarse-to-Fine Search},
  Year   = {2009}}
    
The DPM cascade (CVPR 2010) was developed in my master's thesis.

Erdös = 3 (via two paths)


I like this website