Matei Zaharia
I'm a Ph.D. student in the UC Berkeley AMP Lab, interested in computer systems, networking, and cloud computing. My advisors are Scott Shenker and Ion Stoica. I'm supported by a Google Ph.D. fellowship.
Before joining Berkeley, I got my Bachelor's degree from the University of Waterloo, in Canada, where I worked with Srinivasan Keshav.
You can contact me at matei@berkeley.edu or find me in Soda 493B.
Projects
I focus on systems and algorithms for large-scale data-intensive computing. My projects include:
Spark: As big data analytics evolves beyond simple batch jobs, there is a need for both more complex multi-stage applications (e.g. machine learning algorithms) and more interactive ad-hoc queries. Spark provides efficient and fault-tolerant primitives for in-memory cluster computing, and can run 30x faster than Hadoop MapReduce for these applications. (homepage) (short paper) (tech report)
Mesos: Clusters are running increasingly diverse applications, from batch jobs to interactive services. Mesos is a cluster manager that efficiently supports diverse applications by letting them control their own scheduling. The project is open source in the Apache Incubator. (homepage) (NSDI'11 paper)
Multi-Resource Fairness: Life is not fair, but with a little help, your computer system can be — ensuring predictable time-sharing between users. However, past work on fair sharing considered a single resource (e.g. CPU), while datacenter applications have demands across multiple resources (memory, IO, CPU, etc). Dominant resource fairness generalizes max-min fairness for this case. (NSDI'11 paper)
MapReduce Scheduling: I've worked on several scheduling algorithms for MapReduce, including the LATE algorithm for straggler mitigation (OSDI'08) and delay scheduling for data locality (Eurosys'10). Both algorithms are now included in Hadoop. I also developed the Hadoop Fair Scheduler.
SNAP Sequence Aligner: I'm working with colleagues from Microsoft and UCSF on SNAP, a sequence alignment algorithm that is 10-100x faster than current tools and simultaneously more accurate, to handle the growing volume of data from high-throughput DNA sequencers. (arXiv paper)
Publications
2012
- M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M.J. Franklin, S. Shenker, and I. Stoica. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing, to appear at NSDI 2012.
2011
- T. Hunter, T. Moldovan, M. Zaharia, S. Merzgui, J. Ma, M.J. Franklin, P. Abbeel, and A.M. Bayen. Scaling the Mobile Millennium System in the Cloud, SOCC 2011, October 2011.
- M. Chowdhury, M. Zaharia, J. Ma, M.I. Jordan and I. Stoica, Managing Data Transfers in Computer Clusters with Orchestra SIGCOMM 2011, August 2011.
- B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A.D. Joseph, R. Katz, S. Shenker and I. Stoica, Mesos: Flexible Resource Sharing for the Cloud, USENIX ;login:, August 2011.
- M. Zaharia, B. Hindman, A. Konwinski, A. Ghodsi, A.D. Joseph, R. Katz, S. Shenker and I. Stoica, The Datacenter Needs an Operating System, HotCloud 2011, June 2011.
- B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A.D. Joseph, R. Katz, S. Shenker and I. Stoica, Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center, NSDI 2011, March 2011.
- A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica, Dominant Resource Fairness: Fair Allocation of Multiple Resources Types, NSDI 2011, March 2011.
2010
- M. Zaharia, M. Chowdhury, M.J. Franklin, S. Shenker and I. Stoica. Spark: Cluster Computing with Working Sets, HotCloud 2010, June 2010.
- M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker and I. Stoica. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling, EuroSys 2010, April 2010.
- M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R.H. Katz, A. Konwinski, G. Lee, D.A. Patterson, A. Rabkin, I. Stoica and M. Zaharia, Above the Clouds: A View of Cloud Computing, Communications of the ACM, April 2010.
Earlier
- B. Hindman, A. Konwinski, M. Zaharia and I. Stoica, A Common Substrate for Cluster Computing, HotCloud 2009, June 2009.
- R. Luk, M. Zaharia, M. Ho, B. Levine and P. Aoki, ICTD for Healthcare in Ghana: Two Parallel Case Studies, ICTD 2009, April 2009.
- M. Zaharia, A. Konwinski, A.D. Joseph, R. Katz and I. Stoica, Improving MapReduce Performance in Heterogeneous Environments, OSDI 2008, December 2008.
- S. Guo, M.H. Falaki, E.A. Oliver, S. Ur Rahman, A. Seth, M. Zaharia, U. Ismail, and S. Keshav, Design and Implementation of the KioskNet System, ICTD 2007, December 2007.
- S. Guo, M.H. Falaki, E.A. Oliver, S. Ur Rahman, A. Seth, M. Zaharia, and S. Keshav, Very Low-Cost Internet Access Using KioskNet, ACM Computer Communication Review, October 2007.
- M. Zaharia and S. Keshav, Gossip-based Search Selection in Hybrid Peer-to-Peer Networks, J. Concurrency and Computation: Practice and Experience, 2007.
- M. Zaharia, A. Chandel, S. Saroiu, and S. Keshav, Finding Content in File-Sharing Networks When You Can't Even Spell, Proc. IPTPS, February 2007.
- A. Seth, D. Kroeker, M. Zaharia, S. Guo, S. Keshav, Low-cost Communication for Rural Internet Kiosks Using Mechanical Backhaul, Proc. MOBICOM 2006, September 2006.
- M. Zaharia and S. Keshav, Gossip-Based Search Selection in Hybrid Peer-to-Peer Networks, Proc. IPTPS, February 2006.
Talks
- Spark: In-Memory Cluster Computing for Iterative and Interactive Applications (machine learning focused version) (pptx, pdf) NIPS Big Learning Workshop, Sierra Nevada, Spain, December 2011. Runner-up for best talk.
- Spark: In-Memory Cluster Computing for Iterative and Interactive Applications (pptx, pdf) Google Inc, Mountain View, CA, October 2011.
- The Datacenter Needs an Operating System (ppt, pdf) HotCloud 2011, Portland, OR, June 2011.
- Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center (ppt, pdf), NSDI 2011, Boston, MA, March 2011.
- Spark: In-Memory Cluster Computing for Iterative and Interactive Applications (ppt, pdf), Stanford University, Stanford, CA, February 2011.
- Spark: Cluster Computing with Working Sets (ppt, pdf), HotCloud 2010, Boston, MA, June 2010.
- Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling (ppt, pdf), Eurosys 2010, Paris, France, April 2010.
- Job Scheduling with the Fair and Capacity Schedulers (ppt, pdf), Hadoop Summit 2009, Santa Clara, CA, June 2009.
- Job Scheduling for MapReduce (ppt, pdf), Microsoft Research Silicon Valley, Mountain View, CA, January 2009.
- Improving MapReduce Performance in Heterogeneous Environments (ppt, pdf), OSDI 2008, San Diego, CA, December 2008.
Open Source
Almost all of my work is open source. The LATE algorithm for straggler mitigation and the Hadoop Fair Scheduler are part of Apache Hadoop, and I continue to contribute to Hadoop as a committer. Mesos and Spark are both available on GitHub.
Other Activities
Starting in high school, I've participated in a number of programming contests, including the International Olympiad in Informatics and the ACM International Collegiate Programming Contest. I've now stopped doing contests, but I still love algorithmic and mathematical problems.
In undergrad, I contributed to the open source realtime strategy game 0 A.D., where I worked on gameplay logic, random map generation, water rendering, and multiplayer networking.
I enjoy reading, nature, and food that is either good or free.