Haoyuan Li

Haoyuan (H.Y.) Li is the Founder, Chairman, and CEO of Alluxio. He graduated with a Computer Science Ph.D. from the AMPLab at UC Berkeley, advised by Prof. Scott Shenker and Prof. Ion Stoica. At the AMPLab, he co-created and led Alluxio (formerly Tachyon), an open source virtual distributed file system. Before UC Berkeley, he got a M.S. from Cornell University and a B.S. from Peking Univeristy, all in Computer Science.

Ph.D. Dissertation: Alluxio: A Virtual Distributed File System

Contacts: haoyuan@alluxio.com, [Github] [LinkedIn] [Twitter] [Weibo]

Projects

Alluxio (formerly Tachyon): A memory speed virtual distributed file system. The project is open source and is deployed at hundreds of companies. It has more than 1000 contributors from over 200 institutions, including Alibaba, Yahoo, Intel, Baidu, IBM, Tencent, and Redhat etc. [SOCC 13] [Github] [San Francisco Bay Area Meetup]

Spark Streaming: Spark Streaming offers a high-level functional programming API, strong consistency, and efficient fault recovery. It is now part of the Spark, which lets users seamlessly intermix streaming, batch and interactive queries. [HotCloud'12] [SOSP'13] [Github]

Apache Spark: A cluster computing engine that makes data analytics fast. It provides an efficient abstraction for distributed in-memory computation. I am a founding committer of Apache Spark. [Github]

Parallel Frequent Pattern Mining: Various algorithms have been developed to speed up frequent itemset mining performance. We designed a parallel FP-Growth algorithm, and ran it on a cluster of several thousands of machines. It became a part of Apache Mahout. [RecSys'08]

Alluxio (formerly Tachyon), Spark Streaming, Apache Spark, Shark, and Apache Mesos are parts of the Berkeley Data Analytics Stack (BDAS).

Publications

Google Scholar

FairRide: Near-Optimal, Fair Cache Sharing, Qifan Pu, Haoyuan Li, Matei Zaharia, Ali Ghodsi and Ion Stoica. NSDI 2016, March 2016.
The Missing Piece in Complex Analytics: Low Latency, Scalable Model Management and Serving with Velox, Dan Crankshaw, Peter Bailis, Joey Gonzalez, Haoyuan Li, Zhao Zhang, Michael J. Franklin, Ali Ghodsi and Michael I. Jordan. CIDR 2015, January 2015.
Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks, Haoyuan Li, Ali Ghodsi, Matei Zaharia, Scott Shenker and Ion Stoica. SOCC 2014, November 2014.
Reliable, Memory Speed Storage for Cluster Computing Frameworks, Haoyuan Li, Ali Ghodsi, Matei Zaharia, Scott Shenker and Ion Stoica. UCB EECS Tech Report 2014, June 2014.
Discretized Streams: Fault-Tolerant Streaming Computation at Scale, Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. SOSP 2013, November 2013.
Tachyon: Memory Throughput I/O for Cluster Computing Frameworks, Haoyuan Li, Ali Ghodsi, Matei Zaharia, Eric Baldeschwieler, Scott Shenker and Ion Stoica. LADIS 2013, November 2013.
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing, Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. UCB EECS Tech Report 2012, December 2012.
Tradeoffs in CDN designs for throughput oriented traffic, Minlan Yu, Wenjie Jiang, Haoyuan Li, and Ion Stoica. CoNEXT 2012, December 2012.
Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters, Matei Zaharia, Tathagata Das, Haoyuan Li, Scott Shenker, and Ion Stoica. HotCloud 2012, June 2012.
Quilt: A Patchwork of Multicast Regions, Qi Huang, Ken Birman, Ymir Vigfusson, and Haoyuan Li. DEBS 2010, July 2010.
Dr. Multicast: Rx for Data Center Communication Scalability, Ymir Vigfusson, Hussam Abu-Libdeh, Mahesh Balakrishnan, Ken Birman, Robert Burgess, Haoyuan Li, Gregory Chockler, and Yoav Tock. EuroSys 2010, April 2010.
Declarative Languages to Declarative Processing in Computer Games, Ben Sowell, Alan Demers, Johannes Gehrke, Nitin Gupta, Haoyuan Li, and Walker White. CIDR 2009, January 2009.
PFP: Parallel FP-Growth for Query Recommendation, Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang, and Edward Chang. RecSys 2008, October 2008.

Selected Awards

Olin Fellowship, IBM Fellowship (twice), Morgan Stanley Fellowship, Beijing Outstanding Graduates, Chinese National Fellowship, Innovation Award at Peking University, Pacemaker to Outstanding students at Peking University (three times), General Electric Fellowship, No. 11 and No. 13 in ACM-ICPC World Final 2005 and 2006, No. 8 in Google Code Jam China Final,

Template design by Andreas Viklund. Valid XHTML and CSS.