Haoyuan Li

I'm a Computer Science PhD student in the AMP Lab at UC Berkeley, interested in computer systems, big data, and cloud computing. My advisors are Scott Shenker and Ion Stoica. Before Berkeley, I studied at Cornell University and Peking University, and worked at Conviva and Google.

You can contact me at haoyuan@cs.berkeley.edu , or [Github] [Twitter] [LinkedIn] [Weibo]

Projects

I focus on systems and algorithms for large-scale data-intensive computing. Below is a list of open sourced projects that I contribute to:

Tachyon: A fault tolerant distributed file system enabling reliable file sharing at memory-speed across cluster frameworks, such as Spark and MapReduce. [LADIS'13] [Github]

Spark Streaming: Spark Streaming offers a high-level functional programming API, strong consistency, and efficient fault recovery. It is now part of the Spark, which lets users seamlessly intermix streaming, batch and interactive queries. [HotCloud'12] [SOSP'13] [Github]

Apache Spark: A cluster computing engine that makes data analytics fast. It provides an efficient abstraction for distributed in-memory computation. Besides the streaming part, I worked on the initial version of Storage Manager. I am a founding committer of Apache Spark. [Github]

Shark: A high-speed query engine runs Hive SQL queries on top of Spark, and supports fault recovery and complex analytics (e.g. machine learning). I contributed to the integration with Tachyon. [Github]

Parallel Frequent Pattern Mining: Various algorithms have been developed to speed up frequent itemset mining performance. We designed a parallel FP-Growth algorithm, and ran it on a cluster of several thousands of machine. It became a part of Apache Mahout. [RecSys'08]

Apache Mesos and Apache Yarn: Both Mesos and Yarn are cluster resource managers. I ported Yarn to run on top of Mesos.

Tachyon, Spark Streaming, Apache Spark, Shark, and Apache Mesos are parts of the Berkeley Data Analytics Stack (BDAS).

Publications

Talks

Selected Awards

Olin Fellowship, IBM Fellowship (twice), Morgan Stanley Fellowship, Beijing Outstanding Graduates, Chinese National Fellowship, Innovation Award at Peking University, Pacemaker to Outstanding students at Peking University (three times), General Electric Fellowship, No. 11 and No. 13 in ACM-ICPC World Final 2005 and 2006, No. 8 in Google Code Jam China Final,

Template design by Andreas Viklund. Valid XHTML and CSS. Password Manager: OneLastPass.