Hometown: Ganzhou, Jiangxi Province, China
Degree: Ph.D in Computer Science
Title: Associate Professor
Favor Proverbs: Yesterday is history, tommorow is mystery, but today is a gift, that is why it is called the present.
Email: firstname.lastname@example.org (Recommended)
email@example.com (NOT recommended due to email loss sometimes)
Mail Address: College of Computer Science and Technology, Hangzhou Dianzi University, XiaSha High Education District, Hangzhou, Zhejiang, China 310018
Zujie Ren is an associate professor with the College of Computer Science and Technology, Hangzhou Dianzi University. Zujie is currently working at the cloud computing research institute of Hangzhou Dianzi University. Zujie studied at Database Lab in Computer Science the Zhejiang University and received his Ph.D degree in Sep. 2010. Prior to joining the faculty at Oct. 2010, he had worked in the NetEase Research Hangzhou for more than three years for his internship. From Sep. 2011 to Jun. 2012, he had worked with Data Platform Team in Alibaba Inc., focusing on the workload characterization and performance optimization for Taobao Hadoop cluster (called Yunti 云梯). From May 2013 to up date, he has been working with Aspara (飞天) Team in Aliyun, Inc, focusing on the log analysis of Aspara, which is distributed computing platform developed by Aliyun.
Zujie's research interests include workload analysis and performance optimization of cloud computing or other distributed systems.
Massive data processing
With the rapid growth of data volume in many enterprisers' IT infrastructure, large-scale data processing become an urgent challenge and receive a plenty of attention from both academical and industrial fields. MapReduce framework, proposed by Google, provides a highly scalable solution for data processing. The fundamental of MapReduce is to distribute data among large number of nodes and processing the data in parallel. Hadoop is an open-source implementation of MapReduce framework, can easily scale out to thousands of nodes and work with petabyte data. Due to the high scalability and performance, Hadoop has gained much popularity and widely usage. A lot of company, such Yahoo, Facebook, and research groups use Hadoop to run their data-intensive jobs, such as click-log analysis, web crawling and data mining.
Our work was originally motivated by the Hadoop cluster at Taobao. Taobao is a leading online e-commence company in China with more than 70% share. To provide large-scale data processing service to other group company, Taobao construct a giant data warehouse on Hadoop. System log, crawled pages and replicas of online relational database(Oracle or MySQL), are gathered in this cluster continuously, where they are used for numerous applications, including traffic statistics, product sales trends and recommender systems. This data warehouse runs on more than 2,000 nodes and stores more than 20PB of compressed data which is growing at the rate 20TB per day. Besides production jobs that must run periodically, there are many temporary jobs, ranging from multi-hour collaborate filtering computations to several seconds ad-hoc queries. Most of the production jobs are automatically run during the mid-night, and they should be completely finished before the users start to work in the morning. While for the temporary jobs, most of them are submitted by the Taobao engineers on the working time and run immediately. The number of jobs run in the Hadoop cluster per day exceeds forty thousands, the number of users is more than five hundred.
A Realistic Hadoop Simulator for Parameters Tuning and Scalability Analysis]]
Ankus: E-Commerce Workload Synthesization
Resource Scheduling in Data-Centric Systems
With the explosive growth of data volumes, more and more organizations build large-scale data-centric systems (DCS), which serve as infrastructures for various applications involving "big data". As data-centric systems continue to grow, so does the need for effective resource scheduling. Most data-centric systems, equipped with limited resources, need to serve multiple users and execute various workloads simultaneously. Resource scheduling is essential for improving system throughput and user response time. However, the resource scheduling problem in data-centric systems is challenging due to the system complexity and workload diversity. To date, various scheduling techniques have been proposed and applied in different instances of data-centric systems, such as cloud computing platforms, HPC clusters and MapReduce-style systems. Thus, an overall landscape of the current research within this field will benefit researchers greatly. We aim to give a comprehensive survey on the existing research advances, which is helpful to optimize resource scheduling techniques. We initiatively categorize the resource scheduling approaches into three groups according to the scheduling model: resource provision, job scheduling and data scheduling. We give a systematic review of the most significant techniques for each group, and present some open problems yet to be addressed. Then, we discuss four case studies, each of which is carefully chosen from practical or productional systems. Finally, we outline some open problems and challenges within the area of resource scheduling. We believe this systematic and comprehensive analysis will provide a much greater insight the existing scheduling techniques and inspire new developments within this field.
Log analysis and Problem Diagnosis on Large-scale Systems
System logs have provided a rich information source for failure detection, failure prediction and root cause diagnosis, but with the continuous increase of the system size, it is a challenging task to collect, analyze and manage logs. However, with the rapid growth of the system scale and the popularity of various applications in productional environments, the volume of logs emerged per day becomes huge, posing serious challenges for storage and analysis. Log filtering technology has been widely used in system log analysis and handling process. The existing research can be approximately divided into the instance based method and the feature based method. The instance based approach is generally used to identify instances containing abnormal information and delete instances with redundant information. To solve these problems, we focus on developing an online log filtering mechanism to eliminate the redundant and noisy log records through event filtering and instance filtering, aiming to minimize the log size without losing important information required for the fault diagnosis.
Benchmarking for Cloud OS
Over the past few years, cloud file systems such as Google File System (GFS) and Hadoop Distributed File System (HDFS) have received significant research efforts to optimize their mechanisms and implementations. A common issue for these system optimization efforts is performance benchmarking. However, many system researchers and engineers face challenges on making a benchmark that reflects real-life workload cases, due to the system complexity and vagueness of I/O workload characteristics. They could easily make incorrect assumptions about their systems and workloads, leading to the benchmark results do not accord with the fact.
As the preliminary step for making a realistic benchmark, we make efforts to explore the characteristics of data and I/O workload in a production environment. We collect a two-week I/O workload trace from a 2,500-node production cluster, which is one of the largest cloud platform in Asia. This cloud platform provides two public cloud services: data storage service and data processing service. We analyze the commonalities and individualities between both cloud services in multiple perspectives, including request arrival pattern, request size, data population and so on. Key observations include the request arrival rate follows a log-normal distribution rather than Poisson distribution, request arrival presents multiple periodicities, cloud file systems fit partly-open model rather than completely open model, etc. Based on the comparative analysis results, we derive some interesting implications for guiding system researchers and engineers to build a realistic benchmark on their own systems. We discuss several open issues and challenges raised on benchmarking cloud file systems.
Program (Co-)Chair: MDSP 2012, MDSP 2013
Program Commitee: FM-S&C'11, FM-S&C'12
Reviewer: Computer Journal, Oxford University Press
- NSF of China
Title: Data Scheduling for Task Acceleratin on Massive Data Processing
- NSF of Zhejiang Province
Title: Research on KV Storage Engine-based Metadata Server Cluster
- NSF of Zhejiang Province
Title: Research on LSM-Tree Model Based Metadata Management Techniques
- Zhejiang Provincial Key Research and Development Plan
Title: Distributed Platform for Intergrated Batch and Streaming Computing
Zujie's DBLP entry is here.
- Zujie Ren, Weisong Shi, Jian Wan, Feng Cao, Jiangbin Lin: Realistic and Scalable Benchmarking Cloud File Systems: Practices and Lessons from AliCloud. IEEE Transactions on Parallel and Distributed Systems (TPDS) 28(11): 3272-3285 (2017) http://doi.ieeecomputersociety.org/10.1109/TPDS.2017.2715327 (SCI IF=4.181, CCF A类期刊, JCR一区)
- Zujie Ren, Jian Wan, Weisong Shi, Xianghua Xu, Min Zhou: Workload Analysis, Implications, and Optimization on a Production Hadoop Cluster: A Case Study on Taobao. IEEE Transactions on Services Computing 7(2): 307-321 (2014) (SCI IF=3.520, CCF B类期刊, JCR一区)
- Zujie Ren, Xianghua Xu, Jian Wan: A Comparative Investigation on Different P2P Search Systems in Dynamic Environments. ChinaGrid 2011: 116-123(EI Journal)
- Gang Chen, Zujie Ren, Lidan Shou, Ke Chen, Yijun Bei: PISA: A framework for integrating uncooperative peers into P2P-based federated search. Computer Communications 34(6): 715-729 (2011)(SCI IF=1.325)
- Zujie Ren, Ke Chen, Lidan Shou, Gang Chen, Yijun Bei, Xiaoyan Li: HAPS: Supporting Effective and Efficient Full-Text P2P Search with Peer Dynamics. J. Comput. Sci. Technol. 25(3): 482-498 (2010)(SCI IF=0.642)
- Li Zhou, Tianming Zhang, Zujie Ren, Weisong Shi, Jian Wan, Measuring Performance Isolation of Cloud File Systems, HPC China 2016 (Accepted)
- Zujie Ren, Biao Xu, Weisong Shi, Yong-Jian Ren, Feng Cao, Jiangbin Lin, Zheng Ye: iGen: A Realistic Request Generator for Cloud File Systems Benchmarking. CLOUD 2016: 343-350 (CCF B类会议).
- Zujie Ren, Weisong Shi, Jian Wan, Towards Realistic Benchmarking for Cloud File Systems: Early Experiences,IISWC 2014, 88-98.(CCF C类会议)
- Zujie Ren, Zhijun Liu, Xianghua Xu, Jian Wan, Weisong Shi, Min Zhou, WaxElephant: a realistic Hadoop simulator for parameters tuning and scalability analysis, ChinaGrid, September 20-23, 2012:pp.9-16, Beijing, China, 2012
- Jian Wan, Minggang Liu, Xixiang Hu, Zujie Ren, Jilin Zhang, Weisong Shi and Wei Wu, Dual-JT: toward the high availability of JobTracker in Hadoop, CloudCom, December 3-6, pp.263-268, Taiwan, China, 2012
- Zujie Ren, Xianghua Xu, Jian Wan, Weisong Shi, Min Zhou: Workload characterization on a production Hadoop cluster: A case study on Taobao. IEEE International Symposium on Workload Characterization (IISWC), November 4-6, pp.3-13, San Diego, USA, 2012 (Best Paper Award!)(CCF C类会议)
- Jian Wan, Jiawei Yan, Congfeng Jiang, Li Zhou, Zujie Ren, Yong-Jian Ren: Effective and Efficient Web Reviews Extraction Based on Hadoop. ICSOC Workshops 2013: 107-118
- Zujie Ren, Lidan Shou, Gang Chen, Chun Cen, Yiju Bei: PISA: Federated Search in P2P Networks with Uncooperative Peers. DEXA 2009: 735-744
- Yusen Wu, Zujie Ren, Weisong Shi, Xing Wang, Xiaolong Zhang, E Chen, Yuan Wang: An Early Experience on Container Technologies from NetEase. ACM SIGOPS China 2017, Shanghai, May 12-14.
Invited Book Chapter:
- Zujie Ren, Xiaohong Zhang, Weisong Shi, Resource Scheduling in Data-Centric Systems, in Handbook on Data Centers, Springer. 1307-1330, 2015. http://dx.doi.org/10.1007/978-1-4939-2092-1_46
- [Spring 2011] JavaEE Development
- [Spring 2011] Computer Security
- [Autumn 2012] Massive Data Storage and Processing
- [Autumn 2012] Data Structure
- [Spring 2013] Object-oriented Programming using C++
- [Autumn 2013] Data Structure
- [Autumn 2013] Object-oriented Programming using C++
- [Autumn 2013] Objected-oriented analysis and design [for graduates]
- [Spring 2014] Objected-oriented analysis and design [for graduates]
- [Spring 2014] Advanced Database Systems[for graduates]
- [Winter 2014] Big Data Storage and Processing[for graduates in ZJU SC]