![]() |
Welcome to Chen Zhang's Home Page I have recently completed my PhD
and graduated from Before coming to UW, I did my master study
in Vrije Universiteit Amsterdam on parallel and distributed computer systems.
Prior to that, I was an undergraduate student in |
|
PhD |
|
2006 - 2011 |
|
|
|
MSc |
|
2004 - 2006 |
|
|
|
BEng |
|
2000 - 2004 |
|
|
Current Projects |
Sep 2008 - Spring 2011 |
|
|
·
Cloud Database Solution Using HBase Develop tools and techniques to support multi-row distributed transactions (ACID) with global
strong snapshot isolation using HBase. The resulting HBaseSI system is the first SI system on bare-bones HBase. It is non-intrusive to existing HBase system configuration and user data, uses novel ways to efficiently handle distributed transactions without using consensus-based protocols, explicit atomic broadcast, and data locks.
|
||
|
·
CloudWF Designed a scalable and lightweight scientific workflow system for managing workflows composed of both MapReduce and legacy applications on Hadoop clouds. It is the first workflow management system targeted to take advantage of the Hadoop/HBase architecture for scalability, fault tolerance and ease of use.
|
||
|
·
CloudBATCH Designed a system to use Hadoop/HBase as a cluster management system in lieu of traditional batch job queuing systems to accept both MapReduce and legacy batch job submissions, removing the complexity and overhead of making the two kinds of systems compatible.
|
||
|
·
Space Weather Simulation Develop a prototype of an automated Space Weather Forecast simulation tool on clouds powered by Eucalyptus. We use Python scripts to automate the process of reading inputs from external data sources, running simulation, performing visualization and staging output files back to a web portal for users to view. |
||
|
·
Live Cell Image Processing Designed a system to execute legacy applications (MATLAB) with non-standard Hadoop input formats (i.e. image files instead of textual inputs) through the Hadoop MapReduce framework. It is one of the earliest efforts to apply Hadoop to large-scale scientific data processing instead of server side computations such as web indexing, etc. The project goal is to study the complex
molecular interactions that regulate biological systems. |
||
|
Past Projects |
Sep 2004 – Aug 2008 |
|
|
·
GridBASE Refined a previously developed database-driven light-weight distributed job execution system for running task-farmable applications on clusters/grids. The core concept behind GridBASE is similar to cloud computing in using computing power on-demand like electricity, treating every node as homogeneous. |
||
|
·
Parallel
MPEG Encoder Designed and implemented a parallel MPEG encoder to make video compressing fast and easy.
The application employed a self-developed library based on JavaGAT@GridLab. It split the
original video into chunks to be processed independently on grid compute nodes and merged the
intermediate result files into a final video output. ·
Bioinformatics
Workflow Framework Designed and implemented a grid workflow system as a web portal backend, tailored specifically to bioinformatics programs oered by IBIVU (http://ibivu.cs.vu.nl). The system was based on Opensymphony workflow engine to submit jobs to DAS-2 cluster. ·
Standalone
Globule Redirector Designed and implemented a redirection configuration fetcher which helped improve performance of Globule (http://www.globule.org) on top of Apache Portable Runtime. ·
FFPF
and SCAMPI project Ported pcap to FFPF (Fairly Fast Packet Filter) for network monitoring for terabyte backbone. · RFID Guardian A project focused on providing security and privacy in Radio Frequency Identification (RFID) systems. As one of the earliest participants to the project, contributed to the
design and analysis of the software system architecture and the development
of various usage scenarios as well as project design documents. |
||
|
·
Book
Chapter [1]
Hans De Sterck, Alex Papo, Chen Zhang, Micah Hamady, Rob Knight.
'Database-driven Grid Computing and Distributed Web Applications: A Comparison',
For Book “Grids for Bioinformatics and Computational Biology”.
Wiley, December 2007. ISBN: 978-0-471-78409-8. ·
Paper [1] Chen Zhang, Hans De Sterck. 'HBaseSI: A Solution for Multi-row Distributed Transactions with Global Strong Snapshot Isolation on Clouds', Scalable Computing: Practice and Experience, 12, July 2011. [2] Chen Zhang, Hans De Sterck. 'Supporting Multi-row Distributed Transactions with Global Snapshot Isolation Using Bare-bones HBase', The 11th ACM/IEEE International Conference on Grid Computing (Grid 2010), Oct 25-29, 2010, Brussels, Belgium. preprint
[3] Chen Zhang, Hans De Sterck. 'CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase', The Second IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2010), Nov 30 - Dec 3, 2010, Indianapolis, Indiana, USA. preprint
[4] Ryan Kennedy, Manuel E. Lladser, Zhiyuan Wu, Chen Zhang, Michael Yarus, Hans De Sterck, and Rob Knight. 'Natural and Artificial RNAs Occupy the Same Restricted Region of Sequence Space', RNA, 16:280-289, 2010.
[5] Chen Zhang, Hans De Sterck. 'CloudWF: A Computational Workflow System for Clouds Based on Hadoop', The First International Conference on Cloud Computing, December 1-4, 2009,
[6] Chen Zhang, Ashraf Aboulnaga, Hans De Sterck, Haig Djambazian, Rob Sladek. 'Case Study of Scientific Data Processing on a Cloud Using Hadoop', High Performance Computing Symposium 2009, June 14-17,2009, Kingston,
[7] Hans De Sterck, Chen Zhang, Aleks Papo. 'Database-driven grid computing with GridBASE', The 2007 IEEE International Symposium on Bioinformatics
and Life Science Computing. May 21-23, 2007, Niagara Fall, [8] Stevens le Blond, Ana-Maria Oprescu, Chen Zhang. 'Early Application Experience with the Grid Application Toolkit
(GAT)', The Fourteenth Global Grid Forum (GGF14), June 26-30, 2005, ·
Thesis [1] A Workflow Engine for Running Bioinformatics Codes On DAS-2 (Dutch
Grid). Aug 2006. Vrije Universiteit [2] Distributed and Parallel Oriented Layered Architecture Model for
NLP. Aug 2004. ·
Book [1] Chen Zhang, Fu Bing, Zhao Jun. 'Java2 programming 150 examples and explanations', China Publishing House of Electronics Industry (PHEI), 2003-09. Second print, 2004-02. |
|
|
Fall 2006 |
|
CS848 Advanced Topics in Databases: Management of Information Systems CS882 Protein Folding CS666 Advanced Algorithms |
|
|
Winter 2007 |
|
CS787 Computational Vision CS846 Analysis of Reactive Programs |
|
|
Fall 2008 |
|
CS848 Advanced Topics in Databases: Databases in Cloud Computing Environments |
·
TA
CS134 Principles of Computer
Science
CS135 Designing Functional Programs
CS338 Computer Applications in Business: Databases
CS348
Introduction
to Database Management
CS442/642
Principles of Programming Languages
CS454 Distributed Systems
Chen Zhang
(Currently off campus. New mailing address available upon request.)
Email: chen@chenzhang.info
|
Last updated: Dec. 2011 |
Copyright@Chen Zhang |