CSE 511 Data Processing at Scale

Database systems are used to provide convenient access to disk-resident data through efficient query processing, indexing structures, concurrency control, and recovery. This course delves into new frameworks for processing and generating large-scale datasets with parallel and distributed algorithms, covering the design, deployment and use of state-of-the-art data processing systems, which provide scalable access to data

Specific topics covered include:

  • Efficient query processing
  • Indexing structures
  • Distributed database design
  • Parallel query execution
  • Concurrency control in distributed parallel database systems
  • Data management in cloud computing environments
  • Data management in Map/Reduce-based
  • NoSQL database systems
Jia Yu
Jia Yu

Jia Yu is a co-founder of Wherobots Inc. and leads its engineering team. He is currently on leave of absence from his role of Tenure-Track Assistant Professor of Computer Science at Washington State University. He obtained his PhD from Arizona State University in Summer 2020. Jia’s research interests include database systems, distributed data systems and geospatial data management.