CSE 511 Data Processing at Scale

Database systems are used to provide convenient access to disk-resident data through efficient query processing, indexing structures, concurrency control, and recovery. This course delves into new frameworks for processing and generating large-scale datasets with parallel and distributed algorithms, covering the design, deployment and use of state-of-the-art data processing systems, which provide scalable access to data

Specific topics covered include:

  • Efficient query processing
  • Indexing structures
  • Distributed database design
  • Parallel query execution
  • Concurrency control in distributed parallel database systems
  • Data management in cloud computing environments
  • Data management in Map/Reduce-based
  • NoSQL database systems
Jia Yu
Assistant Professor

Jia Yu is an Assistant Professor at Washington State University School of EECS. He obtained his PhD from Arizona State University in Summer 2020. Jia’s research interests include database systems, distributed data systems and geospatial data management.