CSE 511 Data Processing at Scale

Database systems are used to provide convenient access to disk-resident data through efficient query processing, indexing structures, concurrency control, and recovery. This course delves into new frameworks for processing and generating large-scale datasets with parallel and distributed algorithms, covering the design, deployment and use of state-of-the-art data processing systems, which provide scalable access to data

Specific topics covered include:

  • Efficient query processing
  • Indexing structures
  • Distributed database design
  • Parallel query execution
  • Concurrency control in distributed parallel database systems
  • Data management in cloud computing environments
  • Data management in Map/Reduce-based
  • NoSQL database systems
Jia Yu
Jia Yu

Jia Yu is a co-founder of Wherobots Inc. and leads its engineering team. Jia is the creator of Apache Sedona and was a Tenure-Track Assistant Professor of Computer Science at Washington State University from 2020 - 2023. Jia’s research interests include database systems, distributed data systems and geospatial data management.