Research

Background

The volume of geospatial data increased tremendously. Such data includes but is not limited to weather maps, Internet-of-Things sensors, and geo-tagged social media. Many data-intensive geospatial analytics applications, such as Machine Learning algorithms, highly rely on the underlying data infrastructures such as database management systems (DBMS) to efficiently manipulate, retrieve and manage data. Unfortunately, classic database management systems, such as MySQL, PostgreSQL, PostGIS, and ArcGIS, suffer from a significant performance drop when handling large-scale geospatial data.

Agenda

My research focuses on crafting database systems to accelerate large-scale geospatial data analytics. In particular, I am interested in

Philosophy

The “ecosystem” of my research

During my PhD, I worked on several projects in two research streams: large-scale geospatial data management and lightweight database indexes (for regular data and spatial data). Over the time, the research projects built up two funny “ecosystems”. I am so proud of my contribution to the community!

Summary

Item Introduction
Research focus Theme: Large-scale geospatial databases
Topics: cluster computing, interactive visualization, lightweight database index
Application Help data scientists analyze big geospatial data in a scalable and interactive way at lower cost of time and money
Urban planning, animal migration, transportation engineering, climate change analytics
Papers 19 publications (SIGMOD, VLDB, ICDE, SSTD, Geoinformatica)
13 first-author, 5 second-author, 1 third-author
1 under revision, 1 under review
All my publications with my advisor have <= 3 authors
Citations Google Scholar: 350+
The GeoSpark paper in 2015 is the most cited paper among all 633 papers from 2014 - 2019 in ACM SIGSPATIAL
Internship Microsoft Research (database group, the birthplace of SQL Server) - SIGMOD'20
IBM Almaden Research Center (database group, the birthplace of relational model, SQL and DB2) - SIGMOD'19, VLDB'19
Apple (map team, the birthplace of Apple Map)
Open-source All my ASU projects are on GitHub
over 1000 stars + forks
10,000 monthly downloads
Industry impact Users and code contributors of my ASU projects are from major IT companies, such as Uber, MoBike (摩拜单车), and Facebook
Many companies use my ASU projects in production: Databricks GeoSpark notebook, Gyana BI dashboard powered by GeoSpark
System hack All my ASU projects are implemented in the kernel of widely used data systems
GeoSpark cluster computing system is in Apache Spark
Hippo index is a PostgreSQL 9.6 built-in index
Talks Conference (9 times): VLDB, ICDE, SIGSPATIAL, SSTD, MDM, ApacheCon (Apache Software Foundation annual conference)
Company (7 times): Microsoft Research, IBM Almaden Research Center, Apple, NVidia, StateFarm, Vocareum
Referees Mohamed Sarwat (ASU)
Yingjun Wu (Amazon Web Services, former researcher at IBM Almaden Research Center)
Umar Farooq Minhas (Microsoft Research)
David Lomet (Microsoft Research, Member of the National Academy of Engineering)