GLIN: A (G) eneric (L) earned (In) dexing Mechanism for Complex Geometries

Abstract

Although spatial indexes shorten the query response time, they rely on complex tree structures to narrow down the search space. Such structures in turn yield additional storage overhead and take a toll on index maintenance. Recently, there have been a flurry of efforts attempting to leverage Machine-Learning (ML) models to simplify the index structures. However, existing geospatial indexes can only index point data rather than complex geometries such as polygons and trajectories that are widely available in geospatial data. As a result, they cannot efficiently and correctly answer geometry relationship queries. This paper introduces GLIN, an indexing mechanism for spatial relationship queries on complex geometries. To achieve that, GLIN transforms geometries to Z-address intervals, and then harnesses an existing order-preserving learned index to model the cumulative distribution function between these intervals and the record positions. The lightweight learned index greatly reduces indexing overhead and provides faster or comparable query latency. Most importantly, GLIN augments spatial query windows to support queries exactly for common spatial relationships. Our experiments on real-world and synthetic datasets show that GLIN has 80% - 90% lower storage overhead than Quad-Tree and 60% - 80% than R-tree and 30% - 70% faster query on medium selectivity. Moreover, GLIN’s maintenance throughput is 1.5 times higher on insertion and 3 - 5 times higher on deletion.

Publication
In ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data

Related