Hippo

Oct 1, 2016

PDF Code Slides Video

Introduction

Hippo is a fast, yet scalable, database indexing approach. It significantly shrinks the index storage and mitigates maintenance overhead without compromising much on the query execution performance. Hippo stores disk page ranges instead of tuple pointers in the indexed table to reduce the storage space occupied by the index. It maintains simplified histograms that represent the data distribution and adopts a page grouping technique that groups contiguous pages into page ranges based on the similarity of their index key attribute distributions. When a query is issued, Hippo leverages the page ranges and histogram-based page summaries to recognize those pages such that their tuples are guaranteed not to satisfy the query predicates and inspects the remaining pages.

Highlight

Publications: PVLDB 2016 (research), ICDE 2017 (demo), SSTD 2017 (research)

Collaborators: Mohamed Sarwat (Arizona State University)

Highlight: Hippo is a PostgreSQL 9.6 built-in index

Source code

I implemented Hippo index into PostgreSQL kernel. Source code is hosted on Github: https://github.com/DataSystemsLab/hippo-postgresql

Demo video

I implemented a demo system using Hippo-spatial as the backend. Demo video is hosted on Youtube: https://www.youtube.com/watch?v=wWaOK2-9k9A

Presentation

I presented Hippo index in VLDB 2017. Here is the pre-presentation video: http://www.public.asu.edu/~jiayu2/video/vldb2017-presentation.mp4

Jia Yu

Co-founder

Jia Yu is a co-founder of Wherobots Inc.. Jia is the creator of Apache Sedona and was a Tenure-Track Assistant Professor of Computer Science at Washington State University from 2020 to 2023. Jia’s research interests include database systems, distributed data systems and geospatial data management.

Publications

Indexing the Pickup and Drop-Off Locations of NYC Taxi Trips in PostgreSQL - Lessons from the Road

In this paper, we present our experience in indexing the dropoff and pick-up locations of taxi trips in New York City. The paper …

Jia Yu, Mohamed Sarwat

PDF Code Project

Hippo in Action: Scalable Indexing of a Billion New York City Taxi Trips and Beyond

The paper demonstrates Hippo a lightweight database indexing scheme that significantly reduces the storage and maintenance overhead …

Jia Yu, Raha Moraffah, Mohamed Sarwat

PDF Code Project

Two Birds, One Stone: A Fast, yet Lightweight, Indexing Scheme for Modern Database Systems

Classic database indexes (e.g., B+-Tree), though speed up queries, suffer from two main drawbacks: (1) An index usually yields 5% to …

Jia Yu, Mohamed Sarwat

PDF Code Project

Talks

Indexing the Pickup and Drop-off Locations of NYC Taxi Trips in PostgreSQL – Lessons from the Road

Aug 1, 2017 12:00 AM SSTD@Washington D.C., USA

Project Slides

Two Birds, One Stone - A Fast, yet Lightweight, Indexing Scheme for Modern Database Systems

Aug 1, 2017 12:00 AM VLDB@Munich, Germany

Project Slides Video

Hippo

Introduction

Highlight

Source code

Demo video

Presentation

Jia Yu

Co-founder

Related

Publications

Indexing the Pickup and Drop-Off Locations of NYC Taxi Trips in PostgreSQL - Lessons from the Road

Hippo in Action: Scalable Indexing of a Billion New York City Taxi Trips and Beyond

Two Birds, One Stone: A Fast, yet Lightweight, Indexing Scheme for Modern Database Systems

Talks

Indexing the Pickup and Drop-off Locations of NYC Taxi Trips in PostgreSQL – Lessons from the Road

Two Birds, One Stone - A Fast, yet Lightweight, Indexing Scheme for Modern Database Systems