Aggregated 2D Range Queries on Clustered Points

Abstract

Efficient processing of aggregated range queries on two-dimensional grids is a common requirement in information retrieval and data mining systems, for example in Geographic Information Systems and OLAP cubes. We introduce a technique to represent grids supporting aggregated range queries that requires little space when the data points in the grid are clustered, which is common in practice. We show how this general technique can be used to support two important types of aggregated queries, which are ranked range queries and counting range queries. Our experimental evaluation shows that this technique can speed up aggregated queries up to more than an order of magnitude, with a small space overhead.

Source Code

  • Range counting K2-tree: k2tree.tar.gz
  • K2-treap: k2treap.tar.gz
  • Wavelet Tree RMQ (wtrmq): it has been included in libcds, which is maintained by one of the researchers associated to this project (Roberto Konow).

Third-Party Libraries

  • Compact Data Structures Library: libcds.

Publications

A preliminary partial version of this work, in which we study top-k queries, appeared in Proc. SPIRE 2014 (pdf), pp. 215-226, LNCS 8799. The complete journal version, in which we present a general technique for aggregated 2D range queries (pdf), has been accepted for publication in Information Systems (Elsevier).

About us

We are a group of researchers from the University of A Coruña (Spain), University of Chile and University of Concepción (Chile).

Acknowledgements

This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 690941, from Millennium Nucleus Information and Coordination in Networks ICM/FIC P10-024F (Chile), from MINECO (PGE and FEDER) Projects TIN2013-46238-C4-3-R and TIN2013-46801-C4-3-R (Spain), and also from Xunta de Galicia (GRC2013/053) (Spain).