Overview

Querying and analyzing massive networks is important for many applications such as bioinformatics, social networks, etc. Since the last decade many efforts have focused on building a distributed framework to query and analyze massive networks. However, this approach has high capital cost. In fact, a framework with more than 100 servers with 128 GBs of memory can easily cost more than USD 300,000 and consume over 25KW of electricity. Furthermore, there is additional performance cost due to the interconnectness of the machines. Consequently, in this research, we explore the feasilibility of graph querying and analytics over massive networks in a single machine. That is, we seek answer to the following question: How far can we push a single machine to process massive graphs by judicious management of disk and memory space? Our initial research shows that certain graph query and analytics problems can indeed be processed on billion-nodes network in a single commodity machine.

We can't just consume our way to a more sustainable world.

- Jennifer Nini

Our research has appeared in premium venues such as SIGMOD, VLDB, VLDB Journal, and TKDE.


Key Achievements

  • PANDA (VLDB J 2017, VLDB 2018) is the world's first algorithm that supports efficient evaluation of partial topology-based search. Unlike existing approaches, it enables users to search networks in a single machine based on partial topological knowledge of their queries.
  • DUALSIM (ACM SIGMOD 2016) is the world's first technique that can efficiently process subgraph enumeration on billion nodes network in a single machine. It even outperforms several existing distributed frameworks designed for subgraph enumeration.
  • TEA+ (ACM SIGMOD 2019) is a novel heat kernel-based local clustering technique that produces high quality clusters efficiently on billion-edge networks.
  • G-CARE (ACM SIGMOD 2020) is the world's first benchmarking platform for cardinality estimation of subgraph matching queries.
  • PANE (VLDB 2021) and NRP (VLDB 2020) are the world's one of the most scalable network embedding techniques.



Awards

  • 2022 ACM SIGMOD Research Highlight Award for PANE.
  • Best Research Paper Award (VLDB 2021) for PANE.



Publication

The list of publications related to this project can be found here.