Overview

The digital world is experiencing a staggering volume of complex-structured data in the form of networks, images, and text. This has lead to the information overload problem. Specifically, the deluge of data from various sources comes at a cost. An end user may find such data in its raw form overwhelming as it is extremely difficult and time consuming to analyze and interpret such complex data. For example, a major challenge to biologists is to make sense out of the intertwining hairball of information contained in large biological networks. In this research, we explore efficient and novel techniques to summarize a variety of complex-structured data in its raw form or in results of queries posed over these data. According to Oxford Dictionary, summary means "A brief statement or account of the main points of something". Hence, in our research we search for techniques that can automatically describe the main points of complex-structured data such as graphs, social tweets, and social images.

Our research results have appeared in premium venues such as ACM BCB, ACM SIGIR, ACM WWW, IEEE TKDE, and VLDB. Our research on biological network summarization is recently published as a book entitled "Summarizing Biological Networks" by Springer-Verlag, Computational Biology Series (May 2017). Recently, some of our results are presented in a tutorial in VLDB 2017.

This research is partially funded by SMA-CSB grant.



Key Achievements

  • FUSE (BMC Bioinformatics 2012, ACM BCB 2011) is the world's first algorithm that automatically discovers main points (functional summaries) from protein-protein interaction networks.
  • DiffNet (Methods 2014) is the world's first technique to automatically construct differential functional summaries from a pair of genetic interaction networks.
  • PRISM (ACM SIGIR 2014, VLDB 2015, WWW 2014) is the world's first framework for generating high quality summaries of top-k social image search results.
  • TOTEM (ACM SIGIR 2017, JASIST 2019) is the world's first system for summarizing personal recent tweets on mobile devices.





Publications

The list of publications related to this project can be found in ResearchGate.

Aspirations

DANTE


DANTE is the data management research group at NTU. It was formed in 2009 and comprises of 3 faculty members from SCSE. They work on a variety of data management challenges including summarisation of complex data.

Contact

Sourav S Bhowmick

assourav@ntu.edu.sg

+65 67904320