Sourav S Bhowmick is an Associate Professor in the College of Computing & Data Science (CCDS), Nanyang Technological University, Singapore. He is the Program Director of the newly launched Master of Science in Data Science (MSDS) program in NTU. He is the founder of the Data Management Research Group at NTU (DANTE). He is also the Research Group Lead of Data Management & Analytics Group in SCSE. Sourav is currently a "Huashan Talent" Visiting Professor at the Xidian University, China. In the past, he was a Visiting Associate Professor (2007-2013) at the Biological Engineering Division, Massachusetts Institute of Technology (MIT), an Adjuct Associate Professor (2021-2023) at the Department of Computer Science, Hong Kong Baptist University (HKBU), and a Senior Visiting Professor (2013) at the Fudan University. He is an affiliate member of Centre for Research and Development in Learning (CRADLE).
Sourav’s core research expertise is in data management, human-data interaction, and data analytics. He is more excited to solving problems not within the mainstream of these fields that require new perspectives instead of yet another solution to a traditional problem. Consequently, his research interests are user-centric, multi-area (e.g., data management and HCI, data management and IR) and multi-disciplinary (e.g., marriage of data-centric computing with psychology or abstract art or system biology or education) in flavor, focusing primarily on developing novel paradigms, algorithms, and technology to improve efficiency, scalability, fairness, or usability of data-centric software. He has received over S$4M in research grant (over S$3M as PI) from the Singapore-MIT Alliance (SMA) and the Ministry of Education (MOE) (4 Tier 2s and 8 Tier 1s). He has published more than 100 papers in top-tier data management, data mining, multimedia, bioinformatics, and systems biology conferences and journals such as ACM SIGMOD, VLDB, ACM MM, ACM SIGIR, VLDB Journal, and Bioinformatics. The common thread running through his research is a focus on going beyond papers to build usable novel systems and prototypes.
Sourav was inducted into Distinguished Members of the ACM in 2020 for "outstanding scientific contributions to computing". His key research contributions are summarized as follows.
Human-Data Interaction: He and his group pioneered the vision of bridging HCI and data management to build usable and high performance visual querying systems. Specifically, they are the first to propose a novel query processing paradigm that blends visual query formulation and query processing to turbo-charge system response time by exploiting the latency offered by visual interfaces. This paradigm was realized in the context of XML and graph data. Subsequently, they proposed various novel techniques for visual query feedback and query processing. They also pioneered data-driven construction of visual query interfaces. The results of this research were published in SIGMOD, CIDR, VLDB, ICDE, CIKM, CACM, VLDB Journal, and IEEE TKDE. Their research has influenced several downstream reseach in visual querying. His research on human-graph interaction is published as two books entitled "Human Interaction with Graphs: A Visual Querying Perspective" and "Plug-and-Play Visual Subgraph Query Interfaces". Sourav's research in bridging HCI and graph data management was selected in the special issue of CACM highlighting cutting-edge research and innovation in East Asia and Oceania. His group's work on plug-and-play SQL received the Best Student Paper Award in ER 2023.
Graph Data Management: He and his collaborators have contributed several novel techniques for efficient and scalable graph query processing and analytics on a single machine. Several of these works were published in SIGMOD, VLDB, ICDE, VLDB Journal, and TKDE. Specifically, they are the first to propose a single machine solution to the subgraph enumeration problem called DualSim (SIGMOD 2016) that outperforms distributed variants (at the time of publication). They also addressed a long-standing usability challenge of subgraph search formulation by proposing a novel query paradigm called partial topology query (VLDB J 2017, VLDB 2018). They undertook a comprehensive study on the cardinality estimation problem for subgraph matching queries and developed the world's first framework called G-CARE (SIGMOD 2020) to benchmark cardinality estimation techniques for these queries. The study unveiled intriguing and unexpected findings that we believe will shape future research on graph cardinality estimation. Recently, they also invented a suite of techniques for scaling network embedding (VLDB 2020, VLDB 2021). In particular, the work on scaling attributed network embedding received the Best Research Paper Award in VLDB 2021 and 2022 ACM SIGMOD Research Highlights Award.
Technology-Enabled Learning: Sourav and his collaborators are recently exploring how data-driven techniques can supplement learning of database systems. To this end, they have proposed several novel frameworks that are at the intersection of data management, NLP, and psychology to augment learning of learners taking database systems courses. These techniques are published in SIGMOD, VLDB, and SIGCSE TS. A vision paper on this topic has appeared in IEEE Data Engineering Bulletin.
Social Search and Analytics: Sourav and his collaborators proposed several novel analytics and retrieval techniques for social images and social networks. Some of these works were published in SIGMOD, VLDB, SIGIR, WWW, ACM MM, VLDB Journal, DMKD, and EDBT. In particular, his group transferred ideas from data management and analytics domain to social image retrieval, resulting in the creation of why-not question answering framework for social images (ACM MM 2013, WWW 2014) and result summmarization system for social image search (ACM SIGIR 2014, VLDB 2015). His group pioneered the bridging of social psychology with online social influence problem by bringing in conformity of users into influence models and algorithms (ACM CIKM 2011, EDBT 2013, VLDB J 2015, ACM SIGMOD 2020). His work on influence maximization in competitive networks received Best Paper Award Nomination in ACM SIGMOD 2015. All these research have influenced several downstream research in the social space.
Semistructured Data Management: He and his collaborators have contributed several novel techniques related to XML storage and query processing, XML schema integration, XML change management, and more recently on XML keyword search and XML usability. Some of these works were published in ICDE, SIGMOD, VLDB, CIKM, DASFAA, VLDB Journal, and DKE.
Evolution/Change Analytics: His research team is the first to undertake a systematic study on mining structural evolution of tree-structured data. This work received Best Interdisciplinary Paper Award in ACM CIKM 2004. Subsequently, they proposed solutions to a series of novel problems related to mining evolution of tree and graph structured data. Some of these works were published in ACM WWW, ACM SIGKDD, ACM CIKM, VLDB, and ICDE.
Network Analytics Meets Systems Biology: His team has proposed several novel techniques for biological network analytics such as generating high quality functional summaries of protein interaction networks as well as multi-faceted views of these networks to improve their understanding. Each of these techniques are world’s first in the field of bioinformatics and systems biology. They also proposed a novel network-driven, in silico framework for identifying synergistic target combinations from signaling networks by analyzing its topology and dynamics. The results of these work were published in IEEE TKDE, IEEE ICDM, ACM BCB, BMC Bioinformatics, Methods, and Bioinformatics journal. Specifically, the work related to functional summaries received Best Paper Award in ACM BCB 2011. His research on biological network summarization is recently published as a book entitled "Summarizing Biological Networks" by Springer-Verlag, Computational Biology Series (May 2017).
Sourav's views on human-data interaction and review fairness can be found in 2021 SIGMOD Record interview (Distinguished Profiles column)
Sourav's research has been deployed in the real-world. His XML data management framework for biological data (query engine and bioinformatics workflow) was part of the product of a local startup HeliXense Pte Ltd in 2002 (read the technology framework published in VLDB 2002, CIKM 2003, and ICDE 2003). More recently, his data-driven Conflicts of Interest (COI) detection system called CLOSET has been deployed in more than 25 venues including several premium conferences such as SIGMOD, VLDB, and LICS. It is also interfaced with Microsoft's CMT. It has influenced changes to longstanding COI policies and management in premium data management venues. It was selected in the special issue of CACM highlighting cutting-edge research and innovation in East Asia and Oceania. His technology-enabled database education systems, NEURON, LANTERN, and ARENA have been used by more than a thousand learners (end users) taking database system courses in more than 75 institutions (2019-present). Surveys of learners as well as analysis of academic outcomes show that they benefit from them in learning relational query processing.
Sourav regularly serves as a reviewer for data management, data mining, and bioinformatics conferences (e.g., SIGMOD, VLDB) and journals (e.g., ACM TODS, VLDB Journal, Bioinformatics). He has served/serving as a program chair/co-chair of several international conferences such as EDBT 2023, CODS-COMAD 2022, ACM CIKM 2020, IEEE BigComp 2018, DASFAA 2014, DEXA 2009 and 2008. He has also served/serving as Group Leader/Associate Editor/Area Chair for VLDB, SIGMOD, ICDE, and DASFAA. He is a member of the steering committee of DASFAA. He is serving as a member of the SIGMOD Executive Committee (2021-2025), SIGMOD Awards Committee (2024-2028), and PVLDB Advisory Board (2021-present). He is an elected trustee of the VLDB Endowment (2024-2029). He has also served as a co-lead in the committee for Diversity and Inclusion in Database Conference Venues (2021-2023). Sourav has been panelist, tutorial and keynote speaker in several international conferences (including SIGMOD, VLDB). He has also been reviewer of external Ph.D dissertations, national and international grant proposals, and external tenure applications. He is/was a member of the editorial boards of several international journals (e.g., ACM PACMMOD, PVLDB, IEEE TKDE, JASIST, SIGMOD Record). Sourav is a member of ACM, ACM SIGMOD, and ACM SIGBio.
Sourav is a co-recipient of the VLDB Service Award in 2018 from the VLDB Endowment for his contribution in designing an efficient PVLDB proceedings management framework. He was conferred Distinguished Reviewer Award for "outstanding services to VLDB 2020 and data management community" and in VLDB 2023. He also received the Distinguished Associate Editor Award in ACM SIGMOD 2021, 2023, and VLDB 2022. In 2022, he received the SCSE Faculty Award for leadership and contributions to SCSE.
Sourav have strong interest and passion for teaching undergraduate and graduate students. He has been involved in teaching both small (less than 10 students) and large classes (500+ students). He has taught and developed the course contents of various undergraduate and graduate level courses. He was Nominated for the Excellence in Teaching Award in 2003, 2004, 2005, and 2020. He was recipient of Lecturer of the Year Award (2002-2003) for Year 1 undergraduate course. Sourav have been invited twice (2011 and 2013) by the Database Society of Japan to mentor graduate students in Japan.
Last but not the least, Sourav is an avid visual artist. His interest lies in cognitive art - paintings and drawings that provoke us to think about the lives we lead, the things we do, and how they impact our society and planet. He has given public exhibitions of his artworks in Singapore such as Voice of Art 1 and Blowin' in the Wind (on the theme of refugee crisis). Some of his artworks have been sold for social causes such as contributions to elderly care and children suffering from rare diseases. More details of his artworks are available here.
Sourav received his Ph.D. in computer engineering in 2001. His Ph.D. thesis was published as a book entitled "Web Data Management: A Warehouse Approach" (Springers Verlag, October 2003).