Background

Our handling of conflicts of interest (COI) during reviewing of conference and journal submissions has changed very little although the number of submissions as well as number of PC members have increased exponentially for several venues in recent times. Each paper’s authors must manually declare all conflicts of interest. As top-tier conference program committees can easily have hundreds of members, it is prohibitively expensive for PC chairs with thousands of reviews to manage to double-check the accuracy and completeness of these manual declarations. Nor can reviewers reliably catch unreported conflicts. Hence, it is paramount to design a data-driven solution to address the issue of scale, completeness, and accuracy in COI declarations. Such solution naturally paves the way for a fairer review process that is paramount for any scientific endeavor. The recent article by Snodgrass and Winslett in CACM nicely articulate the need to relook our COI declaration and detection mechanisms. Furthermore, an ACM SIGMOD Blog post on COI management issues is available here.

Overview of CLOSET

CLOSET is a data-driven scalable solution to address the aforementioned challenges. Specifically, it takes as input a list of submissions to a venue (paper id, paper title, and authors), reviewers and metareviewers assigned to these submissions (reviewer emails), a list of PC/SPC members (name, institution, and email) of the venue, and an optional file containing author-specified COI. It generates COI violation details (if any) between the authors of submissions and assigned reviewers/metareviewers as output. CLOSET focuses on COI w.r.t the following two scenarios: (a) past coauthorship between authors and reviewers and (b) same institution of affiliation. In the case, the author-specified COI details are available, it also generates as output a set of authors (possibly empty) associated with each paper who have unreported COI with any reviewer (not limited to assigned reviewers) in the PC.

Note that CLOSET does not ingest any information about the content of a paper beyond its title. Also, it neither requires any details of review comments and ratings given by reviewers, nor demands any details from authors (e.g., google scholar page, names of collaborators). CLOSET is designed based on the principle that we should not demand additional inputs from authors and reviewers beyond those that are typically provided by them for any conference. Consequently, it does not require any changes to the existing interaction behavior of authors and reviewers with a review management system. It is also orthogonal to any reviewer assignment process (manual, automatic, or TPMS-based). That is, it does not care how reviewers are assigned to a paper.

CLOSET is implemented using Python and a relational database system. Under the hood, it exploits multiple bibliographic data sources and implements novel homonymy (i.e., disambuation) resolution and indexing and pruning strategies to make COI detection accurate, efficient, and scalable. It can also identify the correct author among those sharing identical names with him/her with high accuracy. It has in place a data cleaning framework and a name matching module that is cognizant of the ways authors typically specify their names in order to facilitate generation of highly accurate results. For instance, CLOSET can detect unreported COI that are missed by an exact name-based COI search technique. On the other hand, it will not detect false positives that may result from a very generic regular expression-based name matching. The output of CLOSET is presented in visual and structured text formats.

Currently, CLOSET supports any venue hosted by Microsoft's CMT and Easy Chair. The framework is easily extensible to support other review management systems.

CLOSET is available as a service where a venue chair gives us access to the aforementioned information and we generate the COI details using it. In order to protect the security and privacy of COI detection code, CLOSET is not publicly downloadable.

CLOSET can also be used for auditing reviewer assignment data of past editions of conferences (i.e., post-facto). It also has various other features such as bid analytics, publication profile of PC, network analysis of PC, diversity analysis of PC, etc.

Variants of CLOSET

Mini CLOSET: This variant is designed for venues that cannot release the paper assignment details to external parties (e.g., double-blind venues). Mini CLOSET is a "smaller" version of CLOSET that takes as input a list of authors in a venue (name, affiliation, email) and a list of PC/SPC members (name, institution, and email). It generates reports on co-authorship relationships (if any) between all reviewer-author pairs. Note that it does not flag unreported COIs as paper assignment information is not available. PC chairs can use the output of Mini CLOSET as a guide to assign reviewers to submissions.

J-CLOSET: This variant is designed for journal submissions to aid Associate Editors (AE) to assign COI-free reviewers. It is a light-weight version of CLOSET that takes as input a list of authors in a paper submitted to a journal (name, email) and a list of candidate reviewers (name, institution, and email). It generates reports on any co-authorship relationships between the authors and candidate reviewers. AEs can use the output of J-CLOSET to make a decision on whether a candidate reviewer should be invited to review a submission.

User Base

CLOSET and its variants have been deployed in the following non-exhaustive list of venues to detect COIs:

  • ACM SIGMOD: 2021, 2022
  • VLDB: 2021, 2022
  • ACM CIKM: 2020
  • IEEE ICDE: 2020 (post-facto), 2021 (post-facto)
  • ACM India CODS-COMAD: 2021, 2022

Contact Us

If you are interested in using CLOSET to identify COIs between authors and reviewers, please contact me at assourav@ntu.edu.sg.