index.html

Data-driven sciences are widely regarded as the fourth paradigm of sciences that will fundamentally change the society and our everyday lives. Indeed, artificial intelligence (AI) models have already revolutionized and transformed various data-intensive industries. Machine learning and deep learning models have achieved unprecedented extraordinary performance for image, text, audio, video, and network data analysis. The great successes are mainly due to three reasons, i.e., accumulation of the gigantic amount of data, ever-increasing computational power, and design of the highly efficient algorithms. Further, the remarkable achievement of AlphaFold2 for protein folding problem has ushered in a new era for AI-based molecular data analysis for materials, chemistry, and biology.

With the excitement and opportunities come challenges. Currently, one of the central challenges for AI-based molecular data analysis is molecular representation, which is to identify or design appropriate molecular descriptors or fingerprints. Proper descriptors should preserve the most important and intrinsic molecular properties and information that directly determinate molecular functions. In this way, they can be better “understood” by machine learning models. In fact, the performance of many learning methods is heavily dependent on the choice of data representation and featurization, which is a long-standing issue for cheminformatics and bioinformatics. Traditional molecular descriptors are properties obtained from structural geometry/topology, chemical conformation, chemical graph, as well as molecular formula, hydrophobicity, steric properties, and electronic properties. These descriptors are widely used in the quantitative structure-activity relationship (QSAR) and learning models.

Mathematical AI for Molecular Sciences (IA matemática para ciencias moleculares, Matematička Ai Za Molekularne Znanosti) is proposed for molecular representation, featurization, and learning. As illustrated above, various types of data, in particular, molecular data from materials, chemistry, and biology, can be represented using topological models, including graphs, simplicial complexes, hypergraphs, etc. From these representations, various mathematical invariants are obtained by using advance mathematical models from algebraic topology, discrete geometry, combinatorics, etc. These mathematical invariants are used as input features for learning models. Dramatically different from previous models, molecular data are modelled using higher-dimensional topologies, such as simplicial complexes and hypergraphs, and filtration-induced multiscale representations. Further, mathematical invariant-based features characterize the most intrinsic and fundamental properties and have a better transferability for learning models.

A brief introduction of the area can be found in 2021 winter school lectures at Dalian, AATRN talk, report in Chinese for the series of talks on "Math for AI & AI for math", and Prof.Guowei Wei's works (SIAM news, Harvard talk, D3R news). We sincerely welcome highly motivated students and postdocs to join our group!

Publication list

98. Xinyu You, Xiang Liu, Chuan-Shen Hu, Kelin Xia, and Tze Chien Sum. "Quotient-complex transformer for perovskite data analysis." Cell Reports Physical Science 7, no. 4 (2026).
97. Aida Abiad, Alex Arenas, Agnes Backhausz, Jozsef Balogh, Christopher RS Banerji, Sergio Barbarossa, Ginestra Bianconi et al. "Hypergraphs and simplicial complexes in focus: A roadmap for future research in higher-order interactions." Journal of Physics: Complexity (2026).
96. Cong Shen, Yipeng Zhang, Tze Kwang Gerald Er, Fei Han, Atsushi Goto, and Kelin Xia, "Molecular Topological Deep Learning for Polymer Property Prediction", ACS Nano, 20(1), 288-299 (2026)
95. Kelin Xia. "The Hodge Laplacian advances inference of single-cell trajectories", Nature Methods (2025): 1-2. (News & Views)
94. Zetian Mao, Chuan-Shen Hu, Jiawen Li, Chen Liang, Diptesh Das, Masato Sumita, Kelin Xia, and Koji Tsuda. "Molecule graph networks with many-body equivariant interactions." Journal of Chemical Theory and Computation, 21 (16), 7954-7966 (2025)
93. Longlong Li, Yipeng Zhang, Guanghui Wang, and Kelin Xia. "Kolmogorov–Arnold graph neural networks for molecular property prediction." Nature Machine Intelligence 7, 1346–1354 (2025) (ScienceAI, DrugAI, PHAIMUS)
92. Liang Huang, Benedict Lee, Daniel Hui Loong Ng, and Kelin Xia. "Simplicial Convolutional Networks for Inductive Short Text Classification." Procedia Computer Science, 264: 137-146 (2025)
91. Joshua Zhi En Tan, JunJie Wee, Xue Gong, and Kelin Xia. "Topology-enhanced machine learning model (Top-ML) for anticancer peptide prediction." Journal of Chemical Information and Modeling, 65(8): 4232-4242 (2025)
90. JunJie Wee, Xue Gong, Wilderich Tuschmann, and Kelin Xia. "A cohomology-based Gromov–Hausdorff metric approach for quantifying molecular similarity." Scientific Reports, 15(1):10458 (2025)
89. Yaxing Wang, Xiang Liu, Yipeng Zhang, Xiangjun Wang, and Kelin Xia. "Join Persistent Homology (JPH)-Based Machine Learning for Metalloprotein–Ligand Binding Affinity Prediction." Journal of Chemical Information and Modeling, 65, 2785−2793 (2025).
88. Bingqing Han, Yipeng Zhang, Longlong Li, Xinqi Gong, Kelin Xia. "TopoQA: a topological deep learning-based approach for protein complex structure interface quality assessment." Briefings in Bioinformatics, 26(2): bbaf083 (2025)
87. Cong Shen, Xiang Liu, Jiawei Luo, and Kelin Xia. "Torsion Graph Neural Networks." Transactions on Pattern Analysis and Machine Intelligence, 47(4): 2946-2956 (2025)
86. Chuan-Shen Hu, Rishikanta Mayengbam, Kelin Xia, and Tze Chien Sum. "Quotient Complex (QC)-Based Machine Learning for 2D Hybrid Perovskite Design." Journal of Chemical Information and Modeling, 65 (2), 660-671(2025).
85. Yipeng Zhang, Cong Shen, and Kelin Xia. "Multi-Cover Persistence (MCP)-based machine learning for polymer property prediction." Briefings in Bioinformatics, 25(6): bbae465 (2024)
84. Chuan-Shen Hu, Rishikanta Mayengbam, Min-Chun Wu, Kelin Xia, and Tze Chien Sum. "Geometric data analysis-based machine learning for two-dimensional perovskite design." Communications Materials 5(1): 106 (2024)
83. Xiang Liu, Huitao Feng, Jie Wu, and Kelin Xia. "Computing hypergraph homology." Foundations of Data Science 6(2): 172-194 (2024)
82. Cong Shen, Pingjian Ding, Junjie Wee, Jialin Bi, Jiawei Luo, and Kelin Xia. "Curvature-enhanced graph convolutional network for biomolecular interaction prediction." Computational and Structural Biotechnology Journal, 23, 1016-1025 (2024)
81. JunJie Wee, Jiahui Chen, Kelin Xia, and Guo-Wei Wei. "Integration of persistent Laplacian and pre-trained transformer for protein solubility changes upon mutation." Computers in Biology and Medicine, 169: 107918 (2024)
80. Siyan Deng, Chao Chen, Ke Li, Xi Chen, Kelin Xia, and Shuzhou Li. "Structure-based multilevel descriptors for high-throughput screening of elastomers." The Journal of Physical Chemistry B, 127(46): 10077-10087 (2023)
79. Cong Shen, Jiawei Luo, and Kelin Xia. "Molecular geometric deep learning." Cell Reports Methods, 3, 100621 (2023) DrugAI Report
78. Bi, Jialin, JunJie Wee, Xiang Liu, Cunquan Qu, Guanghui Wang, and Kelin Xia. "Multiscale topological indices for the quantitative prediction of SARS CoV-2 binding affinity change upon mutations." Journal of Chemical Information and Modeling, 63(13): 4216-4227 (2023)
77. Wee, JunJie, Ginestra Bianconi, and Kelin Xia. "Persistent Dirac for molecular representation." Scientific Report, 13, 11183 (2023)
76. D. Vijay Anand, Ronald Koh Joon Wei, and Kelin Xia. "Coarse-grained models for vault normal model analysis." In Protein Cages: Design, Structure, and Applications, pp. 307-318. New York, NY: Springer US (2023)
75. Hou Yee Choo, JunJie Wee, Cong Shen, and Kelin Xia. "Fingerprint-enhanced graph attention network (FinGAT) model for antibiotic discovery." Journal of Chemical Information and Modeling, 63(10), 2928–2935 (2023) DrugAI Report
74. Peter Tsung-Wen Yen, Kelin Xia, and Siew Ann Cheong. "Laplacian spectra of persistent structures in Taiwan, Singapore, and US stock markets." Entropy, 25 (6): 846 (2023)
73. Kein Xia, Xiang Liu, and JunJie Wee. "Persistent homology for RNA data analysis." In Homology Modeling: Methods and Protocols, 211-229. New York, NY: Springer US (2023)
72. Xiang Liu, Huitao Feng, Zhi Lü, and Kelin Xia. "Persistent Tor-algebra for protein–protein interaction analysis." Briefings in Bioinformatics, 24(2), bbad046 (2023)
71. Jian Liu, Kein Xia, Jie Wu, Stephen Shing-Toung Yau, and Guo-Wei Wei. "Biomolecular topology: modelling and analysis." Acta Mathematica Sinica, English Series 38(10), 1901-1938 (2022)
70. D. Vijay Anand, Qiang Xu, Junjie Wee, Kelin Xia, and Tze Chien Sum, "Topological feature engineering for machine learning based halide perovskite materials design", npj Computational Materials, 8 (203) (2022) Science@NTU
69. Leong, Yong Xiang, Emily Xi Tan, Shi Xuan Leong, Charlynn Sher Lin Koh, Lam Bang Thanh Nguyen, Jaslyn Ru Ting Chen, Kelin Xia, and Xing Yi Ling, "Where nanosensors meet machine learning: prospects and challenges in detecting disease X", ACS nano, 16 (9), 13279-13293 (2022)
68. Xiang Liu, Huitao Feng, Jie Wu, and Kelin Xia, "Hom-complex-based machine learning (HCML) for the prediction of protein–protein binding affinity changes upon mutation", Journal of Chemical Information and Modeling, 62 (17), 3961-3969 (2022) ComputArt Report
67. Xiao-Shuang Li, Xiang Liu, Le Lu, Xian-Sheng Hua, Ying Chi, and Kelin Xia. "Multiphysical graph neural network (MP-GNN) for COVID-19 drug design." Briefings in Bioinformatics, bbac231 (2022) DrugAI Report Supplementary_information-version
66. Ronald Koh Joon Wei, Junjie Wee, Valerie Evangelin Laurent, and Kelin Xia, "Hodge theory-based biomolecular data analysis", Scientific Report, 12(1), 1-16 (2022)
65. Weikang Gong, JunJie Wee, Min-Chun Wu, Xiaohan Sun, Chunhua Li, and Kelin Xia, "Persistent spectral simplicial complex-based machine learning for chromosomal structural analysis in cellular differentiation", Briefings in Bioinformatics, bbac168 (2022)
64. Jelena Grbic, Jie Wu, Kelin Xia and Guowei Wei, "Aspects of topological approaches for data science." Foundations of Data Science, 4(2), 165 (2022)
63. Xiang Liu, Huitao Feng, Jie Wu, and Kelin Xia, "Dowker complex based machine learning (DCML) models for protein-ligand binding affinity prediction." PLOS Computational Biology, 18(4), e1009943 (2022)
62. Chi Seng Pun, Si Xian Lee, and Kelin Xia, "Persistent-homology-based machine learning: a survey and a comparative study." Artificial Intelligence Review, 55, 5169–5213, (2022) Data & Codes
61. JunJie Wee and Kelin Xia, "Persistent spectral based ensemble learning (PerSpect-EL) for protein-protein binding affinity prediction." Briefings In Bioinformatics, 23(2), bbac024 (2022)
60. Jiajie Peng, Jinjin Yang, D. Vijay Anand, Xuequn Shang, and Kelin Xia, "Flexibility and rigidity index for chromosome packing, flexibility and dynamics analysis." Frontiers of Computer Science, 16 (4), 1-11 (2022)
59. Peiran Jiang, Ying Chi, Xiao-Shuang Li, Xiang Liu, Xian-Sheng Hua, and Kelin Xia, "Molecular persistent spectral image (Mol-PSI) representation for machine learning models in drug design." Briefings in Bioinformatics, 23 (1), bbab527 (2022)
58. Jinghao Peng, Jiajie Peng, Haiyin Piao, Zhang Luo, Kelin Xia, and Xuequn Shang, "Predicting chromosome flexibility from the genomic sequence based on deep learning neural networks." Current Bioinformatics 16 (10), 1311-1319 (2021)
57. Jayanth Kumar Narayana, Micheál Mac Aogáin, Wilson Wen Bin Goh, Kelin Xia, Krasimira Tsaneva-Atanasova, and Sanjay H. Chotirmall, "Mathematical-based microbiome analytics for clinical translation." Computational and Structural Biotechnology Journal, 19, 6272-6281 (2021)
56. Peter Tsung-Wen Yen, Kelin Xia, and Siew Ann Cheong, "Understanding changes in the topology and geometry of financial market correlations during a market crash." Entropy, 23 (9), 1211 (2021)
55. Zhenyu Meng and Kelin Xia, "Persistent spectral–based machine learning (PerSpect ML) for protein-ligand binding affinity prediction." Science Advances, 7 (19), eabc5329 (2021)
54. JunJie Wee and Kelin Xia, "Forman persistent Ricci curvature (FPRC) based machine learning models for protein–ligand binding affinity prediction." Briefings In Bioinformatics, 22 (6), bbab136 (2021)
53. Xiang Liu, Huitao Feng, Jie Wu, and Kelin Xia, "Persistent spectral hypergraph based machine learning (PSH-ML) for protein-ligand binding affinity prediction." Briefings In Bioinformatics, 22 (5), bbab127 (2021)
52. JunJie Wee and Kelin Xia, "Ollivier persistent Ricci curvature-based machine learning for protein-ligand binding affinity prediction." Journal of Chemical Information and Modeling, 61 (4), 1617-1626 (2021)
51. Duan Chen, Shaoyu Li, Xue Wang, and Kelin Xia, "Fast random algorithms for manifold based optimization in reconstructing 3D chromosomal structures." Communications in Information and Systems, 21 (1), 1-29 (2021)
50. Xiang Liu, Xiangjun Wang, Jie Wu, and Kelin Xia, "Hypergraph based persistent cohomology (HPC) for molecular representations in drug design." Briefings In Bioinformatics, 22 (5), bbaa411 (2021)
49. Jinyin Zha, Yuwei Zhang, Kelin Xia, and Fei Xia, "Coarse-grained simulation of mechanical properties of single microtubules with micrometer length." Frontiers in Molecular Biosciences, 7, 517 (2020)
48. Chi Seng Pun, Brandon Yung Sin Yong, and Kelin Xia, "Weighted-persistent-homology-based machine learning for RNA flexibility analysis." PLOS ONE, 15 (8), e0237747 (2020)
47. Chengyuan Wu, Shiquan Ren, Jie Wu, and Kelin Xia, "Weighted fundamental group." Bulletin of the Malaysian Mathematical Sciences Society, 43 (6), 4065-4088 (2020)
46. D Vijay Anand, Zhenyu Meng, Kelin Xia, and Yuguang Mu, "Weighted persistent homology for osmolyte molecular aggregation and hydrogen-bonding network analysis." Scientific Report, 10 (1), 1-17 (2020) Data & Codes
45. Chengyuan Wu, Shiquan Ren, Jie Wu, and Kelin Xia, "Discrete Morse theory for weighted simplicial complexes." Topology and its Applications, 270, 107038 (2020)
44. Zhenyu Meng, D Vijay Anand, Yunpeng Lu, Jie Wu, and Kelin Xia, "Weighted persistent homology for biomolecular data analysis." Scientific Report, 10 (1), 1-15 (2020) Data & Codes
43. Zhenliang Wu, Yuwei Zhang, John Zenghui Zhang, Kelin Xia, and Fei Xia, "Determining optimal coarse-grained representation for biomolecules using internal cluster validation indexes." Journal of Computational Chemistry, 41 (1), 14-20 (2020)
42. Kelin Xia, D Vijay Anand, Shikhar Saxena, and Yuguang Mu, "Persistent homology analysis of osmolyte molecular aggregation and their hydrogen-bonding networks." Physical Chemistry Chemical Physics, 21, 21038-21048 (2019)
41. Chengyuan Wu, Shiquan Ren, Jie Wu, and Kelin Xia, "Magnus representation of genome sequences." Journal of Theoretical Biology, 480 (7), 104-111 (2019)
40. Liangzhen Zheng, Kelin Xia, and Yuguang Mu, "Ligand binding induces agonistic-like conformational adaptations in helix 12 of progesterone receptor ligand binding domain." Frontiers in Chemistry, 7 (315) (2019)
39. Yuwei Zhang, Kelin Xia, Zexing Cao, Frauke Grater, and Fei Xia, "A new method for the construction of coarse-grained models of large biomolecules from low-resolution cryo-electron microscopy data." Physical Chemistry Chemical Physics, 21, 9720-9727 (2019) (PCCP HOT Articles)
38. Manchugondanahalli S. Krishna, Desiree-Faye Kaixin Toh, Zhenyu Meng, Alan Ann Lerk Ong, Zhenzhang Wang, Yunpeng Lu, Kelin Xia, Mookkan Prabakaran, and Gang Chen, "Sequence- and structure-specific probing of RNAs by short nucleobase-modified dsRNA-binding PNAs incorporating a fluorescent light-up uracil analog." Analytical Chemistry, 91 (8), 5331-5338 (2019)
37. Alan Ann Lerk Ong, Desiree-Faye Kaixin Toh, Kiran M. Patil , Zhenyu Meng, Zhen Yuan, Manchugondanahalli S. Krishna, Gitali Devi, Phensinee Haruehanroengra, Yunpeng Lu, Kelin Xia, Katsutomo Okamura, Jia Sheng, and Gang Chen, "General recognition of U-G, U-A, and C-G pairs by double-stranded RNA-binding PNAs incorporated with an artificial nucleobase." Biochemistry, 58 (10), 1319-1331 (2019)
36. D Vijay Anand, Zhengyu Meng, and Kelin Xia, "A complex multiscale virtual particle model based elastic network model (CMVP-ENM) for the normal mode analysis of biomolecular complexes." Physical Chemistry Chemical Physics, 21, 4359-4366 (2019)
35. Kelin Xia,"Persistent similarity for biomolecular structure comparison." Comunications in information and systems, 18 (4), 251-280 (2018)
34. Kelin Xia, "Persistent homology analysis of ion aggregation and hydrogen-bonding network." Physical Chemistry Chemical Physics, 20, 13448-13460 (2018)
33. Kelin Xia, "Sequence-based multiscale modeling for high-throughput chromosome conformation capture (Hi-C) data analysis." PLOS ONE, 13 (2), 0191899 (2018)
32. Kelin Xia, "Multiscale virtual particle based elastic network model (MVP-ENM) for normal mode analysis of large-sized biomolecules." Physical Chemistry Chemical Physics, 20 (1), 658-669 (2018)
31. Kelin Xia, Zhiming Li, and Lin Mu, "Multiscale persistent functions for biomolecular structure characterization." Bulletin of Mathematical Biology, 80 (1),1-31 (2018)
30. Yin Cao, Bao Wang, Kelin Xia and Guo-Wei Wei, "Finite volume formulation of the MIB method for elliptic interface problems." Journal of Computational and Applied Mathematics, 321, 60-77 (2017)
29. Lin Mu,Kelin Xia, and Guowei Wei, "Geometric and electrostatic modeling using molecular rigidity functions." Journal of Computational and Applied Mathematics, 313, 18-37 (2017)
28. Duc D Nguyen, Kelin Xia and Guo-Wei Wei, "Generalized flexibility-rigidity index." Journal of Chemical Physics, 144, 234106 (2016).
27. Kristopher Opron, Kelin Xia, Zachary F. Burton and Guo-Wei Wei, "Flexibility-rigidity index for protein-nucleic acid flexibility and fluctuation analysis." Journal of Computational Chemistry, 37, 1283-1295 (2016).
26. Zixuan Cang, Lin Mu, Kedi Wu, Kristopher Opron, Kelin Xia and Guo-Wei Wei, "A topological approach to protein classification." Molecular Based Mathematical Biology, 3, 140-162 (2015). PDF
25. Kelin Xia, Kristopher Opron and Guo-Wei Wei, "Multiscale Gaussian network model (mGNM) and multiscale anisotropic network model (mANM)." Journal of Chemical Physics, 143, 204106(2015). PDF
24. Kelin Xia, Zhixiong Zhao and Guo-Wei Wei, "Multiresolution persistent homology for excessively large biomolecular datasets." Journal of Chemical Physics, 143, 134103(2015).PDF
23. Kelin Xia and Guo-Wei Wei, "Multiresolution topological simplification." Journal of Computational Biology, 22(9), 1-5 (2015).PDF
22. Kelin Xia and Guo-Wei Wei, "Multidimensional persistence in biomolecular data." Journal of Computational Chemistry, 36, 1502-1520(2015).PDF
21. Kelin Xia and Guo-Wei Wei, "Persistent homology for cryo-EM data analysis." International Journal for Numerical Methods in Biomedical Engineering, 31(8), e02719(2015).PDF
20. Jinkyoung Park, Kelin Xia and Guo-Wei Wei, "Atomic scale design and three-dimensional simulations of nanofluidic systems." Microfluidics and Nanofluidics, 19(3), 665-692(2015).PDF
19. Kristopher Opron, Kelin Xia and Guo-Wei Wei, "Communication: Capturing protein multiscale thermal fluctuations." Journal of Chemical Physics, 142, 211101(2015).PDF
18. Bao Wang, Kelin Xia and Guo-Wei Wei, "Second order method for solving 3D elasticity equations with complex interfaces." J. Comput. Phys., 294, 405-438(2015).PDF
17. Kelin Xia, Xin Feng, Yiying Tong and Guo-Wei Wei, "Persistent homology for the quantitative prediction of fullerene stability." Journal of Computational Chemistry, 36, 408-422(2015).PDF
16. Bao Wang, Kelin Xia and Guo-Wei Wei, "Matched interface and boundary method for elasticity interface problems." Journal of Computational and Applied Mathematics, 285, 203-225(2015).PDF
15. Kelin Xia and Guo-Wei Wei, "A Galerkin formulation of the MIB method for three dimensional elliptic interface problems." Computers and Mathematics with Applications, 68, 719-745(2014).PDF
14. Kelin Xia and Guo-Wei Wei, "Persistent homology analysis of protein structure, flexibility and folding." International Journal for Numerical Methods in Biomedical Engineering, 30(8), 814-844(2014).PDF
13. Kristopher Opron, Kelin Xia and Guo-Wei Wei, "Fast and anisotropic flexibility-rigidity index for protein flexibility and fluctuation analysis." J. Chem. Phys., 140, 234105(2014) .PDF
12. Kelin Xia, Meng Zhan and Guo-Wei Wei, "MIB Galerkin method for elliptic interface problem." Journal of Computational and Applied Mathematics, 272, 195-220(2014).PDF
11. Kelin Xia and Guo-Wei Wei, "Molecular nonlinear dynamics and protein thermal uncertainty quantification." Chaos, 24, 013103(2014).PDF
10. Kelin Xia and Guo-Wei Wei, "Stochastic model for protein flexibility analysis." Physical Review E, 88, 062709(2013).PDF
9. Kelin Xia, Kristopher Opron and Guo-Wei Wei, "Multiscale multiphysics and multidomain models-Flexibility and rigidity." Journal of Chemical Physics, 139, 194109(2013).PDF
8. Kelin Xia, Xin Feng, Zhan Chen, Yiying Tong and Guo-Wei Wei, "Multiscale geometric modeling of macromolecules I: Cartesian representation." J. Comput. Phys., 257, 912-936(2014).PDF
7. Xin Feng, Kelin Xia, Zhan Chen, Yiying Tong and Guo-Wei Wei, "Multiscale geometric modeling of macromolecules II: Lagrangian representation." Journal of Computational Chemistry, 34, 2100-2120(2013).PDF
6. Xin Feng, Kelin Xia, Yiying Tong and Guo-Wei Wei, "Geometric modeling of organelles, subcellular structures and multiprotein complexes." International Journal for Numerical Methods in Biomedical Engineering, 28(12), 1198-1223(2012). PDF
5. Guo-Wei Wei, Qiong Zheng, Zhan Chen and Kelin Xia, "Variational multiscale models for charge transport." SIAM Review, 54(4), 699-754(2012).PDF
4. Kelin Xia, Meng Zhan, Decheng Wan and Guo-Wei Wei, "Adaptively deformed mesh based matched interface and boundary (MIB) method for elliptic interface problems." J. Comput. Phys., 231(4), 1440-1461(2012).PDF
3. Kelin Xia, Meng Zhan and Guo-Wei Wei, "MIB method for elliptic equations with multimaterial interfaces." J. Comput. Phys., 230(12), 4588-4615(2011).PDF
2. Ming Yi, Kelin Xia and Meng Zhan, "Theoretical study for regulatory property of scaffold protein on MAPK cascade: a qualitative modeling." Biophysical Chemistry, 147(3), 130-139(2010).PDF
1. Qi Zhao, Ming Yi, Kelin Xia and Meng Zhan, "Information propagation from IP3 to target protein: A combined model for encoding and decoding of Ca2+ signal." Physica A, 388, 4105-4114(2009).PDF

(Conference papers)
11. Yuhan Peng, Junwen Dong, Yuzhi Zeng, Hao Li, Ce Ju, Huitao Feng, Diaaeldin Taha, Anna Wienhard, and Kelin Xia. "Sheaf Neural Networks on SPD Manifolds: Second-Order Geometric Representation Learning." ICML (2026).
10. Xiaohan Wang, Deyu Bo, Longlong Li, and Kelin Xia. "Full-Spectrum Graph Neural Network: Expressive and Scalable." ICML (2026).
9. Ming Li, Yujie Fang, Dongrui Shen, Han Feng, Xiaosheng Zhuang, Kelin Xia, Pietro Lio, "High-pass matters: Theoretical insights and sheaflet-based design for hypergraph neural networks", AAAI (2026) (Outstanding Paper Award)
8. Peng Yuhan, Benquan Wang, Kelin Xia, and Zexiang Shen. "Topology-preserving deep learning for structural integrity in optical semiconductor characterization at deeply subwavelength resolution." In AI4X 2025 International Conference (2025)
7. Yipeng Zhang, Longlong Li, Kelin Xia, "Rhomboid Tiling for Geometric Graph Deep Learning", ICML (2025)
6. Huang Liang, Benedict Lee, Daniel Hui Loong Ng, Kelin Xia, "Contrastive Learning with Simplicial Convolutional Networks for Short-Text Classification", ICML (2025)
5. Huang, Liang, Kelin Xia, and Chuan-Shen Hu, "Path Complex Neural Networks for Sequential Process Activities Classification." Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2025)
4. See Hian Lee, Feng Ji, Kelin Xia, Wee Peng Tay, "Graph Neural Networks with a Distribution of Parametrized Graphs", Proceedings of the 41st International Conference on Machine Learning (ICML), PMLR 235:26640-26660 (2024)
3. Longlong Li, Xiang Liu, Guanghui Wang, Yu Guang Wang, and Kelin Xia. "Path Complex Neural Network for Molecular Property Prediction." In ICML 2024 Workshop on Geometry-grounded Representation Learning and Generative Modeling (2024)
2. Xiang Liu, and Kelin Xia, "Persistent Tor-algebra based stacking ensemble learning (PTA-SEL) for protein-protein binding affinity prediction", ICLR 2022 Workshop on Geometrical and Topological Representation Learning (2022)
1. Xiang Liu, and Kelin Xia, "Neighborhood complex based machine learning (NCML) models for drug design." In Interpretability of Machine Intelligence in Medical Image Computing, and Topological Data Analysis and Its Applications for Medical Data, pp. 87-97. Springer, Cham (2021).

(Book editor)
Takafumi Ueno, Sierin Lim, and Kelin Xia, "Protein Cages: Design, Structure, and Applications", Book series Methods in Molecular Biology, New York, NY: Springer US (2023)

Research Topics

Persistent Spectral based machine learning (PerSpect ML) for drug design

The structure-function relationship is of essential importance to the analysis of biomolecular flexibility, dynamics, interactions, and functions. Topology studies the network and connection information within the data and provides an effective way of structure characterization. As illustrated in the figures, there are three basic topological representations, including graph, simplicial complex, and hypergraph, for molecular structures. Features for learning models can be obtained from these representations. The essential idea is to use eigen-spectrum-based properties as molecular descriptors.

Our persistent spectral (PerSpect) theory covers three basic models, i.e., PerSpect graph, PerSpect simplicial complex, and PerSpect hypergraph. These models are filtration-based multidimensional spectral methods. Mathematically, spectral graph theory, spectral simplicial complex, and spectral hypergraph have been developed based on graph, simplicial complex, and hypergraph. These models use different types of connection matrixes, in particular, Hodge (combinatorial) Laplacian matrixes, to represent structure connection. The multidimensional representation is achieved through a filtration process. The persistence and variation of eigen spectrum information during the filtration process are characterized by persistent functions or attributes, which are further used as molecular features or fingerprints.

Reference: Zhenyu Meng and Kelin Xia, "Persistent spectral based machine learning (PerSpect ML) for protein-ligand binding affinity prediction", Science advances (2021)

Persistent Ricci curvature based machine learning

Ricci curvature is one of the fundamental concepts in differential geometry and theoretical physic. Two discrete Ricci curvature forms, i.e., Ollivier Ricci curvature (ORC) and Forman Ricci curvature (FRC), have been developed to characterize different aspects of the classical Ricci curvature. ORC is defined as the Wasserstein distance between two associated probability measurements on metric spaces. It captures clustering and coherence properties of global and local structures in networks. In contrast, FRC is defined as a combinatorial property of upper adjacent, lower-adjacent and parallel simplexes on CW complexes. This combinatorial curvature can be directly derived from the combinatorial Bochner-Weitzenbock decomposition. It characterizes geodesics dispersal property and algebraic topological information within networks. Even though the two discrete forms can have totally different values, sometimes even signs, for network substructures, they are found to be highly correlated in various complex networks. Generally speaking, positive ORCs or FRCs are commonly found in densely-packed clusters or “communities”, while negative ORCs or FRCs usually represent bridges or links between clusters.
The persistent Ricci curvature is proposed to combine filtration-based multiscale representations with Ricci curvatures for molecular featurization. Ricci curvatures are systematically evaluated on all the graphs/simplicial complexes/hypergraphs in the filtration process. The statistical and combinatorial properties of Ricci curvatures during the filtration are used as molecular descriptors.

Referece: JunJie Wee and Kelin Xia, "Forman persistent Ricci curvature (FPRC) based machine learning models for protein–ligand binding affinity prediction", Briefings In Bioinformatics (2021)
JunJie Wee and Kelin Xia, "Ollivier persistent Ricci curvature (OPRC) based machine learning for protein-ligand binding affinity prediction", Journal of Chemical Information and Modeling, https://doi.org/10.1021/acs.jcim.0c01415 (2021)

Peristent hypergraph based machine learning

Hypergraphs are powerful topological representations that can characterize more general structure information than graphs and simplicial complexes. A hypergraph is composed of hyperedges, which are sets of vertices. Essentially, a hyperedge can be viewed as a generalization of simplexes without the closeness under boundary conditions. The interactions between molecules at atomic level can be well represented as hypergraphs. Mathematically, a hyperedge can be defined a set of vertices (atoms) that have at least one from each molecules. For instance, in protein-ligand interactions, a hyperedge is defined among protein and ligand atoms, but it has at least one atom from protein and the other from ligand. In this way, hyperedges represent (many-boday) interactions between protein and ligand atoms.
Element-specific models are widely used to decompose molecular complexes into a series of atom specific combinations. More specifically, proteins can be decomposed into at least 5 types of atom sets, i.e., C, O, N, S, and H, while ligand usually have at least 10 types of atom, including C, N, O, S, P, F, Cl, Br, I and H. In this way, upto 50 atom combinations can be obtained and the corresponding hypergraphs can be constructed. Topological and geometric invariants can be systematically obtained from these hyperedges and further used as features for machine learning models.

Reference: Xiang Liu, Huitao Feng, Jie Wu, and Kelin Xia, "Persistent spectral hypergraph based machine learning (PSH-ML) for protein-ligand binding affinity prediction", Briefings In Bioinformatics (2021)
Xiang Liu, Xiangjun Wang, Jie Wu, and Kelin Xia, "Hypergraph based persistent cohomology (HPC) for molecular representations in drug design", Briefings In Bioinformatics (2021)

Geometric and Variational modeling

Variational multi-scale models

We develop geometric modeling and computational algorithm for biomolecular structures from two data sources: Protein Data Bank (PDB) and Electron Microscopy Data Bank (EMDB) in the Eulerian (or Cartesian) representation. Molecular surface (MS) contains non-smooth geometric singularities, such as cusps, tips and selfintersecting facets, which often lead to computational instabilities in molecular simulations, and violate the physical principle of surface free energy minimization. Variational multiscale surface definitions are proposed based on geometric flows and solvation analysis of biomolecular systems. The resulting surfaces are free of geometric singularities and minimize the total free energy of the biomolecular system. High order partial differential equation (PDE)-based nonlinear filters are employed for EMDB data processing. After the construction of protein multiresolution surfaces, we explore the analysis and characterization of surface morphology by the consideration of Gaussian curvature, mean curvature, maximum curvature, minimum curvature, shape index, and curvedness. Based on the curvature and electrostatic analysis from our multiresolution surfaces, we introduce a new concept, the polarized curvature, for the prediction of protein binding sites.

Protein flexibility and rigidity analysis

Protein structural fluctuation, typically measured by Debye-Waller factors, or B-factors, is a manifestation of protein flexibility, which strongly correlates to protein function. The flexibility-rigidity index (FRI) is a newly proposed method for the construction of atomic rigidity functions required in the theory of continuum elasticity with atomic rigidity, which is a new multiscale formalism for describing excessively large biomolecular systems. The FRI method analyzes protein rigidity and flexibility and is capable of predicting protein B-factors without resorting to matrix diagonalization. A fundamental assumption used in the FRI is that protein structures are uniquely determined by various internal and external interactions, while the protein functions, such as stability and flexibility, are solely determined by the structure. As such, one can predict protein flexibility without resorting to the protein interaction Hamiltonian. Additionally, we propose anisotropic FRI (aFRI) algorithms for the analysis of protein collective dynamics. Eigenvectors obtained from the proposed aFRI algorithms are able to demonstrate collective motions.

Scientific Computing

MIB method for multi-material interface problem

Multi-material interface problems are omnipresent in science, engineering and daily life. The solution to this class of problems becomes exceptionally challenging when more than two heterogeneous materials join at one point of the space and form a geometric singularityprimary. Based on the MIB method, several schemes have been constructed to solve 2D elliptic equations with discontinuous coefficients associated with three-material interfaces. The essential idea is to smoothly extend functions across the interface and employ the fictitious values at irregular points. For the geometric singularities, two sets of interface conditions are considered simultaneously. Intensive numerical experiments are carried out to validate the proposed schemes. A second order of accuracy is obtained for complex geometric and geometric singularities.

Adaptive mesh based MIB method

Mesh deformation methods break down for elliptic PDEs interface problems, as additional interface jump conditions are required to maintain the well-posedness of the governing equation. An interface technique based adaptively deformed mesh strategy is introduced for resolving elliptic interface problems. We take the advantages of the high accuracy, flexibility and robustness of MIB method to construct an adaptively deformed mesh based interface method. The proposed method generates deformed meshes in the physical domain and solves the transformed governed equations in the computational domain, which maintains regular Cartesian meshes. The mesh deformation is realized by a mesh transformation PDE, which controls the mesh redistribution by a source term. The source term consists of a monitor function, which builds in mesh contraction rules. Both interface geometry based deformed meshes and solution gradient based deformed meshes are constructed to reduce errors in solving elliptic interface problems. The proposed adaptively deformed mesh based interface method is extensively validated by many numerical experiments. Numerical results indicate that the adaptively deformed mesh based interface method outperforms the original MIB method for dealing with elliptic interface problems.

MIB Galerkin method

A MIB Galerkin formulation is developped for solving the elliptic interface problem. In this approach, we build up two sets of elements respectively on two extended subdomains which both include the interface. As a result, two sets of elements overlap each other near the interface. Fictitious solutions are defined on the overlapping part of the elements, so that the differentiation operations of the original PDEs can be discretized as if there was no interface. The extra coeffients of polynomial basis functions, which furnish the overlapping elements and solve the fictitious solutions, are determined by interface jump conditions. Consequently, the interface jump conditions are rigorously enforced on the interface. The present method utilizes Cartesian meshes to avoid the mesh generation in conventional finite element methods (FEMs). The accuracy, stability and robustness of the proposed 3D MIB Galerkin are extensively validated. Near second order accuracy has been confirmed. To our knowledge, it is the first time for an FEM to show a near second order convergence in solvingthe Poisson equation with realistic protein surfaces. Additionally, the present work offers the first known near second order accurate method for C_1 continuous or H_2 continuous solutions associated with a Lipschitz continuous interface.