TDA2022


Organizer:

Kelin Xia (NTU, Singapore)

Zoom link:
Zoom ID: 305 412 0115
Passcode: MH1100
Link: https://ntu-sg.zoom.us/j/3054120115




Speaker: Henry Adams (Department of Mathematics, Colorado State University)
Time: 4:00pm, 03/05/2022
Title: Topology in Machine Learning
Abstract: How do you "vectorize" geometry, i.e., extract it as a feature for use in machine learning? One way is persistent homology, a popular technique for incorporating geometry and topology in data analysis tasks. I will survey applications arising from materials science, computer vision, and agent-based modeling (modeling a flock of birds or a school of fish). Furthermore, I will explain how these techniques are related to the local geometry of a dataset and to explainable machine learning.


Speaker: Qing Nie (Department of Mathematics, Department of Developmental and Cell Biology, NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine)
Time: 10:00am, 04/05/2022
Title: Multiscale spatiotemporal reconstruction of single-cell genomics data
Abstract: Cells make fate decisions in response to dynamic environments, and multicellular structures emerge from multiscale interplays among cells and genes in space and time. The recent single-cell genomics technology provides an unprecedented opportunity to profile cells.  However, those measurements are taken as static snapshots of many individual cells that often lose spatial information. How to obtain temporal relationships among cells from such measurements? How to recover spatial interactions among cells, such as cell-cell communication? In this talk I will present our newly developed computational tools that dissect transition properties of cells and infer cell-cell communication based on nonspatial single-cell genomics data. In addition, I will present methods to derive multicellular spatiotemporal pattern from spatial transcriptomics datasets. Through applications of those methods to systems in development and regeneration, we show the discovery power of such methods and identify areas for further development for spatiotemporal reconstruction of single-cell genomics data.


Speaker: Tamar Schlick (Department of Chemistry, Courant Institute of Mathematical Sciences, New York University)
Time: 8:00am, 06/05/2022
Title: The complex conformational landscape of the SARS-CoV-2 Frameshifting RNA element
Abstract: A combination of graph-based modeling for RNAs with pseudoknots, chemical reactivity experiments, and microsecond molecular dynamics simulations will be described to untangle the complex conformational landscape of the frameshifting RNA element of SARS-CoV2 and suggest new avenues for anti-viral therapy.


Speaker: Pedro J. Ballester (Cancer Research Center of Marseille, INSERM)
Time: 2:30pm, 06/05/2022
Title: Machine-learning scoring functions for structure-based virtual screening: where are we?
Abstract: Molecular docking usually predicts whether and how small molecules bind to a macromolecular target from one of its X-ray crystal structures. Scoring functions for structure-based virtual screening primarily aim at discovering which molecules bind to the considered target when these form part of a library with a much higher proportion of non-binders. Classical scoring functions are essentially models building a linear mapping between the features describing a protein–ligand complex and its binding/activity label. Alternatively, techniques from machine learning, a major subfield of artificial intelligence, can be used to build fast supervised learning models for this task. In this talk, we will provide an overview of such machine-learning scoring functions for structure-based virtual screening and explain how are different from those intended for optimising a drug lead. We will discuss what the shortcomings of current benchmarks really mean and what valid alternatives have been employed. The latter retrospective studies observed that machine-learning scoring functions were substantially more accurate, in terms of higher hit rates and potencies, than the classical scoring functions they were compared to. Several of these machine-learning scoring functions were also employed in prospective studies, in which low- to mid-nanomolar binders with novel chemical structures were directly discovered without requiring any potency optimization. A discussion of open questions for future work completes this talk.


Speaker: Aurora Clark  (Department of Chemistry, Director of the Center for Institutional Research Computing, Washington State University; Laboratory Fellow, Pacific Northwest National Laboratory)
Time: 10:00am, 11/05/2022
Title:  Studying Multiscale and Many-body Correlations in Chemical Systems Using Persistent Homology
Abstract: Experimental and computational chemists traditionally employ spatiotemporal correlation functions to examine structural organization and dynamic phenomena of physical systems. Although the exact formulation of such functions may be motivated by experimental design, as advanced computational chemistry methods begin to predict data for increasingly realistic and non-ideal conditions – apriori knowledge of which correlation functions are relevant becomes a challenge. It is a case of “unknown unknowns”.  New tools are needed for chemists to analyze complex data and identify correlations and structure across length and timescales. The challenges presented in this discussion are well-suited to recent developments and ongoing research in computational topology. Here, I will discuss several case studies from our laboratory that use persistent homology to analyze chemical point cloud data, surfaces, and manifolds. Further study of the topological features - patterns within the birth and death times of  topological features, or the application of distance metrics of persistence distributions, is providing new fundamental insight that may then be employed within new theories of chemical behavior that are expanding the predictive capabilities of computational chemistry.


Speaker: Luoxin Zhang (Department of Mathematics, National University of Singapore)
Time: 10:00am, 17/05/2022
Title:  Phylogenetic trees or phylogenetic networks
Abstract: Current genomic and genetic studies suggest that reticulate processes play more important roles in genome evolution than we expected a decade ago. As such, phylogenetic networks are believed to be more suitable for modelling reticulate processes than trees for genome evolution. However, phylogenetic networks are much more complex than phylogenetic trees, as the network class is much larger than tree class. In this talk, the speaker will discuss different combinatorial aspects of phylogenetic networks and how hard to infer phylogenetic networks from phylogenetic trees from different types of biological data.


Speaker: Ginestra Bianconi  (School of Mathematical Sciences, Queen Mary University of London, Alan Turing Fellow, Alan Turing Institute)
Time: 5:00pm, 18/05/2022
Title: The dynamics of higher-order networks: the effect of topology and triadic interactions
Abstract:Networks have been very successful to investigate complex systems.  However, they have the strong limitation that they capture only pairwise interactions. Recently growing attention has been addressed to higher-order networks that include interactions among two or more nodes and allow to go beyond the description provided by graphs and networks. Here we show that higher-order interactions are responsible for new dynamical processes that cannot be observed in pairwise networks. We will cover how topology described by higher-order Laplacian and the Dirac operator is key to define synchronization of topological signals, i.e. dynamical signal defined not only on nodes but also on links, triangles and higher-dimensional simplicies in simplicial complexes. We will also reveal how triadic interactions can turn percolation into a fully-fledged dynamical process in which nodes can turn on and off intermittently in a periodic fashion or even chaotically leading to period doubling and a route to chaos of the percolation order parameter.


Speaker: Yasuaki Hiraoka (Department of Mathematics, Kyoto University)
Time: 10:00am, 19/05/2022
Title: Persistent homology in materials science and its related mathematical problems
Abstract: Topological data analysis (TDA) is an emerging concept in applied mathematics, by which we characterize “shape of data” using topological methods. In particular, the persistent homology and its persistence diagrams are nowadays applied to a wide variety of scientific and engineering problems. In this talk, I will survey our recent activity of TDA in materials science (glass, granular systems, iron ore sinters etc). By developing several new mathematical tools based on quiver representations, inverse analysis, and machine learnings, we can explicitly characterize significant geometric and topological (hierarchical) features embedded in those materials, which are practically important for controlling materials functions. I will also present several mathematical challenges in multi-parameter persistence and random topology motivated by those applications.


Speaker: Moo K. Chung (Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison)
Time: 9:00am, 24/05/2022
Title: Topological inference and learning for graphs
Abstract: Many previous studies on networks have mainly focused analyzing graph theory features that are often parameter dependent. Persistent homology provides a more coherent mathematical framework that is invariant to the choice of parameters. Instead of looking at networks at a fixed scale, persistent homology charts the topological changes of networks over every possible parameter. In doing so, it reveals the most persistent topological features that are robust to parameter changes. In this talk, we present novel topological inference and learning frameworks that can integrate networks of different sizes, topology or modalities through the Wasserstein distances. The use of Wasserstein distances bypasses the intrinsic computational bottleneck associated with persistent homology. It is now possible to perform various graph computations including matching in O(n log n). We demonstrate the versatility of the proposed method through the twin brain imaging study where we determine the extent to which brain networks are genetically heritable. The talk is based on preprints: Songdechakraiwut et al. 2021 (arXiv:2012.0067), Anand et al. 2021 (arXiv:2110.14599) and Chung et al. 2022 (arXiv:2201:00087).


Speaker: Jelena Grbic (Department of Mathematics, University of Southampton)
Time: 3:00pm, 24/05/2022
Title: Mathematical disguises of simplicial complexes
Abstract: In this talk I will present few pure mathematical objects from various research areas that have simplicial complexes and their combinatorial structures in common. My aim is to highlight the importance of the combinatorics of simplicial complexes in solving seemingly unrelated problems as well as how, for example, topological problems can indicate new combinatorial invariants of simplicial complexes.


Speaker: Javier Arsuaga (Department of Molecular and Cellular Biology, Department of Mathematics, UC Davis)
Time: 10:00am, 26/05/2022
Title: Using random knot theory and statistical topology to measure chromosome entanglement
Abstract: Uncovering the basic principles that govern the three dimensional (3D) organization of genomes is one of the main challenges of mathematical biology in the post-genomic era. Theoretical results in random knotting theory predict that, due to confinement, genomes should be highly entangled and form knots and links. On the other hand, in vitro studies show that knots and links are detrimental for the cell. It is therefore natural to ask whether knots or links are naturaly occurring in genomes; and if found, how they are regulated. Double stranded DNA in certain viruses and in the mitochondrion of trypanosomes (organisms responsible for African Trypanosomiasis) is highly confined; their genomes have been found to contain knots and to form large networks of linked circles respectively. To test whether these knots and links are due to confinement we turn to the theory of random knotting. We show, analytically or computationally that (1) the knotting probability of a random curve in a confined volume in- creases exponentially fast with the length of the curve, and that (2) the probability of forming a large random network of linked circles grows exponentially fast with the density of circles. We further characterize the mechanisms that regulate the topology of these systems by combining these results in random knotting with other mathematical results obtained using brownian dynamics simulations.
It is also natural to ask whether knots and/or links occur in the chromosomes of higher organisms. This question poses new challenges because experimental methods used in the previous examples are not valid; and because chromosomes are linear and the mathematical concept of knot or link is only defined for circular curves. To address the first concern we analyze chromosome conformation capture (CCC) data to build three dimensional reconstructions of genomes. For the second, we introduce the concept of linking proportion, a statistical feature that allows us to quantify the entanglement of non-circular genomes. Our analysis shows that, the Rabl configuration, an evolutionary conserved structure common in fungi and plants reduces the entanglement of genomes.
We suggest that topological complexity is a problem that evolution needs to solve when the size of genomes increase. We propose that statistical topology and random knotting are key areas of mathematics in the analysis of the three dimensional structure of chromosomes
.



Speaker: Fei Han (Department of Mathematics, NUS)
Time: 9:30am, 30/05/2022  (reschedule to 9:30am, 14/06/2022)
Title: Gromov-Hausdorff distance and its application Dynamics
Abstract: In this talk, I will discuss our newly-developed topology-awared Gromov-Hausdorff distance and its application in molecular data.


Speaker: Chao Zhou (Department of Mathematics and Risk Management Institute, NUS)
Time: 10:30am, 30/05/2022
Title: Optimal Execution with Hidden Orders under Self-Exciting Dynamics
Abstract: Hidden liquidity is attracting significant volume share in modern order-driven markets, providing exposure risk reduction and mitigating adverse selection risk. In a continuous-time framework, we show there is a switching in the optimal liquidation strategy for a risk-neutral agent who uses both hidden and displayed limit orders controlling the order sizes. When market order arrivals are modeled as the Poisson process, we derive a closed-form solution that contains a switching time, at which the agent changes from a pure-hidden-order phase to a mixed-orders phase until termination. Under the Hawkes process with self-exciting dynamics, a numerical solution is provided. We show that the optimal strategy exhibits a similar two-phase pattern, except that the switching time becomes a function of the market order intensity. Simulation experiments show that the use of hidden order reduces liquidation cost, accompanied by an increase in liquidity. Given event-level limit order book data of 100 NASDAQ stocks, we test the liquidation strategies, where our strategy (with mixed type under the self-exciting dynamics) leads to cost reduction up to 57% to the pure limit order strategy and 15\% to the strategy with both order types under the Poisson process.