Li, Yi (
)
I am an associate professor in the Division of Mathematical Sciences at Nanyang Technological University.
Before joining NTU, I was a postdoc at Harvard University (hosted by Jelani Nelson), a postdoc at the Max-Planck Institute for Informatics in Saarbruecken, Germany and a research fellow at the Simons Institute for the Theory of Computing.
Email:
Position Vacancies
- Two one-year postdoc vacancies are available to work on topics related to data stream algorithms, low distortion metric embeddings and randomized numerical linear algebra. The duration of the posts will end before December 2025 with a flexible starting date. Renewals are possible and will be based on performance and the funding situation. Travel support is available. The salary is expected to be 78000 SGD per year. The tax rate in Singapore is low, which is about 4% for the aforesaid salary level. Please contact me if you are interested.
- One funded PhD student position is available. The research area is expected to be related to data stream algorithms, low distortion metric embeddings and randomized numerical linear algebra. If you are interested, please contact me so we can know more about each other. The application deadline is 31 January 2024. Please visit the official website of the Division of Mathematical Sciences for more information of the PhD programme (including how to apply).
Research Interests
Algorithms for massive data sets, data streaming algorithms
Low-distortion metric embeddings
Compressive sensing and signal processing
Theoretical computer science
Education
2010 - 2013 |
Ph. D. in Computer Science and Engineering,
University of Michigan.
|
2008 - 2010 |
M. Sc. in Computer Science and Engineering, University of Michigan. |
2004 - 2008 |
B. Eng. in Computer Science (ACM Honoured Class),
Shanghai Jiao Tong University. |
Manuscripts
- Cheng Chen, Yi Li,and Yiming Sun. Online Active Regression.
arXiv:2207.05945
(This version supersedes, and significantly improves, [C31])
- Yi Li and Mingmou Liu. Sparsity-Dimension Trade-Offs for Oblivious Subspace Embeddings.
arXiv:2212.02913
Journal Publications
- Zhengyang Guo, Yi Li, and Shaoyu Pei. Expected Size of Random Tukey Layers and Convex Layers.
Computational Geometry: Theory and Applications 103, Article No. 101856, Apr 2022. pdf
- Yi Li, Ruosong Wang, and David Woodruff. Tight Bounds for the Subspace Sketch Problem with Applications.
SIAM J. Comp. 50(4), pp 1287--1335, 2021. pdf
(This version supersedes [C19])
- Yi Li and Vasileios Nakos. Sublinear-Time Algorithms for Compressive Phase Retrieval.
IEEE Trans. Info. Theory, 66(11), pp 7302--7310, 2020. pdf
(This paper supersedes, and is radically different from, [C14])
- Yi Li, Huy Le Nguyen, and David Woodruff. On Approximating Matrix Norms in a Stream.
SIAM J. Comput., 48(6), pp 1643--1697, 2019. pdf
(This version supersedes [C4], [C9] and part of [C11])
- Sudipto Guha, Yi Li, and Qin Zhang. Distributed Partial Clustering.
ACM Transactions on Parallel Computing 6(3), pp 11:1--11:20, October 2019. (Special issue on SPAA 2017) pdf
(This version supersedes [C12])
- Anna Gilbert, Yi Li, Ely Porat, and Martin Strauss.
For-all Sparse Recovery in Near-Optimal Time.
ACM Transactions on Algorithms 13(3), pp 32:1--32:26, August 2017. pdf
(This version supersedes [C6])
- Petros Boufounos, Volkan Cevher, Anna Gilbert, Yi Li, and Martin Strauss. What's the Frequency, Kenneth?: Sublinear Fourier Sampling Off the Grid.
Algorithmica 73(2), pp 261-288, 2015. pdf
(This version supersedes [C2])
Update: A small tweak in the hashing lemma shows that diluting S^1 by k/η, instead of 1/η, would be enough. The sampling duration can be brought down to 1/η from k/η.
- Anna Gilbert, Yi Li, Ely Porat, and Martin Strauss. Approximate Sparse Recovery: Optimizing Time and Measurements.
SIAM J. Comput. 41(2), pp 436-453, 2012. pdf
(This version supersedes [C1])
Conference Publications
- Yi Li, Honghao Lin, and David P. Woodruff. ℓp-Regression in the Arbitrary Partition Model of Communication.
Proceedings of Machine Learning Research 195:4902-4928, 2023. (Proceedings of COLT 2023) arXiv:2307.05117
- Yi Li, Honghao Lin, Simin Liu, Ali Vakilian, and David Woodruff. Learning the Positions in CountSketch.
Proceedings of ICLR 2023. (No physical publication) Open Review link
- Yi Li, Honghao Lin, and David P. Woodruff. The ℓp-Subspace Sketch Problem in Small Dimensions with Applications to Support Vector Machines.
Proceedings of SODA 2023, pp 850--877. arXiv:2211.07132
- Yi Li, Honghao Lin, David Woodruff, and Yuheng Zhang. Streaming Algorithms with Large Approximation Factors.
Proceedings of APPROX/RANDOM 2022, LIPIcs Vol. 245, pp 13:1--13:23. arXiv:2207.08075
- Cheng Chen, Yi Li, and Yiming Sun. Online Active Regression.
Proceedings of Machine Learning Research 162:3320--3335, 2022. (Proceedings of ICML 2022) Long presentation at ICML 2022.
- Yi Li and Mingmou Liu. Lower Bounds for Sparse Oblivious Subspace Embeddings.
Proceedings of PODS 2022, pp 251--260. arXiv:2112.10987
- Yi Li, Yan Song, and Qin Zhang. Learning to Cluster via Same-Cluster Queries.
Proceedings of CIKM 2021, pp 978--987. arXiv:2108.07383
- Yi Li and David P. Woodruff. The Product of Gaussian Matrices is Close to Gaussian.
Proceedings of APPROX/RANDOM 2021, LIPIcs Vol. 207, pp 35:1--35:22. arXiv:2108.09887
- Yi Li, David P. Woodruff, and Taisuke Yasuda. Exponentially Improved Dimensionality Reduction for ℓ1: Subspace Embeddings and Independence Testing.
Proceedings of Machine Learning Research 134:3111--3195, 2021. (Proceedings of COLT 2021) arXiv:2104.12946
- Yifei Jiang, Yi Li, Yiming Sun, Jiaxin Wang, and David Woodruff. Single Pass Entrywise-Transformed Low Rank Approximation.
Proceedings of Machine Learning Research 139:4982--4991, 2021. (Proceedings of ICML 2021) arxiv:2107.07889
- Zhengyang Guo and Yi Li. Geometric Cover with Outliers Removal.
Proceedings of STACS 2021, LIPIcs Vol. 187, pp 39:1--39:15. pdf
- Alexandr Andoni, Collin Burns, Yi Li, Sepideh Mahabadi, and David Woodruff. Streaming Complexity of SVMs.
Proceedings of APPROX/RANDOM 2020, LIPIcs Vol. 176, pp 50:1--50:22. arxiv:2007.03633
- Yi Li, Ruosong Wang, Lin Yang, and Hanrui Zhang. Nearly Linear Row Sampling Algorithm for Quantile Regression.
Proceedings of Machine Learning Research 119:1888--1898, 2020. (Proceedings of ICML 2020). arxiv:2006.08397
- Yi Li and David Woodruff. Input-Sparsity Low Rank Approximation in Schatten Norm.
Proceedings of Machine Learning Research 119:11124--11132, 2020. (Proceedings of ICML 2020). arxiv:2004.12646
- Yi Li and Vasileios Nakos. Deterministic Sparse Fourier Transform with an ℓ∞ Guarantee.
Proceedings of ICALP 2020, LIPIcs Vol. 168, pp 77:1--77:14. arxiv:1903.00995
- Tanqiu Jiang, Yi Li, Honghao Lin, Yisong Ruan, and David Woodruff. Learning-Augmented Data Stream Algorithms.
Proceedings of ICLR 2020. (No physical publication) Open Review link
- Yi Li, Ruosong Wang, and David Woodruff. Tight Bounds for the Subspace Sketch Problem with Applications.
Proceedings of SODA 2020, pp 1655--1674. arxiv:1904.05543
- Maria-Florina Balcan, Yi Li, David Woodruff, and Hongyang Zhang. Testing Matrix Rank, Optimally.
Proceedings of SODA 2019, pp 727--746. arxiv:1810.08171
- Yi Li and Vasileios Nakos. Deterministic Heavy Hitters with Sublinear Query Time.
Proceedings of APPROX/RANDOM 2018, LIPIcs Vol. 116, pp 18:1--18:18. arxiv:1712.01971
- Yi Li, Vasileios Nakos, and David Woodruff. On Low-Risk Heavy Hitters and Sparse Recovery Schemes.
Proceedings of APPROX/RANDOM 2018, LIPIcs Vol. 116, pp 19:1--19:13. arxiv:1709.02919
- Vladimir Braverman, Stephen Chestnut, Robert Krauthgamer, Yi Li, David Woodruff, and Lin Yang. Matrix Norms in Data Streams: Faster, Multi-Pass and Row-Order.
Proceedings of Machine Learning Research 80:648-657, 2018. (Proceedings of ICML 2018) arxiv:1609.05885
- Yi Li and Vasileios Nakos. Sublinear-Time Algorithms for Compressive Phase Retrieval.
Proceedings of ISIT 2018, pp 2301--2305.
- Yi Li and David Woodruff. Embeddings of Schatten Norms with Applications to Data Streams.
Proceedings of ICALP 2017, pp 60:1--60:14. pdf
- Sudipto Guha, Yi Li, and Qin Zhang. Distributed Partial Clustering.
Proceedings of SPAA 2017, pp 143--152. Co-winner of the Best Paper award.
- Yi Li and David Woodruff. Tight Bounds for Sketching the Operator Norm, Schatten Norms, and Subspace Embeddings.
Proceedings of APPROX/RANDOM 2016, LIPIcs Vol. 60, 39:1--39:11. arXiv:2202.09797
- Yuqing Ai, Wei Hu, Yi Li, and David Woodruff. New Characterizations in Turnstile Streams with Applications.
Proceedings of CCC 2016, pp 20:1--20:22. pdf
- Yi Li and David Woodruff. On Approximating Functions of the Singular Values in a Stream.
Proceedings of STOC 2016, pp 767--780.
- Yi Li, Xiaoming Sun, Chengu Wang, and David Woodruff.
On The Communication Complexity of Linear Algebraic Problems in the Message Passing Model.
Proceedings of DISC 2014, pp 499--513. Full version: arxiv:1407.4755
- Yi Li, Zhengyu Wang, and David Woodruff. Improved Testing of Low Rank Matrices.
Proceedings of SIGKDD 2014, pp 691--700. One of nine best papers.
- Anna Gilbert, Yi Li, Ely Porat, and Martin Strauss.
For-all Sparse Recovery in Near-Optimal Time.
Proceedings of ICALP 2014, LNCS 8572, pp 538--550
- Yi Li, Huy Le Nguyen, and David Woodruff. Turnstile
Streaming Algorithms Might as Well Be Linear Sketches.
Proceedings of STOC 2014, pp 174--183. pdf
- Yi Li, Huy Le Nguyen, and David Woodruff. On Sketching
Matrix Norms and the Top Singular Vector.
Proceedings of SODA 2014, pp 1562--1581.
- Yi Li and David Woodruff. A Tight Lower
Bound for High Frequency Moment Estimation for Small
Error.
Proceedings of APPROX/RANDOM 2013, LNCS 8906, pp
623--638. pdf of full version
- Petros Boufounos, Volkan Cevher, Anna Gilbert, Yi Li, and
Martin Strauss. What's the Frequency, Kenneth?: Sublinear
Fourier Sampling Off the Grid.
Proceedings of APPROX/RANDOM 2012, LNCS 7408, pp 61-72.
- Anna Gilbert, Yi Li, Ely Porat, and Martin Strauss.
Approximate Sparse Recovery: Optimizing Time and
Measurements.
Proceedings of STOC 2010, pp 475-484.
Grants
- MOE AcRF Tier 2. Feb 2023 -- Jan 2026. Amount awarded: 489,600 SGD.
- MOE AcRF Tier 1. Mar 2022 -- Aug 2024. Amount awarded: 189,000 SGD.
- MOE AcRF Tier 2. Jan 2019 -- Mar 2022. Amount awarded: 453,060 SGD + 140,000 SGD for 1 scholarship.
Talks (excluding conference presentations)
The lp-Subspace Sketch Problem. Basic Algorithms Research Copenhagen (BARC), University of Copenhagen, Mar 2023.
Lower Bounds for Sparse Oblivious Subspace Embeddings. Information Theory and Data Science Workshop. Institute of Mathematical Sciences, NUS, Jan 2023.
The lp-Subspace Sketch Problem. National University of Singapore, Dec 2022.
Lower Bounds for Sparse Oblivious Subspace Embeddings. Workshop on Algorithms and Foundations for Data Science. Institute of Mathematical Sciences, NUS, Jun 2022.
Some Linear Algebraic Problems from TCS Perspective. Shanghai Jiaotong University, Nov 2020.
The lp-Subspace Sketch Problem with Applications to Non-embeddability. Shanghai Jiaotong University, May 2019 and Xiamen University, Jun 2019.
The lp-Subspace Sketch Problem with Applications. Nanjing University, May 2019.
Matrix-related Problems in Data Streams. Workshop on Random Matrices, Stochastic Geometry and Related Topics. Institute of Mathematical Sciences, NUS, 2019.
Estimating the lp and Schatten Norms in Data Stream Model. NUS Department of Statistics Seminar Talk, 2019.
Estimating the Schatten Norms in Streaming Model. Shanghai Jiaotong University, 2018.
Estimating the Schatten Norms in Streaming Model. NII Shonan Meeting "Processing Big Data Streams", 2017.
Improved Sparse Recovery Algorithms. Dagstuhl Seminar 17181 "Theory and Applications of Hashing", 2017.
Sublinear-time Algorithms for the Sparse Recovery Problem. Workshop on Sparse Representations & Compressive Sensing, 2017.
Estimating the Schatten Norms in Streaming Model. DIMACS Workshop on E+M=C2, 2017.
Introduction to the Data Stream Algorithms. Xiamen University, 2016
Introduction to the Sparse Recovery Problem. Fuzhou University, 2016
Introduction to the Data Stream Algorithms. Math Colloquium, Nanyang Technological University, 2016.
Data Streaming Algorithms. MAS Seminar, Nanyang Technological University, 2015.
For-all Sparse Recovery in Near-Optimal Time. Theory of Computation Seminar, Harvard University, 2015.
A Brief Introduction to the Sublinear-time Sparse Recovery Problem. MPI for Informatics, 2014.
Sublinear Fourier Sampling Off the Grid. Workshop
on Sparse Fourier Transform, MIT, 2013.
Approximate Sparse Recovery: Optimizing Time and
Measurements. Minisymposium
on Combinatorics and Data Science, Shanghai Jiaotong
University, 2011.
Approximate Sparse Recovery: Optimizing Time and
Measurements. DIMACS
Workshop on Network Data Streaming and Compressive
Sensing, 2010.
Teaching
- At Nanyang Technological University
- MH1812. Discrete Mathematics.
- Autumn 2020*, 2021*, 2022*, 2023*. (* denotes co-teaching with Jian Guo)
- MH2500. Introduction to Probability and Statistics.
- Autumn 2017*, 2018*, 2019. (* denotes co-teaching with Chan Song Heng)
- MAS723. Topics in Probability and Statistics I: Probability in High-dimensional Spaces
- Spring 2017, 2018, 2019, 2020, 2021, 2022, 2023.
- MH2401. Algorithms and Computing III.
- Autumn 2016. (co-teaching with Fedor Duzhin)
- At Shanghai Jiaotong University
- Summer 2018, 2021, 2022. Probability and Computing.
Services
Miscellaneous