Major Research

“The most exciting phrase to hear in science,
the one that heralds the most discoveries,
is not 'Eureka!', but 'That's funny…”

--Isaac Asimov

Research Theme 1: Perceptual Signal Modeling & Processing

“To Make Machines Perceive Signals as Humans Do”

Signal quality evaluation plays a central role in shaping almost all signal processing algorithms and systems, as well as their implementation, optimization and testing. Since the human (with visual, hearing, touching, smelling and tasting senses) is the ultimate receiver of the majority of signals so far--being natural captured or computer generated--after acquisition, processing and transmission, incorporating proper human perception characteristics not only makes the built systems user-oriented but also enables resource savings (i.e., turning imperfectness of the human perception into advantages in design).

The resultant metrics are to replace the existing, mathematical measure (e.g., MSE, SNR, PSNR, or one of their relatives) to define and gauge the distortion of processed signals, since MSE, SNR or PSNR does not reflect the human perception well. Perceptual metrics are expected to fill a gap in most existing signal processing related products and services; namely, a non-perception-based criterion used in engineering design versus devices/services for the human to consume.

This is an exciting, inter-disciplinary research area since it enables user-oriented designs and further system performance improvement. We need to incorporate the latest relevant findings in ​neuroscience, brain theory, psychophysics, aesthetics, statistics, and user and cultural studies into computational models, and to verify such models with subjective viewing/hearing/sensing results; i.e., to make perception science truly quantitative.

Some examples of research work under this theme:

As can be seen in the Publications part, the comprehensive theoretical formulation of JND (Just-Noticeable Distortion) has been introduced in spatial [2003], transform [2005], spatiotemporal [2006], boundary-texture separation [2010], and pattern masking (with brain theory) [2013] aspects, with extension to screen contents [2016] and top-down mechanisms [2022]; we have researched for JND-based protection of privacy/copyright, fighting adversary/deepfake [2021-2023], and exploration beyond the visual into audio, haptics, olfaction and taste [2022]. For visual attention modeling, first quantitative solutions have been proposed for modulatory effect [2005], compressed bitstream directly [2011] (a leap since all visual signals are compressed), spatiotemporal uncertainty [2013], stereoscopic views [2014] and use of deep learning [2019,2020,2022,2023]. Besides several technological breakthroughs in signal-driven IQA (Image Quality Assessment) for natural and partially-artificial images/videos [2003-2022], our effort has led to emergence of machine learning as a new IQA category [2009-2013], by big foundation models [2022,2023] and for generative AI [2023], the first attempt for fine-grained IQA [2019,2022,2023] (an overlooked but important field), and demonstration that foundation models possess IOA capability [2024]. For perceptual video coding, the built perceptual models enable effective resource-allocation and signal optimization/reconstruction [2003,2005,2013-2017], and exploration to 3D visual signals [2022,2023].