Yonggang Wen
[ Home
| Biography | Research | Teaching
| Service | Publication
| Entrepreneurship | Group
| Media ] |
||
Research
Philosophy
Learning-based System Prototyping and
Performance Optimization for Large-Scale Networked Computer Systems
Our research interests are in the
area of large-scale networked computer
systems, ranging from social media networks, cloud computing platforms,
green data centre and big data systems. Leveraging a unique background of
rigorous analytical training at MIT and system engineering experience at
Cisco, I have been leading our team to bridge the gap between theory and
practice, by extending theoretical insights into system prototyping and
performance optimization. At NTU, we have worked on applications of machine
learning, optimization theory, queuing theory and information theory, to
tackle practical challenges in a variety of Internet-scale system projects,
including a social TV platform, a modular data centre testbed and a big-data
solution over GPGPU virtualization, to name a few. In this process, we have
refined a research framework, called learning-based
system prototyping and performance optimization for large-scale networked
computing systems, in which some of these systems have leaped over the
barrier between academia and industry to become commercial products and
services. Learning-based System
Prototyping and Performance Optimization At NTU, we have been refining a learning-based approach
to system prototyping and performance optimization for large-scale networked
computing systems. Its workflow, as illustrated within the box, pivots over
an innovation cycle of agile system prototyping and data-driven performance
optimization. It starts with an
architectural concept, and is further substantiated with a reference system
design, which in turn serves as a blueprint for a system prototype built upon
open-source libraries and proprietary implementations. The resulting
prototype is then released for public trial to collect live operational data,
which is supplemented with related datasets from its ecosystem. This combined
set of public and private data is fed into a learning engine to generate
models for performance optimizations. Actionable guidelines from this joint
learning-optimization theory are then put into action over the reference
design, via the emerging paradigm of software-defined systems. We expect the
final deliverables of this research practice as refined products or services,
which can be commercialized with less additional efforts. As we go through
each cycle, high-quality publications can be obtained as by-products. Research Projects
Toward Green Data Centre as an Interruptible Load for Grid
Stabilization
Data centre has emerged
as the critical infrastructure to fuel Internet innovations (e.g., cloud
computing, big data). However, data centre typically consumes a huge amount
of electrical power, leading to energy waste due to its low unitization and
aggravating the instability challenge of power grid with volatile yields. In
this research, we propose a ground-breaking concept of flipping data centre’s
power burden into an opportunity to stabilize power grid with fluctuating
supply. Specifically, we aim to develop technical and scientific solutions
with an arbitrage-free economic model to enable data centre as an
interruptible load (i.e. power load that can be scaled down temporally and
spatially) to stabilize power grid with volatile renewables due to varying
weather conditions. The technical solution leverages a transformative power analytics
framework, i.e., embedded software as
sensors, in which software hooks are embedded into a range of data centre
subsystems, from chip to system to application level, to log ICT activities
and power usage in a fine-grained, real-time manner. Data collected are then
analyzed, via whitebox
(e.g., kernel methods) and blackbox (e.g., deep
learning networks) approaches, to construct system power models, which are
used to develop (near)-optimal algorithms for energy-efficient data centre
operations across computing, power distribution and cooling subsystems. This
holistic system monitoring and optimization framework strives to reduce the
overall power consumption of data centre, and enable spatial and temporal
shifting of workloads in a network of geo-distributed data centres to
mitigate the grid instability resulted from stochastic renewable yields. Data Centre Energy Map Towards Outside Air Cooling and Energy-Efficient ICT
Operations for Modular Data Center in Tropical
Environment
The primary objective
of this research program is to demonstrate the practicality and
cost-effectiveness of a modular data center,
equipped with outside air cooling and energy-efficient ICT operations, in a
tropical environment like Singapore. It takes an integrated approach towards
a data center design for the future in terms of its
major sub-systems: IT equipment
(servers, storage and network), power supply infrastructure (including UPS
and back-up power) and cooling systems (mode of cooling, type of equipment and systems
design). Such an integrated approach
enables matching of the IT needs with the individual sub-system requirements
with minimal overprovision of resources. It also allows the key concerns of
data center operations (response to the IT demands
with minimal latency, adequacy of resources to ensure continuous equipment
uptime, and operational conditions which minimizes failures rates) to be
achieved concurrently without compromising on energy efficiency. As the
industry is marked by high variability in its operational environment, there
is a tendency for it to at a low part-load for a substantial proportion of
time, hence a highly modularized design, coupled with the ability to ramp up
resources within a short time is another desirable feature. The effectiveness
of this integrated approach is being demonstrated via two leading
applications, including HTTP video streaming and big-data analytics. Multi-Screen Cloud Social TV for Value-Added Content
Services
This project aims to develop our
patent-pending cloud-centric media technologies into a multi-screen
cloud-based social TV platform, for which a system prototype will be
implemented for feasibility and usability studies. Research on big-data analytics on metadata and
social data will be pursued to further improve user experience. Two value-added
applications (e.g., video streaming over multiple screens and real-time TV
advertisement tracking) will be studied to establish the business value of
this technology. Our multi-screen Social TV technology
has been touted, by global media (1600+ news articles from 29+ countries), as
an innovative technology to transform the traditional “laid-back” TV viewing behavior with the
proactive “lean-forward” social
networking experience, marrying TV to the social networking lifestyle of
today. This platform, when fully developed and commercialized, would
transform the value of TV and potentially save it from the similar downfall
of newspapers. In our system, examples of salient and sticky features
include, but not limited to, virtual living room experience that allows
remote viewers to watch TV programs together with text, audio and video
communication modalities, video teleportation experience that allows viewers
seamlessly to migrate programs across different screens (e.g., TV, smartphone
and tablet) with minimum learning. Moreover, to meet the requirements from
various customers, the platform will provide a set of Application Programming
Interfaces (APIs) for other developers to design, implement and deploy novel
value-added content services for specific customer needs (e.g., elderly home
care, TV ad workflow redesign, real-time TV shopping, collaborative
e-learning, autism diagnosis and assistive treatment, to name a few. In this
research, we plan to customize our solution targeted at two high-value TV
applications, including an immersive TV
watching service across multiple screens and a real-time TV advertisement tracking service, for which a
campus-wide trial will be conduct at NTU. Our novel technology outperforms other
commercially available solutions of similar usage, by providing the most
comprehensive features to meet end user’s needs in every occasion, from
social networking to potentially home care monitoring, while offering the
required scalability to support a large number of concurrent users. Adoption
will be extremely easy through highly intuitive human-computer interfaces.
Initial discussions have generated high commercial interest in our
technology, via a desire for collaboration enquires with specific customer
needs from TV vendors, service providers and OTT content providers. These
needs dictate new features to be introduced into our system prototype and
additional R&D efforts on big data analytics on social data and metadata
to provide higher value to our customers. Toward Learning-based Thermal Comfort Models to Instill Behavioral Changes for
Greener, Smarter and Healthier Building in the Tropics via Pervasive Sensing
This research proposes to develop online thermal comfort models, via a
deep-learning approach, and apply them for behavioral
studies to drive “greener, smarter and healthier buildings” in the tropics
(e.g., Singapore). Leveraging privacy-preserving
data analytics over information acquired from smartphone crowdsourcing and
in-situ wearables measurements, we plan to develop and validate an
integrative, economical and scalable thermal comfort management system, with
the following technical aims: ·
To validate the canonical PMV model in the tropics via privacy-preserving
data mining; ·
To develop an online personalized thermal comfort model via a deep-learning approach; and ·
To derive a unified utility
mechanism for thermal comfort to instill behavioral changes in building occupants for greener,
smarter and healthier buildings in Singapore.
Our
solution builds upon our expertise in pervasive sensing and data analytics,
and focuses on applied R&D with commercialization interest in smart
buildings. First, we will develop a human-centric solution to leverage
wearable devices (i.e., wristband) and mobile devices (e.g., smartphone) for
crowdsourcing user preference and in-situ measurements. Second, we will
perform privacy-preserving data analytics to transform the canonical thermal
comfort model (i.e., PMV) into an online paradigm for behavioral
studies. Finally, analytical insights will be validated in
the SinBerBest testbed for energy efficiency.
Working with local and international partners, we will showcase our R&D
outcomes locally and globally. Our expected deliverables include an
integrated thermal comfort management system, as well as a light-weight
mobile application, which would have been well tested in Singapore and can
scale up via a cloud data service for mass adoption with our
commercialization partners globally. Toward Joint IT-Thermal Optimization to Improve Energy
Efficiency for High-Ambient Temperature Data Centre in the Tropics via
Learning-based Algorithms
In
this research, we propose to develop learning-based
algorithms for joint IT-thermal optimization to improve energy efficiency
for high-ambient temperature (enterprise) data center
in the tropics. To tackle the paramount challenge of the siloed approach to
IT and facility systems in data center, we plan to
adopt an interdisciplinary approach to develop advanced energy-efficient
technologies, with the following specific aims: · To develop a data-driven mathematical framework,
based on the highly-touted Deep Q-Networks (DQN), for controlling and
optimizing large-scale systems with unknown dynamics and objectives. It first
optimizes a low-cost hybrid sensing
technique (i.e., UbiSense), combining our
patented “software-as-sensors” technology (for ICT system performance
counters, e.g., CPU usage, memory, I/O, etc) with strategically-deployed
physical sensors (for ambient temperature, humidity, noise and airflow), for
data centre monitoring. The hybrid dataset is then fed into a deep-learning
engine to train a set of sophisticated and domain-specific models to capture
the profound relationship between IT and non-IT systems. · To apply and
validate robust learning-based
algorithms for joint IT-Thermal optimization, in harmony with the complex
interplay between IT systems and non-IT systems. The joint optimization aims
to increase the energy efficiency of enterprise data center
operations while providing the desired system reliability and performance to
ensure business continuity, under the extreme system dynamics of Singapore’s
tropical weather. · To
conduct system trials for
technology validation and commercialization with our private data center testbed at NTU and a public data center testbed provided by our government partner (i.e.,
National Super Computing Center). These trials will prepare us well for
potential technology licenses and spin-off opportunities. We believe that our holistic
approach, based on emerging machine-learning approaches, stands out as a
novel and practical solution to address the technical and operational
challenges in running data centers in a higher ambient
temperature environment. We pioneer in learning-based algorithms for joint
IT-thermal optimization, adopting a data-centric approach, compared to
existing model-based approaches. The practicality of our proposed solution
has been previously endorsed by the data center
industry with the 2015 Data Centre Dynamics Awards - APAC. It is expected
with good confidence that our solution will reduce the energy consumption of
data centers in Singapore by 20% in its full
potential, leading to significant economic savings and environmental benefits
for Singapore to materialize the envisioned digital economy transformation.
Cost-Optimal Mobile Computing in the Cloud
This research aims to resolve an eminent tussle between the
growing usage of smart phones and their resource-constrained nature, by
offloading computation tasks from handsets to a cloud infrastructure
dynamically. In particular, we introduce a new concept of VMlet, which represents a
service container on a virtual machine, executing a set of computing tasks
dynamically offloaded from smartphones. The research challenge is how to
optimally decompose a mobile application, represented by a directed graph for
its workflow, into a set of virtual machines for cost-optimal execution. We
formulate this challenge as a constrained optimization problem, with an objective
to minimize a chosen cost metric (e.g., monetary cost or energy usage) for
either the mobile user or the mobile service provider, under the constraint
of quality of service (QoS) requirements (e.g., delay deadline). Our
analytical framework builds on our previous work in task offloading for
mobile cloud. Moreover, by solving a series of progressively-challenging
sub-problems, we will develop distributed algorithms to solve the
optimization problem and implement a prototype of the platform service for
feature verification and performance optimization. Our expected deliverable will include a suit of middleware, in which the
front-end software will be developed under the popular mobile platforms and
the back-end software will run over public cloud platforms, and two pilot
applications to showcase the platform capabilities. The software package will be offered via a
Platform-as-a-Service (PaaS) model for application development, under an
open-source license. The flexibility it offers to application developers
would help to improve the brand awareness of emerging mobile devices and
infrastructure cloud platforms. Acknowledgements
|
||
Maintained
by Yonggang Wen |