Workload Adaptive Power Management with Live Phase Monitoring and Prediction
Workload Adaptive Power Management with Live Phase Monitoring and Prediction
Canturk Isci
ABSTRACT
In current computer systems, power dissipation is widely
recognized as one of the primary critical constraints.
Improving the power efficiency of current and emerging
systems has therefore become a pressing challenge and an
active research area over recent years.
Dynamic, on-the-fly management techniques aim to address
this challenge by adaptively responding to the
changes in application execution.
These application patterns, commonly referred to as
"phases", expose distinct, dynamically-varying and
often repetitive characteristics of workloads.
Dynamic management techniques, guided by workload phase
information, can effectively tune system resources
to varying workload demands for improved power-efficiency.
This thesis researches new methods to characterize and predict
application behavior for a dynamic power management endgoal.
Specifically, this work has two major thrusts.
First, it explores different approaches to characterize and
predict dynamically varying workload power behavior. Second,
it discusses runtime management techniques for real systems
that can proactively adapt processor execution to
varying application execution characteristics.
This work develops a runtime, real-system power model
that provides processor power consumption details in terms of the
component powers of different architectural units.
We show that similarity analysis methods applied to
these component powers help expose power phase
behavior of applications. A small set of "power signatures"
can represent overall workload power characteristics within
5% of the actual behavior. We develop a "transition-guided"
phase detection framework that can identify repetitive application
phase patterns despite system-induced variability effects.
This detection strategy can identify recurrent phase
signatures with less than 5% false alarms on running systems.
Last, we propose a workload-adaptive dynamic power management
framework guided by runtime phase predictions.
This predictive power management approach is shown to
improve the energy-delay product of a deployed platform
by 7% when compared to existing reactive techniques and
by 27% over the baseline unmanaged system.
Overall, this thesis shows a roadmap to effective
on-the-fly phase detection and prediction
on real-systems for application to workload-adaptive
dynamic power management. With the increasing focus
on adaptive and autonomous system management,
this research offers practical techniques that can
serve as integral components for
current and emerging power-aware systems.
1 Background and Motivation
Computing systems have experienced a tremendous sustained growth
in performance and complexity for more than two decades.
Exponentially increasing transistor integration enables
more devices to
be packed within single chips, which in turn provides more
functionality and state with each generation of processors.
Figure 1.1 illustrates this for a range of processor
families [12,36,53,56,141,144].
Moreover, reduced process dimensions enable faster switching transistors,
driving higher operating frequencies with each generation.
Coupled with technology advances, new architectural and compiler
techniques have pushed the performance bar even higher with deeper
pipelines, high speculation, out-of-order and superscalar microarchitectures, and
increasing instruction-level parallelism. In addition,
new simultaneously multithreaded and multicore systems enable thread-level
parallelism [66,134,156,157,170].
All of these advances translate into more computations per unit time
with each new computer generation.
Figure 1:
Number of transistors within a die over time.
From a historical perspective, these have been tremendous forward progress
in computing performance. By leveraging both technological and architectural
advances, microprocessor designers have been able to actually surpass the
performance trends indicated by Moore's Law [129,133].
For example, when we look at the reported performance results with the
SPEC CPU2000 benchmarks between 2000 and 2006, we see more than 10-fold
increase in integer performance and 14-fold increase in floating point performance
for Intel family processors [165]. This unabated push towards
higher performance and reduced form factors has provided currently emerging
mobile devices with computing capability that was previously confined to mainframe
systems.
Figure 2:
Processor power density over time.
Nonetheless, this forward progress in performance has not come for free. Together with
increasing clock rates and performance capabilities, the power dissipation of computing systems
has also accelerated rapidly. Figure 1.2 illustrates this for Figure 1.1's
processor families over the same time period
[12,31,53,57,141].
As this figure demonstrates,
processor generations also experienced an exponential increase in power
density. This increase in power density has recently become one of the primary
constraints in microprocessor design. First, stemming from both increased power dissipation
and widespread adoption of personal computers, the overall energy impact of computing
systems has become an important issue. Once again looking from a historical perspective,
the total worldwide processor power dissipation
of personal computers increased by more than 50 times over the last decade [173].
Second, increasing power density has also directly influenced thermal limitations of
processors, requiring advanced cooling and thermal management strategies
[58,155]. Third, increasing power demand, as well as
the temporal and spatial power variations within microprocessors have produced
significant strain on effective and reliable power delivery [92,141].
Last and more recently, the financial and environmental impacts of computing system power dissipation
has also been widely acknowledged. Especially in large-scale data centers, the current
annual cost of power delivery and cooling has reached to the order of millions of dollars.
If the current trend-that advances in computing performance are accompanied with rising
power demand-continues in the next generation systems, the ongoing costs of power and cooling
can soon surpass the initial cost of the underlying computing hardware by a growing
margin [12]. To address the impacts of computer power dissipation,
the Environmental Protection Agency has recently announced new specifications for
computer power-efficiency [171]. Based on the projections of these specifications,
improving the energy-efficiency of computing systems can potentially achieve $1.8 billion
of total energy cost savings over the next five years. Moreover, such emphasis on
computing power can eliminate greenhouse gas emissions equivalent to the annual
emissions of 2.7 million cars.
Interestingly, this is not the first time the computing industry has faced the
power challenge. Early mainframe systems that relied on bipolar devices had experienced
a similar exponential growth in power until the early 1990s, at which point the mainframe industry
had to move towards CMOS devices that enabled an order of magnitude improvements in
power densities [146]. Less than two decades later, we have
once again approached the limits of power density. As CMOS technology continues to be the
viable design option for microprocessors, there is a growing necessity to devise and employ
effective power management techniques in all levels of computing systems, from circuits
and architectures to systems and software. Indeed, recent years have unveiled numerous
research efforts that aim to address power-efficiency at all levels of abstractions.
These different power-management strategies can be categorized as either static and dynamic
management approaches. Static, or offline, techniques involve design-time decisions, profile-based
optimizations and compiler-driven management responses.
These approaches are employed at various design stages and abstraction layers.
These include circuit-level techniques such as transistor reordering and dual-threshold
circuits [104,118,161], architectural mechanisms such as
profiling-based adaptations at subroutine granularities or execution checkpoints
[7,75], systems- and application-level approaches
such as task partitioning and stretching, deadline-based scheduling, software
transformations and remote execution
[43,102,114,164], and compiler-driven
management techniques that involve profiling and instrumentation
of applications with power management hints or state keeping instructions
[1,65,71,122,154,180].
Dynamic, or online, power management techniques involve runtime control mechanisms
in hardware or software; they tune the configurable computing resources during execution.
There is a large variety of dynamic management techniques
across the whole spectrum of computing systems hierarchy,
spanning from circuit level techniques to application and compiler level
power management. Circuit-level adaptations include techniques such as adaptive
body biasing and multi-threshold CMOS circuits (power gating)
[4,97,98]. Architectural power management
techniques involve
pipeline reconfigurations [3,8,26,90,139,153],
adaptive cache scaling and decay [41,48,96,140],
pipeline-delay-based supply voltage tuning [47],
speculation control [23,123],
multiple clock domain architectures [147,178]
and management techniques for chip multiprocessors [94,103,115].
At the system-level many power-aware adaptations exist that target at
dynamic management of the system operation and the underlying platform components.
One of the most widely used dynamic power management techniques at the system level is
workload-dependent dynamic frequency and voltage scaling
[33,176].
Some other employed dynamic power management techniques are
adaptive disk control [60],
energy-efficient I/O and memory management [110,162,136,143,177,186],
task-level energy budgeting [5,20,119]
and power-aware scheduling [67,127].
In addition to system-level management approaches, there are also some power-aware dynamic compilation techniques
[73,172,179].
Static approaches generally have the broad view of the entire
application, and lead to simpler control. However, they lack the actual dynamic execution
information of applications. Many software-level static management approaches also require
prior profiling of applications or recompilations to incorporate compiler directives. In contrast,
dynamic techniques are directly exposed to the dynamic execution behavior and can guide
management responses on-the-fly. However, the major drawback of these online techniques
lies in their limited view of application execution as they cannot know a priori the whole
application structure. In general, dynamic management also
necessitates more elaborate monitoring and control schemes to track execution characteristics
and to apply management responses. Nonetheless, as the need for aggressive power
management continues to increase, such control mechanisms become more attractive
in emerging systems despite the design effort they require. In particular, as current workloads
exhibit highly variable and nondeterministic characteristics, and as the pool of legacy
applications grows, static techniques bring limited benefits.
Dynamic management techniques offer significant additional
improvements in overall system power efficiency.
My research particularly aims to leverage the broad view of application execution
at runtime by monitoring architectural characteristics of applications and inferring
dynamically-varying workload behavior. I use observed runtime workload
characteristics to detect and predict repetitive application execution and
this repetitive behavior information guides dynamic management techniques.
One of the primary drivers of dynamic power management is the inherent variability
in both the running workload demands and the underlying computing systems.
Efficiently matching the underlying resources to the dynamically varying application
demands by adaptively configuring these computing structures is a powerful enabler
for power-efficient computation. My dissertation research focuses on two important
research challenges for such workload-adaptive and dynamically-controlled execution:
(i)
Developing accurate and practical characterizations of dynamically
varying workload demands and correctly projecting future behavior.
(ii)
Efficiently managing the dynamic configurations of the
underlying computing resources based on projected workload demand.
One primary focus of my dissertation research is to bring real-system experimentation and
validation with real measurements into architecture research.
In the following chapters of this dissertation, I provide an overview of the
different research aspects and the accomplishments of my research along these
two thrusts.
2 Research Overview
My dissertation research explores architectural and real-system techniques to
characterize and predict wide-scale power behavior of programs and develops
autonomous methods that track and predict dynamically-varying workload characteristics to
guide runtime, workload-adaptive power management techniques.
Many of the presented studies aim to explore and leverage the phase behavior of
workloads. This phase behavior represents the temporal variations in workload behavior
that are commonly observed during execution. These workload phases are known to
exhibit repetitive patterns due to the iterative nature of dynamic execution and
can be observed in various forms such as performance characteristics,
power consumption and traversed execution address space. Moreover, different phase
patterns can be observed at different phase granularities from a few hundred of instructions
to billions of instructions. Figure 1.3
shows an example of this phase behavior with an execution snapshot from the
SPEC CPU2000 vortex benchmark when its execution characteristics are classified into
two major phases. In this example, the three charts show the
phase behavior for vortex for two performance metrics as well as with the
actual measured power behavior.
Figure 3:
Phase behavior as observed from the measured performance
metrics and power for the vortex benchmark. This execution snapshot can be roughly
separated into two phases that repeat throughout benchmark execution.
From a high-level perspective, my thesis research contributes to existing
literature in four related research areas:
- First, it presents one of the first real-system frameworks for modeling
microarchitecture-level power consumption of modern processors at runtime.
- Second, I describe workload phase analysis methodologies that target
characterizing the dynamically-varying power behavior of applications.
- Third, my research is one of the first real-system phase analysis studies
that tackles the problem of identifying repetitive execution characteristics
despite the system-induced variability effects. In this direction, I
propose novel phase characterizations and present effective techniques
that mitigate the sampling and variability effects.
- Fourth, my work demonstrates a complete predictive dynamic management system
that predicts application characteristics at runtime and performs autonomous
system adaptations to improve power efficiency based on projected workload demand.
Moreover, in this dissertation I describe three different real-system infrastructures
that I developed for experimentation and evaluations. These infrastructures
are deployed in running systems for remote power monitoring and estimation,
phase analysis with dynamic instrumentation and real-measurement feedback,
and phase-prediction-driven dynamic power management.
Below, I provide an overview of each of these four major aspects of my dissertation
research, which are detailed in the subsequent chapters of this thesis.
2.1 Live, Runtime Power Estimation
The ability to measure or model processor power dissipation lies at the heart of
power-oriented computing research. At the architecture level, much of this is
performed via simulator infrastructures. These either perform analytical power
derivations for architecture components based on technology parameters [24]
or use empirical power model macros derived from lower-level production
simulators [21]. Regardless of the taken approach, the architectural power
modeling principle remains similar, where the derived maximum component powers are
scaled with component utilization rates and architectural parameters to form component-level
power estimates. Together with holding or idle power at zero utilization, these power
estimates can then approximate the processor power dissipation. While such simulation-oriented
techniques provide extensive detail, they are generally prone to limited absolute
accuracy, they are impractical for long-timescale simulations and they often consider applications
in an isolated environment, thus lacking the effects of underlying system events.
Real system measurements can remedy these shortcomings
[51,142,168].
However, they generally lack
the architectural detail provided by simulations and focus only on total power dissipation.
This line of my research explores an alternative approach to modeling
processor power consumption that aims to leverage the advantages of both domains.
I propose a real-system power measurement and estimation approach that can also
provide microarchitecture-level detail. Fundamentally, this power modeling approach
is similar to the simulation approach, where we consider maximum component powers
scaled with activity factors. However, instead of cycle-level accounting, my
technique relies on hardware performance monitoring events to track component activity.
Moreover, I develop this as a runtime power estimation strategy that operates at
native application execution speed. I use real power measurement feedback to
calibrate power estimators, to incorporate nonlinear power behavior of processor
components due to baseline power management techniques and to provide a validated
absolute estimation accuracy. While there are prior studies that also investigate
event-counter-based power estimations [13,93,95],
these studies do not focus on the distribution of power to the architectural components.
Furthermore, they only consider processors with small power variation. My work provides both validated
total power estimates and their decomposition into architectural components.
These estimates are evaluated on a high-end system with aggressive speculation and
baseline power saving techniques, where the observed power at different execution regions
can vary by as much as 600%. This runtime power estimation framework can approximate
processor power behavior within 5% of actual power consumption, as validated with
simultaneous real measurements.
2.2 Phase Analysis for Power
In recent years, there has been a growing interest in application
phase behavior. Part of this interest focuses on identifying workload
phases for characterization purposes and summarizing execution, while
others explore methods to detect phases at runtime to guide dynamic
adaptations
[6,41,72,90,152,153].
With such phase-based adaptations, computing hardware
and software can be tuned at runtime to the demands of
different program phases. Prior research has considered
a range of possible phase analysis techniques, but has
focused almost exclusively on performance-oriented phases.
Moreover, the bulk of phase-analysis studies have focused on
simulation-based evaluations. However, effective and practical
analysis of application phase behavior on real-systems is essential
to employing these phase-based adaptations on running systems.
In addition, there is generally a missing link between
phase characterizations and their ability to represent
power behavior.
Such power characterization is very important especially for dynamic
power and thermal management, providing a direct relation between
dynamic workload execution and its impact on processor power consumption.
In this thesis I describe a phase analysis methodology that is targeted
directly towards characterizing workload power behavior.
This approach uses the temporal similarity among estimated
component power dissipations to discern the phase patterns in
workload power behavior. The power phase characterizations acquired
with this method capture the power variations during workload execution
within 5% of actual measurements using a small set of representative
phases. These phases generally summarize overall execution with less
than 1% of the complete execution information.
I develop a novel real-system framework for power-oriented phase
analysis that coordinates performance monitoring, power estimations,
dynamic instrumentation and real power measurements. With this
evaluation infrastructure I demonstrate the comparative benefits of
different phase characterization techniques that utilize control-flow
or event-counter features of applications. This part of my work
shows that while both features reveal significant insights to
power phase behavior, event counter features further provide
33% improvements in the characterization of workload power variations.
2.3 Mitigating System Induced Variability Effects on Real-System Phase Detection
One primary requirement for the application of phase-based dynamic adaptations
is the ability to discern repetitive execution. Detecting repetitive phases
in application execution helps apply dynamic management responses proactively,
thus improving their overall effectiveness. Real system experiments bring additional
challenges to the detection of such repetitive behavior due to system
induced variations. Therefore, it is essential to understand how these
indeterministic system events alter workload phases from phase to phase
and from run to run. Consequently, for a phase detection technique to be
effective on real systems, it should be resilient to these variability
effects.
This part of my work examines the phase behavior of applications
running on real systems to reliably discern and recover phase
behavior in the face of application variability stemming from
real-system and time sampling effects.
I discuss and classify the extent and type of the alterations
application phases experience with real-system experiments.
I propose a set of new, "transition-based"
phase detection techniques. These techniques can
detect repetitive workload phase information from time-varying,
real-system measurements with less than 5% false
alarm probabilities. In comparison to previous detection methods,
my transition-based techniques achieve
on average 6-fold improvements in phase detection efficiency
by mitigating the system induced variability effects.
2.4 Runtime Phase Tracking and Phase-Driven Dynamic Power Management
One of the primary motivations for doing power management dynamically is the
highly variable phase behavior within applications at different execution regions.
Dynamic management techniques highly benefit from this application
phase behavior, which can help identify workload execution
regions with different characteristics, and thus can dictate different
dynamic management responses. Most existing dynamic management techniques
respond to these phase changes reactively. When they observe a noticeable
deviation from previous application characteristics, these techniques
adjust the underlying system configurations dynamically, assuming
this recent behavior will persist in future execution
[33,41,90,162,176,186].
These approaches have difficulty however, when applications
change characteristics at a high rate. In such cases recognizing and predicting
phases on-the-fly provides better adaptation of the applied dynamic configurations.
Therefore, it is important to develop methods to identify and predict repetitive
phases, to proactively apply dynamic management responses.
My work develops online phase prediction methods that can be applied in running
systems and demonstrates how these runtime phase predictors can effectively guide
dynamic, on-the-fly processor power management. I describe a general-purpose phase
prediction framework that can be configured for different power-performance
trade-offs and can be utilized to track various application characteristics
for the desired management actions. This phase predictor operates at runtime
with negligible overheads and autonomously tracks and predicts application phases.
These phase predictions can be employed to guide various management techniques.
In my real-system experiments I demonstrate their benefits with dynamic voltage and
frequency scaling (DVFS) as an example technique.
I implement this complete runtime phase prediction and phase-driven dynamic adaptation
infrastructure on a mobile laptop platform. Compared to existing reactive and statistical
approaches, our phase predictor significantly improves the accuracy of the
predicted workload behavior, reducing the misprediction rates by 2.4X for
applications with variable behavior. My experiments demonstrate that DVFS-based
dynamic management improves the energy-delay product of the experimental system by
27% on average, when guided by my runtime phase predictor. Compared to prior reactive
approaches, these dynamic adaptations improve the energy-delay product of applications
by 7%, while incurring less performance degradation.
3 Literature Review
This section gives a general overview of existing work related to my
thesis research. Each of the following chapters provides more detailed
discussions of prior work specific to each of the presented studies.
Here, I discuss related literature along the main areas of contribution
discussed above. These are categorized under three areas: processor power
modeling, workload characterization and phase analysis, and workload-adaptive
power management.
3.1 Processor Power Modeling
Earlier work on processor power modeling involves
power measurement feedback for software and
instruction-level power models.
These include instruction energy tables
and inter-instruction effects for processor and
memory [113,126,168].
Software power models aim to map energy consumption
to program structure [51,142].
In general, these techniques are employed in simpler or embedded
processors with minimal clock gating and power management that
exhibit low temporal variations. In these cases, the power
behavior largely depends on the operating
frequency and voltage [28] and simple
table-based approaches provide good approximations to
processor power behavior.
Architectural and functional module-level power modeling
has also been prevalent in power-aware computing studies.
These have focused mostly on high-level abstractions of
processor components. These abstractions encompass
energy consumption models driven by functional unit complexity,
profiled averages or switching activities particular to different
units [105].
Starting from simple average-case estimates
[145], these power estimators evolved into
activity and lookup based power models
[106,107] that can
also incorporate inter-module interactions [125].
As more capable and detailed execution- or trace-driven
architectural simulation tools became available, accompanying
cycle-accurate power modeling tools have also been developed.
Among different power estimation frameworks, here I mention
several of the most commonly used models. Wattch is a
processor power modeling infrastructure that relies on
parameterized power models for different processor building
blocks such as array and associative memory structures, logic,
interconnect and clock tree [24]. SimplePower is
another cycle-accurate energy estimation tool that uses energy
models together with switch capacitance tables for each
microarchitectural unit [175]. These approaches
use analytical energy models that rely on circuit capacitance
parameters. In contrast, PowerTimer uses an empirical
energy estimation model based on circuit-level energy
models derived from low-level simulations [21]. Last,
SoftWatt provides a full-system power model, including
the processor and the complete memory hierarchy [59].
More recently, there has been growing interest in runtime
architectural power modeling on real-systems. These approaches enable
power estimations for the long timescales that are required for
system-level and thermal adaptations. Since these approaches lack
extensive simulation-style detail, they rely on supporting hardware
or software functionality such as performance counters to drive
power estimations. Prior work demonstrates that several performance
monitoring events correlate highly with processor power dissipation
[13]. These events can be configured to track and estimate
processor power behavior and can be used to infer the distribution
of power to microarchitectural components
[93,95,176].
This runtime information is used in conjunction with analytical
models for detailed component-level power estimates
[18,19,34,111].
Simple runtime models are also employed to track the operating
system's contribution to power consumption [116].
While the above approaches consider fixed, static power models,
adaptive, feedback-driven power estimation models have also recently
been explored [61]. As power dissipation
and thermal limitations become pressing issues in large-scale
systems, such runtime models are also emerging in the server and
cluster domains to enable efficient monitoring and dynamic
management of large-scale systems [45,63].
In runtime power modeling, my work is one of the first studies that provides
microarchitecture-level
power estimations on real systems for a high-end, highly speculative processor.
I develop power estimation models that track the power consumption of
microarchitectural units in all execution regions with high or low processor
utilization. Moreover, my work presents a complete power modeling and validation
framework including remote runtime monitoring and real-time power measurement
feedback.
3.2 Workload Characterization and Phase Analysis
There is a large body of existing work related to workload
characterization and the analysis of application phase behavior.
These studies can be classified under various themes such as
online and offline approaches, simulation-based and real-system
characterization, characterizations with different workload
features and for different endgoals.
One set of existing research employs different characterization
techniques to summarize execution with representative regions
or phases. Some of these techniques use simulations to classify
workload execution based on programmatical information
(such as executed instruction addresses and visited basic blocks)
[32,40,72,151,152]
or performance characteristics
[35,46,101].
Another line of phase characterization research focuses on
real-system studies that track hardware events or dynamic
program flow
[6,29,108,128,131,132,169].
Several of
these studies employ a wide range of similarity measures and
clustering methods such as k-means, regression trees, principal
or independent component analysis for online or offline
classification of execution into self similar regions.
A major area of research focuses on monitoring and detecting
workload phase behavior for dynamic adaptations [68].
These studies
use various workload features and evaluation techniques in their analyses.
Part of these studies focus on different indicators of dynamic
program flow to monitor varying workload characteristics
such as branch counts [90], working set signatures
[41],
traversed basic blocks [109,153] and
visited subroutines [75]. These approaches
track patterns in execution flow to trigger suited dynamic
management responses that employ various architectural
reconfigurations. In addition to the above simulation-oriented
studies, some real-system studies consider detecting specific
application behavior for dynamic responses. These works
track application phases to control management schemes readily available
in current systems such as voltage and frequency scaling
[176,179],
to detect changes in execution space and to drive dynamic optimization
strategies in runtime systems [38,100,120].
Application phase monitoring and detection guides dynamic adaptations
to react to the changes in observed characteristics. Once the new
behavior is detected, corresponding responses in tune with the demands
of the new phase can be activated. However, predicting this change in
application characteristics can provide additional benefits by initiating
management proactively. This is especially
important in the case of quickly varying application behavior, where
the fundamental frequency at which the application phases change is
close to the sampling rate of the tracked characteristics. Existing
research has employed different strategies to predict varying workload
characteristics. Compiler- and application-level techniques develop
static, analytical models based on program structure to predict changes
in workload characteristics such as memory access patterns
[52,54]. Several prediction schemes that
dynamically update their decisions during workload runtime have been
proposed at the systems and architecture levels. At the system level,
both statistical and table-based approaches that predict specific
workload characteristics based on previous history have been proposed
[44]. In addition, memory related runtime phase
predictors based on memory reuse distance patterns [150],
as well as dynamic code region based phase predictions [99]
have been studied in prior related work. In architectural studies, the
ability to propose hardware support has led to more elaborate phase
prediction mechanisms. Run-length and control-flow based phase predictors
have been developed with hardware support to predict
phases in the dynamic execution space of applications [153].
In addition to predictors of future workload phases, alternative
schemes that predict phase changes and durations have also been employed
in architectural implementations [109]. Overall, these works
demonstrate effective prediction techniques across a wide range of
granularities, with variety of workload features spanning both hardware and
software mechanisms.
My research contributes to the existing body of phase analysis work in
characterization, detection and prediction of application phases with
a primary focus on real-system phase analysis methods. While most of
the existing phase characterization work focuses on performance behavior
of workloads, my thesis presents new techniques to identify power phase behavior
of applications using hardware performance monitoring features. It develops
novel strategies to detect repetitive application phases on real systems
in spite of the system-induced perturbations on workload characteristics.
Last, my work demonstrates a fully-autonomous, real-system phase prediction
infrastructure that predicts future phase behavior of applications at runtime
by leveraging the pattern behavior in execution phases.
3.3 Workload-Adaptive Power Management
Earlier in this chapter, I have discussed the extensive range of research
broadly in the area of dynamic management, spanning from circuits to systems
and applications. Here I review some of these approaches that particularly
aim to tune system execution to the dynamic changes in the workload
characteristics. I discuss related work in workload-adaptive power management
under three abstractions: compiler- and application-level techniques,
system-level management and architectural adaptations.
High-level workload adaptations involving compilers and applications
give high-level software more responsibility for power management.
Typically, these approaches can operate in two opposite directions. First,
part of the existing work has developed strategies to adapt the workloads themselves
for varying power constraints by providing different degrees of quality of
service. These adaptations include application features with different qualities or
optional application steps that are activated only at high energy settings.
Some techniques also involve choosing between local and remote program or data
components based on their power-performance trade-offs
[50,102,143].
This first direction deliberately induces changes in workload characteristics
to respond to energy constraints, and can be referred to as power-driven
workload adaptations.
In the second direction, several techniques have considered
employing special directives within applications to guide lower-level
power management. Such directives are introduced via compiler support or
specialized application programming interfaces to perform bookkeeping
operations about application characteristics [1,7], to insert offline profiling
information for code regions at different power management states [71,122,154]
and to inform the underlying system layers about different application operations
such as I/O intensive regions [65,177].
System-level power management techniques are applied in two different
manners. First, some studies have considered performing operating system tasks
such as scheduling and memory management in a power-aware manner.
Second, additional studies make use of the operating system to assist
lower-level management functionalities in their management decisions.
In these applications, the operating system is extended with monitoring
and control interfaces that track workload characteristics and provide
control directives to the underlying management schemes such as frequency
scaling and disk power management. In the first direction, prior studies
have considered energy-aware scheduling of workloads with different characteristics
to balance power consumption, to reduce power density and to control energy dissipation
rate in both single and multiprocessor systems
[14,67,127,184].
Other workload-adaptive system research has discussed
power-aware memory management [135,186] and page
allocation [110].
Some recent studies have also presented methods for power-efficient distribution
of parallel, multithreaded applications into multiple
homogeneous or heterogeneous processing components
[5,37]. In the second direction, previous studies
have discussed system-level adaptations for disk power management [60],
controlling network interfaces and managing other input/output devices
[174]. In addition, there has been a growing body
of work in system-level management for dynamic voltage and frequency
scaling [33,49,176].
More recently, there has also been interest in machine learning techniques
for power management across multiple platform components [167],
as well as dynamic compilation support for workload-adaptive power management
[73,172,179].
At the architecture level, existing work has proposed several strategies that
track varying workload characteristics to perform architectural adaptations.
Tracking methods differ significantly in their approaches. These can be
simple occupancy or usage based models [3,139],
metrics that characterize varying workload performance
[8,26]
access frequency monitoring [48,96], inconsistency
checks [47] or more detailed
hardware structures that aim to discern varying application phases
[41,90,153].
In general, architectural management approaches focus on modulating the
effective size or speed of different hardware units. Among different
architectural components, memory hierarchy is one of the most investigated
structures. Different studies have proposed adaptively disabling or reducing
supply voltages for different cache ways and unused blocks
[48,96,140]. Some work has
proposed dynamically configurable caches based on varying working set size
information and changes in control flow
[9,41,153].
Architectural management schemes for higher levels of memory hierarchy,
including main memory and disks have also been explored
[117,186].
Besides the memory hierarchy, several studies have focused on
other architectural adaptations, such as adaptive issue queues
[8,26,139]. These approaches
have considered monitoring changes in application performance
(i.e. rate of executed instructions) and changes in the occupancy of
queue structures to tune their configurations to the changes in workload
characteristics. Other management schemes have also been proposed for adaptive
pipeline scaling and dynamic configurations of other architectural components
such as reorder buffers and register files [3,90].
These techniques have also employed some amount of architectural support
(for example, the branch behavior buffer and power profiling units) to
track dynamically-varying workload demands and to effectively match the
dynamic configurations to different application phases.
My thesis in particular discusses workload-adaptive power management techniques
that operate at the architecture and system boundary. It leverages architectural
execution information to guide system-level adaptations. Most of the existing
system adaptations either function reactively by responding to recent execution
behavior or rely on prior profiling information. My work, however, describes
a predictive and completely on-the-fly adaptation strategy that utilizes runtime
phase predictions to manage dynamic adaptations, without effecting the
execution or the structure of workloads.
4 Thesis Contributions
My thesis makes four main contributions to the existing literature.
First, I describe a generic approach to microarchitecture-level
power modeling using processor hardware performance monitoring features.
I demonstrate a detailed, yet practical runtime power monitoring and estimation
approach with simultaneous measurement
support for runtime validation feedback.
Overall, this framework paves the way for many
following runtime power and thermal management studies that can benefit
from insight on live processor power dissipation.
Second, I provide two important contributions to the general body of
workload characterization and phase analysis research.
I demonstrate practical real-system methods for identifying
application phases at runtime. These techniques can be readily employed
in system-level dynamic power and thermal management studies.
Moreover, my work defines phases targeted directly to discern
varying power characteristics of workloads, using event-counter-based
power estimations at the basis of its similarity analysis.
Third, this thesis presents a complete flow of methods that mitigate the
negative impacts of system-induced variability and sampling effects on
the detection of repetitive application behavior.
My work describes a taxonomy of phase transformations due to variability and
sampling effects. I introduce a new,
transition-based phase characterization, which is shown to be more resilient
for repetitive phase detection under the influence of these transformations.
This work provides a quantitative evaluation of phase detection techniques
and quantifies their effectiveness in recognizing recurrent execution.
Last, in this thesis I demonstrate a complete real-system framework for runtime phase
prediction and its application to workload-adaptive power management.
I describe a configurable runtime phase prediction methodology that seamlessly operates
on a real mobile system with negligible overheads. I depict the immediate benefits
of runtime phase prediction for on-the-fly, phase-driven dynamic power management.
Although the examples shown in this thesis use certain phase definitions for specific
power management techniques, the developed approaches represent a general-purpose phase
monitoring and prediction framework. My infrastructure can be employed for monitoring
and predicting different workload characteristics that can guide a range of dynamic
management techniques.
5 Thesis Outline
The following chapters of
this dissertation present the main accomplishments of my research
in more detail. I present this in a progressive manner, starting
with the experimentation basics and the power analysis framework,
followed by phase analysis basics, phase detection and prediction
methods and finally their application to dynamic power management.
In particular,
Chapter 2 presents the fundamentals of my real-system experimentation
framework and develops runtime processor power monitoring and estimation
techniques.
Chapter 3 discusses different phase analysis strategies and demonstrates
their effective application for power-oriented workload phase characterization.
Chapter 4 focuses on the interesting challenges of phase detection in real-system
experiments and develops an effective phase detection framework, which is
resilient to system-induced variations in observed workload characteristics.
Chapter 5 introduces an efficient real-system phase prediction method and
outlines a complete infrastructure that is driven by runtime phase predictions
for workload-adaptive power management. This chapter meshes the different aspects
of my research together and demonstrates the concrete benefits of phase-based dynamic
power management for power-aware computing systems.
Last, Chapter 6 presents the final remarks and discusses avenues
of future research.
References
- [1]
-
N. AbouGhazaleh, B. Childers, D. Mosse, R. Melhem, and M. Craven.
Energy Management for Real-time Embedded Applications with Compiler
Support.
In Proceedings of the Proceedings of the 2003 ACM SIGPLAN
Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES),
2003.
- [2]
-
A. R. Alameldeen and D. A. Wood.
Variability in Architectural Simulations of Multi-threaded
Workloads.
In Proceedings of 9th International Symposium on High
Performance Computer Architecture (HPCA-9), Feb. 2003.
- [3]
-
D. Albonesi, R. Balasubramonian, S. Dropsho, S. Dwarkadas, E. Friedman,
M. Huang, V. Kursun, G. Magklis, M. Scott, G. Semeraro, P. Bose,
A. Buyuktosunoglu, P. Cook, and S. Schuster.
Dynamically Tuning Processor Resources with Adaptive Processing.
IEEE Computer, 36(12):43-51, 2003.
- [4]
-
M. Anis, S. Areibi, and M. Elmasry.
Design and Optimization of Multi-Threshold CMOS (MTCMOS) Circuits.
IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, 22(10):1324-1342, Oct. 2003.
- [5]
-
M. Annavaram, E. Grochowski, and J. Shen.
Mitigating Amdahl's Law Through EPI Throttling.
In Proceedings of the 32nd International Symposium on Computer
Architecture (ISCA-32), 2005.
- [6]
-
M. Annavaram, R. Rakvic, M. Polito, J.-Y. Bouguet, R. Hankins, and B. Davies.
The Fuzzy Correlation between Code and Performance Predictability.
In Proceedings of the 37th International Symp. on
Microarchitecture, 2004.
- [7]
-
A. Azevedo, I. Issenin, R. Cornea, R. Gupta, N. Dutt, A. Veidenbaum, and
A. Nicolau.
Profile-based Dynamic Voltage Scheduling using Program Checkpoints.
In Proceedings of the conference on Design, automation and test
in Europe (DATE'02), Mar. 2002.
- [8]
-
R. I. Bahar and S. Manne.
Power and Energy Reduction Via Pipeline Balancing.
In Proceedings of the 28th International Symposium on Computer
Architecture (ISCA-28), June 2001.
- [9]
-
R. Balasubramonian, D. H. Albonesi, A. Buyuktosunoglu, and S. Dwarkadas.
Memory hierarchy reconfiguration for energy and performance in
general-purpose processor architectures.
In International Symposium on Microarchitecture, pages
245-257, 2000.
- [10]
-
H. Bao, J. Bielak, O. Ghattas, L. F. Kallivokas, D. R. O'Hallaron, J. R.
Shewchuk, and J. Xu.
Large-scale Simulation of Elastic Wave Propagation in
Heterogeneous Media on Parallel Computers.
Computer Methods in Applied Mechanics and Engineering,
152(1-2):85-102, Jan. 1998.
- [11]
-
R. D. Barnes, E. M. Nystrom, M. C. Merten, and W. mei W.Hwu.
Vacuum packing: extracting hardware-detected program phases for
post-link optimization.
In Proceedings of the 35th International Symp. on
Microarchitecture, Nov. 2002.
- [12]
-
L. A. Barroso.
The Price of Performance.
ACM Queue, 3(7):48-53, Sept. 2005.
- [13]
-
F. Bellosa.
The benefits of event-driven energy accounting in power-sensitive
systems.
In Proceedings of 9th ACM SIGOPS European Workshop, September
2000.
- [14]
-
F. Bellosa, A. Weissel, M. Waitz, and S. Kellner.
Event-Driven Energy Accounting for Dynamic Thermal Management.
In Proceedings of the Workshop on Compilers and Operating
Systems for Low Power (COLP'03), New Orleans, Sept. 2003.
- [15]
-
B. Bentley.
Validating the Intel Pentium 4 microprocessor.
In Design Automation Conference, pages 244-248, 2001.
- [16]
-
R. Berrendorf and B. Mohr.
PCL - The Performance Counter Library: A Common Interface to
Access Hardware Performance Counters on Microprocessors (Version 2.0).
http://www.kfa-juelich.de/zam/PCL/.
- [17]
-
R. Bianchini and R. Rajamony.
Power and energy management for server systems.
IEEE Computer, 37(11), November 2004.
- [18]
-
W. Bircher, J. Law, M. Valluri, and L. K. John.
Effective Use of Performance Monitoring Counters for Run-Time
Prediction of Power.
Technical Report TR-041104-01, University of Texas at Austin, Nov.
2004.
- [19]
-
W. L. Bircher, M. Valluri, J. Law, and L. K. John.
Runtime identification of microprocessor energy saving
opportunities.
In Proceedings of the 2005 International Symposium on Low Power
Electronics and Design (ISLPED), 2005.
- [20]
-
B. Brock and K. Rajamani.
Dynamic Power Management for Embedded Systems.
In Proceedings of the IEEE International SOC Conference, Sept.
2003.
- [21]
-
D. Brooks, P. Bose, V. Srinivasan, M. K. Gschwind, P. G. Emma, and M. G.
Rosenfield.
New Methodology for Early-Stage, Microarchitecture-Level
Power-Performance Analysis of Microprocessors.
IBM J. of Research and Development, 46(5/6):653-670, 2003.
- [22]
-
D. Brooks and M. Martonosi.
Dynamically exploiting narrow width operands to improve processor
power and performance.
In Proceedings of the 5th International Symposium on High
Performance Computer Architecture, Jan. 1999.
- [23]
-
D. Brooks and M. Martonosi.
Dynamic thermal management for high-performance microprocessors.
In Proceedings of the Seventh International Symposium on
High-Performance Computer Architecture (HPCA-7), January 2001.
- [24]
-
D. Brooks, V. Tiwari, and M. Martonosi.
Wattch: A framework for architectural-level power analysis and
optimizations.
In Proceedings of the 27th International Symposium on Computer
Architecture, June 2000.
- [25]
-
S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci.
A Portable Programming Interface for Performance Evaluation on
Modern Processors.
The International Journal of High Performance Computing
Applications, 14(3):189-204, 2000.
- [26]
-
A. Buyuktosunoglu, S. Schuster, D. Brooks, P. Bose, P. W. Cook, and D. H.
Albonesi.
An Adaptive Issue Queue for Reduced Power at High Performance.
In Proceedings of the First International Workshop on
Power-Aware Computer Systems (PACS'00), 2001.
- [27]
-
B. Calder, T. Sherwood, E. Perelman, and G. Hamerly.
SimPoint web page.
http://www.cs.ucsd.edu/simpoint/.
- [28]
-
A. P. Chandrakasan and A. Sinha.
JouleTrack: A Web Based Tool for Software Energy Profiling.
In Proceedings of the 38th Design Automation Conference
(DAC'01), June 2001.
- [29]
-
F. Chang, K. Farkas, and P. Ranganathan.
Energy driven statistical profiling: Detecting software hotspots.
In Proceedings of the Proceedings of the Workshop on Computer
Systems, 2002.
- [30]
-
J. Chase, D. Anderson, P. Thakar, A. Vahdat, and R. Doyle.
Managing energy and server resources in hosting centers.
In Proceedings of the 18th Symposium on Operating Systems
Principles (SOSP), October 2001.
- [31]
-
M. Chin.
Desktop CPU Power Survey.
In SPCR Forum, 2006.
- [32]
-
C.-B. Cho and T. Li.
Complexity-based Program Phase Analysis and Classification.
In Proceedings of the International Conference on Parallel
Architectures and Compilation Techniques (PACT), Sept. 2006.
- [33]
-
K. Choi, R. Soma, and M. Pedram.
Dynamic Voltage and Frequency Scaling based on Workload
Decomposition.
In Proceedings of International Symposium on Low Power
Electronics and Design (ISLPED), Aug. 2004.
- [34]
-
G. Contreras and M. Martonosi.
Power Prediction for Intel XScale Processors Using Performance
Monitoring Unit Events.
In Proceedings of the 2005 International Symposium on Low Power
Electronics and Design (ISLPED), 2005.
- [35]
-
J. Cook, R. L. Oliver, and E. E. Johnson.
Examining performance differences in workload execution phases.
In Proceedings of the IEEE International Workshop on Workload
Characterization (WWC-4), 2001.
- [36]
-
N. Corporation.
NVIDIA GeForce 8800 GPU Architecture Overview.
Technical Brief TB-02787-001_v01, NVIDIA Corporation, Nov. 2006.
- [37]
-
M. Curtis-Maury, J. Dzierwa, C. D. Antonopoulos, and D. S. Nikolopoulos.
Online Power-Performance Adaptation of Multithreaded Programs using
Event-Based Prediction.
In Proceedings of the 20th ACM International Conference on
Supercomputing (ICS), June 2006.
- [38]
-
A. Das, J. Lu, and W.-C. Hsu.
Region Monitoring for Local Phase Detection in Dynamic Optimization
Systems.
In Proceedings of the International Symposium on Code Generation
and Optimization (CGO), Mar. 2006.
- [39]
-
P. J. Denning.
The working set model for program behavior.
Communications of the ACM, pages 323-333, May 1968.
- [40]
-
A. Dhodapkar and J. Smith.
Comparing Program Phase Detection Techniques.
In 36th International Symp. on Microarchitecture, 2003.
- [41]
-
A. Dhodapkar and J. Smith.
Managing multi-configurable hardware via dynamic working set
analysis.
In 29th Annual International Symposium on Computer Architecture,
2002.
- [42]
-
R. O. Duda, P. E. Hart, and D. G. Stork.
Pattern Classification. Second Edition.
Wiley Interscience, New York, 2001.
- [43]
-
A. Dudani, F. Mueller, and Y. Zhu.
Energy Conserving Feedback EDF Scheduling for Embedded Systems with
Real-time Constraints.
In LCTES/SCOPES '02: Proceedings of the joint conference on
Languages, compilers and tools for embedded systems, 2002.
- [44]
-
E. Duesterwald, C. Cascaval, and S. Dwarkadas.
Characterizing and Predicting Program Behavior and its Variability.
In IEEE PACT, pages 220-231, 2003.
- [45]
-
D. Economou, S. Rivoire, C. Kozyrakis, and P. Ranganathan.
Full-system Power Analysis and Modeling for Server Environments.
In Proceedings of the Workshop on Modeling Benchmarking and
Simulation (MOBS), June 2006.
- [46]
-
L. Eeckhout, R. Sundareswara, J. Yi, D. Lilja, and P. Schrater.
Accurate Statistical Approaches for Generating Representative
Workload Compositions.
In Proceedings of the IEEE International Symposium on Workload
Characterization, Oct. 2005.
- [47]
-
D. Ernst, S. D. Nam Sung Kim, S. Pant, T. Pham, R. Rao, C. Ziesler, D. Blaauw,
T. Austin, and T. Mudge.
Razor: A Low-Power Pipeline Based on Circuit-Level Timing
Speculation.
In Proceedings of the 36th International Symp. on
Microarchitecture, Dec. 2003.
- [48]
-
K. Flautner, N. S. Kim, S. Martin, D. Blaauw, and T. Mudge.
Drowsy Caches: Simple Techniques for Reducing Leakage Power.
In Proceedings of the 29th International Symposium on Computer
Architecture (ISCA-29), May 2002.
- [49]
-
K. Flautner and T. Mudge.
Vertigo: Automatic Performance-Setting for Linux.
In Proceedings of the Fifth Symposium on Operating System Design
and Implementation OSDI'02, 2002.
- [50]
-
J. Flinn.
Extending Mobile Computer Battery Life through Energy-Aware
Adaptation.
PhD thesis, Computer Science Department, Carnegie Mellon University,
Dec. 2001.
- [51]
-
J. Flinn and M. Satyanarayanan.
Powerscope: a tool for profiling the energy usage of mobile
applications.
In Second IEEE Workshop on Mobile Computing Systems and
Applications, pages 2-10, Feb. 1999.
- [52]
-
B. B. Fraguela, R. Doallo, J. Tourino, and E. L. Zapata.
A Compiler Tool to Predict Memory Hierarchy Performance of
Scientific Codes.
Parallel Computing, 30(2):225-228, 2004.
- [53]
-
J. Friedrich, B. McCredie, N. James, B. Huott, B. Curran, E. Fluhr, G. Mittal,
E. Chan, Y. Chan, D. Plass, S. Chu, H. Le, L. Clark, J. Ripley, S. Taylor,
J. Dilullo, and M. Lanzerotti.
Design of the POWER6 Microprocessor.
In IEEE International Solid-State Circuits Conference (ISSCC
2007), Feb. 2007.
- [54]
-
S. Ghosh, M. Martonosi, and S. Malik.
Cache Miss Equations: A Compiler Framework for Analyzing and Tuning
Memory Behavior.
ACM Transactions on Programming Languages and Systems (TOPLAS),
21(4):703-746, 1999.
- [55]
-
S. Gochman, R. Ronen, I. Anati, A. Berkovits, T. Kurts, A. Naveh, A. Saeed,
Z. Sperber, and R. C. Valentine.
The Intel Pentium M Processor: Microarchitecture and Performance.
Intel Technology Journal, Q2, 2003, 7(02), 2003.
- [56]
-
M. Golden, S. Arekapudi, G. Dabney, M. Haertel, S. Hale, L. Herlinger, Y. Kim,
K. McGrath, V. Palisetti, and M. Singh.
A 2.6GHz Dual-Core 64b x86 Microprocessor with DDR2 Memory Support.
In IEEE International Solid-State Circuits Conference (ISSCC
2006), Feb. 2006.
- [57]
-
M. Gschwind.
Chip Multiprocessing and the Cell Broadband Engine.
IBM Researh Report RC-23921, IBM T. J. Watson Research Center, Feb.
2006.
- [58]
-
S. P. Gurrum, S. K. Suman, Y. K. Joshi, and A. G. Fedorov.
Thermal Issues in Next-Generation Integrated Circuits.
IEEE Transactions on Device and Materials Reliability,
4(4):709-714, Dec. 2004.
- [59]
-
S. Gurumurthi, A. Sivasubramaniam, M. J. Irwin, N. Vijaykrishnan, M. Kandemir,
T. Li, and L. K. John.
Using Complete Machine Simulation for Software Power Estimation: The
SoftWatt Approach.
In Proceedings of the 8th International Symposium on
High-Performance Computer Architecture (HPCA), Feb. 2002.
- [60]
-
S. Gurumurthi, A. Sivasubramaniam, M. Kandemir, and H. Franke.
DRPM: Dynamic Speed Control for Power Management in Server Class
Disks.
Computer Architecture News, 31(2):169 - 181, May 2003.
- [61]
-
S. Gurun and C. Krintz.
A Run-Time, Feedback-Based Energy Estimation Model For Embedded
Devices.
In Proceedings of the International Conference on
Hardware-Software Codesign and System Synthesis (CODES+ISSS), Oct. 2006.
- [62]
-
J. Haid, G. Kafer, C. Steger, R. Weiss, , W. Schogler, and M. Manninger.
Run-time energy estimation in system-on-a-chip designs.
In Asia and South Pacific Design Automation Conference
(ASP-DAC), Jan. 2003.
- [63]
-
T. Heath, A. P. Centeno, P. George, L. Ramos, Y. Jaluria, and R. Bianchini.
Mercury and freon: Temperature emulation and management in server
systems.
In Proceedings of the International Conference on Architectural
Support for Programming Languages and Operating Systems (ASPLOS), October
2006.
- [64]
-
T. Heath, B. Diniz, E. V. Carrera, W. Meira Jr., and R. Bianchini.
Energy conservation in heterogeneous server clusters.
In Proceedings of the 10th Symposium on Principles and Practice
of Parallel Programming (PPoPP), 2005.
- [65]
-
T. Heath, E. Pinheiro, J. Hom, U. Kremer, and R. Bianchini.
Code Transformations for Energy-Efficient Device Management.
IEEE Transactions on on Computers, 53(8):974- 987, Aug. 2004.
- [66]
-
J. L. Hennessy and D. A. Patterson.
Computer Architecture: A Quantitative Approach.
Morgan Kaufman Publishers, 2003.
Third Edition.
- [67]
-
S. Heo, K. Barr, and K. Asanovic.
Reducing Power Density through Activity Migration.
In Proceedings of International Symposium on Low Power
Electronics and Design (ISLPED), Seoul, Korea, Aug. 2003.
- [68]
-
M. J. Hind, V. T. Rajan, and P. F. Sweeney.
Phase Shift Detection: A Problem Classification.
IBM Researh Report RC-22887, IBM T. J. Watson, Aug. 2003.
- [69]
-
G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel.
The microarchitecture of the Pentium 4 processor.
Intel Technology Journal, First Quarter 2001, 2001.
http://developer.intel.com/technology/itj/.
- [70]
-
J. Hom and U. Kremer.
Inter-program Compilation for Disk Energy Reduction.
In Workshop on Power-Aware Computer Systems (PACS'03), 2003.
- [71]
-
C.-H. Hsu and U. Kremer.
The design, implementation, and evaluation of a compiler algorithm
for CPU energy reduction.
In Proceedings of the ACM SIGPLAN 2003 conference on Programming
language design and implementation, pages 38-48, 2003.
- [72]
-
C. Hu, D. Jimenez, and U. Kremer.
Toward an Evaluation Infrastructure for Power and Energy
Optimizations.
In Workshop on High-Performance, Power-Aware Computing, 2005.
- [73]
-
S. Hu, M. Valluri, and L. K. John.
Effective Adaptive Computing Environment Management via Dynamic
Optimization.
In Proceedings of the International Symposium on Code Generation
and Optimization (CGO), Mar. 2005.
- [74]
-
M. Huang, J. Renau, and J. Torrellas.
Profile-Based Energy Reduction in High-Performance Processors.
In 4th ACM Workshop on Feedback-Directed and Dynamic
Optimization, December 2001.
- [75]
-
M. Huang, J. Renau, and J. Torrellas.
Positional Adaptation of Processors: Application to Energy
Reduction.
In Proceedings of the International Symp. on Computer
Architecture, 2003.
- [76]
-
C. Hughes, J. Srinivasan, and S. Adve.
Saving energy with architectural and frequency adaptations for
multimedia applications.
In Proceedings of the 34th Annual International Symposium on
Microarchitecture (MICRO-34), Dec. 2001.
- [77]
-
IBM.
PMAPI structure and function Reference.
http://www16.boulder.ibm.com/pseries/en_US/files/aixfiles/
pmapi.h.htm.
- [78]
-
Intel Corporation.
VTuneTM Performance Analyzer 1.1.
http://developer.intel.com/software/products/vtune/vlin/.
- [79]
-
Intel Corporation.
Intel Pentium 4 and Intel Xeon Processor Optimization
Reference Manual, 2002.
http://developer.intel.com/design/Pentium4/manuals/248966.htm.
- [80]
-
Intel Corporation.
Intel Pentium 4 Processor in the 423 pin package / Intel 850
chipset platform, 2002.
http://developer.intel.com/design/chipsets/designex/298245.htm.
- [81]
-
Intel Corporation.
Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume
3B: System Programming Guide, 2006.
- [82]
-
C. Isci, G. Contreras, and M. Martonosi.
Hardware Performance Counters for Detailed Runtime Power and Thermal
Estimations: Experiences and Proposals.
In Proceedings of the Hardware Performance Monitor Design and
Functionality Workshop in the 11th International Symposium on
High-Performance Computer Architecture (HPCA-11), Feb. 2005.
- [83]
-
C. Isci, G. Contreras, and M. Martonosi.
Live, Runtime Phase Monitoring and Prediction on Real Systems with
Application to Dynamic Power Management.
In Proceedings of the 39th ACM/IEEE International Symposium on
Microarchitecture (MICRO-39), 2006.
- [84]
-
C. Isci and M. Martonosi.
Identifying Program Power Phase Behavior using Power Vectors.
In Proceedings of the IEEE International Workshop on Workload
Characterization (WWC-6), 2003.
- [85]
-
C. Isci and M. Martonosi.
Runtime Power Monitoring in High-End Processors: Methodology and
Empirical Data.
In Proceedings of the 36th International Symp. on
Microarchitecture, Dec. 2003.
- [86]
-
C. Isci and M. Martonosi.
Runtime Power Monitoring in High-End Processors: Methodology and
Empirical Data.
Technical report, Princeton University Electrical Eng. Dept., Sep
2003.
- [87]
-
C. Isci and M. Martonosi.
Detecting Recurrent Phase Behavior under Real-System Variability.
In Proceedings of the IEEE International Symposium on Workload
Characterization, Oct. 2005.
- [88]
-
C. Isci and M. Martonosi.
Phase Characterization for Power: Evaluating Control-Flow-Based and
Event-Counter-Based Techniques.
In Proceedings of the 12th International Symposium on
High-Performance Computer Architecture (HPCA-12), 2006.
- [89]
-
C. Isci, M. Martonosi, and A. Buyuktosunoglu.
Long-term Workload Phases: Duration Predictions and Applications to
DVFS.
IEEE Micro: Special Issue on Energy Efficient Design,
25(5):39-51, Sep/Oct 2005.
- [90]
-
A. Iyer and D. Marculescu.
Power aware microarchitecture resource scaling.
In Proceedings of Design Automation and Test in Europe, DATE,
Mar. 2001.
- [91]
-
R. Jenkins.
Hash functions.
Dr. Dobb's Journal, 9709, Sept. 1997.
- [92]
-
R. Joseph, D. Brooks, and M. Martonosi.
Control techniques to eliminate voltage emergencies in high
performance processors.
In Proc. of the 9th International Symposium on High Performance
Computer Architecture (HPCA-9), February 2003.
- [93]
-
R. Joseph and M. Martonosi.
Run-time power estimation in high performance microprocessors.
In International Symposium on Low Power Electronics and Design,
pages 135-140, 2001.
- [94]
-
P. Juang, Q. Wu, L.-S. Peh, M. Martonosi, and D. Clark.
Coordinated, Distributed, Formal Energy Management of Chip
Multiprocessors.
In Proceedings of International Symposium on Low Power
Electronics and Design (ISLPED'05), Aug. 2005.
- [95]
-
I. Kadayif, T. Chinoda, M. T. Kandemir, N. Vijaykrishnan, M. J. Irwin, and
A. Sivasubramaniam.
vEC: virtual energy counters.
In Workshop on Program Analysis for Software Tools and
Engineering, pages 28-31, 2001.
- [96]
-
S. Kaxiras, Z. Hu, and M. Martonosi.
Cache decay: Exploiting generational behavior to reduce cache leakage
power.
In Proceedings of the 28th International Symposium on Computer
Architecture (ISCA-28), June 2001.
- [97]
-
A. Keshavarzi, S. Ma, S. Narendra, B. Bloechel, K. Mistry, T. Ghani, S. Borkar,
and V. De.
Effectiveness of Reverse Body Bias for Leakage Control in Scaled
Dual Vt CMOS ICs.
In Proceedings of the 2001 International Symposium on Low Power
Electronics and Design (ISLPED), Aug. 2001.
- [98]
-
C. H. Kim and K. Roy.
Dynamic Vth Scaling Scheme for Active Leakage Power Reduction.
In Proceedings of the conference on Design, automation and test
in Europe (DATE'02), Mar. 2002.
- [99]
-
J. Kim, S. V. Kodakara, W.-C. Hsu, D. J. Lilja, and P.-C. Yew.
Dynamic Code Region (DCR) Based Program Phase Tracking and
Prediction for Dynamic Optimizations.
Lecture Notes in Computer Science, 3793:203-217, 2005.
- [100]
-
T. Kistler and M. Franz.
Continuous Program Pptimization: A Case Study.
ACM Transactions on Programming Languages and Systems (TOPLAS),
25(4):500-548, 2003.
- [101]
-
A. KleinOsowski, J. Flynn, N. Meares, and D. J. Lilja.
Adapting the SPEC2000 benchmark suite for simulation-based computer
architecture research.
In Workshop on Workload Characterization, International
Conference on Computer Design, Sept. 2000.
- [102]
-
U. Kremer, J. Hicks, and J. Rehg.
Compiler-Directed Remote Task Execution for Power Management.
In Proceedings of the Workshop on Compilers and Operating
Systems for Low Power (COLP'00), 2000.
- [103]
-
R. Kumar, K. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen.
Single-ISA Heterogeneous Multi-Core Architectures: The Potential for
Processor Power Reduction.
In Proceedings of the 36th International Symp. on
Microarchitecture, Dec. 2003.
- [104]
-
E. Kursun, S. Ghiasi, and M. Sarrafzadeh.
Transistor Level Budgeting for Power Optimization.
In Proceedings of the 5th International Symposium on Quality
Electronic Design (ISQED'05), 2004.
- [105]
-
P. E. Landman.
High-level power estimation.
In Proceedings of the 1996 International Symposium on Low Power
Electronics and Design (ISLPED), Oct. 1996.
- [106]
-
P. E. Landman and J. M. Rabaey.
Black-box Capacitance Models for Architectural Power Analysis.
In Proceedings of the International Workshop on Low Power
Design, Apr. 1994.
- [107]
-
P. E. Landman and J. M. Rabaey.
Activity-sensitive Architectural Power Analysis for the Control
Path.
In Proceedings of the International Workshop on Low Power
Design, Apr. 1995.
- [108]
-
J. Lau, J. Sampson, E. Perelman, G. Hamerly, and B. Calder.
The Strong Correlation between Code Signatures and Performance.
In IEEE International Symposium on Performance Analysis of
Systems and Software, Mar. 2005.
- [109]
-
J. Lau, S. Schoenmackers, and B. Calder.
Transition Phase Classification and Prediction.
In 11th International Symposium on High Performance Computer
Architecture, 2005.
- [110]
-
A. R. Lebeck, X. Fan, H. Zeng, and C. Ellis.
Power Aware Page Allocation.
ACM SIGOPS Operating Systems Review, 34(5):105 - 116, Dec.
2000.
- [111]
-
B. Lee and D. Brooks.
Accurate and Efficient Regression Modeling for Microarchitectural
Performance and Power Prediction.
In Proceedings of the International Conference on Architectural
Support for Programming Languages and Operating Systems (ASPLOS-XII),
October 2006.
- [112]
-
K. Lee and K. Skadron.
Using Performance Counters for Runtime Temperature Sensing in
High-Performance Processors.
In Workshop on High-Performance, Power-Aware Computing, 2005.
- [113]
-
S. Lee, A. Ermedahl, S. L. Min, and N. Chang.
An accurate instruction-level energy consumption model for embedded
RISC processors.
In LCTES/OM, pages 1-10, 2001.
- [114]
-
S. Lee and T. Sakurai.
Run-time Voltage Hopping for Low-power Real-time Systems.
In Proceedings of the 37th Design Automation Conference
(DAC'00), 2000.
- [115]
-
J. Li and J. Martinez.
Dynamic Power-Performance Adaptation of Parallel Computation on Chip
Multiprocessors.
In Proceedings of the 12th International Symposium on
High-Performance Computer Architecture (HPCA-12), 2006.
- [116]
-
T. Li and L. K. John.
Run-time Modeling and Estimation of Operating System Power
Consumption.
In Proceedings of the International Conference on Measurement
and Modeling of Computer Systems (SIGMETRICS), 2003.
- [117]
-
X. Li, Z. Li, F. David, P. Zhou, Y. Zhou, S. Adve, and S. Kumar.
Performance Directed Energy Management for Main Memory and Disks.
In Proceedings of the 11th International Conference on
Architectural Support for Programming Languages and Operating Systems
(ASPLOS-XI), 2004.
- [118]
-
M. Liu, W.-S. Wang, and M. Orshansky.
Leakage Power Reduction by Dual-Vth Designs Under Probabilistic
Analysis of Vth Variation.
In Proceedings of the 2004 International Symposium on Low Power
Electronics and Design (ISLPED), Aug. 2004.
- [119]
-
J. R. Lorch and A. J. Smith.
Improving Dynamic Voltage Scaling Algorithms with PACE.
In Proceedings of the 2001 ACM SIGMETRICS international
conference on Measurement and modeling of computer systems, 2001.
- [120]
-
J. Lu, H. Chen, P. Yew, and W. Hsu.
Design and Implementation of a Lightweight Dynamic Optimization
System.
The Journal of Instruction-Level Parallelism, 6:1-24, 2004.
- [121]
-
C. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace,
V. Reddi, and K. Hazelwood.
Pin: Building Customized Program Analysis Tools with Dynamic
Instrumentation.
In Programming Language Design and Implementation (PLDI), June
2005.
- [122]
-
G. Magklis, M. Scott, G. Semeraro, D. Albonesi, and S. Dropsho.
Profile-based Dynamic Voltage and Frequency Scaling for a Multiple
Clock Domain Microprocessor.
In Proceedings of the 30th International Symposium on Computer
Architecture (ISCA-30), 2003.
- [123]
-
S. Manne, A. Klauser, and D. Grunwald.
Pipeline gating: Speculation control for energy reduction.
In Proceedings of the 25th International Symposium on Computer
Architecture, pages 132-41, June 1998.
- [124]
-
C. McNairy and R. Bhatia.
Montecito: A Dual-Core, Dual-Thread Itanium Processor.
IEEE Micro, 25(2):10-20, Mar/Apr 2005.
- [125]
-
H. Mehta, R. M. Owens, and M. J. Irwin.
Energy characterization based on clustering.
In Proceedings of the 33rd Design Automation Conference
(DAC'96), 1996.
- [126]
-
H. Mehta, R. M. Owens, and M. J. Irwin.
Instruction Level Power Profiling.
In Proceedings of the IEEE International Conference on
Acoustics, Speech, and Signal Processing (ICASSP'96), May 1996.
- [127]
-
A. Merkel.
Balancing Power Consumption in Multiprocessor Systems.
PhD thesis, Sept. 2005.
System Architecture Group, University of Karlsruhe, Diploma Thesis.
- [128]
-
M. C. Merten, A. R. Trick, R. D. Barnes, E. M. Nystrom, C. N. George, J. C.
Gyllenhaal, and W. mei W. Hwu.
An architectural framework for runtime optimization.
IEEE Transactions on Computers, 50(6):567-589, 2001.
- [129]
-
G. E. Moore.
Cramming more components onto integrated circuits.
In Electronics, pages 114-117, Apr. 1965.
- [130]
-
J. Moore, J. Chase, P. Ranganathan, and R. Sharma.
Making scheduling cool: Temperature-aware workload placement in data
centers.
In Proceedings of USENIX `05, June 2005.
- [131]
-
P. Nagpurkar, C. Krintz, M. Hind, P. Sweeney, and V. Rajan.
Online Phase Detection Algorithms.
In Proceedings of the International Symposium on Code Generation
and Optimization (CGO), Mar. 2006.
- [132]
-
P. Nagpurkar, C. Krintz, and T. Sherwood.
Phase-Aware Remote Profiling.
In Proceedings of the International Symposium on Code Generation
and Optimization (CGO), Mar. 2005.
- [133]
-
K. Olukotun and L. Hammond.
The Future of Microprocessors.
ACM Queue, 3(7):27-34, Sept. 2005.
- [134]
-
K. Olukotun, B. A. Nayfeh, L. Hammond, K. Wilson, and K.-Y. Chang.
The Case for a Single-Chip Multiprocessor.
In Seventh International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS VII), Oct. 1996.
- [135]
-
H. H. Padmanabhan.
Design and Implementation of Power-aware Virtual Memory.
In Proceedings of USENIX, 2003.
- [136]
-
V. Pandey, W. Jiang, Y. Zhou, and R. Bianchini.
DMA-Aware Memory Energy Management.
In Proceedings of the 12th International Symposium on
High-Performance Computer Architecture (HPCA-12), Feb. 2006.
- [137]
-
H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi.
Pinpointing Representative Portions of Large Intel Itanium Programs
with Dynamic Instrumentation.
In Proceedings of the 37th International Symp. on
Microarchitecture, 2004.
- [138]
-
C. Poirier, R. McGowen, C. Bostak, and S. Naffziger.
Power and Temperature Control on a 90nm Itanium-Family Processor.
In IEEE International Solid-State Circuits Conference (ISSCC
2005), Feb. 2005.
- [139]
-
D. Ponomarev, G. Kucuk, and K. Ghose.
Reducing Power Requirements of Instruction Scheduling Through
Dynamic Allocation of Multiple Datapath Resources.
In Proceedings of the 34th Annual International Symposium on
Microarchitecture (MICRO-34), Dec. 2001.
- [140]
-
M. Powell, S.-H. Yang, B. Falsafi, K. Roy, and T. N. Vijaykumar.
Reducing Leakage in a High-Performance Deep-Submicron Instruction
Cache.
IEEE Transactions on Very Large Scale Integration (VLSI)
Systems, 9(1):77-90, 2001.
- [141]
-
R. Ronen, A. Mendelson, K. Lai, S.-L. Lu, F. Pollack, and J. P. Shen.
Coming Challenges in Microarchitecture and Architecture.
Proceedings of the IEEE, 89(3):325-340, Mar. 2001.
- [142]
-
J. Russell and M. Jacome.
Software power estimation and optimization for high performance,
32-bit embedded processors.
In Proceedings of the International Conference on Computer
Design, October 1998.
- [143]
-
D. G. Sachs, W. Yuan, C. J. Hughes, A. Harris, S. V. Adve, D. L. Jones, R. H.
Kravets, and K. Nahrstedt.
Grace: A hierarchical adaptation framework for saving energy.
Technical report, Computer Science, University of Illinois Technical
Report UIUCDCS-R-2004-2409, 2004.
- [144]
-
N. Sakran, M. Yuffe, M. Mehalel, J. Doweck, E. Knoll, and A. Kovacs.
Implementation of the 65nm Dual-Core 64b Merom Processor.
In IEEE International Solid-State Circuits Conference (ISSCC
2007), Feb. 2007.
- [145]
-
T. Sato, M. Nagamatsu, and H. Tago.
Power and Performance Simulator: ESP and Its Applications for 100
MIPS/W Class RISC Design.
In Proceedings of the 1994 International Symposium on Low Power
Electronics and Design (ISLPED), Oct. 1994.
- [146]
-
R. Schmidt.
Liquid Cooling is Back.
Electronics Cooling, 11(3), Aug. 2005.
- [147]
-
G. Semeraro, G. Magklis, R. Balasubramonian, D. Albonesi, S. Dwarkadas, and
M. Scott.
Energy-Efficient Processor Design Using Multiple Clock Domains with
Dynamic Voltage and Frequency Scaling.
In Proceedings of the 8th International Symposium on
High-Performance Computer Architecture (HPCA-8), 2002.
- [148]
-
J. S. Seng and D. M. Tullsen.
The effect of compiler optimizations on Pentium 4 power
consumption.
In 7th Annual Workshop on Interaction between Compilers and
Computer Architectures, Feb. 2003.
- [149]
-
Server System Infrastructure (SSI) consortium.
Power Supply Management Interface Design Guide, Rev. 2.12, Sept.
2005.
- [150]
-
X. Shen, Y. Zhong, and C. Ding.
Locality Phase Prediction.
In Eleventh International Conference on Architectural Support
for Programming Languages and Operating Systems (ASPLOS XI), Oct. 2004.
- [151]
-
T. Sherwood, E. Perelman, and B. Calder.
Basic block distribution analysis to find periodic behavior and
simulation points in applications.
In International Conference on Parallel Architectures and
Compilation Techniques, Sept. 2001.
- [152]
-
T. Sherwood, E. Perelman, G. Hamerly, and B. Calder.
Automatically Characterizing Large Scale Program Behavior.
In Tenth International Conference on Architectural Support for
Programming Languages and Operating Systems, Oct 2002.
- [153]
-
T. Sherwood, S. Sair, and B. Calder.
Phase tracking and prediction.
In Proceedings of the 28th International Symposium on Computer
Architecture (ISCA-30), June 2003.
- [154]
-
D. Shin, J. Kim, and S. Lee.
Low-Energy Intra-Task Voltage Scheduling Using Static Timing
Analysis.
In Proceedings of the 38th Design Automation Conference
(DAC'01), June 2001.
- [155]
-
K. Skadron, M. R. Stan, W. Huang, S. Velusamy, K. Sankaranarayanan, and
D. Tarjan.
Temperature-aware microarchitecture.
In Proceedings of the 30th International Symposium on Computer
Architecture, June 2003.
- [156]
-
L. Spracklen and S. G. Abraham.
Chip Multithreading: Opportunities and Challenges.
In 11th International Symposium on High Performance Computer
Architecture (HPCA-11), 2005.
- [157]
-
E. Sprangle and D. Carmean.
Increasing Processor Performance by Implementing Deeper Pipelines.
In Proceedings of the 29th International Symposium on Computer
Architecture (ISCA-29), May 2002.
- [158]
-
B. Sprunt.
Brink and Abyss Pentium 4 Performance Counter Tools For Linux,
Feb. 2002.
http://www.eg.bucknell.edu/bsprunt/emon/brink_abyss/brink_abyss.shtm.
- [159]
-
B. Sprunt.
Pentium 4 Performance-Monitoring Features.
IEEE Micro, 22(4):72-82, Jul/Aug 2002.
- [160]
-
B. Sprunt.
Managing The Complexity Of Performance Monitoring Hardware: The
Brink and Abyss Approach.
International Journal of High Performance Computing
Applications, 20(4):533-540, 2006.
- [161]
-
A. Srivastava and D. Sylvester.
Minimizing Total Power by Simultaneous Vdd/Vth Assignment.
In ASPDAC: Proceedings of the 2003 conference on Asia South
Pacific design automation, Jan. 2003.
- [162]
-
P. Stanley-Marbell, M. S. Hsiao, and U. Kremer.
A Hardware Architecture for Dynamic Performance and Energy
Adaptation.
In Proceedings of the Workshop on Power-Aware Computer Systems,
2002.
- [163]
-
D. Talkin.
A robust algorithm for pitch tracking (RAPT). Speech Coding and
Synthesis.
Elsevier Science B. V., New York, 1995.
- [164]
-
T. K. Tan, A. Raghunathan, and N. K. Jha.
Software Architectural Transformations: A New Approach to Low Energy
Embedded Software.
In Proceedings of the conference on Design, Automation and Test
in Europe (DATE'03), Mar. 2003.
- [165]
-
The Standard Performance Evaluation Corporation.
SPEC CPU2000 Results.
http://www.spec.org/cpu2000/results/.
- [166]
-
The Standard Performance Evaluation Corporation.
SPEC CPU2000 Suite.
http://www.specbench.org/osg/cpu2000/.
- [167]
-
G. Theocharous, S. Mannor, N. Shah, P. Gandhi, B. Kveton, S. Siddiqi, and C.-H.
Yu.
Machine Learning for Adaptive Power Management.
Intel Technology journal, 10(4):299-311, 2006.
- [168]
-
V. Tiwari, S. Malik, and A. Wolfe.
Power analysis of embedded software: A first step towards software
power minimization.
IEEE Transactions on VLSI Systems, 2(4):437-445, December
1994.
- [169]
-
R. Todi.
Speclite: using representative samples to reduce spec cpu2000
workload.
In Proceedings of the IEEE International Workshop on Workload
Characterization (WWC-4), 2001.
- [170]
-
D. M. Tullsen, S. J. Eggers, and H. M. Levy.
Simultaneous multithreading: Maximizing on-chip parallelism.
In Proceedings of the 22nd International Symp. on Computer
Architecture, pages 392-403, June 1995.
- [171]
-
United States Environmental Protection Agency.
ENERGY STAR Program Requirements for Computers, Version 4.0.
Oct. 2006.
- [172]
-
P. Unnikrishnan, G. Chen, M. Kandemir, and D. R. Mudgett.
Dynamic Compilation for Energy Adaptation.
In Proceedings of the 2002 IEEE/ACM international conference on
Computer-aided design (ICCAD), 2002.
- [173]
-
O. Unsal and I. Koren.
System-Level Power-Aware Design Techniques in Real-Time Systems.
Proceedings of the IEEE, 91(7), July 2003.
- [174]
-
V. Venkatachalam and M. Franz.
Power Reduction Techniques for Microprocessor Systems.
ACM Computing Surveys (CSUR), 37(3):195-237, 2005.
- [175]
-
N. Vijaykrishnan, M. Kandemir, M. J. Irwin, H. S. Kim, and W. Ye.
Energy-Driven Integrated Hardware-Software Optimizations Using
SimplePower.
In Proceedings of the 27th International Symposium on Computer
Architecture, June 2000.
- [176]
-
A. Weissel and F. Bellosa.
Process cruise control: Event-driven clock scaling for dynamic power
management.
In Proceedings of the International Conference on Compilers,
Architecture and Synthesis for Embedded Systems (CASES 2002), Grenoble,
France,, Aug. 2002.
- [177]
-
A. Weissel, B. Beutel, and F. Bellosa.
Cooperative I/O-A Novel I/O Semantics for Energy-Aware
Applications.
In Proceedings of the Fifth Symposium on Operating System Design
and Implementation OSDI'02, 2002.
- [178]
-
Q. Wu, P. Juang, M. Martonosi, and D. W. Clark.
Voltage and Frequency Control with Adaptive Reaction Time in
Multiple-Clock-Domain Processors.
In Proceedings of the 11th International Symposium on
High-Performance Computer Architecture (HPCA-11), 2005.
- [179]
-
Q. Wu, V. Reddi, Y. Wu, J. Lee, D. Connors, D. Brooks, M. Martonosi, and D. W.
Clark.
A Dynamic Compilation Framework for Controlling Microprocessor
Energy and Performance.
In Proceedings of the 38th International Symp. on
Microarchitecture, 2005.
- [180]
-
F. Xie, M. Martonosi, and S. Malik.
Compile-Time Dynamic Voltage Scaling Settings: Opportunities and
Limits.
In Proceedings of the ACM SIGPLAN Conference on Programming
Language Design and Implementation (PLDI 2003), June 2003.
- [181]
-
T. Y. Yeh and Y. N. Patt.
Alternative implementations of two-level adaptive branch
prediction.
In 19th Annual International Symposium on Computer
Architecture, May 1992.
- [182]
-
J. J. Yi, D. J. Lilja, and D. M. Hawkins.
A Statistically Rigorous Approach for Improving Simulation
Methodology.
In Proceedings of the 9th International Symposium on
High-Performance Computer Architecture (HPCA-9), Feb. 2003.
- [183]
-
J. J. Yi, R. Sendag, L. Eeckhout, A. Joshi, D. J. Lilja, and L. K. John.
Evaluating Benchmark Subsetting Approaches.
In Proceedings of the IEEE International Symposium on Workload
Characterization, Oct. 2006.
- [184]
-
H. Zeng, X. Fan, C. Ellis, A. Lebeck, and A. Vahdat.
ECOSystem: Managing energy as a first class operating system
resource.
In Tenth International Conference on Architectural Support for
Programming Languages and Operating Systems (ASPLOS X), Oct. 2002.
- [185]
-
M. T. Zhang.
Powering Intel Pentium 4 generation processors.
In IEEE Electrical Performance of Electronic Packaging
Conference, pages 215-218, 2001.
- [186]
-
P. Zhou, V. Pandey, J. Sundaresan, A. Raghuraman, Y. Zhou, and S. Kumar.
Dynamic Tracking of Page Miss Ratio Curve for Memory Management.
In Proceedings of the 11th International Conference on
Architectural Support for Programming Languages and Operating Systems
(ASPLOS-XI), 2004.
File translated from
TEX
by
TTH,
version 3.77.
On 18 Jun 2007, 22:28.