Actions

Operating Systems: Difference between revisions

From Modelado Foundation

imported>Admin
No edit summary
imported>Admin
No edit summary
 
Line 67: Line 67:
|question=What energy/power and reliability APIs do you expect from the OS/R?
|question=What energy/power and reliability APIs do you expect from the OS/R?


|xpressanswer=Virtualization is a key requirement for advanced multi-modal use of LXK supporting a diversity of runtime environments including but not limited to conventional legacy codes and future runtime systems like HPX.
|xpressanswer=While XPRESS is not explicitly addressing energy and reliability issues, under related support it does incorporate the microcheckpointing compute-validate-commit cycle for reliability and the side-path energy suppression strategy for power reduction. Access to faults detection from hardware and OS is required as is measurements of power and thermal conditions on per socket basis. Control of processor core operation and clocks are required.


|tganswer=No.
|tganswer=We hope that this will all be managed inside the runtime layer and that our EDT and DB programming model will enable the user to be unaware of it. We may, however, allow for some hints from higher level tools (for example having a compiler expose multiple versions of an EDT with differing energy envelopes). The runtime will rely on the underlying system-software/OS to provide introspection capabilities as well as "knobs" to turn (mostly voltage and frequency regulation as well as fine-grained on/off).


|degasanswer=
|degasanswer=


|dtecanswer=Virtualization is an orthogonal issue for D-TEC. We do not assume either the presence or absence of virtualization in our work.
|dtecanswer=Fault detection notifications from hardware and OS. - Fine-grain power measurements (per socket) - Thermal information (per socket) - Access to DVFS settings


|dynaxanswer=No.
|dynaxanswer=The ability to dynamically modify power state and clock frequency for subsets of the system, and a mechanism to detect failures, are both very important for meeting power and reliability goals.


|xtuneanswer=
|xtuneanswer=
Line 85: Line 85:
|sleecanswer=N/A
|sleecanswer=N/A


|piperanswer=No
|piperanswer=Power: Power measurements at varying granularity using processor internal counters and external sensors, for autotuning access to setting DVFS levels or power capping; Resilience: access to fault notification (for corrected and non-corrected errors)
 
}}
}}


Line 165: Line 166:


}}
}}
{| class="wikitable"
! style="width: 200;" | QUESTIONS
! style="width: 200;" | XPRESS
! style="width: 200;" | TG X-Stack
! style="width: 200;" | DEGAS
! style="width: 200;" | D-TEC
! style="width: 200;" | DynAX
! style="width: 200;" | X-TUNE
! style="width: 200;" | GVR
! style="width: 200;" | CORVETTE
! style="width: 200;" | SLEEC
! style="width: 200;" | PIPER
|-
| '''PI''' || Ron Brightwell || Shekhar Borkar || Katherine Yelick || Daniel Quinlan  || Guang Gao || Mary Hall || Andrew Chien || Koushik Sen || Milind Kulkarni || Martin Schulz
|- style="vertical-align:top;"
|'''What are the key system calls / features that you need OS/R to support?  Examples: exit, read, write, open, close, link, unlink, chdir, time, chmod, clone, uname, execv, etc.'''
|LXK is developed within XPRESS as an independent lightweight kernel to fully support the HPX runtime system and XPI programming interfaces through the RIOS (runtime interface to OS) protocol.
|We currently require some method to get memory from the system (equivalent of sbrk) and some method of input/output. The requirements will be extended as we see the need but we want to limit the dependence on system calls to make our runtime as general and applicable to as wide a range of targets as possible.
|(DEGAS)
|(D-TEC)
|The SWARM runtime requires access to hardware threads, memory, and network interconnect(s), whether by system call or direct access.  On commodity clusters, SWARM additionally needs access to I/O facilities, such as the POSIX select, read, and write calls.
|(X-TUNE)
|(GVR) 
|(CORVETTE)
|SLEEC
|sockets, ptrace access, dynamic linking, timer interrupts/signals, access to hardware counters, file I/O
|- style="vertical-align:top;"
|'''Does your project implement it's own lightweight thread package, or does it rely on the threads provided by the OS/R? If you implement your own threads, what are the key features that required developing a new implementation? If you don't implement your own thread package, what are the key performance characteristics and APIs needed to support your project?'''
|HPX provides its own lightweight thread package that relies on heavyweight thread execution by the LXK OS. Features required include threads as first class objects, efficient context switch, application dynamic scheduling policies, message-driven remote thread creation.
|Our runtime can rely on existing thread (we use pthreads for example) frameworks but we do not need them as we really use the existing threading framework to emulate a computing resource.
|(DEGAS)
|(D-TEC)
|SWARM uses codelets to intermediate between hardware cores and function/method calls. The general requirement is for direct allocation of hardware resources.  On Linux-like platforms, threads are created and bound to cores at runtime startup; codelets are bound to particular threads only when they're dispatched, unless some more specific binding is arranged for before readying the codelet.
|(X-TUNE)
|(GVR)
|(CORVETTE)
|N/A
|relying on native threads, typically pthreads, which is appropriate
|- style="vertical-align:top;"
|'''Do you currently view virtualization as a key requirement to support your x-stack project?  If so, why?'''
|Virtualization is a key requirement for advanced multi-modal use of LXK supporting a diversity of runtime environments including but not limited to conventional legacy codes and future runtime systems like HPX.
|No.
|(DEGAS)
| Virtualization is an orthogonal issue for D-TEC.  We do not assume either the presence or absence of virtualization in our work.
|No.
|(X-TUNE)
|(GVR)
|(CORVETTE)
|N/A
|No
|- style="vertical-align:top;"
|'''What energy/power and reliability APIs do you expect from the OS/R?''' 
|While XPRESS is not explicitly addressing energy and reliability issues, under related support it does incorporate the microcheckpointing compute-validate-commit cycle for reliability and the side-path energy suppression strategy for power reduction. Access to faults detection from hardware and OS is required as is measurements of power and thermal conditions on per socket basis. Control of processor core operation and clocks are required.
|We hope that this will all be managed inside the runtime layer and that our EDT and DB programming model will enable the user to be unaware of it. We may, however, allow for some hints from higher level tools (for example having a compiler expose multiple versions of an EDT with differing energy envelopes). The runtime will rely on the underlying system-software/OS to provide introspection capabilities as well as "knobs" to turn (mostly voltage and frequency regulation as well as fine-grained on/off).
|(DEGAS)
|(D-TEC)
- Fault detection notifications from hardware and OS.
- Fine-grain power measurements (per socket)
- Thermal information (per socket)
- Access to DVFS settings
|The ability to dynamically modify power state and clock frequency for subsets of the system, and a mechanism to detect failures, are both very important for meeting power and reliability goals.
|(X-TUNE)
|(GVR)
|(CORVETTE)
|N/A
|Power: Power measurements at varying granularity using processor internal counters and external sensors, for autotuning access to setting DVFS levels or power capping; Resilience: access to fault notification (for corrected and non-corrected errors)
|- style="vertical-align:top;"
|'''Please describe how parallel programs are "composed" within your project, and what APIs and support is required from the OS/R?'''
|Parallel programs are composed through ParalleX Processes interfaces comprising message-driven method instantiation for value/synchronization passing and data object access. OS support for memory address and global name-space is required.
|See the [[Traleika_Glacier#About_OCR_-_Open_Community_Runtime|OCR spec]].
|(DEGAS)
|(D-TEC)
|Parallel programs comprise a directed acyclic graph of non-blocking tasks, called "codelets," which form the nodes of the graph, and which produce and consume data, which form the edges of the graph. The SWARM runtime keeps track of when each task's input dependencies are met and when it can run, and where it can run in order to maximize locality to minimize the amount of required data movement.  Fundamentally, this only requires the OS or hardware to permit SWARM to allocate hardware cores and memory, and to send data around the system. To reduce runtime overhead, hardware features such as DMA would be beneficial. If such hardware features are protected from direct access by the SWARM runtime, then the OS should expose an API to access them.
|(X-TUNE)
|(GVR)
|(CORVETTE)
|N/A
|N/A
|- style="vertical-align:top;"
|'''What is your model for extreme-scale I/O, and how do you expect the OS/R to support your plans?'''
|OpenX supports a multiplicity of I/O support interfaces from conventional file systems to a unified namespace and asynchronous control (under separate funding).
|Proxy call to external hosts.
|(DEGAS)
|(D-TEC)
|At Exascale, it is expected that network latencies will be very high, and that hardware failure is common.  Therefore, at the OS/runtime level, what's needed is an interface for doing asynchronous I/O which is resilient in the face of failures. Further details will be dictated largely by the needs of applications.
|(X-TUNE)
|(GVR)
|(CORVETTE)
|N/A
|For PIPER this refers to performance information: hierarchical aggregation through MRNet to reduce data and perform online analysis. OS support required for bootstrapping / optionally, could be integrated with OS backplanes
|- style="vertical-align:top;"
|'''Does your project include support for multiple, hierarchical, memory regions? If so, how will it be allocated and managed, by your xstack project or my the OS/R? What APIs do you expect the OS/R to support complex and deep memory?'''
|Physical memory is managed by LXK and allocated to HPX ParalleX processes, which aggregate physical blocks to comprise a hierarchy of logical memory resources. Memory is globally accessible through capabilities based addressing and exported process methods.
|Yes, OCR supports multiple, hierarchical memory regions. We need support from the underlying system-software/OS to inform the runtime of the regions that it can use and of their characteristics.
|(DEGAS)
|(D-TEC)
|The SWARM runtime does support distributed memory and deep NUMA machines, through the use of a hierarchy of locale structures which roughly parallel the hardware topology.  SWARM requires the ability to allocate the relevant memory regions from the OS.
|(X-TUNE)
|(GVR)
|(CORVETTE)
|N/A
|N/A
|}

Latest revision as of 05:01, May 20, 2014

Sonia requested that Pete Beckman initiate this page. For comments, please contact Pete Beckman. This page is still in development.

PI
XPRESS Ron Brightwell
TG Shekhar Borkar
DEGAS Katherine Yelick
D-TEC Daniel Quinlan
DynAX Guang Gao
X-TUNE Mary Hall
GVR Andrew Chien
CORVETTE Koushik Sen
SLEEC Milind Kulkarni
PIPER Martin Schulz

Questions:

What are the key system calls / features that you need OS/R to support? Examples: exit, read, write, open, close, link, unlink, chdir, time, chmod, clone, uname, execv, etc.
XPRESS LXK is developed within XPRESS as an independent lightweight kernel to fully support the HPX runtime system and XPI programming interfaces through the RIOS (runtime interface to OS) protocol.
TG We currently require some method to get memory from the system (equivalent of sbrk) and some method of input/output. The requirements will be extended as we see the need but we want to limit the dependence on system calls to make our runtime as general and applicable to as wide a range of targets as possible.
DEGAS
D-TEC
DynAX The SWARM runtime requires access to hardware threads, memory, and network interconnect(s), whether by system call or direct access. On commodity clusters, SWARM additionally needs access to I/O facilities, such as the POSIX select, read, and write calls.
X-TUNE
GVR
CORVETTE
SLEEC
PIPER sockets, ptrace access, dynamic linking, timer interrupts/signals, access to hardware counters, file I/O
Does your project implement it's own lightweight thread package, or does it rely on the threads provided by the OS/R? If you implement your own threads, what are the key features that required developing a new implementation? If you don't implement your own thread package, what are the key performance characteristics and APIs needed to support your project?
XPRESS HPX provides its own lightweight thread package that relies on heavyweight thread execution by the LXK OS. Features required include threads as first class objects, efficient context switch, application dynamic scheduling policies, message-driven remote thread creation.
TG Our runtime can rely on existing thread (we use pthreads for example) frameworks but we do not need them as we really use the existing threading framework to emulate a computing resource.
DEGAS
D-TEC
DynAX SWARM uses codelets to intermediate between hardware cores and function/method calls. The general requirement is for direct allocation of hardware resources. On Linux-like platforms, threads are created and bound to cores at runtime startup; codelets are bound to particular threads only when they're dispatched, unless some more specific binding is arranged for before readying the codelet.
X-TUNE
GVR
CORVETTE
SLEEC N/A
PIPER relying on native threads, typically pthreads, which is appropriate
What energy/power and reliability APIs do you expect from the OS/R?
XPRESS While XPRESS is not explicitly addressing energy and reliability issues, under related support it does incorporate the microcheckpointing compute-validate-commit cycle for reliability and the side-path energy suppression strategy for power reduction. Access to faults detection from hardware and OS is required as is measurements of power and thermal conditions on per socket basis. Control of processor core operation and clocks are required.
TG We hope that this will all be managed inside the runtime layer and that our EDT and DB programming model will enable the user to be unaware of it. We may, however, allow for some hints from higher level tools (for example having a compiler expose multiple versions of an EDT with differing energy envelopes). The runtime will rely on the underlying system-software/OS to provide introspection capabilities as well as "knobs" to turn (mostly voltage and frequency regulation as well as fine-grained on/off).
DEGAS
D-TEC Fault detection notifications from hardware and OS. - Fine-grain power measurements (per socket) - Thermal information (per socket) - Access to DVFS settings
DynAX The ability to dynamically modify power state and clock frequency for subsets of the system, and a mechanism to detect failures, are both very important for meeting power and reliability goals.
X-TUNE
GVR
CORVETTE
SLEEC N/A
PIPER Power: Power measurements at varying granularity using processor internal counters and external sensors, for autotuning access to setting DVFS levels or power capping; Resilience: access to fault notification (for corrected and non-corrected errors)
Please describe how parallel programs are "composed" within your project, and what APIs and support is required from the OS/R?
XPRESS Parallel programs are composed through ParalleX Processes interfaces comprising message-driven method instantiation for value/synchronization passing and data object access. OS support for memory address and global name-space is required.
TG See the OCR Spec.
DEGAS
D-TEC
DynAX Parallel programs comprise a directed acyclic graph of non-blocking tasks, called "codelets," which form the nodes of the graph, and which produce and consume data, which form the edges of the graph. The SWARM runtime keeps track of when each task's input dependencies are met and when it can run, and where it can run in order to maximize locality to minimize the amount of required data movement. Fundamentally, this only requires the OS or hardware to permit SWARM to allocate hardware cores and memory, and to send data around the system. To reduce runtime overhead, hardware features such as DMA would be beneficial. If such hardware features are protected from direct access by the SWARM runtime, then the OS should expose an API to access them.
X-TUNE
GVR
CORVETTE
SLEEC N/A
PIPER N/A
What is your model for extreme-scale I/O, and how do you expect the OS/R to support your plans?
XPRESS OpenX supports a multiplicity of I/O support interfaces from conventional file systems to a unified namespace and asynchronous control (under separate funding).
TG Proxy call to external hosts.
DEGAS
D-TEC
DynAX At Exascale, it is expected that network latencies will be very high, and that hardware failure is common. Therefore, at the OS/runtime level, what's needed is an interface for doing asynchronous I/O which is resilient in the face of failures. Further details will be dictated largely by the needs of applications.
X-TUNE
GVR
CORVETTE
SLEEC N/A
PIPER For PIPER this refers to performance information: hierarchical aggregation through MRNet to reduce data and perform online analysis. OS support required for bootstrapping / optionally, could be integrated with OS backplanes
Does your project include support for multiple, hierarchical, memory regions? If so, how will it be allocated and managed, by your xstack project or my the OS/R? What APIs do you expect the OS/R to support complex and deep memory?
XPRESS Physical memory is managed by LXK and allocated to HPX ParalleX processes, which aggregate physical blocks to comprise a hierarchy of logical memory resources. Memory is globally accessible through capabilities based addressing and exported process methods.
TG Yes, OCR supports multiple, hierarchical memory regions. We need support from the underlying system-software/OS to inform the runtime of the regions that it can use and of their characteristics.
DEGAS
D-TEC Yes. We consider SMT solvers in this respect most promising.
DynAX The SWARM runtime does support distributed memory and deep NUMA machines, through the use of a hierarchy of locale structures which roughly parallel the hardware topology. SWARM requires the ability to allocate the relevant memory regions from the OS.
X-TUNE
GVR
CORVETTE
SLEEC N/A
PIPER N/A