Topics Related to HPC in 2019

Algorithms

The development, evaluation and optimization of scalable, general-purpose, high performance algorithms.

» Algorithmic techniques to improve energy and power efficiency
» Algorithmic techniques to improve load balance
» Data-intensive parallel algorithms
» Discrete and combinatorial problems
» Fault-tolerant algorithms
» Graph and network algorithms
» Hybrid/heterogeneous/accelerated algorithms
» Numerical methods and algebraic systems
» Scheduling algorithms
» Uncertainty quantification
» Other high performance algorithms

Applications

The development and enhancement of algorithms, parallel implementations, models, software and problem solving environments for specific applications that require high performance resources.

» Bioinformatics and computational biology
» Computational earth and atmospheric sciences
» Computational materials science and engineering
» Computational astrophysics/astronomy, chemistry, and physics
» Computational fluid dynamics and mechanics
» Computation and data enabled social science
» Computational design optimization for aerospace, energy, manufacturing, and industrial applications
» Computational medicine and bioengineering
» Other high performance applications
» Use of uncertainty quantification, statistical, and machine-learning techniques to improve a specific HPC application
» Improved models, algorithms, performance or scalability of specific applications and respective software

Architecture and Networks

All aspects of high performance hardware including the optimization and evaluation of processors and networks.

» Memory systems: caches, memory technology, non-volatile memory, memory system architecture (to include address translation for cores and accelerators)
» I/O architecture/hardware and emerging storage technologies
» Network protocols, quality of service, congestion control, collective communication
» Scalable and composable coherence (for cores and accelerators)
» Multi-processor architecture and micro-architecture (e.g. reconfigurable, vector, stream, dataflow, GPUs, and custom/novel architecture)
» Interconnect technologies, topology, switch architecture, optical networks, software-defined networks
» Architectures to support extremely heterogeneous composable systems (e.g., chiplets)
» Secure architectures, side-channel attacks, and mitigation
» Power-efficient design and power-management strategies
» Resilience, error correction, high availability architectures
» Software/hardware co-design, domain specific language support
» Evaluation and measurement on testbed or production hardware systems
» Hardware acceleration of containerization and virtualization mechanisms for HPC

Clouds and Distributed Computing

All software aspects of clouds and distributed computing that are related to HPC systems, including software architecture, configuration, optimization and evaluation.

» Compute and storage cloud architectures including many-core computing and accelerators in the cloud
» HPC and cloud convergence at infrastructure and software level
» Innovative methods for using cloud-based systems for HPC applications
» Support and tuning of Big Data cloud data ecosystems on HPC infrastructures
» Parallel programming models and tools at the intersection of cloud and HPC
» Virtualization and containerization for HPC, virtualized high performance I/O network interconnects, parallel and distributed file systems in virtual environments
» Cloud workflow, data, and resource management including dynamic resource provisioning
» Methods, systems, and architectures for scalable data stream processing
» Scheduling, load balancing, resource provisioning, energy efficiency, fault tolerance, and reliability for cloud computing
» Self-configuration, management, information services, and monitoring
» Service-oriented architectures and tools for integration of clouds, clusters, and distributed computing
» Cloud security and identity management
» Science case studies on cloud infrastructure
» Machine learning for science in the cloud

Data Analytics, Visualization, and Storage

All aspects of data analytics, visualization, storage, and storage I/O related to HPC systems. Submissions on work done at scale are highly favored.

» Databases and scalable structured storage for HPC
» Data mining, analysis, and visualization for modeling and simulation
» Data analytics and frameworks supporting data analytics
» Ensemble analysis and visualization
» I/O performance tuning, benchmarking, and middleware
» Scalable storage systems
» Next-generation storage systems and media
» Parallel file, object, key-value, campaign, and archival systems
» Provenance, metadata, and data management
» Reliability and fault tolerance in HPC storage
» Scalable storage, metadata, namespaces, and data management
» Storage tiering, entirely on-premise internal tiering as well as tiering between on-premise and cloud
» Storage innovations using machine learning such as predictive tiering, failure, etc.
» Storage networks
» Cloud-based storage

Machine Learning and HPC

The development and enhancement of algorithms, systems, and software for scalable machine learning utilizing high-performance and cloud computing platforms.

» Machine learning and optimization models for extreme scale systems
» Enhancing applicability of machine learning in HPC (e.g. usability)
» Learning large models / optimizing hyper parameters (e.g. deep learning, representation learning)
» Facilitating very large ensembles in extreme scale systems
» Training machine learning models on large datasets and scientific data
» Overcoming the machine learning problems inherent to large datasets (e.g. noisy labels, missing data, scalable ingest)
» Large scale machine learning applications utilizing HPC
» Future research challenges for machine learning at large scale
» Hybrid machine learning algorithms for hybrid HPC compute architectures
» Systems, compilers, and languages for machine learning at scale

Performance Measurement, Modeling, and Tools

Novel methods and tools for measuring, evaluating, and/or analyzing performance for large scale systems.

» Analysis, modeling, prediction, or simulation methods
» Empirical measurement techniques on HPC systems
» Scalable tools and instrumentation infrastructure for measurement, monitoring, and/or visualization of performance
» Novel and broadly applicable performance optimization techniques
» Methodologies, metrics, and formalisms for performance analysis and tools
» Workload characterization and benchmarking techniques
» Performance studies of HPC subsystems such as processor, network, memory, accelerators, and storage
» System-design tradeoffs between different measures of performance (e.g., performance and resilience, performance and security)

Programming Systems

Technologies that support parallel programming for large-scale systems as well as smaller-scale components that will plausibly serve as building blocks for next-generation HPC architectures.

» Parallel programming languages, libraries, models, and notations
» Programming language and compilation techniques for reducing energy and data movement (e.g., precision allocation, use of approximations, tiling)
» Solutions for parallel-programming challenges (e.g., interoperability, memory consistency, determinism, race detection, work stealing, or load balancing)
» Parallel application frameworks
» Tools for parallel program development (e.g., debuggers and integrated development environments)
» Program analysis, synthesis, and verification to enhance cross-platform portability, maintainability, result reproducibility, resilience (e.g., combined static and dynamic analysis methods, testing, formal methods)
» Compiler analysis and optimization; program transformation
» Runtime systems as they interact with programming systems

State of the Practice

All R&D aspects of the pragmatic practices of HPC, including operational IT infrastructure, services, facilities, large-scale application executions and benchmarks.

» Bridging of cloud data centers and supercomputing centers
» Comparative system benchmarking over a wide spectrum of workloads
» Deployment experiences of large-scale infrastructures and facilities
» Facilitation of “big data” associated with supercomputing
» Long-term infrastructural management experiences
» Pragmatic resource management strategies and experiences
» Procurement, technology investment and acquisition best practices
» Quantitative results of education, training and dissemination activities
» User support experiences with large-scale and novel machines
» Infrastructural policy issues, especially international experiences
» Software engineering best practices for HPC

System Software

Operating system (OS), runtime system and other low-level software research & development that enables allocation and management of hardware resources for HPC applications and services.

» Alternative and specialized parallel operating systems and runtime systems
» Approaches for enabling adaptive and introspective system software
» Communication optimization
» Software distributed shared memory systems
» System-software support for global address spaces
» OS and runtime system enhancements for attached and integrated accelerators
» Interactions among the OS, runtime, compiler, middleware, and tools
» Parallel/networked file system integration with the OS and runtime
» Resource management
» Runtime and OS management of complex memory hierarchies
» System software strategies for controlling energy and temperature
» Support for fault tolerance and resilience
» Virtualization and virtual machines