Supercomputing and Big Data
given by Prof. Dr. - Ing. Morris Riedel
The fast training of traditional machine learning models and more innovative deep learning networks from increasingly growing large quantities of scientific and engineering datasets (aka ‘Big Data‘) requires high performance computing (HPC) on modern supercomputers today. HPC technologies such as those developed within the European DEEP-EST project provide innovative approaches w.r.t. processing, memory, and modular supercomputing usage during training, testing, and validation processes. This workshop thus focus on parallel and scalable machine learning driven by HPC and will pave the way for participants to use parallel processing on supercomputers as a key enabler for a wide variety of machine learning and deep learning algorithms used today. Examples include scientific and engineering applications that leverage traditional machine learning techniques such as scalable feature engineering, density-based spatial clustering of applications with noise (DBSCAN) and support vector machines (SVMs) with kernel methods. Those applications of traditional machine learning will be also compared with innovative deep learning models using Keras and TensorFlow taking advantage of convolutional neural networks (CNNs) for image datasets as well as long short-term memory (LSTM) networks for sequence data. Throughout learning these concrete models the participants will further learn required aspects of statistical learning theory and how to avoid overfitting in context of applications using various regularization and cross-validation techniques. The agenda is as follows:
10:00 – 11:30 HPC Introduction & Parallel and Scalable Clustering using DBSCAN
11:30 – 12:00 coffee break
12:00 – 13:30 Parallel and Scalable Classification using SVMs with Applications
13:30 – 14:30 lunch
14:30 – 16:00 Deep Learning using CNNs driven by HPC & GPUs
16:00 – 16:30 coffee break
16:30 – 17:30 Deep Learning using LSTMs driven by HPC & GPUs