Die verschiedenen von der GWDG betriebenen HPC-Systeme sind auch Gegenstand der GWDG-Forschung zu HPC-Methoden, deren
Ergebnisse dann zur Verbesserung der Bedienung und/oder User Experience genutzt werden. In diesem Zusammenhang ist die
GWDG auch an verschiedenen Drittmittelprojekten beteiligt.
Neben der serviceorientierten Forschung ist die akademische Lehre in den Bereichen Informatik ein Schwerpunkt unserer
Arbeit. Aus diesem Grund engagieren wir uns auf vielfältige Weise in der Ausbildung der Studierenden. Die GWDG verfügt
derzeit über drei Forschungsgruppen, deren Lehrtätigkeit am
Institut für Informatik der Universität Göttingen verankert ist und
deren Lehrinhalte Teil verschiedener Studiengänge sind.
Eine vollständige Liste der GWDG Forschungsprojekte finden Sie
hier.
Aktuelle HPC Veröffentlichungen
2022
Improve the Deep Learning Models in Forestry Based on Explanations and Expertise
(Ximeng Cheng, Ali Doosthosseini, Julian Kunkel),
In Frontiers in Plant Science,
Schloss Dagstuhl -- Leibniz-Zentrum für Informatik,
ISSN: 1664-462X,
2022-05-01
DOIPDF
2021
User-Centric System Fault Identification Using IO500 Benchmark
(Radita Liem, Dmytro Povaliaiev, Jay Lofstead, Julian Kunkel, Christian Terboven),
pp. 35-40,
IEEE,
2021-12-01
DOIPDF
Understanding I/O Behavior in Scientific and Data-Intensive Computing (Dagstuhl Seminar 21332)
(Philip Carns, Julian Kunkel, Kathryn Mohror, Martin Schulz),
In Dagstuhl Reports,
pp. 16-75,
Schloss Dagstuhl -- Leibniz-Zentrum für Informatik,
ISSN: 2192-5283,
2021-09-14
URLDOIPDF
BibTeX: Improve the Deep Learning Models in Forestry Based on Explanations and Expertise
@article{ITDLMIFBOE22,
abstract = {"In forestry studies, deep learning models have achieved excellent performance in many application scenarios (e.g., detecting forest damage). However, the unclear model decisions (i.e., black-box) undermine the credibility of the results and hinder their practicality. This study intends to obtain explanations of such models through the use of explainable artificial intelligence methods, and then use feature unlearning methods to improve their performance, which is the first such attempt in the field of forestry. Results of three experiments show that the model training can be guided by expertise to gain specific knowledge, which is reflected by explanations. For all three experiments based on synthetic and real leaf images, the improvement of models is quantified in the classification accuracy (up to 4.6%) and three indicators of explanation assessment (i.e., root-mean-square error, cosine similarity, and the proportion of important pixels). Besides, the introduced expertise in annotation matrix form was automatically created in all experiments. This study emphasizes that studies of deep learning in forestry should not only pursue model performance (e.g., higher classification accuracy) but also focus on the explanations and try to improve models according to the expertise."},
author = {Ximeng Cheng and Ali Doosthosseini and Julian Kunkel},
doi = {https://doi.org/10.3389/fpls.2022.902105},
issn = {1664-462X},
journal = {Frontiers in Plant Science},
month = {05},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum für Informatik},
title = {Improve the Deep Learning Models in Forestry Based on Explanations and Expertise},
type = {article},
year = {2022},
}
BibTeX: User-Centric System Fault Identification Using IO500 Benchmark
@inproceedings{USFIUIBLPL21,
abstract = {"I/O performance in a multi-user environment is difficult to predict. Users do not know what I/O performance to expect when running and tuning applications. We propose to use the IO500 benchmark as a way to guide user expectations on their application’s performance and to aid identifying root causes of their I/O problems that might come from the system. Our experiments describe how we manage user expectation with IO500 and provide a mechanism for system fault identification. This work also provides us with information of the tail latency problem that needs to be addressed and granular information about the impact of I/O technique choices (POSIX and MPI-IO)."},
author = {Radita Liem and Dmytro Povaliaiev and Jay Lofstead and Julian Kunkel and Christian Terboven},
booktitle = {In 2021 IEEE/ACM Sixth International Parallel Data Systems Workshop (PDSW)},
conference = {International Parallel Data Systems Workshop (PDSW)},
doi = {https://doi.org/10.1109/PDSW54622.2021.00011},
editor = {},
location = {St. Louis},
month = {12},
pages = {35-40},
publisher = {IEEE},
title = {User-Centric System Fault Identification Using IO500 Benchmark},
type = {inproceedings},
year = {2021},
}
BibTeX: Understanding I/O Behavior in Scientific and Data-Intensive Computing (Dagstuhl Seminar 21332)
@article{UIBISADCSC21,
abstract = {| Two key changes are driving an immediate need for deeper understanding of I/O workloads in high-performance computing (HPC): applications are evolving beyond the traditional bulk-synchronous models to include integrated multistep workflows, in situ analysis, artificial intelligence, and data analytics methods; and storage systems designs are evolving beyond a two-tiered file system and archive model to complex hierarchies containing temporary, fast tiers of storage close to compute resources with markedly different performance properties. Both of these changes represent a significant departure from the decades-long status quo and require investigation from storage researchers and practitioners to understand their impacts on overall I/O performance. Without an in-depth understanding of I/O workload behavior, storage system designers, I/O middleware developers, facility operators, and application developers will not know how best to design or utilize the additional tiers for optimal performance of a given I/O workload. The goal of this Dagstuhl Seminar was to bring together experts in I/O performance analysis and storage system architecture to collectively evaluate how our community is capturing and analyzing I/O workloads on HPC systems, identify any gaps in our methodologies, and determine how to develop a better in-depth understanding of their impact on HPC systems. Our discussions were lively and resulted in identifying critical needs for research in the area of understanding I/O behavior. We document those discussions in this report.},
author = {Philip Carns and Julian Kunkel and Kathryn Mohror and Martin Schulz},
doi = {https://doi.org/10.4230/DagRep.11.7.16},
issn = {2192-5283},
journal = {Dagstuhl Reports},
month = {09},
pages = {16-75},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum für Informatik},
title = {Understanding I/O Behavior in Scientific and Data-Intensive Computing (Dagstuhl Seminar 21332)},
type = {article},
url = {https://drops.dagstuhl.de/opus/volltexte/2021/15589},
year = {2021},
}
Eine Übersicht aller GWDG Veröffentlichungen finden Sie hier.
Offene Projekte und Bachelor-, Master- und Doktorarbeiten
Thema
Professor*in
Typ
Token Management for an API to utilise HPC resources in generic workflows
Parallel applications on HPC systems often rely on system specific MPI (Message Passing Interface) and interconnect libraries, for example for Infiniband or OmniPath networks. This partially offsets one main advantage of containerizing such applications, namely the portability between different platforms. The goal of this project is to evaluate different ways of integrating system specific communication libraries into containers, allowing for porting these containers to a different platform with minimal effort. A PoC should be implemented and benchmarked against running natively on a system. Betreuer*in: Christian Boehme 📧
Digital Twin of the data center: Erstellung eines 3D Modells für den GWDG Data Center für Begehungen in virtual reality
Prof. Julian Kunkel
BSc, MSc
Betreuer*in:
Digitale Lehere: Entwicklung von Prüfungszenarien für HPC-Kenntnisse
Prof. Julian Kunkel
BSc, MSc
Betreuer*in:
Entwicklung einer Provenance aware ad-hoc Schnittstelle für einen Data Lake