Head of Group

Prof. Dr.Philipp Wieder

Biography

Professor Philipp Wieder has been deputy head of the GWDG, a joint institution of the Georg August University of Göttingen and the Max Planck Society, since October 2011. He also heads the “eScience” working group and held an honorary professorship at the Georg August University of Göttingen from 2019 to 2025. Since 2025, he has been a full professor of Data Science Infrastructures there.

In 2000, he completed his studies in electrical engineering at RWTH Aachen University. He then devoted himself to research in the field of distributed systems and grid computing at what is now the Jülich Supercomputing Centre and earned his doctorate on the topic of scheduling in distributed infrastructures at TU Dortmund University. At the same time, he was group leader for the “Service Computing” working group at the TU Dortmund University.

Philipp Wieder is involved in various initiatives on topics such as research data management and distributed service infrastructures. In addition to the European Open Science Cloud, this includes in particular the National Research Data Infrastructure (NFDI), where he is co-spokesperson for the Text+ consortium. He is also responsible for numerous national and international research projects and teaches in the field of data science.

Research Interests

Data Science Infrastructure
Research Data Management
Distributed Systems

Recent Publications

All publications by Prof. Dr. Philipp Wieder are available on the Goettingen Research Online platform. Publications by Prof. Dr. Philipp Wieder

2023

A Survey on Identity and Access Management for Cross-Domain Dynamic Users: Issues, Solutions, and Challenges (Aytaj Badirova, Shirin Dabbaghi, Faraz Fatemi Moghaddam, Philipp Wieder, Ramin Yahyapour), 2023-01-01

2022

Canonical Workflow for Experimental Research (Dirk Betz, Claudia Biniossek, Christophe Blanchi, Felix Henninger, Thomas Lauer, Philipp Wieder, Peter Wittenburg, Martin Zünkeler), 2022-01-01 DOI
Data Depositing Services und der Text+ Datenraum (Andreas Witt, Andreas Henrich, Jonathan Blumtritt, Christoph Draxler, Axel Herold, Marius Hug, Christoph Kudella, Peter Leinen, Philipp Wieder), 2022-01-01
Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes (Hendrik Nolte, Philipp Wieder), 2022-01-01 DOI
Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes (Hendrik Nolte, Philipp Wieder), 2022-01-01 URL DOI
Toward data lakes as central building blocks for data management and analysis (Philipp Wieder, Hendrik Nolte), 2022-01-01 DOI
Toward data lakes as central building blocks for data management and analysis (Hendrik Nolte, Philipp Wieder), 2022-01-01 URL

2021

An Optimized Single Sign-On Schema for Reliable Multi -Level Security Management in Clouds (Aytaj Badirova, Shirin Dabbaghi, Faraz Fatemi-Moghaddam, Philipp Wieder, Ramin Yahyapour), In Proceedings of FiCloud 2021 – 8th International Conference on Future Internet of Things and Cloud, 2021-01-01
Certification Schemes for Research Infrastructures (Felix Helfer, Stefan Buddenbohm, Thomas Eckart, Philipp Wieder), 2021-01-01
Sekundäre Nutzung von hausärztlichen Routinedaten ist machbar – Bericht vom RADAR Projekt (Johannes Hauswaldt, Thomas Bahls, Arne Blumentritt, Iris Demmer, Johannes Drepper, Roland Groh, Stephanie Heinemann, Wolfgang Hoffmann, Valérie Kempter, Johannes Pung, Otto Rienhoff, Falk Schlegelmilch, Philipp Wieder, Ramin Yahyapour, Eva Hummers), 2021-01-01 DOI

BibTeX: A Survey on Identity and Access Management for Cross-Domain Dynamic Users: Issues, Solutions, and Challenges

@article{2_131999,

	author = {Aytaj Badirova and Shirin Dabbaghi and Faraz Fatemi Moghaddam and Philipp Wieder and Ramin Yahyapour},
	grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/131999},
	month = {01},
	title = {A Survey on Identity and Access Management for Cross-Domain Dynamic Users: Issues, Solutions, and Challenges},
	type = {article},
	year = {2023},
}

BibTeX: Toward data lakes as central building blocks for data management and analysis

@article{2_129372,

	abstract = {&#34;Data lakes are a fundamental building block for many industrial data analysis solutions and becoming increasingly popular in research. Often associated with big data use cases, data lakes are, for example, used as central data management systems of research institutions or as the core entity of machine learning pipelines. The basic underlying idea of retaining data in its native format within a data lake facilitates a large range of use cases and improves data reusability, especially when compared to the schema-on-write approach applied in data warehouses, where data is transformed prior to the actual storage to fit a predefined schema. Storing such massive amounts of raw data, however, has its very own challenges, spanning from the general data modeling, and indexing for concise querying to the integration of suitable and scalable compute capabilities. In this contribution, influential papers of the last decade have been selected to provide a comprehensive overview of developments and obtained results. The papers are analyzed with regard to the applicability of their input to data lakes that serve as central data management systems of research institutions. To achieve this, contributions to data lake architectures, metadata models, data provenance, workflow support, and FAIR principles are investigated. Last, but not least, these capabilities are mapped onto the requirements of two common research personae to identify open challenges. With that, potential research topics are determined, which have to be tackled toward the applicability of data lakes as central building blocks for research data management.&#34;},
	author = {Hendrik Nolte and Philipp Wieder},
	grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/129372},
	month = {01},
	title = {Toward data lakes as central building blocks for data management and analysis},
	type = {article},
	url = {https://publications.goettingen-research-online.de/handle/2/114449},
	year = {2022},
}

BibTeX: Toward data lakes as central building blocks for data management and analysis

@article{2_114449,

	abstract = {&#34;Data lakes are a fundamental building block for many industrial data analysis solutions and becoming increasingly popular in research. Often associated with big data use cases, data lakes are, for example, used as central data management systems of research institutions or as the core entity of machine learning pipelines. The basic underlying idea of retaining data in its native format within a data lake facilitates a large range of use cases and improves data reusability, especially when compared to the schema-on-write approach applied in data warehouses, where data is transformed prior to the actual storage to fit a predefined schema. Storing such massive amounts of raw data, however, has its very own challenges, spanning from the general data modeling, and indexing for concise querying to the integration of suitable and scalable compute capabilities. In this contribution, influential papers of the last decade have been selected to provide a comprehensive overview of developments and obtained results. The papers are analyzed with regard to the applicability of their input to data lakes that serve as central data management systems of research institutions. To achieve this, contributions to data lake architectures, metadata models, data provenance, workflow support, and FAIR principles are investigated. Last, but not least, these capabilities are mapped onto the requirements of two common research personae to identify open challenges. With that, potential research topics are determined, which have to be tackled toward the applicability of data lakes as central building blocks for research data management.&#34;},
	author = {Philipp Wieder and Hendrik Nolte},
	doi = {10.3389/fdata.2022.945720},
	grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/114449},
	month = {01},
	title = {Toward data lakes as central building blocks for data management and analysis},
	type = {article},
	year = {2022},
}

BibTeX: Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes

@article{2_129373,

	author = {Hendrik Nolte and Philipp Wieder},
	doi = {10.1162/dint_a_00141},
	grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/129373},
	month = {01},
	title = {Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes},
	type = {article},
	url = {https://publications.goettingen-research-online.de/handle/2/121151},
	year = {2022},
}

BibTeX: Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes

@article{2_121151,

	author = {Hendrik Nolte and Philipp Wieder},
	doi = {10.1162/dint_a_00141},
	grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/121151},
	month = {01},
	title = {Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes},
	type = {article},
	year = {2022},
}

BibTeX: Data Depositing Services und der Text+ Datenraum

@misc{2_127235,

	author = {Andreas Witt and Andreas Henrich and Jonathan Blumtritt and Christoph Draxler and Axel Herold and Marius Hug and Christoph Kudella and Peter Leinen and Philipp Wieder},
	grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/127235},
	month = {01},
	title = {Data Depositing Services und der Text&#43; Datenraum},
	type = {misc},
	year = {2022},
}

BibTeX: Canonical Workflow for Experimental Research

@article{2_121152,

	author = {Dirk Betz and Claudia Biniossek and Christophe Blanchi and Felix Henninger and Thomas Lauer and Philipp Wieder and Peter Wittenburg and Martin Zünkeler},
	doi = {10.1162/dint_a_00123},
	grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/121152},
	month = {01},
	title = {Canonical Workflow for Experimental Research},
	type = {article},
	year = {2022},
}

BibTeX: Sekundäre Nutzung von hausärztlichen Routinedaten ist machbar – Bericht vom RADAR Projekt

@article{2_97749,

	abstract = {&#34;Zusammenfassung Ziel der Studie „Real world“-Daten aus der ambulanten Gesundheitsversorgung sind in Deutschland nur schwer systematisch und longitudinal zu erlangen. Unsere Vision ist eine permanente Datenablage mit repräsentativen, de-identifizierten Patienten- und Versorgungsdaten, längsschnittlich, fortwährend aktualisiert und von verschiedenen Versorgern, mit der Möglichkeit zur Verknüpfung mit weiteren Daten, etwa aus Patientenbefragungen oder biologischer Forschung, zugänglich für andere Forscher. Wir berichten methodische Vorgehensweisen und Ergebnisse aus dem RADAR Projekt.Methodik Untersuchung des Rechtsrahmens, Entwicklung prototypischer technischer Abläufe und Lösungen, mit Machbarkeitsstudie zur Evaluation von technischer und inhaltlicher Funktionalität sowie Eignung für Fragestellungen der Versorgungsforschung.Ergebnisse Ab 2016 entwickelte ein interdisziplinäres Wissenschaftlerteam ein Datenschutzkonzept für Exporte von Versorgungsdaten aus elektronischen Praxisverwaltungssystemen. Eine technische und organisatorische Forschungsinfrastruktur im ambulanten Sektor wurden entwickelt und im Anwendungsfall „Orale Antikoagulation“ (OAK) umgesetzt. In 7 niedersächsischen Hausarztpraxen wurden 100 Patienten gewonnen und nach informierter Einwilligung ihre ausgewählten Behandlungsdaten, reduziert auf 40 relevante Datenfelder, über die Behandlungsdatentransfer-Schnittstelle extrahiert, unmittelbar vor Ort in identifizierende bzw. medizinische Daten getrennt und verschlüsselt zur Treuhandstelle (THS) bzw. an den Datenhalter übertragen. 75 Patienten, die die Einschlusskriterien erfüllten (mind. 1 Jahr Behandlung mit OAK), erhielten einen Lebensqualitäts-Fragebogen über die THS per Post. Von 66 Rücksendungen wurden 63 Fragebogenergebnisse mit den Behandlungsdaten in der Datenablage verknüpft.Schlussfolgerung Die rechtskonforme Machbarkeit der Gewinnung von pseudonymisierten hausärztlichen Routinedaten mit expliziter informierter Patienteneinwilligung und deren wissenschaftliche Nutzung einschließlich Re-Kontaktierung und Einbindung von Fragebogendaten konnte nachgewiesen werden. Die Schutzkonzepte Privacy by design und Datenminimierung (Artikel 25 mit Erwägungsgrund 78 DSGVO) wurden systematisch in das RADAR Projekt integriert und begründen wesentlich, dass der Machbarkeitsnachweis rechtskonformer Primärdatengewinnung und sekundärer Nutzung für Forschungszwecke gelang. Eine Nutzung hinreichend anonymisierter, aber noch sinnvoller hausärztlicher Gesundheitsdaten ohne individuelle Einwilligung ist im bestehenden Rechtsrahmen in Deutschland schwerlich umsetzbar.&#34;},
	author = {Johannes Hauswaldt and Thomas Bahls and Arne Blumentritt and Iris Demmer and Johannes Drepper and Roland Groh and Stephanie Heinemann and Wolfgang Hoffmann and Valérie Kempter and Johannes Pung and Otto Rienhoff and Falk Schlegelmilch and Philipp Wieder and Ramin Yahyapour and Eva Hummers},
	doi = {10.1055/a-1676-4020},
	grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/97749},
	month = {01},
	title = {Sekundäre Nutzung von hausärztlichen Routinedaten ist machbar – Bericht vom RADAR Projekt},
	type = {article},
	year = {2021},
}

BibTeX: Certification Schemes for Research Infrastructures

@misc{2_108259,

	abstract = {&#34;This working paper discusses the use and importance of various certification systems for the field of modern research infrastructures. For infrastructures such as CLARIAH-DE, reliable storage, management and dissemination of research data is an essential task. The certification of various areas, such as the technical architecture used, the work processes used or the qualification level of the staff, is an established procedure to ensure compliance with a variety of standards and quality criteria and to demonstrate the quality and reliability of an infrastructure to researchers, funders and comparable consortia. The working paper conducts this discussion based on an overview of selected certification systems that are of particular importance for CLARIAH-DE, but also for other research infrastructures. In addition to formalised certifications, the paper also addresses the areas of software-specific and self-assessment-based procedures and the different roles of the actors involved.&#34;},
	address = {Göttingen},
	author = {Felix Helfer and Stefan Buddenbohm and Thomas Eckart and Philipp Wieder},
	grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/108259},
	month = {01},
	title = {Certification Schemes for Research Infrastructures},
	type = {misc},
	year = {2021},
}

BibTeX: An Optimized Single Sign-On Schema for Reliable Multi -Level Security Management in Clouds

@inproceedings{2_121153,

	author = {Aytaj Badirova and Shirin Dabbaghi and Faraz Fatemi-Moghaddam and Philipp Wieder and Ramin Yahyapour},
	grolink = {https://resolver.sub.uni-goettingen.de/purl?gro-2/121153},
	journal = {Proceedings of FiCloud 2021 – 8th International Conference on Future Internet of Things and Cloud},
	month = {01},
	title = {An Optimized Single Sign-On Schema for Reliable Multi -Level Security Management in Clouds},
	type = {inproceedings},
	year = {2021},
}

Teaching

Winter term 2026

Lecture : Anwendungsgebiete der Data Science, Prof. Dr. Philipp Wieder
Lecture : Data Science Infrastructures, Prof. Dr. Philipp Wieder, Dr. Sven Bingert
Colloquium : Kollquium zu Projekten in den Digital Humanities, Prof. Dr. Philipp Wieder

Summer term 2026

Colloquium : Kollquium zu Projekten in den Digital Humanities, Prof. Dr. Philipp Wieder

Winter term 2025

Lecture : Anwendungsgebiete der Data Science, Prof. Dr. Philipp Wieder
Lecture : Data Science Infrastructures, Prof. Dr. Philipp Wieder, Dr. Sven Bingert
Colloquium : Kollquium zu Projekten in den Digital Humanities, Prof. Dr. Philipp Wieder

Summer term 2025

Colloquium : Kollquium zu Projekten in den Digital Humanities, Prof. Dr. Philipp Wieder

Theses and Projects

Topic	Professor	Type
Metadata quality dashboard for the Deutsche Digitale Bibliothek	Prof. Ramin Yahyapour	BSc, MSc
SSO Keycloak integration and self-services for a community portal	Prof. Ramin Yahyapour	BSc, MSc
Knowledge Graphs and NLP techniques	Prof. Ramin Yahyapour	BSc, MSc
Implementation of an API specification to enhance the functionality of an Text- and Datamining system	Prof. Ramin Yahyapour	BSc, MSc
Token Management for an API to utilise HPC resources in generic workflows	Prof. Ramin Yahyapour	BSc, MSc
Benchmarking and Characterization of Workflow Execution in Heterogeneous HPC Systems	Prof. Julian Kunkel	BSc, MSc
Hybrid Scheduling: Combining Exact Solvers and Learning-Based Methods for HPC Workflows	Prof. Julian Kunkel	BSc, MSc
Quantum-Inspired Optimization for Workflow Mapping in Heterogeneous HPC Systems	Prof. Julian Kunkel	BSc, MSc
Ethical and Responsible AI Considerations in Automated HPC Scheduling Systems	Prof. Julian Kunkel	BSc, MSc
Constraint-Based Workflow Scheduling Using MILP and CP-SAT: A Comparative Study	Prof. Julian Kunkel	BSc, MSc
Modeling System and Workload Characteristics for Workflow Scheduling in the HPC Compute Continuum	Prof. Julian Kunkel	BSc, MSc
Designing an Environmental Sustainability Labeling System for AI Services Based on Resource Usage	Prof. Julian Kunkel	BSc, MSc
Meta Machine Intelligence (MMI) for Error Detection in High-Performance Computing Systems	Prof. Julian Kunkel	BSc, MSc
Multi-Model Job Scheduling for Mixed Computing Environments	Prof. Julian Kunkel	BSc, MSc
Lightweight AI for Detecting Irregular Behavior in Device Logs	Prof. Julian Kunkel	BSc, MSc
Interactive Dashboard for Monitoring AI Performance in System Maintenance	Prof. Julian Kunkel	BSc, MSc
How do students use AI in their studies?	Prof. Julian Kunkel	BSc, MSc
AI-assisted programming learning	Prof. Julian Kunkel	BSc, MSc
Exploring Quantum Computing Use Cases	Prof. Julian Kunkel	BSc, MSc
Comparison of Distributed Computing Frameworks	Prof. Julian Kunkel	BSc, MSc
Implementation of a precice-Adapter for the particle transport simulator LIGGGHTS	Prof. Julian Kunkel	BSc, MSc
Integrated Analysis of High Performance Computing Training Materials: A Fusion of Web Scraping, Machine Learning, and Statistical Insights	Prof. Julian Kunkel	BSc, MSc
Advancing Education in High Performance Computing: Exploring Personalized Teaching Strategies and Adaptive Learning Technologies	Prof. Julian Kunkel	BSc, MSc
Evaluating Pedagogical Strategies in High Performance Computing Training: A Machine Learning-driven Investigation into Effective Didactic Approaches	Prof. Julian Kunkel	BSc, MSc
Reimagining and Porting a Prototype for High Performance Computing Certification: Enhancing Knowledge and Skills Validation	Prof. Julian Kunkel	BSc, MSc
Implementation of a precice-Adapter for the particle transport simulator LIGGGHTS	Prof. Julian Kunkel	BSc, MSc
Processing of experimental videos for object tracking	Prof. Julian Kunkel	BSc, MSc
Enabling particle simulations with a deformable boundary	Prof. Julian Kunkel	BSc, MSc
Framework for automated ML and empirical model generation	Prof. Julian Kunkel	BSc, MSc
An Agentic Retrieval-Augmented Generation Framework for Modular Knowledge Pipelines	Prof. Julian Kunkel	BSc, MSc
A Collaborative Human–Agent RAG Chat System with Role-Aware Reasoning and Expert-Owned Knowledge	Prof. Julian Kunkel	BSc, MSc
Regulation-Aware AI Supervision: A RAG-Based Evaluation and Policy Enforcement Framework	Prof. Julian Kunkel	BSc, MSc
AgentFlow: A Modular Pipeline for Coordinated Collaboration of AI Agents	Prof. Julian Kunkel	BSc, MSc
Advanced Retrieval-Augmented Generation: Improving Quality, Latency, and Adaptability	Prof. Julian Kunkel	BSc, MSc
Federated Fine-Tuning of Large Language Models in Distributed and Privacy-Preserving Environments	Prof. Julian Kunkel	BSc, MSc
Agentic Retrieval-Augmented Generation System for Modular Knowledge Pipelines	Prof. Julian Kunkel	BSc, MSc
Segment-Wise Sequential Fine-Tuning of Large Language Models Under Memory Constraints	Prof. Julian Kunkel	BSc, MSc
Performance optimization of numerical simulation of condensed matter systems	Prof. Julian Kunkel	BSc, MSc
Benchmarking Applications on Cloud vs. HPC Systems	Prof. Julian Kunkel	BSc, MSc
Putting RISC-V eval board Linux and HPC toolchains into operation	Prof. Julian Kunkel	BSc, MSc
Comparison of Distributed Computing Frameworks	Prof. Julian Kunkel	BSc, MSc
Performance Evaluation of LLM Inference Engines	Prof. Julian Kunkel	BSc, MSc
Operating Kubernetes with AI Engineers	Prof. Julian Kunkel	BSc, MSc
Prototyping a Geo-Redundancy Engine	Prof. Julian Kunkel	BSc, MSc
Development of a new application for the SpiNNaker-2 neuromorphic computing platform	Prof. Julian Kunkel	BSc, MSc
Development of Text-to-SQL/XML Conversational AI for Planarian Research Database	Prof. Julian Kunkel	BSc, MSc