1-Slider-Barcamp-Background

October 8 – 9, 2025

Posters & Demos

Day 1

(1) RADAR – Enhancing FAIR Research Data Management with AI Support

Felix Bach, Kerstin Soltau, Stefan Hofmann

Organization(s): FIZ Karlsruhe – Leibniz-Institut für Informationsinfrastruktur

RADAR, developed and operated by FIZ Karlsruhe, is a well-established research data repository supporting secure archiving, publication, and long-term preservation of data across disciplines. Since its launch in 2017, RADAR has continuously evolved to meet the growing demands of open science. It offers comprehensive metadata support, persistent identifiers, semantic enrichment (e. g. Schema.org, FAIR Signposting), discipline-specific terminologies via TS4NFDI, and integration with platforms such as GitHub and GitLab. Flexible deployment options (RADAR Cloud, RADAR Local) and tailored services (e. g. RADAR4Chem, RADAR4Memory) ensure broad usability and community alignment.

As part of our ongoing innovation efforts, we are currently exploring AI-driven enhancements that further support FAIR data practices. These include:

AI-assisted metadata review and enrichment, enabling e. g. the automatic extraction of relevant metadata from documents linked within submissions;
AI-assisted FAIRness checks, offering feedback and suggestions to improve the FAIRness of datasets.

These developments aim to help researchers meet growing expectations for quality metadata and data stewardship while reducing manual effort. A live notebook demo will be available to trial our AI-enhanced features, currently in testing. Our poster, which uses a timeline to illustrate RADAR´s evolution and feature set, invites discussion on AI, FAIR data, open science, and future-ready repository services.

(2) Helping SSH Researchers to Explore and Exploit Knowledge Graphs through Artificial Intelligence: the GRAPHIA Project

Matteo Romanello (1), Luca De Santis (2), Julien Homo (3), Stefano de Paoli (4), Sy Holsinger (5)

Organization(s): 1: Odoma; 2: Net7; 3: FoxCub; 4: Abertay University; 5: OPERAS

Many research institutions worldwide have invested over the years in the creation of knowledge graphs (KGs) with deep semantic descriptions of their data. Despite the semantic richness of this information, its accessibility and usefulness for researchers remain limited due to the complexity of their ontologies and the specialized query languages required to retrieve information (e.g., SPARQL) [1]. Helping researchers in social sciences and humanities (SSH) to more easily access information contained in KGs is one of the aims of the GRAPHIA project, a recent EU-funded project, which aims not only to create a comprehensive KG for SSH, but also to exploit large language models (LLMs) as a means to access and analyse the information contained in them.

In GRAPHIA, we will expand the current implementations of AI done in the context of GoTriple, a massive repository holding more than 19 million SSH publications. The first AI implementation is the GoTriple ChatBot [2], a RAG system (Retrieval Augmented Generation) fed with the full text of selected documents to answer user prompts. As a complete RAG indexation of GoTriple turned out to be too costly, a new implementation allows users to create notebooks starting from GoTriple documents, and then pass them to the RAG system for AI-assisted analysis. Finally, GRAPHIA partners’ experiments with AI include the usage of LLMs to convert natural language questions into structured queries against KGs, and the usage of the Model Context Protocol (MCP) to empower AI-assisted interactions with massive KGs.

[1] http://dx.doi.org/10.3233/SW-223117

[2] http://dx.doi.org/10.5281/zenodo.10977163

(3) AI-assisted research data annotation in biomedical consortia

Felix Engel, Gita Benadi, Claudia Giuliani, Harald Binder, Klaus Kaier

Organization(s): Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center – University of Freiburg, Germany

Annotation of research data is a key element of Open Science and has gained additional value as training input for artificial intelligence. However, developing metadata schemas poses a series of challenges, including optimisation and securing both complete coverage and constant completeness and quality. We employ large language models (LLMs) to address some of these challenges while keeping researchers in the loop to ensure reliability of annotations.

Our research data management group currently supports seven biomedical research consortia. We develop customised metadata schemas together with consortium members, drawing on established controlled vocabularies (Engel et al. 2025). Schemas are implemented on the fredato research data platform developed at the IMBI (Watter et al. 2023). Schemas are documented and published as knowledge graphs adhering to the Resource Description Framework (RDF), relating metadata to research processes as modelled by commonly used ontologies.

LLMs are employed to develop initial schema drafts from related research literature and to predict dataset annotations from scientific papers (Giuliani et al. 2025). The models have proved to perform well with these tasks, supporting researchers with improving metadata coverage in their consortia.

References

Engel, F. et al. (2025). Development of Metadata Schemas For Collaborative Research Centers. FreiData. https://doi.org/10.60493/K1XE3-NPC10

Giuliani, C. et al. (2025). Identifying biomedical entities for datasets in scientific articles – A 4-step cache-augmented generation approach using GPT-4o and PubTator 3.0. medRxiv. https://doi.org/10.1101/2025.03.04.25323310 Watter, M. et al. (2023). Standardized metadata collection in a research data management tool to strengthen collaboration in Collaborative Research Centers. E-Science-Tage, Heidelberg. https://doi.org/10.11588/HEIDOK.00033131

(4) A Human-Centered Open Source Platform for Image and Text Processing in Humanities Research

Ari Häyrinen

Organization(s): University of Jyväskylä

We present an open source application designed to support non-technical users in the fields of humanities and social sciences by enabling intuitive processing of visual and textual data. Functioning as an ETL-like pipeline tool, the platform integrates multiple OCR engines and supports the use of large language models (both open source and proprietary) to analyze and enrich image and text datasets. Built with simplicity and openness at its core, the tool fosters accessible, transparent, and reproducible research workflows.

This contribution addresses the intersection of AI and Open Science by (1) lowering the technical barrier for deploying LLMs in scholarly analysis, (2) supporting the use of open source models and tools, and (3) encouraging data sharing and methodological transparency. By prioritizing usability and interoperability, the application contributes to training and capacity-building efforts in digital humanities and enhances the responsible adoption of AI technologies within open research infrastructures.

The poster will demonstrate real-world use cases, showcase how the platform enables explainable AI workflows for humanistic inquiry, and invite collaboration for extending its modular design.

Github: https://github.com/OSC-JYU/MessyDesk

(5) Enhancing Discovery, Policy, and Practice: A Hands-On Demo of the European Open Science Resources Registry

Tereza Szybisty, Natalia Manola, Stefania Martziou, Antonis Lempesis

Organization(s): OpenAIRE AMKE

The EOSC Open Science Observatory is a policy intelligence platform that tracks and visualizes the progress of Open Science across Europe. It offers stakeholders—from policymakers to practitioners—access to trustworthy data on Open Science policies, practices, and impacts.

This demo will offer a hands-on exploration of the EOSC Open Science Observatory dashboard, guiding participants through its latest data visualization features. Participants will learn to navigate the platform’s new features that visualize key OS dimensions—from Open Access to FAIR data, infrastructure, and skills.

A highlight of the session will be the European Open Science Resources Registry, an integrated component of the EOSC Open Science Observatory. This AI-enhanced registry curates essential documents—including policies, strategies, and best practices—and supports advanced search and classification using Natural Language Processing (NLP) and Machine Learning (ML).

Participants will discover how these technologies automate document retrieval, extract metadata, classify content, and produce summaries—enhancing discoverability and understanding of Open Science efforts across Europe.

After this demo, participants will:

learn to explore and interpret Open Science indicators using the EOSC OS Observatory
understand how the European Open Science Resource Registry streamlines access to key policy resources
get involved in the discussion how AI tools support smarter, more connected Open Science governance

This demo bridges technology and transparency, showcasing how AI and open infrastructures can reinforce the core principles of Open Science.

(6) Bridging Open Science and large language models: Enhancing research accuracy through Knowledge graphs

Nicolaus Wilder, Marie Alavi, Julia Priess-Buchheit

Organization(s): Kiel University

Open Science (OS) emphasises reproducibility, factual accuracy and originality to promote responsible conduct of research, share reliable data, minimise resource waste, and foster innovation.

In contrast, large language models (LLMs) process vast amounts of (non-) scientific data by probabilistic modeling, prioritising quantity over reliability of data.

As LLMs are increasingly used in research, a critical question arises: How can the two different logics (1)—efficiency through openness and efficiency through volume—coexist and be utilised responsibly in research? We expand this question and argue for knowledge-augmented systems to enhance LLMs’ accuracy and reliability, proposing a combination with an OS knowledge graph (KG).

The poster visualises a triangular relationship among OS, KG, and LLM, highlighting two workflows.

(A) OS—KG—LLM: OS resources serve as the knowledge base structured within a KG ontology (2), connected upstream of an LLM, providing a constantly updated reference that guides the LLM’s data selection through specific nodes (entities) and branches (relationships), mitigating hallucinations and delivering contextually relevant content aligned with the prompt.

(B) OS—LLM—KG: OS resources are automatically prepared with the help of an LLM, which identifies nodes and relationships, extracts relevant data from the OS pool, and transforms publications into KG-compliant data. Thus, LLMs can assist in creating an ontology represented in a KG based on the principles, data, and results of OS.

References:

(1) https://doi.org/10.5281/zenodo.11562117

(2) https://doi.org/10.48550/arXiv.2406.08223

(7) Open Data and Open LVLMs: How to Explore Scientific Collections Differently

Iris Vogel (1), Florian Schneider (2), Narges Baba Ahmadi (2), Niloufar Baba Ahmadi (2), Chris Biemann (2), Martin Semmann (2)

Organization(s): 1: Center for Sustainable Research Data Management, University of Hamburg; 2: Hub of Computing and Data Science, University of Hamburg

We exemplify how multimodal agentic chatbots based on Large-Vision-Language-Models (LVLMs) can be leveraged as a tool for interactive and engaging exploration of scientific collections from diverse disciplines. Thereby, we demonstrate a potential beyond digital showcases. Specifically, with our tool we aim to make interesting scientific artifacts, often hidden behind complex search interfaces, more accessible and engaging, for non-experts and the general public.

In bringing together the vast amount of openly available scientific collections with open source LVLMs, we showcase how open science leverages hidden potentials in academic institutions. Our contribution focuses on the perspective of scientific collection of data as data, which not only serves the research community but also the interested public. By using more explorative layers for data linkage and retrieval through application of state-of-the-art Artificial Intelligence technology, we open up new and more accessible entry points for the data and its underlying relations

Search via conventional portals requires some expertise in the field. Enhancing it with an interactive chat interface, which can answer questions about the scientific collection portal in general, their collections, and single objects within, opens up an intuitive approach for all kinds of users of the collection portal.

https://fundus-murag.ltdemos.informatik.uni-hamburg.de

(8) An open challenge to develop automated assessment of research finding8

Timothy Errington

Organization(s): Center for Open Science

Assessing the validity and trustworthiness of research claims is a central, ongoing, and labor-intensive part of the scientific process. Confidence assessment strategies range from expert judgment to aggregating existing evidence to systematic replication efforts, requiring substantial time and effort. What if we could create automated methods that achieve similar accuracy in a few seconds? The Predicting Replicability Challenge is a public competition to advance automated assessment of research claims. The challenge invites teams to develop algorithmic approaches that predict the likelihood of research claims being successfully replicated. Participants will have access to training data drawn from the Framework for Open and Reproducible Research Training (FORRT) database that documents 3,000+ replication effects. New research claims will then be used to test the algorithmic approaches’ ability to predict replication outcomes. The first set of held out social-behavioral claims will be shared with participating teams in August 2025. Teams will have one month to submit confidence scores, which will be evaluated in October 2025 with prizes awarded for the top-performing teams. A tentative second round will be held in October 2025, and the final round in February 2026. The initiative encourages innovation and interdisciplinary collaboration, including partnerships between AI/ML experts and domain specialists in social-behavioral sciences. The challenge is open to participants from academic and non-academic spaces worldwide.

(9) Improving Research Data findability with FAIR Signposting: implementation insights from KonsortSWD Data Centers
Janete Saldanha Bach, Brigitte Mathiak, Yudong Zhang, Peter Mutschke

Organization(s): GESIS – Leibniz-Institut für Sozialwissenschaften

The FAIR Principles are open to interpretation, resulting in varying assessments of compliance by different FAIR assessment tools, such as F-UJI. FAIR Signposting plays a critical role in standardizing these assessments, reducing inconsistencies, and enabling a more consistent interpretation across platforms. This is especially relevant in a pilot project by the KonsortSWD consortium of the German National Research Data Infrastructure (NFDI), which addresses challenges in evaluating and improving the FAIRness—particularly the findability—of research data.

The project implements the FAIR Signposting standard, a set of machine-readable, HTTP-based link relations that standardize metadata exposure and support automated FAIR assessments. It uses relation types (e.g., cite-as, described by, license, author, item, and collection) embedded in HTML headers, HTTP responses, or standalone linkset documents to guide automated agents such as search engines to metadata, persistent identifiers (PIDs), and related resources.

The two-part strategy included the deployment of a prototype at GESIS – Leibniz Institute for the Social Sciences and partner data centers such as LIfBi, DIPF, DIW/SOEP, and DZHW; and the creation of a best practices document based on implementation experiences. The application of FAIR Signposting led to significant improvements in FAIRness scores (e.g., GESIS: 43% to 79%).

This project demonstrates that embedding standard link relations enhances metadata interoperability, discoverability, and machine-readability. Tools like F-UJI were used to measure these improvements. The contribution offers practical guidance, implementation examples, and FAIR Signposting validation tools to support broader adoption across research data centers.

(10) Beyond the Numbers: Using AI to measure the impact of Open Science through data reuse

Agata Morka (1), Parth Sarin (2), Tim Vines (2), Iain Hrynaszkiewicz (1)

Organization(s): 1: PLOS; 2: DataSeer

Can language models help us capture how open science practices translate into real-world impact? PLOS and DataSeer collaborated on a project exploring how AI can address a critical gap in Open Science metrics: measuring data reuse. While data sharing is a widely encouraged Open Science practice, the reuse of shared data—one of the most meaningful impacts of Open Science—remains difficult to detect and quantify. Our project developed an AI-based methodology to tackle this challenge, combined with a consultation process with key scholarly communications stakeholders.

Following the community consultation, we refined definitions and criteria for identifying data reuse and fine-tuned a large language model (Llama-3.1-8B-Instruct) using proximal policy optimization (PPO) to classify instances of data reuse. The model was trained on a curated set of 421 annotated research articles and applied to a broader corpus of 4,328 PLOS publications. Results showed that 47% of the articles demonstrated data reuse, although persistent identifiers (e.g., DOIs, accession numbers) were only detected in a small fraction (176 cases). Importantly, the model’s reasoning summaries were adapted with RLHF and iterative feedback from researchers, editors, and policymakers to produce interpretable insights about how data were reused.

This work highlights that current bibliometric methods likely underestimate data reuse and demonstrates that AI can offer scalable, nuanced assessments of Open Science’s impacts. By shifting the focus from practices (such as data sharing) to measurable impacts (such as data reuse), our new indicator offers a more meaningful way to evaluate and incentivize Open Science practices.

Day 2

(1) The power of open polyglot plain text tooling for reproducible AI research

Alva Seltmann (1,2), Christian Eggeling (1,2)

Organization(s): 1: Institute for Applied Optics and Biophysics, Friedrich Schiller University Jena, Jena, Germany; 2: Leibniz Institute of Photonic Technology, Jena, Germany

Methods reproducibility is the ability to record and implement all experimental and computational procedures of the experiment with the same data and tools, obtaining the same results. It is a crucial step for applied Artificial Intelligence (AI) research in an Open Science context. While AI-specific reproducibility tools exist, the typical researcher employs AI in a larger scientific context. Here, we explore synergies of existing plain-text-based, polyglot tooling to capture this whole context. First, literate programming using Org-mode allows executing, capturing and annotating all research steps. It includes research hypotheses, dataset creation, study protocols, the training and evaluation process, and the model application with analysis code and reproducible figures and papers. We detail Org-mode’s polyglot capabilities of combining arbitrary programming languages and concepts like tangling or transclusion, which integrate source and artifact files in the literate document. Second, the Nix package manager allows declaring the full polyglot environment programmatically, making it reproducible on arbitrary machines. Third, recording every change with DataLad, built on Git and git-annex, provides provenance of all results and entry points to reproduce experiments. This Open Science approach to AI experiments integrates computational reproducibility with context readability for machines and humans.

(2) When Measures Become Targets: Lessons from Open Science and Machine Learning on the Fragility of Reform

Moritz Herrmann

Organization(s): Munich Center for Machine Learning, LMU Munich

Building on the position paper “Why We Must Rethink Empirical Research in Machine Learning” [1], which examines methodological and epistemic challenges of machine learning (ML) as an empirical science, we highlight parallels between the replication crisis in the applied sciences and current issues in ML/AI research. Both have long been subject to epistemic critiques: warnings about weak inferential foundations, misapplied statistical reasoning, and overreliance on performance metrics. Yet these concerns are routinely overlooked in practice, sidelined by institutional incentives and publication pressures. Moreover, because research communities function as social systems, measures of scientific quality—such as prediction performance or statistical significance—can themselves become targets, prone to corruption pressures. This dynamic also affects core practices promoted by the Open Science movement: code and data sharing, computational reproducibility, and preregistration are promising tools for improving research integrity, but as alternative measures of scientific quality they are not immune to the same social and systemic pressures. Without critical reflection, they risk being treated as shallow technical fixes rather than as components of a deeper epistemic shift. By drawing these connections, we advocate for a more epistemologically grounded approach to Open Science—within AI in particular, and across scientific practice more broadly.

[1] Herrmann, M., Lange, F. J. D., Eggensperger, K., Casalicchio, G., Wever, M., Feurer, M., Rügamer, D., Hüllermeier, E., Boulesteix, A.-L., & Bischl, B. (2024). Position: Why We Must Rethink Empirical Research in Machine Learning. Proceedings of the 41st International Conference on Machine Learning (ICML 2024), Proceedings of Machine Learning Research, 235, 18228–18247. https://proceedings.mlr.press/v235/herrmann24b.html

(3) Small-scale Domain-specific Web Crawling for Complementing Established LLM Data Sources

Thomas Eckart (1), Frank Binder (2), Erik Körner (1), Christopher Schröder (2,3), Felix Helfer (1)

Organization(s): 1: Saxon Academy of Sciences and Humanities in Leipzig; 2: Institute for Applied Informatics (InfAI); 3: Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI) Dresden/Leipzig

The multilingual Leipzig Corpora Collection (LCC) [1] operates its own crawling infrastructure since 2003. Despite the availability of established datasets for Large Language Model (LLM) training, we aim to continue this small-scale infrastructure as it guarantees complete control over source selection and thematic focus and still provides a significant gain in LLM training material over established resources.

For German, we compare LCC’s annual news and web crawls against the German subset of the popular OSCAR corpus of the same time [2]. Using identical content extraction and sentence-level deduplication on both resources and considering their levels of content density, we establish solid estimates of their degree of complementarity.

Accordingly, the 396-billion-raw-token German LCC crawl from 2023 yields at least 35.9 billion tokens of cleaned and deduplicated document-level text data, with less than 15% overlap with the German OSCAR subset and less than 28% overlap with the LCC data from 2021-2022. Thus, for 2023, our sovereign crawling infrastructure and content extraction gained over

22.1 billion complementary tokens in high-quality document text, to be freely used in research contexts.

Within the scope of the UrhBiMaG [3], we use this data to investigate effects of data quality and training schemes on the quality and performance of neural language models of different sizes and architectures in the ongoing CORAL project [4].

[1] https://wortschatz-leipzig.de

[2] https://oscar-project.org/

[3] https://www.bmj.de/SharedDocs/Gesetzgebungsverfahren/DE/2020_Gesetz_Anpassung-Urheberrecht-dig-Binnenmarkt.html

[4] https://coral-nlp.github.io/

(4) Enhancing Open Cosmology with Emulator Packaging

Ian Harrison, Hidde Jense

Organization(s): Cardiff University

In the field of cosmological astrophysics there is growing adoption of AI emulators to speed up numerical calculations necessary for inferring properties of the Universe, such as the behaviours of dark energy, dark matter and the theory of gravity. In our work we have leveraged OS principles to enhance this use in research, by creating a standard framework for the testing, training, accuracy validation, use and – importantly – sharing of these emulators. This involved the creation of a standard “packaging” of the necessary files and metadata, as well as extending existing popular frameworks to make use of this packaging. This standardisation and packaging improves reproducibility and reduces wasted effort and resources due to duplication of otherwise un-reusable emulators, which are often created at great computational expense on HPC clusters. It also reduces barriers to entry for new members of the community in providing ready-made tools which can be used with confidence on a laptop, allowing more diverse sets of analyses of new and interesting cosmological models.

(5) The Intelligence Behind the OpenAIRE Graph: Linking Science with AI

Stefania Amodeo (1), Paolo Manghi (1,2), Natalia Manola (1)

Organization(s): 1: OpenAIRE AMKE; 2: ISTI-CNR

The OpenAIRE Graph stands at the forefront of research infrastructure innovation, combining cutting-edge AI techniques with Open Science principles to process and analyze 400M+ research records monthly, including 290M+ publications, 82M+ datasets, and 1M+ software entries. More than a metadata aggregator, the OpenAIRE Graph fuses diverse sources into a richly linked, machine-actionable research ecosystem, powered by an advanced AI-driven analytical workflow that elevates data quality, connectivity, and usability through:

automated metadata enrichment of persistent identifiers (e.g. ORCID, ROR), Fields of Science classifications, Open Access status, licensing terms, and semantic types using Natural Language Processing;
entity recognition and disambiguation using ML models to connect authors, institutions, projects, and funders across heterogeneous sources;
knowledge graph embeddings and similarity scoring to detect and link conceptually related research artefacts, enabling cross-disciplinary exploration and contextualization;
relationship extraction and network mapping, to uncover latent connections among research outputs, such as citations, co-authorships, and funding dependencies.

These mechanisms are continuously refined using feedback loops, benchmarking datasets, and community input, ensuring the Graph remains a trusted foundation for Open Science monitoring, research assessment, and discovery.

Our demonstration will show how these AI capabilities operationalize the FAIR principles, support evidence-based policymaking, and streamline research workflows. This session will offer practical insights for those exploring AI-enhanced infrastructures for scholarly communication and assessment.

(6) Powering Open Science Across Borders: A Live Demonstration of the EOSC EU Node

Maja Dolinar (1), Natalia Manola (1), Spiros Athanasiou (2)

Organization(s): 1: OpenAIRE AMKE; 2: Athena Research Center

This demonstration introduces the EOSC EU Node—a European-level operational node of the European Open Science Cloud Federation—showcasing its role in accelerating transparent, reproducible scientific practices across disciplines and borders. The EOSC EU Node serves as a federated infrastructure for researchers, enabling access to FAIR-compliant data, computational tools, and collaborative workspaces in line with Open Science values.

In this live demo, we will explore how users can log in using institutional credentials, search and access open datasets through the EOSC Resource Hub, launch compute-intensive workflows via interactive notebooks or use any of the other available services and tools—entirely within a secure, GDPR-compliant environment. Special focus will be given to AI-supported services that facilitate data annotation, automated metadata generation, and reproducible pipelines, all aligned with the EOSC Interoperability Framework and the FAIR principles.

We will illustrate how the EOSC EU Node lowers barriers for smaller institutions, citizen science projects, and cross-disciplinary teams by integrating open tools into reusable research workflows. Attendees will engage with real-time interfaces, receive guidance on service onboarding, and explore how their organizations can benefit from and contribute to the EOSC ecosystem.

This session invites researchers, infrastructure providers, and policy actors to reflect on how federated infrastructures and responsible AI can converge to support a more equitable and innovative Open Science landscape. Join us to experience the EOSC EU Node in action and see how your research community can take part in building the future of science.

(7) Initiatives and networks- Irelands National research landscape

Ruth Patricia Moran (1), Jose Ignacio Flores (2)

Organization(s): 1: Atlantic Technological University; 2: UNIR, Madrid

Ireland’s focus on areas like AI ethics, Open science and research integrity has grown collaboratively in the last number of years.

The National Research integrity forum (NRIF) established in 2015 is a collaborative organisation that promotes good practices in areas like ethical use of AI, the responsible use of AI in areas like publications and drives the research integrity agenda in Ireland.

Ireland’s research eco-system also established the National Open research forum (NORF) to drive open research practices in Ireland.

National Academic integrity network (NAIN) established in 2019 to provide training and education around how to avoid any breaches of academic integrity such as data fabrication, text plagiarism using AI algorithms.

Ethics, research integrity and open science and the evolving change landscape of AI place a large demand on researchers, students and educators. NRIF, NORF and NAIN encourage researchers in Ireland to use AI tools honestly, in a transparent manner. As AI evolves, continuous education of the ethical use of AI policies and practices need to be continuously updated to reflect the ever-changing landscape.

References:

ALLEA – The European Code of Conduct for Research Integrity- The European Code of Conduct for Research Integrity – ALLEA
National Academic Integrity Network (NAIN) (2023) Generative Artificial Intelligence: Guidelines for Educators. Quality and Qualifications Ireland (QQI). NAIN Generative AI Guidelines for Educators 2023.pdf (accessed: 05 May 2025)
National Open research forum (accessed: 5th May 2025) NORF – Digital Repository of Ireland
National Policy Statement on Ensuring Research Integrity in Ireland

(8) OS Policies – Vanguard of a Cultural Shift or Institutional Window Dressing?

Verena Weimer (1), Tamara Heck (1), Florian Papilion (1), Tim Höffler (2), Kerstin Hoenig (3), Guido Scherp (4)

Organization(s): 1: DIPF | Leibniz Institute for Research and Information in Education; 2: Leibniz-Instituts für die Pädagogik der Naturwissenschaften (IPN); 3: German Institute for Adult Education – Leibniz Centre for Lifelong Learning; 4: ZBW – Leibniz Information Centre for Economics

Open Science (OS) has emerged as a normative ideal in research, yet the institutional uptake is highly uneven. Nosek [1] places policy at the top of his pyramid for achieving cultural change toward OS, characterizing it with the imperative to “make it required”. However, the impact of policies lies in the detail of the actual translation of OS practices. Developing institutional policies and determining concrete commitments and measures is a complex endeavour, framed by disciplinary contexts and their underlying practices.

In the project IvOS, we raise the question: Do Open Science policies truly drive cultural transformation in research, or do they serve primarily symbolic functions under the guise of compliance? This study addresses this tension by empirically investigating the content and perceived impact of institutional OS policies within a highly interdisciplinary and heterogeneous research network.

The database consists of OS policy documents (n=92) of Leibniz Institutions. These are coded applying qualitative content analysis (results of a pre-study are available [2]). The analysis focus lies on the OS dimensions (Open Access, Open Data, OER, etc.) [3], the practices addressed and their binding character [4], the consideration of discipline-specific concepts and inclusion [5], the implementation into institutional strategies and good scientific practice [6] as well as the monitoring to measure the policy impact [7].

The results provide information on the function of OS policies, whether they have the potential to act as vanguard of a cultural shift and how well they translate the principles to concrete research practices.

References

[1] Nosek, B. (2019). Strategy for Culture Change. Center for Open Science. URL: https://www.cos.io/blog/strategy-for-culture-change

[2] Weimer, V., Heck, T., Scherp, G., Hoenig, K., & Höffler, T. (2024). Open Science Policy Documents of the Leibniz Institutions. Open Science Festival 2024, Mainz. Zenodo. https://doi.org/10.5281/zenodo.13862317

[3] Leibniz Association (2022). Leibniz Open Science Policy. URL: https://www.leibniz-gemeinschaft.de/fileadmin/user_upload/Bilder_und_Downloads/Forschung/Open_Science/Open_Science_Policy.pdf

[4] SPARC. (2018). An Analysis of Open Data and Open Science Policies in Europe. https://sparceurope.org/download/3674

[5] Chtena, N., Alperin, J. P., Morales, E., Fleerackers, A., Dorsch, I., Pinfield, S., & Simard,M.-A. (2023). The neglect of equity and inclusion in open science policies of Europe and the Americas. In SciELO Preprints. https://doi.org/10.1590/SciELOPreprints.7366

[6] Barcelona Declaration on Open Research Information, Kramer, B., Neylon, C., & Waltman, L. (2024). Barcelona Declaration on Open Research Information (1.0). Zenodo. https://doi.org/10.5281/zenodo.10958522 [7] European Commission: Directorate-General for Research and Innovation, Wouters, P., Ràfols, I., Oancea, A., Kamerlin, S. C. L. et al., Indicator frameworks for fostering open knowledge practices in science and scholarship, Publications Office of the European Union, 2019, https://data.europa.eu/doi/10.2777/445286

(9) Reproducibility the New Frontier in AI Governance

Israel Mason-Williams (1,2), Gabryel Mason-Williams (3)

Organization(s): 1: Kings College London; 2: Imperial College London; 3: Queen Mary University of London

Policymakers for AI are responsible for delivering effective governance mechanisms that can provide oversight into safety concerns. However, the information environment offered to policymakers is characterized by an unnecessarily low signal-to-noise ratio, favouring regulatory capture and creating deep uncertainty and divides on which risks should be prioritized from a governance perspective. We posit that the current speed of publication in AI, combined with the lack of strong scientific standards via weak reproducibility protocols, effectively erodes the power of policymakers to enact meaningful policy and governance protocols. Our paper outlines how AI research could adopt stricter reproducibility guidelines to assist governance endeavours and improve consensus on the risk landscapes posed by AI. We evaluate the forthcoming reproducibility crisis within AI research through the lens of reproducibility crises in other scientific domains and provide a commentary on how adopting preregistration, increased statistical power and negative result publication reproducibility protocols can enable effective AI governance. While we maintain that AI governance must be reactive due to AI’s significant societal implications, we argue that policymakers and governments must consider reproducibility protocols as a core tool in the governance arsenal and demand higher standards for AI research.

(10) Enhancing transparency and reusability through Diamond publishing model: Transformations, A DARIAH Journal

Françoise Gouzi (1), Francesco Gelati (2), Anne Baillot (1), Toma Tasovac (1)

Organization(s): 1: Digital Research Infrastructure for the Arts and Humanities (DARIAH); 2: University of Hamburg

Researchers often face financial issues or legal constraints when it comes to publishing their research output, may it be an article, a dataset or a piece of code.

In 2024, DARIAH (Digital Research Infrastructure for the Arts and Humanities) launched the Diamond open access, overlay journal Transformations: A DARIAH Journal (https://transformations.episciences.org/) in order to support Social Sciences and Humanities (SSH) scholars to better embrace Open Science practices. This new serial provides a trusted, non-commercial platform for documenting innovative research activities in the SSH. Submission types include data papers; workflows; pieces of software or code; training materials, as well as traditional scholarly papers. Such a focus on ephemeral or experimental outputs not only offers scholars more options to share efficiently and FAIRly their work; it also aims to promote (and give a greater visibility to) human labour-intensive, data-driven, genuine research compared to traditional scholarly papers.

At a time when it gets harder and harder to identify machine (and especially artificial intelligence)-enhanced forgeries and AI-powered bad scientific practices, a Diamond open access journal offering high openness and transparency (in both editorial and management processes) may be a safe space for good research.