Posters
iNaturalist is a program that encourages people to document the biodiversity around them using a smart phone. With 230 million observations, iNaturalist is one of the biggest sources of biodiversity data in the world. In February 2025 the Natural History Museum of Los Angeles County held a workshop to teach community scientists and community organizers how to analyze iNaturalist data using R. Many iNaturalist projects encourage community scientists to collect data, but leave data analysis to “real” scientists. Our goal was to teach community scientists some basic skills so that they could look for answers to their own questions
We believe it is important to teach community scientists that their voice and questions matter. We took ideas from the software world, open science world, and open science data world to develop the workshop. During the first class, we covered how to download iNaturalist CSV data, create maps, and create charts using R. During the second class, each attendee presented their analysis. This presentation will cover what we learned from teaching the workshop.
This dissertation explores the social dynamics of Open Source Software (OSS) communities leading up to forking events
It employs statistical modeling of longitudinal collaboration graphs to analyze community evolution. The research aims to identify key factors influencing forking, including measures of influence, conflict indicators, and early warning signs of community changes.
To address the lack of software development and engineering training for intermediate and advanced developers of research software, we present the NSF-sponsored INnovative Training Enabled by a Research Software Engineering ity of Trainers (INTERSECT), which delivers software development and engineering training to intermediate and advanced developers of research software. INTERSECT has three main goals: 1. Develop an open-source modular training framework conducive to community contribution 2. Deliver RSE-led research software engineering training targeting research software developers 3. Grow and deepen the connections within the national community of Research Software Engineers
The majority of INTERSECT’s funded focus is on activities surrounding the development and delivery of higher-level specialized research software engineering training.
We have conducted two INTERSECT-sponsored Research Software Engineering Bootcamps (https://intersect-training.org/bootcamp23/ and https://intersect-training.org/bootcamp24/) at Princeton University. Each bootcamp included ~35 participants from a broad range of US-based institutions representing a diverse set of research domains. The 4.5-day bootcamp consisted of a series of stand-alone hands-on training modules. We designed the modules to be related, but not to rely on successful completion or understanding of previous modules. The primary goal of this design was to allow others to use the modules as needed (either as instructors or as self-guided learners) without having to participate in the entire bootcamp.
The topics covered in the bootcamp modules were: Software Design, Packaging and Distribution, Working Collaboratively, Collaborative Git, Issue Tracking, Making Good Pull Requests, Documentation, Project Management, Licensing, Code Review & Pair Programming, Software Testing, and Continuous Integration/Continuous Deployment.
We are organizing a third INTERSECT bootcamp in July 2025. We expect to again have approximately 35 attendees from a wide range of institutions covering a diverse set of research areas. Because the format and content of the first bootcamp was well-received, we plan to follow a very similar format for the second workshop.
We were recently notified that our renewal grant to fund the INTERSECT bootcamps was funded. Therefore, we will host 4 additional annual summer bootcamps in 2026-2029.
In this poster we will provide an overview of the INTERSECT project. We will provide more details on the content of the bootcamp. We will discuss outcomes of both editions of the bootcamp, including curriculum specifics, lessons learned, participant survey results, and long term objectives. We will also describe how people can get involved as contributors or participants.
Software developers face increasing complexity in computational models, computer architectures, and emerging workflows. In this environment Research Software Engineers need to continually improve software practices and constantly hone their craft. To address this need, the Better Scientific Software (BSSw) Fellowship Program launched in 2018 to seed a community of like-minded individuals interested in improving all aspects of the work of software development. To this aim, the BSSw Fellowship Program fosters and promotes practices, processes, and tools to improve developer productivity and software sustainability.
Our community of BSSw Fellowship alums serve as leaders, mentors, and consultants, thereby increasing the visibility of all those involved in research software production and sustainability in the pursuit of discovery. This poster presents the BSSw Fellowship (BSSwF) Program highlighting our successes in developing a community around software development practices and providing information about application for the upcoming 2026 awards. As many in the BSSwF community identify as RSEs, and BSSwF projects are of particular relevance, this poster will serve to inform the community about the fellowship to build up best practices for productivity and sustainability, and to amplify the connections between research software engineers.
The Research Software Alliance (ReSA) has established a Task Force dedicated to translating the FAIR Principles for Research Software (FAIR4RS Principles) into practical, actionable guidelines. Existing field-specific actionable guidelines, such as the FAIR Biomedical Research Software (FAIR-BioRS) guidelines, lack cross-discipline community input. The Actionable Guidelines for FAIR Research Software Task Force, formed in December 2024, brings together a diverse team of researchers and research software developers to address this gap
The Task Force began by analyzing the FAIR4RS Principles, where it identified six key requirement categories: Identifiers, Metadata for software publication and discovery, Standards for inputs/outputs, Qualified references, Metadata for software reuse, and License. To address these requirements, six sub-groups are conducting literature reviews and community outreach to define actionable practices for each category. Some of the challenges include identifying suitable identifiers, archival repositories, metadata standards, and best practices across research domains. This poster provides an overview of the Task Force, presents its current progress, and outlines opportunities for community involvement. Given the progressive adoption of the FAIR4RS Principles, including by funders, we expect this poster will provide attendees at USRSE’25 with an understanding of the FAIR4RS Principles and how they can make their software FAIR through actionable, easy-to-follow, and easy-to-implement guidelines being established by our Task Force.
Member states of the Treaty on the Non-Proliferation of Nuclear Weapons that are not listed as nuclear weapons states are subject to international safeguards in order to ensure that no nuclear material is diverted or facilities are misused with the aim of building nuclear weapons. With that objective, the International Atomic Energy Agency (IAEA) and other safeguards authorities use technical measures such as seals, closed-circuit television (CCTV) cameras, radiation detectors or laser scanners in civil nuclear facilities. During the process of nuclear waste management, safeguards are applied in interim storage facilities and deep geological repositories to spent nuclear fuel, other nuclear waste forms, as well as casks and containers containing this material. These monitoring systems over the past have grown in complexity, produce large amounts of data and become more and more interconnected and automated. At the same time, digital twin concepts increasingly gain popularity in industry contexts while enabling technologies, e.g
high-performance computing and machine learning, become more easily available. This poster explores the topic of digital twins for safeguards in nuclear waste management by presenting models and software modules. At the core of our approach lies a monitoring system model implemented via a PostgreSQL database that incorporates data traces obtained from inspection data and facility operators’ declarations. To support interaction with the model, we provide a Python API that enables manipulation and tracking of the model state over time from an operator and an inspectorate perspective. Also presented here are the project’s continuous integration of tests, the automated deployment of its documentation, and its graphical user interface (GUI) which is implemented via a plotly-powered dash app. Beyond modeling, data storage and visualization, the presented software is capable of simulating different physical aspects such as neutron and gamma radiation as well as light detection and ranging (LiDAR) and it offers the possibility to generate synthetic data for compliance and diversion scenarios. The use of this synthetic data for the training of machine learning algorithms for experimental design optimization and anomaly detection is discussed. Finally, the poster will provide an outline of the envisioned scaling of this prototype software into a larger digital twin framework capable of processing and analyzing real measurement data can be alongside the synthetic data. The presented work aims at supporting and facilitating remote monitoring, the development of new safeguards techniques, as well as education and training, while aspiring to incorporate good practices of research software engineering.
The Sage project operates a national-scale distributed cyberinfrastructure of AI-enabled edge computing platforms connected to a wide range of sensors. AI algorithms running on Sage can analyze images, audio, LIDAR, and meteorological data to understand ecosystems, urban activity, and detect wildfires. With over 33 million images collected, there is a critical need for tools that enable fast, meaningful exploration of massive visual datasets. Traditional metadata-based methods (e.g., tags or timestamps) fall short for domain-specific retrieval at this scale. We present a modular, end-to-end image search system designed for distributed sensor networks
The workflow integrates automated captioning (Gemma-3), semantic embeddings (DFN5B-CLIP), keyword indexing (BM25), and reranking (ms-marco-MiniLM-L6-v2), all backed by a scalable Weaviate vector database. Images and their generated captions are embedded into a unified semantic space, enabling hybrid search that combines both semantic similarity and keyword relevance when users submit natural language queries. This architecture supports both real-time monitoring and historical analysis, scaling efficiently with large datasets and user demand. Although demonstrated on Sage, the system is model-agnostic and infrastructure-flexible, making it applicable to other distributed sensor networks. It empowers researchers in fields such as ecology and atmospheric science to search image collections by content-not just metadata-enhancing access to sensor data for scientific analysis and decision-making
The Carpentries is a community building global capacity in essential data and computational skills for conducting efficient, open, and reproducible research. In addition to certified Instructors teaching Data Carpentry, Library Carpentry, and Software Carpentry workshops around the world, the community also includes many people contributing to and maintaining Open Source lessons. Recent years have seen enormous growth in the number and diversity of lessons the community is creating, including many teaching skills and concepts essential to Research Software Engineering: software packaging and publication, environment and workflow management, containerised computing, etc
As this curriculum community has developed, demand has been growing for training opportunities, teaching how to design and develop Open Source curriculum effectively and in collaboration with others. A new program launched in 2023, The Carpentries Collaborative Lesson Development Training teaches good practices in lesson design and development, and open source collaboration skills, using The Carpentries Workbench, an Open Source infrastructure for building accessible lesson websites. As the discipline of Research Software Engineering continues to develop and mature, there is an increasing need for high-quality, Open Source and community-maintained training, and for the expertise to develop those resources. This poster will provide an overview of the training, explain how it meets this need, and describe how it fits into The Carpentries ecosystem for lesson development. It will also explain how RSEs can enroll in the training, and give examples of lesson projects that have benefited from it already.
Research laboratories require a professional and up-to-date website to showcase their work, attract collaborators, and establish a strong online presence. However, developing and maintaining such a website can be challenging and time consuming. We present Lab Website Template, open-source website template specifically designed for the unique needs of labs, allowing lab members to focus more on research activities and less on web development.
A critical feature of our template is its ability to automatically generate citations (titles, authors, publishing info, etc.) from simple identifiers like DOIs, PubMed, ORCID, and thousands of other types using the open-source tool Manubot. Advanced pipelines automatically update citations when site content changes, on a periodic schedule, or on demand. The user chooses what sources to pull from and how to filter and display the resulting citations on their site.
To reduce administrative burden and to remain accessible to non-technical users, we automate everything we can, not just citations. This includes: first time setup of the repository, generating live previews of pull requests, configuring custom URLs, and more. To the same end, though the full codebase is exposed to the user, it is structured around the most common use cases and separates user content from internal logic. This allows users to concentrate on the content of their site without wading through implementation details, but still enables advanced customization if needed.
A key strength is our singular commitment to user support, demonstrated by consistently low issue response times and high successful resolution rate. We foster an environment where user input is highly valued and carefully considered, but balanced against maintaining an un-bloated design. Recognizing that source code and comments alone are insufficient, we also maintain a searchable, comprehensive, dedicated documentation suite so users can find answers immediately before needing to ask for help.
The success of our design and approach is evidenced by our template's widespread adoption, with 450+ stars on GitHub, 350+ forks, and a growing gallery of dozens of active labs using it. This work reflects not just a powerful software tool, but an evolving, open, community resource that effectively serves the needs of its users.
The proposed poster will describe the history, process, and future work of the DHTech Code Review Working Group
The working group has implemented a community code review process, in which volunteers from across organizations worldwide review code submitted by digital humanities projects. To date the group has facilitated 9 code reviews. The poster will present how reviews are conducted and what lessons have been learnt.
A new wave of programming aids powered by large language models (LLM) offer new capabilities for producing code, and tools ranging from ChatGPT to GitHub Copilot are currently being used by scientists across disciplines [1]. Because scientific programmers are often under-trained in software development [2], there is considerable excitement about how these tools may enhance their programming abilities and productivity [3]. However, a confluence of evidence from human computer interaction studies as well as the literature on research software development suggests there is a major possibility for LLM-based tools to inject scientifically-invalidating code errors without detection. The incidence of this would be almost impossible to measure in the near-term owing to sparse code sharing and post-publication review practices, and the effect on scientific research quality could be substantial.
This poster lays out arguments for why scientific programmers are at heightened risk for LLM-injected code errors: that (1) this population possesses risk factors for tool overreliance identified by a range of user studies [4-5], (2) LLM-generated code tends to contain more subtle errors than human-written code, as it will often execute without throwing runtime errors [6], and (3) historically, scientific programmers have often used informal code quality control measures, if any [7-8].
Beyond raising an argument for why these issues should be taken seriously as a matter of scientific integrity, this poster provides a setting for collaborative brainstorming about the unique leverage points and opportunities Research Software Engineers may have to proactively lead in upholding code quality in the coming years.
In 2025, DHTech, a community of people doing technical work in the digital humanities, ran a survey to better understand who is developing code in the digital humanities
The survey contained questions about the technologies DH RSEs use, their training and career path. We propose a poster that presents the results of that survey.
This poster describes the work presented in a published paper [1], which reports the results of a survey of research software developers. The focus of the study was to understand how research software developers design test cases, handle output challenges, use metrics, execute tests, and select tools. Specifically, the study explores the following research questions; - RQ1: What are the characteristics of the research software testing process? - RQ2: What challenges do developers face throughout the testing process of research software? - RQ3: How do research software developers design test inputs? - RQ4: What specific challenges do research software developers face when determining the expected output of test cases? - RQ5: What metrics do research software developers use to measure software quality and test quality of tests? - RQ6: How do research software developers execute their tests? - RQ7: What testing tools do research software developers use and what are their limitations? - RQ8: What features should a testing tool specifically developed for research software contain? - RQ9: Are demographic characteristics of research software projects related to the testing process?
The poster will contain graphs reporting the detailed results related to these questions. Our overall findings show that research software testing varies widely. The primary challenges faced by research developers include test case design, evaluating test quality, and evaluating the correctness of test outputs. Our findings highlight the need to allocate more resources to software testing and provide more education and training to research software developers on software testing.
Packaging Python code, especially for scientific applications, has historically been hard, with poor documentation, many obscure fragile hacks into a rigid system, and challenges distributing this to users. All that has dramatically improved over the last few years with a series of PEPs (Python Enhancement Proposals) creating a unified standard to build on, innovative packages that have built on these, and extensive documentation around best practices. We will be looking at how the Scientific Python Development Guide [1], along with its template and repo-review [2] tooling, teaches about modern scientific Python software development. This is part of an effort to document recommendations for Python development with several key tools, like uv, ruff, cibuildwheel [3], and scikit-build-core [4].
The Scientific Python Development Guide was started as a guide in the Scikit-HEP organization, written to facilitate the growing number of packages with a growing number of maintainers. Over time, a cookiecutter-based template was added, and a tool to evaluate repository’s configuration and compare it to the guide was added. At the first Scientific Python Developer Summit in 2023, the guide was moved to the Scientific-Python organization, the tools were reworked and generalized, automation was expanded, parts of the NSLS-II guidelines were merged in, the three separate components were combined into one repo, and the guide became part of the Scientific Python organization. Since then, the guide has remained one of the most up-to-date resources available, with new packaging changes like Trusted Publishing and SPDX license identifiers being added days and even sometimes hours after they are announced. We will look at some of the recommendations, especially newer ones.
The cookiecutter, now part of the guide as well, provides a fast start for users who have read the guide. It supports both cookiecutter and copier programs, and can be used with “update” tooling (cruft or copier). It supports around 10 build backends, including the most popular pure Python and compiled extension backends. You can select classic or vcs versioning, and can pick between several licenses. We will see how easy it is to start a project with all tooling in place.
Repo-review is a framework for writing checks for configuration; it is written in Python and can be run using WebAssembly. The sp-repo-review plugin contains all the checks described in the guide, with cross-links to take a user to the badge in the guide for each check. Some projects, like AstroPy, have added this to their CI, though it can be used entirely from the guide’s website.
A dedicated section on binary package will include scikit-build-core, the native build backend that can adapt CMake projects into Python wheels. When combined with pybind11 [5], you can write working compiled extensions with just three files and a small handful of lines. Combined with cibuildwheel, the wheel builder tool used throughout the Python ecosystem, and a good CI service like GitHub Actions, building redistributable wheels for all major platforms, and even WebAssembly, iOS, and Android, is easy. All the required code to get started can easily be shown on a poster.
References 1. The Scientific Python Development Guide, Henry Schreiner, et. al, June 2023. Retrieved 2025-7-17 2. https://learn.scientific-python.org/development 3. repo-review, Henry Schreiner, et. al, June 2023. https://repo-review.readthedocs.io 4. cibuildwheel, Joe Rickerby, Henry Schreiner, et. al, 2016. https://cibuildwheel.pypa.io 5. scikit-build-core, Henry Schreiner, et. al, 2022.. https://scikit-build-core.readthedocs.io 6. pybind11, Wenzel Jakob, Henry Schreiner, et. al, 2017. https://pybind11.readthedocs.io
Science thrives when researchers and software engineers share their expertise. As scholarship across disciplines becomes more dependent on computational methods, effective collaboration between Research Software Engineers (RSEs) and Software Engineering Researchers (SERs) is essential to drive innovation, optimization, productivity, reproducibility, stewardship, and solutions for interdisciplinary problems [1, 2, 3, 4]. This poster presents ten strategic guidelines to foster productive partnerships between these two distinct yet complementary communities. We very recently published these rules in a longer paper [5]; this poster summarizes the key elements to foster and encourage future RSE-SER collaborations.
RSEs are deeply embedded in research contexts, often balancing software development with domain-specific knowledge. They focus on creating software to meet evolving research needs, including flexibility and experimental workflows. SERs are trained to develop robust, scalable, and maintainable systems emphasizing engineering principles, often aligning with industry best practices. Career paths, incentives, and constraints all differ between these communities. Thus, achieving collaboration between SERs and RSEs requires intentional effort and the application of change theories [1, 6]. To build synergistic relationships between SERs and SREs that enhance the quality and impact of research outcomes, we recommend these ten principles: Recognize The Two ities Are Different: RSEs and SERs must appreciate one another' s unique roles and cultures, celebrate their strengths, and avoid assumptions. Acknowledge Collaboration Is Not Going to Just Happen: Partnerships must be deliberate; proactive initiation, inquiry into mutual goals, and consistent investment are needed. Define Clear Goals and Outcomes: Open discussion, measurable and well documented objectives, and regular check-ins ensure progress, though flexibility is needed too. SERs Must Engage with RSEs in Their Professional Environments: SERs should appreciate RSEs’ institutional obligations and resource constraints. SERs can gain these insights and build trust by attending RSE conferences, workshops, and talks. Identify the Intersection of Shared Research Software Challenges: Collaborations must address common challenges that are practically significant and academically valuable. Ensure Mutual Benefit in Collaboration: Both parties must earn immediate and long-term value. Incentive schemes must be understood. Authorship and leadership responsibilities should be explicitly addressed. Maintain an Open Mind Toward Emerging Challenges: Collaborators should be adaptable and ensure continuous dialogue to identify new challenges and solutions. Actively Advocate for Each Other: RSEs and SERs should showcase one another’s work to demonstrate respect and the value of collaboration. Maintain Vigilance and Recognize When Collaborations Are Off Course: To sustain a win-win relationship, SERs and RSEs must regularly review progress and concerns. Secure Institutional Support: All parties should seek funding, advocate for institutional recognition of RSE and SER roles, and promote frameworks that support collaboration (e.g., joint appointments and recognition programs). Recognizing the distinct cultures, priorities, and workflows of RSEs and SERs is fundamental to building productive partnerships and high quality research software.
The Oak Ridge National Laboratory, in partnership with NOAA, manages the Gaea supercomputer, an HPE Cray EX 3000 machine with an aggregate peak performance of 20 petaflops, dedicated to earth and climate science research. The majority of the workloads involve running models like AM4 [1], ESM [2] and MOM [3]. These are developed with strict reproducibility requirements so any new developments in the models do not change expected results. These models are tightly integrated with the Intel compiler suite and the Cray system software such that even small changes to those libraries could result in variations in the results of the models. The development cycle of these models need to ensure that system and library upgrades do not change the expected results. The Gaea supercomputer undergoes mandatory upgrades when old software goes out of support. The developers use a testing and development system (TDS) of the same architecture as Gaea with the updated environment to upgrade the models to run correctly on the new software stack before the main system upgrade. Notwithstanding, it is essential sometimes for the developers and users to continue to have access to the old software stack. That way, they are able to have ample time for testing or to continue running the old models without the changes in results. To address this, we make use of the HPE Cray Programming Environment (CPE) containers [4]. HPE releases these containers beginning with the CPE 23.05 release. The HPE Cray 23.03 programming environment which comprised cray-mpich 8.1.25, libfabric 1.12.1, libsci 23.02.1.1 and Intel 2022.2.1 which provided the icc 2021.7.1 compiler on Gaea was scheduled to be removed and replaced with an upgraded CPE
Since there was no 23.03 container, we had to rely on a 23.09 CPE container and significantly modify it to produce a container with the 23.03 programming environment. Gaea uses Apptainer as its runtime, but the HPE containers which are built with Docker and include environment modules [5], do not initialize those modules correctly when started with Apptainer. This is because the initialization scripts do not run like they do on Docker to set the environment variables. We solve this by starting the container with Podman (which is equivalent to Docker), copy the properly initialized environment variables into a file, then create and run the Apptainer container by loading these saved environment variables at the start in order to set up the modules on the running Apptainer container. This is done along with removing the 23.09 specific software from the CPE 23.09 container and installing the 23.03 specific software. In addition, the required old Intel compiler version is also installed since the CPE containers don’t distribute those. To verify, we had a reproducer running a development version of the GFDL Atmospheric model (AM5), based on AM4 [1]. The reproducer was the binary and files that were built in the older software environment but wouldn’t work on the updated environment. We run them from within the running 23.03 container and verify the binaries work within the container environment. We verify with checksums of the values that represent the zonal wind components calculated across data spread across MPI processes. The checksum is compared with the checksum value produced from an older run that predates the system upgrade. Previous papers [6,7,8,9] have covered container performance for high performance computing workloads and found it close to native, so it is not reiterated here.
The Interdependent Networked ity Resilience Modeling Environment (IN-CORE) is a state-of-the-art computational environment developed by the NIST Center of Excellence for Risk-Based ity Resilience Planning to advance community resilience measurement science and provide a decision support system. IN-CORE integrates various models, including physics-based modeling of infrastructure, network models, data-based socio-economic models, and resilience metrics, to help communities prepare for, mitigate, and rapidly recover from hazard events. IN-CORE has over 28 GitHub repositories, managing various components of the system, including Helm charts used for deployment in a Kubernetes cluster, several Python libraries for modeling, visualization, and community contribution, as well as the Java-based IN-CORE web services. Each software component has continuous build and integration that happens through a series of GitHub actions that help automate the software development process, from Unit testing and Linting to automatic GitHub releases to GitOps for software deployment. While all projects can benefit from this automation, it is particularly beneficial on a large-scale project with many components to help manage the complexity of the entire process, help with quality control, and ensure smooth delivery of releases and deployments. On the modeling side, the IN-CORE development team at NCSA has developed a process for managing the intake of new models from researchers that is dependent on where the researcher is in their development process
For the case where a researcher has code that is ready, they can submit it for review by a research software engineer (RSE). An RSE is assigned to work with the researcher to review their code and finish the implementation in IN-CORE. If a researcher is at an earlier stage of development, they can submit for consultation so that an RSE can help them with library selection, requirements gathering and other implementation details to help get their code ready for inclusion in IN-CORE. Regardless of the readiness, this often involves an iterative process of working with the researcher and communicating with the research team through online meetings, Slack and email. In some cases, there may be scientific questions that need to be resolved and a separate Science Committee (SciCom) is involved to help resolve any issues before the implementation can continue. The software development life cycle is more complex on larger projects with many moving parts. Having good processes in place can help manage this complexity.
Motorcyclists are often unfairly blamed in insurance claims after crashes. Investigations can take a long time, and a lack of evidence or the influence of stereotypes can lead to incorrect judgments
Our mobile app helps motorcyclists capture accurate 3D reconstructions of crash scenes using their phones. These 3D models can be shown to insurance adjusters to provide a clearer picture of what actually happened. The goal is to make the claims process more fair by offering better visual evidence and reducing bias against riders.
CODA [1] is a data analysis and machine learning pipeline originally developed by the Kieman Lab at Johns Hopkins University and now enhanced through collaboration with the Johns Hopkins Data Science & AI Institute (JH-DSAI). CODA uses classical image analysis to reconstruct 3D volumes of biological tissue samples at cellular resolution from sets of serially-sectioned multimodal microscopy images. It employs machine learning-based segmentation algorithms to identify the locations, extents, and relationships of cellular and tissue microstructures, facilitating novel spatial and volumetric analyses of disease pathology in situ
Analyses of tissue volumes generated using the CODA framework have made impacts in understanding tumorigenesis in pancreatic cancer [1, 2, 3, 4], and integration of spatial genomics data has provided insights into the genetic variety of precancerous lesions observed in the pancreas [5]. Our poster provides technical detail on the classical and machine learning algorithms used in the next-generation CODA pipeline, and gives an overview of the software development and data engineering practices that are central to the continued support of professional-quality software. The next-generation CODA pipeline is one of the first software products developed through collaboration with the JH-DSAI; it is a landmark example of how the Institute is able to provide scientific domain experts with innovative, ready-to-run tools to impact research work along with the technical expertise needed to deploy those tools accessibly, continue their development, and manage their lifecycles to allow new collaborators and communities to benefit from the work of individual groups.
Large language models (LLMs) with billions of parameters present substantial challenges in terms of memory, computation, and energy requirements. Their scale often renders them impractical for deployment on resource-constrained devices such as personal computers, laptops, and mobile phones. These limitations significantly hinder their usability in real-time, on-device applications, such as home security systems requiring immediate threat detection.
While training a large language model is a resource-intensive, one-time process, the operational costs of inference—the repeated running of the model to serve users—also accumulate rapidly. Industry data indicates inference makes up majority of machine learning workloads; for example, Amazon Web Services and Nvidia estimate that 80–90% of overall ML demand is devoted to inference rather than training. As LLM usage scales into millions or billions of queries, the ongoing inference energy consumption can ultimately surpass the energy cost of initial training, especially when factoring in growth in users and applications. Inference workloads not only drive-up electricity usage and data center cooling demands, but also carry significant carbon and water footprints, particularly when powered by fossil-fuel-based energy sources. This ongoing operational burden is a critical factor in the overall environmental and economic impact of LLM deployment.
To address these issues, model compression techniques such as pruning, quantization, distillation, and factorization have been proposed. These methods aim to reduce model size and computational complexity without significant loss in performance. Compressed models not only reduce storage and cloud computing costs but also improve inference speed and energy efficiency. Importantly, they expand accessibility by enabling advanced AI capabilities on low-power and edge devices. These advancements are essential for building sustainable, scalable, and widely deployable AI systems.
Simulating the Earth's climate is an important and complex problem, thus Earth system models are similarly complex, comprising millions of lines of code. In order to appropriately utilize the latest computational and software infrastructure advancements in computational and Earth system science, while running on modern hybrid computing architectures, to improve the model performance, precision, accuracy, or all three; it is important to ensure that model simulations are repeatable and robust. This introduces the need for establishing statistical or non-bit-for-bit reproducibility, since bit-for-bit reproducibility may not always be achievable. Here, we propose a short-simulation ensemble-based test for an atmosphere model to evaluate the null hypothesis that modified model simulation results are statistically equivalent to those of the original model, and implement this test in US Department of Energy's Energy Exascale Earth System Model (E3SM, Golaz, et al., 2022). The test evaluates a standard set of output variables across the two simulation ensembles and uses a false discovery rate correction (Benjamini and Hochberg, 1995) to account for multiple simultaneous testing
The false positive rates of the test are examined using re-sampling techniques on large simulation ensembles and are found to be lower than the currently implemented bootstrapping-based testing approach in E3SM (Mahajan et al. 2019). We also evaluate the statistical power of the test using perturbed simulation ensemble suites, each with a progressively larger magnitude of change to a tuning parameter. The new test is generally found to exhibit greater statistical power than the current approach, being able to detect smaller changes in parameter values with higher confidence.
The field of computational paleontology is rapidly advancing with many recently developed open-source R packages leading the charge for more standardized, reproducible, and open research. This push is a relief for many data-science-minded paleontologists who have previously toiled over writing their own scripts to download, clean, analyze, and visualize their data. Many of these steps are now covered by functions in these new packages (and those of other packages in the R universe). However, this push for more script-based research may introduce a wrench in the existing scientific workflows of less technical researchers who lack a background in coding or cause a greater learning curve for new researchers introduced to the field. Therefore, bridging the gap between visual, hands-on workflows and digital, code-based workflows is imperative to the collaborative future of computational paleontology. Here I present a new open-source Shiny app, paleopal (https://github.com/willgearty/paleopal), that provides a user-friendly interface to build paleontological data science workflows without any programming knowledge
The app connects existing paleontological R packages such as palaeoverse1 and deeptime2 with the tidyverse3 suite of R package to encourage standardized scientific pipelines. Specifically, paleopal presents users with a curated set of workflow “steps” (e.g., data upload, data cleaning, and data visualization) that they can then choose from, customize, and reorder to develop their pipeline. The app is a built on top of the shinypal4 package which uses the shinymeta5 R package to provide a live code and results panel and a downloadable RMarkdown script as the user develops their pipeline. To increase accessibility, I have hosted the shiny app as a serverless application on GitHub Pages (http://williamgearty.com/paleopal/) using the shinylive6 R package and the webR framework. To my knowledge, this is the first use of shinymeta within a webR project which has presented many technological hurdles to overcome, including dealing with browser filesystems, restricted access to operating system software, and cross-browser support. Nonetheless, paleopal aims to spearhead the next generation of training of computational paleontologists, regardless of age, background, or technical expertise. Additionally, the extensible nature of paleopal makes it easy to add further curated workflow “steps”, and the underlying shinypal package could also be used to create similar shiny apps for other scientific fields.
Generative AI tools are rapidly impacting software development practices including the development of graphical user interfaces (GUI). These coding and design tools offer an attractive promise of accelerating the pace of development while reducing onerous or mundane engineering tasks. Generative AI provides Research Software Engineers (RSEs) an opportunity to design and implement the graphical user interface (GUI) of scientific web applications rapidly, even when they may not be trained in web development. Generative AI tools such as Cursor [1], Builder.io [2], Lovable [3] and others, take a prompt (e.g. “I need a scientific data dashboard to monitor experiments and their data outputs”) and rapidly produce an attractive, seemingly useful interactive design that looks like it can be connected to a developer’s pre-existing application. However, the question is whether this approach will truly be able to help produce usable and meaningful GUIs.
RSEs often do not have substantial training in user experience design and usability evaluation work. It is essential that RSEs leveraging AI tools to create UIs have resources that can provide them with confidence in the quality of the resulting user experience. The STRUDEL project [4] is dedicated to building open source scientific user experience (UX) resources, which incorporate expert UX researcher and designer knowledge that RSEs and domain scientists can easily leverage. Currently the project team is investigating the effectiveness of various AI tools to generate deployable prototypes for scientific research web applications.
Upon initial exploration, we find that while AI tools can generate useful ideas and prototypes, they often require significant refinement, revision and expertise to generate usable and maintainable UIs for scientific software. To help improve the usability of the prototypes generated through AI tools, we experimented with using our STRUDEL design templates as starting points from which the AI tools can expand upon. Our experiments show the promise of using carefully researched templates as a type of guardrail for the AI tools. These guardrails can help ensure that the UI product adheres to certain design guidelines without the need to explicitly provide those details in a prompt. We posit that the use of human-designed seed templates as starting points can be a useful method for improving the overall usability of AI-generated GUIs. Using a seed template is a much-needed way to incorporate user experience expertise in the loop while leveraging AI tools to generate UI flows and patterns.
Our initial work explores the implications of AI tool-generation on the usability, development, and long term maintenance of the GUIs, but it is clear this will be an important consideration for the US-RSE community. With this poster we aim to foster discussion among the US-RSE community to delve into the grand promises and likely perils of generative AI tools for building usable scientific web applications.
Large Language Models (LLMs) offer new possibilities for augmenting scholarly document analysis, including automating the extraction and verification of software mentions in scientific papers. While LLMs achieve high recall in identifying software mentions, we observe significant inconsistency across runs in classifying those mentions as cited or uncited. This variability raises critical questions about the reproducibility and trustworthiness of LLM-driven bibliographic tools, especially for tasks with high semantic precision like software citation tracking.
In this poster, we will present the SoftciteAuditor tool we built to extract software mentions and their use in research papers which may or may not be properly cited. The tool employs LLM based on user choice to perform the analysis of a paper and suggest missed software citations with a hope of ensuring proper attribution and highlighting the importance of software in research publications.
The growing importance of research software heightens concerns about research software security, which will only intensify if not proactively addressed. Before any specific measures or interventions can be suggested, it is essential to understand the RSE community’s security behaviors, competencies, and values, collectively referred to as their ‘security culture’ [1]. While studying the climate and culture within a group of people is not a new concept or research topic, to our knowledge, no security culture research has taken place within the RSE community.
In this study, we aim to characterize the security culture of the RSE community by replicating a prior work performed in the open-source software space [3]. To broaden our sample, we distributed this survey to RSE community members in both the US and Germany. By replicating an existing survey, we can compare the RSE community’s responses with those of the open-source community, which shares some characteristics with RSE [4-5]. In addition to the original survey, we added a series of vignettes to gauge the RSE community’s knowledge and perception of threat modeling, a standard “shift-left” approach to security. By doing so, we gauge RSE interest in participating in security efforts and motivate future security research in the research software domain.
Ultimately, we surveyed 104 members of the RSE community, including both those in the US and Germany. To characterize RSE security culture, we ask the following research questions: RQ1: What is the security culture of the RSE community? RQ2: How does the RSE community’s security culture compare with the Open-Source ity’s security culture? RQ3: What is the perception among RSE community members on adopting threat modeling during development?
The primary contributions of this study are: 1) A novel characterization of the RSE community’s security culture, 2) an empirical comparison of the security culture of RSEs and OSS developers, and 3) recommendations for internal and external stakeholders to improve RSE security culture. This study is a first step toward tailoring “shift-left” security principles to address the unique challenges that RSEs face.
Graphical Processing Units (GPUs) are now integrated into the worlds fastest supercomputers to solve the most computationally difficult, mechanistic scientific problems in physics, chemistry, biology, material sciences, energy, earth and space sciences, national security, data analytics, and optimization. An 8 year, $1.8 billion effort called the Exascale Computing Project involving 2,800 scientists and engineers recently finished modernizing the underlying scientific numerical software to efficiently use this new generation of machines at more than a billion, billion floating point calculations per second referred to as exaflops. While NSF ACCESS and the computing facilities themselves directly grant U.S. institutional and industry researchers access to these GPU-accelerated machines at no cost, research software engineers and scientists must first show that their computations scale to efficiently use multiple GPUs across many computer servers / nodes. Therefore, it is imperative to teach the data parallel software development skills, calibration techniques, and sustainable software practices now required to enable large scale, GPU-powered computational research. This poster discusses the unique challenges of semester-long training and course infrastructure to cross-train domain scientists as research software engineers.
The course is architected around solving a real-world research problem with a faculty mentor as a capstone project and is tailored to undergraduate students, graduates students, faculty and staff. Undergraduate students and staff members are paired with a faculty mentor ideally before the course and within the first 2 weeks of the course before the drop date; this helps scope the capstone project and group learners in complementary research areas together. This course does not strongly focus on AI tools—there are plenty of courses covering those—but instead focuses on high performance computing (HPC) tools and HPC numerical libraries that can be instrumented for hardware performance analysis. The capstone project is intended to support summer NSF research experiences for undergraduates (REU), collaborations between faculty and core-facility staff, and preliminary grant work of faculty, postdoctoral trainees, and graduate students; this intent is similar to introductory grant writing classes.
The course is organized into three phases. The first phase is a crash-course covering efficient within-node, multi-node, and data parallel GPU programming; this provides the initial foundations for learners tobegin writing their capstone project code. These topics are covered using C and C++ because C more easily allows inspecting vectorized instructions and C++ is central to vendor-agnostic, GPU and numerical HPC libraries. The second phase is basic knowledge and skills in distributed methods for parameter fitting, performance tuning, automated unit testing and performance testing, and stress-free peer code review along with writing various types of documentation. The third phase covers advanced topics with less immediate applicability to the capstone project, doing deeper dives into topics of interest to the cohort, and for progress reports and final presentations of individual capstone projects.
As this course is still being developed, invaluable feedback from the USRSE community about their teaching, learning, and pedagogical experiences would help make it an enjoyable experience for the future learners. The course assumes experience with at least one programming language, therefore, assigned reading material with a short, low-stakes quiz before each class would help level the knowledge and increase confidence of learners for the classroom instruction. Providing classroom laptops or portable PCs are being investigated to deliver instruction because they would allow group activities such as connecting machines to form a cluster and allow students to watch local resource usage, understand latencies, topologies, and provide memorable, active learning experiences complementing terminal work on remote HPCs. It would be of great interest to learn from USRSE instructors about creating engaging opportunities for classroom learning with such portable GPU machines. Lastly, the University of Pittsburgh libraries support developing open educational resources for which USRSE attendees can weight in on gradable, pedagogically useful exercises often missing from documentation and training resources to help teach RSE skills. Hopefully this poster discussion and feedback from the USRSE community can help make this course a success and support similar teaching and learning endeavors.
As 3D printing moves toward full autonomy, the need for a universal, scalable, and intelligent architecture becomes increasingly critical. Modern additive manufacturing faces persistent barriers: fragmented device interfaces, lack of remote operability, ad hoc parameter tuning, and insufficient integration between physical processes and data-driven control. As additive manufacturing evolves toward more complex applications such as structural color fabrication, there is a growing demand for a universal, intelligent control framework capable of integrating diverse printer hardware, automating parameter optimization, and managing experimental data at scale.
Our work presents a novel universal edge-cloud architecture that combines edge-side hardware adaptors with a cloud-based backend to support real-time control, intelligent optimization, and robust data management in the domain of autonomous 3D print. A centralized, user-friendly web interface allows users to configure and launch print jobs (campaigns), define experimental parameters, and review historical results. The cloud backend is built with scalable microservices, including a high-performance RabbitMQ-based messaging system for edge-cloud communication, MongoDB for structured storage of experimental metadata, print records and Clowder for managing archival files and analysis reports. Distributed image analysis and machine learning-based prediction services run in parallel to optimize print parameters. This supports autonomous closed-loop experimentation across large-scale campaigns. To ensure flexibility and hardware independence, print jobs are abstracted using a custom PCP (Parameterized Control Protocol) file format. PCP files encapsulate device-specific instructions along with structured metadata describing print sequences, geometric layout, and adjustable parameters. This abstraction enables batch-based execution and seamless orchestration across printers running different configurations. The system exchanges messages with 3D printers via a custom-developed edge adaptor deployed to the near printers.
We validate the system through an application in structural color 3D printing, focusing on the relationship between color properties—represented in the HSV color space—and key printing parameters such as pressure, speed, and bed temperature. The system was deployed on the UIUC Radiant cloud cluster and tested across multiple print shapes defined by PCP files on the 3D printers using Marlin firmware. To efficiently explore the parameter space and enhance predictive performance, we incorporate a Bayesian optimizer into the machine learning workflow. The system’s adaptability and modular design enable scalable, reproducible experimentation across a wide range of hardware, demonstrating its potential as a universal platform for autonomous 3D printing.
The development and use of scientific software often entails collaborative work within and across teams. Understanding how such teams collaborate in addition to other factors that influence their likelihood of success is critical given the prevalence and necessity of teamwork for scientific software [1]. In particular, we contend that understanding the current state of collaboration in scientific software projects affords the discovery of opportunities to design more effective collaborations and produce better software and scientific outcomes. To advance the study of teamwork for scientific software, we conducted a systematic literature review (SLR) with a focus on papers that analyze or otherwise describe collaborative work in the domain of scientific software [2]. The main objective of the SLR was thus to provide a foundation for future research based on key insights and gaps identified in the literature on scientific software teams. Our search was applied to three databases (Web of Science, ACM Digital Library, IEEE Xplore) with backward and forward citation search applied to papers selected for final analysis [3]. The results of the SLR indicate that teamwork in scientific software remains understudied: collaboration is repeatedly mentioned but rarely the focus of investigation. At the same time, researchers in this domain recognize the significance of teamwork and have sought to develop tools with the goal of facilitating various aspects of collaboration like communication and data sharing. The evaluation of collaboration tools and the collaboration itself is generally not reported in these publications, suggesting that such evaluations are infrequent, undervalued, and/or constrained (e.g., due to lack of time and funding). Fitting with recent calls for research [4], our results highlight the need for a concerted effort to analyze the inputs, processes, and outcomes of collaborative work in the development and use of scientific software. We propose a path forward for studies of collaboration in scientific software aimed at enhancing both teamwork and software, with emphasis on team roles, cross-disciplinary requirements, and generative AI.
References [1] M. Heroux et al., “Basic Research Needs in The Science of Scientific Software Development and Use: Investment in Software is Investment in Science,” United States Department of Energy, Advanced Scientific Computing Research, Workshop, Aug. 2023. doi: 10.2172/1846009. [2] B. Kitchenham and S. Charters, “Guidelines for performing systematic literature reviews in software engineering,” Keele University & University of Durham, Keele, UK, Technical Report EBSE-2007-01, 2007. [3] C. Wohlin, “Guidelines for snowballing in systematic literature studies and a replication in software engineering,” in Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, in EASE ’14. New York, NY, USA: Association for Computing Machinery, May 2014, pp. 1–10. doi: 10.1145/2601248.2601268. [4] M. Felderer, M. Goedicke, L. Grunske, W. Hasselbring, A.-L. Lamprecht, and B. Rumpe, “Investigating Research Software Engineering: Toward RSE Research,” Commun. ACM, vol. 68, no. 2, pp. 20–23, Feb. 2025, doi: 10.1145/3685265.
Research computing centers need flexible accounting systems to track consumption of compute and storage resources under a cost model that allows researchers (or their groups) to purchase capacity for their workloads [1]. An effective system must: (1) bind funding sources to the Principal Investigators (PIs) who control them, (2) link those funds to specific resource allocations, and (3) retain a complete transaction history for auditing and reporting. Existing tools—such as Moab Accounting Management and the native accounting features in SLURM—do not fully capture the complex organizational structures, funding flows, and policy constraints typical of academic research environments. To address these gaps, the Partnership for an Advanced Computing Environment (PACE) at Georgia Tech is developing the Research Computing Accounting Management System (RCAMS), an open‑source Python platform engineered from first principles with contemporary software design best practices [Figure 1].
RCAMS features a maintainable, three‑layer architecture shown [Figure 2]:
1. Domain Layer: Uses SQLAlchemy ORM to represent core entities (accounts, funds, resources, organizations, owners) with built‑in data integrity checks and a full audit‑logging subsystem that records every change to the database.
2. Operations Layer: Implements generic CRUD operations via a flat inheritance model centered on _OperationBase[T], where T is the domain entity [2]. This approach avoids deep class hierarchies and simplifies maintenance.
3. CLI Front‑End: Inspired by Git’s user experience, the rcams command exposes subcommands for each entity (e.g., rcams account add-storage, and rcams fund get-spending-history), ensuring discoverability and consistency.
To optimize performance—especially when auditing hundreds of thousands of records—RCAMS isolates each entity’s operations and CLI handlers into separate modules. This compartmentalization reduces import overhead and facilitates parallel development by multiple Research Software Engineers (RSEs) with minimal merge conflicts.
Quality is enforced through a comprehensive CI/CD pipeline in GitHub Actions: linting with Ruff [3], static‑typing validation with mypy [4], over 95 % test coverage via pytest [5], automated Sphinx [6] documentation deployment to GitHub Pages, and seamless issue‑tracking updates with GitHub Projects Kanban Board. Together, these practices ensure that every release is reliable, well‑documented, and immediately deployable.
RCAMS is well documented with contributor guidelines, which allows efficient onboarding for the new team members; modular design increases the parallelism in the development, greatly reduced the merge conflicts; well defined maintainer role and developer role ensure code quality; internal team working group forms a governance unit which decides the feature changs and policy enforcement.
By combining modular design, performance‑optimized architecture, and fully automated development of workflows, RCAMS empowers institutions to tailor their accounting processes, enhance transparency, and streamline resource management.
Bayesian methods provide a principled framework for modeling uncertainty, yet they remain underutilized across much of research computing due to perceived mathematical or tooling complexity. This poster introduces a reproducible and accessible pathway into Bayesian modeling using Stan and Python, designed especially for researchers, data scientists, and RSEs who may be unfamiliar with probabilistic methods. The project centers around a hands-on, interactive Google Colab notebook that introduces Bayesian regression using the `cmdstanpy` interface to Stan
The notebook includes hierarchical modeling examples, diagnostic tools (e.g., R-hat, ESS, trace plots), and comparative outputs between Bayesian and frequentist approaches. Participants can modify priors, likelihoods, and data inputs to observe how uncertainty propagates through their models. This work was originally developed as a workshop to support researchers aiming to improve the statistical rigor and interpretability of their analyses. The notebook minimizes setup overhead, requiring only a Google account and a browser, and is ideal for self-paced learning or group training settings. By showcasing the notebook structure, example visualizations, and participant feedback, this poster highlights how well-designed tooling and pedagogy can lower barriers to entry for Bayesian inference in RSE contexts. This approach supports the broader mission of the US-RSE community by promoting statistical literacy, reproducibility, and the adoption of robust modeling frameworks in everyday research workflows.
Modern neuroscience, among others, produces increasing amounts of raw data, e.g. from multi-channel electrophysiology or functional imaging. These data often originate from individual experimental sessions, each resulting in one dataset that is self-contained in the sense that it includes all the information for subsequent processing. Thus, these datasets are independent except on the conceptual level imposed by the experiment. Processing in this context refers to each step in the analysis pipeline, automatic or manual, which can be performed on datasets prior to pooling or cross-referencing them. Examples include: spike detection and spike sorting, or extracting downsampled local field potentials in electrophysiology; in functional imaging, regions of interest are defined and fluorescent signals saved as simple time series. In many cases, the extracted relevant data is much smaller than what was originally recorded. Therefore, it is desirable to keep only the smaller and processed datasets for final analysis on a local computer, while the original raw data and intermediate datasets can be stored on suitable server infrastructure. Maintaining the integrity of datasets throughout this process, including the relationship across phases, is crucial for reproducibility of analysis workflows and becomes increasingly challenging with the amount of data. This is especially important for labs that do not have massive infrastructure or people managing it, and instead rely on whatever their institution provides, which could be as basic as a network file share.
We propose a convention-driven framework to leverage well-established software tools, namely Git [1] and Git LFS [2] (large file storage) to solve the outlined challenges. The main idea is simple: Datasets in uniquely named directories are originally added into separate Git branches, following a customizable naming scheme, to allow independent retrieval and processing of each dataset. Git's history with a common, empty (no files) ancestor commit allows pooling of the fully-processed datasets using simple merge strategies, while maintaining a traceable record of each dataset's history. Leveraging Git LFS for storage and transfer of large files ensures binary data integrity and prevents several scenarios of accidental data loss. Furthermore, commands like “git lfs prune” allow to effortlessly free up storage space on local machines without the need to double check which data was successfully transferred previously, avoiding many human/organizational errors. We also aim at creating a new command line tool "didg", which is still under development, to facilitate the use of this framework, especially for users less familiar with Git. We aim to make didg a helpful and easy-to-use companion for neuroscience labs to manage the processing flow and integrity in a world of increasing data volumes.
One of the main motivations of this kind of convention-driven framework is the independence of additional tools (besides Git, which should be in everyone’s use already). This has allowed us to freely combine and leverage the existing tools for data processing rather than replacing them, whether that’s commercial software or custom written scripts, and even integrate manual and semi-manual steps within the same traceable analysis pipeline, all while keeping local hardware and other requirements to a minimum.
In today’s data-driven research landscape, ensuring that datasets are discoverable, reproducible, and properly attributed is essential. Assigning a Digital Object Identifier (DOI) achieves this by providing each dataset with a unique and persistent identifier, empowering researchers to cite and link data with confidence and amplifying the impact of scientific research.
The Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) data repository stores diverse Earth and environmental science data generated by projects funded by the U.S. Department of Energy. To ensure transparency and enable better data citation practices, the ESS-DIVE Digital Object Identifier (DOI) management tools allow users to reserve a DOI before a dataset is publicly released and automatically keep those DOI records in line with user-submitted metadata throughout the dataset review process, upon publication, and beyond. This ensures that researchers can cite their data in papers under review, even if the corresponding datasets have not yet been made publicly accessible. ESS-DIVE’s DOI tools leverage the DOE's OSTI Elink service to manage DOI records with DataCite. We recently performed a major upgrade of these tools to integrate with the latest version of OSTI Elink 2.0, which overhauled their existing infrastructure with modern web standards. This was a major, collaborative effort at modernizing the software infrastructure of ESS-DIVE in step with OSTI. The upgrade of the ESS-DIVE DOI management tools also contributed to advancing the underlying Metacat platform, which is used by many other DataONE projects as well, thus positively impacting the larger research community.
This modernization effort demonstrates how RSEs can successfully navigate complex collaborations while maintaining service continuity. Through this process, we enhanced ESS-DIVE's capabilities while contributing to the broader research infrastructure through Metacat platform improvements, showcasing the multiplier effect of thoughtful RSE work across the scientific community.
Efficient discovery of relevant data for scientific applications is a major challenge for researchers. This is particularly compounded for general purpose data repositories that store heterogeneous datasets with diverse metadata for different data types. The Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) is a U.S. Department of Energy repository that archives Earth sciences data. The repository recommends archiving data in specific data/metadata community standards to improve data discovery and reuse. While the repository search enables discovering data based on textual metadata provided by users, there was a need to discover data based on the file contents given the vast numbers of files archived within the datasets.
To address this issue, we developed a fusion database (FusionDB) that enables a deeper search of the metadata within the files of each dataset. FusionDB transforms this by validating datasets against community standards and indexing the actual parameters within each file, like location coordinates, time ranges, and measurement types, making them searchable and filterable.
The core engineering insight was recognizing that metadata tells you what researchers think their data contains, but automated content analysis reveals what it actually contains. The results demonstrate the power of thoughtful automation: review processes have been significantly expedited, data quality improved through automated feedback, and researchers can now preview and filter datasets based on the file contents before download. This approach will create a positive feedback loop where better tooling encourages better data practices, with the potential to raise standards across the entire community.
For fellow RSEs, FusionDB demonstrates how solving one community's pain point can catalyze systemic change. By automating the tedious work, we didn't just save time, we enabled researchers to focus on science while positioning the community for improved data quality practices. The lesson: sometimes the most impactful software engineering isn't about building new capabilities, but about removing friction from existing workflows at exactly the right point in the process.
Eye-trackers have long been used within cognitive neuroscience to better understand human development, behavior, and clinical conditions. They are also increasingly being used in commercial applications aimed at accessibility or virtual/augmented reality
Despite Python’s prominence in scientific computing generally, and neuroscience specifically, the scientific community has not coalesced around a Python library dedicated to eye-tracking research. We present our work towards integrating eye-tracking support into MNE-Python, the most popular Python library for human neurophysiological data analyses.
Open source software (OSS) for scientific applications is becoming increasingly ubiquitous. This has led to major strides and breakthroughs in research due to the sharing of computational tools and correct code resources, further promoting software openness in scientific research.
However, many repositories lack strong software practices, leading to a limited contributions from a community of users and issues with overall repository maintenance. These limitations are typically due to the mismatch that results with researchers who are not traditionally trained in good software engineering practices are focused on developing open source computational software. Some existing open source frameworks like RepoScaffolder focus on teaching researchers about the OSS lifecycle as well as required files and best practices to make a project easier to understand for both users and contributors.
While training of researchers in how to effectively create and maintain an open source repository is always important, there may also be compliance reasons to audit and address issues with repositories associated with an institution or corporation. As an example, a particular organization might want to check that all public repositories representing the organization have the same license and basic files like READMEs and CONTRIBUTING pages.
For these reasons, we propose RepoAuditor, a software tool which analyses and audits code repositories for OSS and software engineering best practices. Teams with limited software engineering skill sets can utilize RepoAuditor to identify and adopt good open-source software practices for their code repositories, and they can extend or scope the tool to performance compliance checks across many organizational repositories.
This poster goes into detail on the design and flexibility of the RepoAuditor framework as well as how RSEs can utilize this tool for their own projects and institutional requirements. Additionally, we will share results from user feedback studies as well as their impacts on the design and evolution of the RepoAuditor framework.
Open source software relies on the community of volunteer developers that contribute to and maintain the project. For research software, these developers are often the researchers who actively use the software for scientific work. What motivates researchers to contribute to the research software that they use? What are the barriers that researchers face when contributing to research software? As project maintainers for jsPsych, a widely-used open source library for building behavioral experiments, we wanted to understand how researchers who used our software thought about contributing to it.
jsPsych features a hyper-modular design that allows users to add new functionality as plugins, without needing to contribute to a central repository. An innovation in one lab can be easily published and shared with other labs, and yet we've found that despite a quickly growing userbase, the number of contributors is growing at a much slower pace. To try and get an understanding of why this is the case, we conducted qualitative interviews to uncover users’ relationships to open source contribution. We discerned common themes in contributors’ motivations, as well as barriers articulated by those who had made modifications without contributing them.
We conducted 63 interviews with active users of the software. Participants were selected from cited uses of jsPsych and ranged from university faculty to PhD candidates and even postgraduate research assistants. Only some individuals demonstrated training as software engineers, while the majority consisted of novice programmers who had hacked research solutions with the framework. Though this exploratory sample could not capture the breadth of user innovations, it became an effective base from which to launch sustained engagement of users-turned-developers. We arrived at the following conclusions by the end of our interview process. First, we found that while over half of users interviewed had made modifications, less than a quarter had ever contributed to the repository ecosystem. Next, we found that a majority of non-contributing modders had not contributed due to technical insecurities or lack of familiarity with code contribution standards. Additionally, we found that a little under a third were unfamiliar with contribution repositories, while another equal proportion of interviewees asserted their code was out of date and incompatible with current versions of the framework. Third, we found that of the few contributors interviewed, the vast majority contributed entirely in good faith and out of gratitude. Only half did so out of faith specifically in open-source principles, while only one expressed peer recognition as an additional motivator.
Based on these initial conclusions, we suggest that project leads in open source adopt more proactive approaches when calling fresh contributors into the fold. Documentation should be restructured to guide users towards contribution as a readily achievable goal. We likewise recommend that project maintainers open other accessible channels of communication with their communities, so that developers can showcase their contributions and demonstrate participation in open source development to their peers.
Monolithic software architectures are increasingly being replaced by microservices architectures due to their modularity, scalability, and flexibility. While monolithic systems package all components into a single deployable unit, microservices consist of independently deployable components, each responsible for a specific functionality. Despite the growing popularity of microservices, many organizations adopt architectural styles based on trends or anecdotal success stories, often without a comprehensive understanding of their benefits and trade-offs. This research aims to provide empirical data comparing the performance, scalability, and resource efficiency of monolithic and microservices architectures under varying levels of load. The evaluation utilized the Spring Petclinic application, deployed in both monolithic and microservices configurations
A series of controlled test cases were executed on a local machine, and performance metrics were gathered using Apache JMeter. The results indicate that the monolithic architecture outperformed the microservices configuration in terms of throughput, latency, and resource utilization under most load conditions. While microservices demonstrated lower error rates at minimal load levels, they exhibited significantly higher error rates under peak load, primarily due to service failures resulting from limited CPU and memory resources. These findings suggest that microservices architectures, although promising in scalability, require more robust infrastructure and careful orchestration to perform effectively at scale. This research contributes to the ongoing discourse on software architecture by providing quantifiable insights into when and how to adopt specific architectural paradigms based on system requirements and available resources.
Electroencephalography (EEG) is a cornerstone neuroscience technique used in humans to diagnose and localize epilepsy and for validating and generating a mechanistic understanding in animal models of neurological disease. A number of both free and paid software packages have been developed to analyze rodent EEG recordings; however, there is no interoperability between packages. This fragmentation of analysis tools and techniques is a challenge to researchers. The ability to unify analysis pipelines will allow collaboration and comparison of models across laboratories and animal models. This will, in turn, advance research into neurological disorders. We developed PyEEG (working title), a Python-based EEG analysis package designed for rodent EEG. As a free and open-source tool, PyEEG focuses on modularity, interoperability, and scalability of EEG analysis pipelines. A generalized framework for feature calculation is implemented so that contributors may easily extend the list of features. The package utilizes the SpikeInterface[1] and MNE[2] packages for data import and export, allowing support of a wide variety of file formats with a syntax used frequently in neuroscience.
Development of PyEEG adopts the Continuous Integration practice of code development, with self-contained branches tested before merging. EEG datasets can grow to several terabytes in size, which introduces complications with data loading, memory usage and compute time. Our initial implementation of serial analysis in-memory failed to scale for the whole dataset, so we address the problem by integrating dataset caching into the pipeline and using Dask to parallelize feature computation on a high-performance computing cluster. Intermediate computations are saved to avoid rerunning the whole pipeline due to trivial errors.
Design of the package was guided directly by questions we and collaborators would like to answer with EEG analysis. This produced a tool that was immediately useful, sufficiently fast, and tested on experimental data. The heterogeneity of EEG data formats prompted us to use a modular package structure that was agnostic to file types. Visualizing the dataset with plotting functions was a particular need early so that we could periodically “sanity-check” pipeline calculations with collaborators and benchmark progress.
PyEEG uses two parallel and configurable pipelines to analyze EEG traces: Window Analysis which extracts features on time-binned windows, and Spike Analysis which implements electrographic spike detection. Artifacts are rejected by outlier root mean square (RMS) amplitude events, high RMS amplitude, high beta band power or high local outlier factor. Thresholds were selected by comparison with biological limits and visual inspection of filtered data. Features from both pipelines feed into plotting utilities to visualize at the experiment level or the animal level. We validate this toolbox over mouse intracranial EEG datasets collected from several models of epilepsy. Our analysis demonstrates the value of this tool to unify pipelines in an ever-evolving field, and the lessons that come along with development. The code is available at https://github.com/josephdong1000/PyEEG. Code documentation is a work in progress at the time of this writing and available at https://josephdong1000.github.io/PyEEG/.
1. Buccino AP, Hurwitz CL, Garcia S, Magland J, Siegle JH, Hurwitz R, Hennig MH. SpikeInterface, a unified framework for spike sorting. Elife. 2020 Nov 10;9:e61834. doi: 10.7554/eLife.61834. PMID: 33170122; PMCID: PMC7704107. 2. Alexandre Gramfort, Martin Luessi, Eric Larson, Denis A. Engemann, Daniel Strohmeier, Christian Brodbeck, Roman Goj, Mainak Jas, Teon Brooks, Lauri Parkkonen, and Matti S. Hämäläinen. MEG and EEG data analysis with MNE-Python. Frontiers in Neuroscience, 7(267):1–13, 2013. doi:10.3389/fnins.2013.00267.
Testing legacy codebases often poses considerable challenges, including a scarcity of tests, monolithic architecture, inadequate documentation, and a lack of standardized development practices. These obstacles typically arise from the absence of formal testing, complicating efforts to maintain and update the code for compatibility with newer hardware and software. Consequently, writing unit tests often necessitates refactoring the code, which involves restructuring it without compromising its original functionality. However, this raises a critical question: how can one refactor without having tests in place to verify functionality?
This poster will explore the incorporation of software better practices, with a particular emphasis on testing, to facilitate the modernization of legacy codebases. By adopting a structured approach to testing, our aim is to enhance the reliability and maintainability of the code while preserving the current functionality and stability of software that produces science-ready data. Discussions will focus on practical strategies for implementing testing frameworks, establishing a culture of continuous integration, and ensuring that updates do not disrupt the production of critical scientific outputs. Through these efforts, we can create a more robust and adaptable codebase that meets the demands of modern scientific research.
In astronomy, researchers are increasingly turning to machine learning pipelines to process ever-growing datasets. These pipelines often require GPUs and stacks of interdependent software, making them challenging to manage and maintain across research groups and operating systems. In this poster, we present an example machine learning pipeline for galaxy redshift prediction using a dual-branch convolutional neural network (CNN) that combines multi-band galaxy images with brightness measurements (magnitudes) as inputs. The model is implemented in TensorFlow and features automated experiment tracking through MLflow, which provides environment versioning, model checkpointing, training histories, visualization plots, and model metadata. We use the GalaxiesML dataset, comprising approximately 300,000 galaxies. A Jupyter notebook offers step-by-step instructions for running the pipeline on Windows and Linux platforms (Apple Metal is not supported due to TensorFlow backend limitations)
The code and datasets are openly available on GitHub and designed with novice users in mind. Our example also demonstrates how to train with datasets larger than available GPU memory, addressing a common limitation in CNN tutorials. The modular design allows researchers to modify network architectures, hyperparameters, and training configurations while maintaining a complete record of results in a reproducible local environment. We will present results, visualizations, and practical guidance for deploying scalable machine learning tools in big data astronomy. The poster will also address platform-specific challenges and best practices for getting the pipeline running across different environments. Our goal is to share practical strategies for tracking experiments, model management, and reproducible workflows for research software engineers (RSEs) and domain scientists working with machine learning pipelines.