Posters

To address the lack of software development and engineering training for intermediate and advanced developers of research software, we present the NSF-sponsored INnovative Training Enabled by a Research Software Engineering ity of Trainers (INTERSECT), which delivers software development and engineering training to intermediate and advanced developers of research software. INTERSECT has three main goals: 1. Develop an open-source modular training framework conducive to community contribution 2. Deliver RSE-led research software engineering training targeting research software developers 3. Grow and deepen the connections within the national community of Research Software Engineers

The majority of INTERSECT’s funded focus is on activities surrounding the development and delivery of higher-level specialized research software engineering training.

We have conducted two INTERSECT-sponsored Research Software Engineering Bootcamps (https://intersect-training.org/bootcamp23/ and https://intersect-training.org/bootcamp24/) at Princeton University. Each bootcamp included ~35 participants from a broad range of US-based institutions representing a diverse set of research domains. The 4.5-day bootcamp consisted of a series of stand-alone hands-on training modules. We designed the modules to be related, but not to rely on successful completion or understanding of previous modules. The primary goal of this design was to allow others to use the modules as needed (either as instructors or as self-guided learners) without having to participate in the entire bootcamp.

The topics covered in the bootcamp modules were: Software Design, Packaging and Distribution, Working Collaboratively, Collaborative Git, Issue Tracking, Making Good Pull Requests, Documentation, Project Management, Licensing, Code Review & Pair Programming, Software Testing, and Continuous Integration/Continuous Deployment.

We are organizing a third INTERSECT bootcamp in July 2025. We expect to again have approximately 35 attendees from a wide range of institutions covering a diverse set of research areas. Because the format and content of the first bootcamp was well-received, we plan to follow a very similar format for the second workshop.

We were recently notified that our renewal grant to fund the INTERSECT bootcamps was funded. Therefore, we will host 4 additional annual summer bootcamps in 2026-2029.

In this poster we will provide an overview of the INTERSECT project. We will provide more details on the content of the bootcamp. We will discuss outcomes of both editions of the bootcamp, including curriculum specifics, lessons learned, participant survey results, and long term objectives. We will also describe how people can get involved as contributors or participants.

Software developers face increasing complexity in computational models, computer architectures, and emerging workflows. In this environment Research Software Engineers need to continually improve software practices and constantly hone their craft. To address this need, the Better Scientific Software (BSSw) Fellowship Program launched in 2018 to seed a community of like-minded individuals interested in improving all aspects of the work of software development. To this aim, the BSSw Fellowship Program fosters and promotes practices, processes, and tools to improve developer productivity and software sustainability.

Our community of BSSw Fellowship alums serve as leaders, mentors, and consultants, thereby increasing the visibility of all those involved in research software production and sustainability in the pursuit of discovery. This poster presents the BSSw Fellowship (BSSwF) Program highlighting our successes in developing a community around software development practices and providing information about application for the upcoming 2026 awards. As many in the BSSwF community identify as RSEs, and BSSwF projects are of particular relevance, this poster will serve to inform the community about the fellowship to build up best practices for productivity and sustainability, and to amplify the connections between research software engineers.

The Research Software Alliance (ReSA) has established a Task Force dedicated to translating the FAIR Principles for Research Software (FAIR4RS Principles) into practical, actionable guidelines. Existing field-specific actionable guidelines, such as the FAIR Biomedical Research Software (FAIR-BioRS) guidelines, lack cross-discipline community input. The Actionable Guidelines for FAIR Research Software Task Force, formed in December 2024, brings together a diverse team of researchers and research software developers to address this gap

The Task Force began by analyzing the FAIR4RS Principles, where it identified six key requirement categories: Identifiers, Metadata for software publication and discovery, Standards for inputs/outputs, Qualified references, Metadata for software reuse, and License. To address these requirements, six sub-groups are conducting literature reviews and community outreach to define actionable practices for each category. Some of the challenges include identifying suitable identifiers, archival repositories, metadata standards, and best practices across research domains. This poster provides an overview of the Task Force, presents its current progress, and outlines opportunities for community involvement. Given the progressive adoption of the FAIR4RS Principles, including by funders, we expect this poster will provide attendees at USRSE’25 with an understanding of the FAIR4RS Principles and how they can make their software FAIR through actionable, easy-to-follow, and easy-to-implement guidelines being established by our Task Force.

Member states of the Treaty on the Non-Proliferation of Nuclear Weapons that are not listed as nuclear weapons states are subject to international safeguards in order to ensure that no nuclear material is diverted or facilities are misused with the aim of building nuclear weapons. With that objective, the International Atomic Energy Agency (IAEA) and other safeguards authorities use technical measures such as seals, closed-circuit television (CCTV) cameras, radiation detectors or laser scanners in civil nuclear facilities. During the process of nuclear waste management, safeguards are applied in interim storage facilities and deep geological repositories to spent nuclear fuel, other nuclear waste forms, as well as casks and containers containing this material. These monitoring systems over the past have grown in complexity, produce large amounts of data and become more and more interconnected and automated. At the same time, digital twin concepts increasingly gain popularity in industry contexts while enabling technologies, e.g

high-performance computing and machine learning, become more easily available. This poster explores the topic of digital twins for safeguards in nuclear waste management by presenting models and software modules. At the core of our approach lies a monitoring system model implemented via a PostgreSQL database that incorporates data traces obtained from inspection data and facility operators’ declarations. To support interaction with the model, we provide a Python API that enables manipulation and tracking of the model state over time from an operator and an inspectorate perspective. Also presented here are the project’s continuous integration of tests, the automated deployment of its documentation, and its graphical user interface (GUI) which is implemented via a plotly-powered dash app. Beyond modeling, data storage and visualization, the presented software is capable of simulating different physical aspects such as neutron and gamma radiation as well as light detection and ranging (LiDAR) and it offers the possibility to generate synthetic data for compliance and diversion scenarios. The use of this synthetic data for the training of machine learning algorithms for experimental design optimization and anomaly detection is discussed. Finally, the poster will provide an outline of the envisioned scaling of this prototype software into a larger digital twin framework capable of processing and analyzing real measurement data can be alongside the synthetic data. The presented work aims at supporting and facilitating remote monitoring, the development of new safeguards techniques, as well as education and training, while aspiring to incorporate good practices of research software engineering.

The Sage project operates a national-scale distributed cyberinfrastructure of AI-enabled edge computing platforms connected to a wide range of sensors. AI algorithms running on Sage can analyze images, audio, LIDAR, and meteorological data to understand ecosystems, urban activity, and detect wildfires. With over 33 million images collected, there is a critical need for tools that enable fast, meaningful exploration of massive visual datasets. Traditional metadata-based methods (e.g., tags or timestamps) fall short for domain-specific retrieval at this scale. We present a modular, end-to-end image search system designed for distributed sensor networks

The workflow integrates automated captioning (Gemma-3), semantic embeddings (DFN5B-CLIP), keyword indexing (BM25), and reranking (ms-marco-MiniLM-L6-v2), all backed by a scalable Weaviate vector database. Images and their generated captions are embedded into a unified semantic space, enabling hybrid search that combines both semantic similarity and keyword relevance when users submit natural language queries. This architecture supports both real-time monitoring and historical analysis, scaling efficiently with large datasets and user demand. Although demonstrated on Sage, the system is model-agnostic and infrastructure-flexible, making it applicable to other distributed sensor networks. It empowers researchers in fields such as ecology and atmospheric science to search image collections by content-not just metadata-enhancing access to sensor data for scientific analysis and decision-making

The Carpentries is a community building global capacity in essential data and computational skills for conducting efficient, open, and reproducible research. In addition to certified Instructors teaching Data Carpentry, Library Carpentry, and Software Carpentry workshops around the world, the community also includes many people contributing to and maintaining Open Source lessons. Recent years have seen enormous growth in the number and diversity of lessons the community is creating, including many teaching skills and concepts essential to Research Software Engineering: software packaging and publication, environment and workflow management, containerised computing, etc

As this curriculum community has developed, demand has been growing for training opportunities, teaching how to design and develop Open Source curriculum effectively and in collaboration with others. A new program launched in 2023, The Carpentries Collaborative Lesson Development Training teaches good practices in lesson design and development, and open source collaboration skills, using The Carpentries Workbench, an Open Source infrastructure for building accessible lesson websites. As the discipline of Research Software Engineering continues to develop and mature, there is an increasing need for high-quality, Open Source and community-maintained training, and for the expertise to develop those resources. This poster will provide an overview of the training, explain how it meets this need, and describe how it fits into The Carpentries ecosystem for lesson development. It will also explain how RSEs can enroll in the training, and give examples of lesson projects that have benefited from it already.

Research laboratories require a professional and up-to-date website to showcase their work, attract collaborators, and establish a strong online presence. However, developing and maintaining such a website can be challenging and time consuming. We present Lab Website Template, open-source website template specifically designed for the unique needs of labs, allowing lab members to focus more on research activities and less on web development.

A critical feature of our template is its ability to automatically generate citations (titles, authors, publishing info, etc.) from simple identifiers like DOIs, PubMed, ORCID, and thousands of other types using the open-source tool Manubot. Advanced pipelines automatically update citations when site content changes, on a periodic schedule, or on demand. The user chooses what sources to pull from and how to filter and display the resulting citations on their site.

To reduce administrative burden and to remain accessible to non-technical users, we automate everything we can, not just citations. This includes: first time setup of the repository, generating live previews of pull requests, configuring custom URLs, and more. To the same end, though the full codebase is exposed to the user, it is structured around the most common use cases and separates user content from internal logic. This allows users to concentrate on the content of their site without wading through implementation details, but still enables advanced customization if needed.

A key strength is our singular commitment to user support, demonstrated by consistently low issue response times and high successful resolution rate. We foster an environment where user input is highly valued and carefully considered, but balanced against maintaining an un-bloated design. Recognizing that source code and comments alone are insufficient, we also maintain a searchable, comprehensive, dedicated documentation suite so users can find answers immediately before needing to ask for help.

The success of our design and approach is evidenced by our template's widespread adoption, with 450+ stars on GitHub, 350+ forks, and a growing gallery of dozens of active labs using it. This work reflects not just a powerful software tool, but an evolving, open, community resource that effectively serves the needs of its users.

The proposed poster will describe the history, process, and future work of the DHTech Code Review Working Group

The working group has implemented a community code review process, in which volunteers from across organizations worldwide review code submitted by digital humanities projects. To date the group has facilitated 9 code reviews. The poster will present how reviews are conducted and what lessons have been learnt.

In 2025, DHTech, a community of people doing technical work in the digital humanities, ran a survey to better understand who is developing code in the digital humanities

The survey contained questions about the technologies DH RSEs use, their training and career path. We propose a poster that presents the results of that survey.

This poster describes the work presented in a published paper [1], which reports the results of a survey of research software developers. The focus of the study was to understand how research software developers design test cases, handle output challenges, use metrics, execute tests, and select tools. Specifically, the study explores the following research questions; - RQ1: What are the characteristics of the research software testing process? - RQ2: What challenges do developers face throughout the testing process of research software? - RQ3: How do research software developers design test inputs? - RQ4: What specific challenges do research software developers face when determining the expected output of test cases? - RQ5: What metrics do research software developers use to measure software quality and test quality of tests? - RQ6: How do research software developers execute their tests? - RQ7: What testing tools do research software developers use and what are their limitations? - RQ8: What features should a testing tool specifically developed for research software contain? - RQ9: Are demographic characteristics of research software projects related to the testing process?

The poster will contain graphs reporting the detailed results related to these questions. Our overall findings show that research software testing varies widely. The primary challenges faced by research developers include test case design, evaluating test quality, and evaluating the correctness of test outputs. Our findings highlight the need to allocate more resources to software testing and provide more education and training to research software developers on software testing.

Packaging Python code, especially for scientific applications, has historically been hard, with poor documentation, many obscure fragile hacks into a rigid system, and challenges distributing this to users. All that has dramatically improved over the last few years with a series of PEPs (Python Enhancement Proposals) creating a unified standard to build on, innovative packages that have built on these, and extensive documentation around best practices. We will be looking at how the Scientific Python Development Guide [1], along with its template and repo-review [2] tooling, teaches about modern scientific Python software development. This is part of an effort to document recommendations for Python development with several key tools, like uv, ruff, cibuildwheel [3], and scikit-build-core [4].

The Scientific Python Development Guide was started as a guide in the Scikit-HEP organization, written to facilitate the growing number of packages with a growing number of maintainers. Over time, a cookiecutter-based template was added, and a tool to evaluate repository’s configuration and compare it to the guide was added. At the first Scientific Python Developer Summit in 2023, the guide was moved to the Scientific-Python organization, the tools were reworked and generalized, automation was expanded, parts of the NSLS-II guidelines were merged in, the three separate components were combined into one repo, and the guide became part of the Scientific Python organization. Since then, the guide has remained one of the most up-to-date resources available, with new packaging changes like Trusted Publishing and SPDX license identifiers being added days and even sometimes hours after they are announced. We will look at some of the recommendations, especially newer ones.

The cookiecutter, now part of the guide as well, provides a fast start for users who have read the guide. It supports both cookiecutter and copier programs, and can be used with “update” tooling (cruft or copier). It supports around 10 build backends, including the most popular pure Python and compiled extension backends. You can select classic or vcs versioning, and can pick between several licenses. We will see how easy it is to start a project with all tooling in place.

Repo-review is a framework for writing checks for configuration; it is written in Python and can be run using WebAssembly. The sp-repo-review plugin contains all the checks described in the guide, with cross-links to take a user to the badge in the guide for each check. Some projects, like AstroPy, have added this to their CI, though it can be used entirely from the guide’s website.

A dedicated section on binary package will include scikit-build-core, the native build backend that can adapt CMake projects into Python wheels. When combined with pybind11 [5], you can write working compiled extensions with just three files and a small handful of lines. Combined with cibuildwheel, the wheel builder tool used throughout the Python ecosystem, and a good CI service like GitHub Actions, building redistributable wheels for all major platforms, and even WebAssembly, iOS, and Android, is easy. All the required code to get started can easily be shown on a poster.

References 1. The Scientific Python Development Guide, Henry Schreiner, et. al, June 2023. Retrieved 2025-7-17 2. https://learn.scientific-python.org/development 3. repo-review, Henry Schreiner, et. al, June 2023. https://repo-review.readthedocs.io 4. cibuildwheel, Joe Rickerby, Henry Schreiner, et. al, 2016. https://cibuildwheel.pypa.io 5. scikit-build-core, Henry Schreiner, et. al, 2022.. https://scikit-build-core.readthedocs.io 6. pybind11, Wenzel Jakob, Henry Schreiner, et. al, 2017. https://pybind11.readthedocs.io

Science thrives when researchers and software engineers share their expertise. As scholarship across disciplines becomes more dependent on computational methods, effective collaboration between Research Software Engineers (RSEs) and Software Engineering Researchers (SERs) is essential to drive innovation, optimization, productivity, reproducibility, stewardship, and solutions for interdisciplinary problems [1, 2, 3, 4]. This poster presents ten strategic guidelines to foster productive partnerships between these two distinct yet complementary communities. We very recently published these rules in a longer paper [5]; this poster summarizes the key elements to foster and encourage future RSE-SER collaborations.

RSEs are deeply embedded in research contexts, often balancing software development with domain-specific knowledge. They focus on creating software to meet evolving research needs, including flexibility and experimental workflows. SERs are trained to develop robust, scalable, and maintainable systems emphasizing engineering principles, often aligning with industry best practices. Career paths, incentives, and constraints all differ between these communities. Thus, achieving collaboration between SERs and RSEs requires intentional effort and the application of change theories [1, 6]. To build synergistic relationships between SERs and SREs that enhance the quality and impact of research outcomes, we recommend these ten principles: Recognize The Two ities Are Different: RSEs and SERs must appreciate one another' s unique roles and cultures, celebrate their strengths, and avoid assumptions. Acknowledge Collaboration Is Not Going to Just Happen: Partnerships must be deliberate; proactive initiation, inquiry into mutual goals, and consistent investment are needed. Define Clear Goals and Outcomes: Open discussion, measurable and well documented objectives, and regular check-ins ensure progress, though flexibility is needed too. SERs Must Engage with RSEs in Their Professional Environments: SERs should appreciate RSEs’ institutional obligations and resource constraints. SERs can gain these insights and build trust by attending RSE conferences, workshops, and talks. Identify the Intersection of Shared Research Software Challenges: Collaborations must address common challenges that are practically significant and academically valuable. Ensure Mutual Benefit in Collaboration: Both parties must earn immediate and long-term value. Incentive schemes must be understood. Authorship and leadership responsibilities should be explicitly addressed. Maintain an Open Mind Toward Emerging Challenges: Collaborators should be adaptable and ensure continuous dialogue to identify new challenges and solutions. Actively Advocate for Each Other: RSEs and SERs should showcase one another’s work to demonstrate respect and the value of collaboration. Maintain Vigilance and Recognize When Collaborations Are Off Course: To sustain a win-win relationship, SERs and RSEs must regularly review progress and concerns. Secure Institutional Support: All parties should seek funding, advocate for institutional recognition of RSE and SER roles, and promote frameworks that support collaboration (e.g., joint appointments and recognition programs). Recognizing the distinct cultures, priorities, and workflows of RSEs and SERs is fundamental to building productive partnerships and high quality research software.

The Oak Ridge National Laboratory, in partnership with NOAA, manages the Gaea supercomputer, an HPE Cray EX 3000 machine with an aggregate peak performance of 20 petaflops, dedicated to earth and climate science research. The majority of the workloads involve running models like AM4 [1], ESM [2] and MOM [3]. These are developed with strict reproducibility requirements so any new developments in the models do not change expected results. These models are tightly integrated with the Intel compiler suite and the Cray system software such that even small changes to those libraries could result in variations in the results of the models. The development cycle of these models need to ensure that system and library upgrades do not change the expected results. The Gaea supercomputer undergoes mandatory upgrades when old software goes out of support. The developers use a testing and development system (TDS) of the same architecture as Gaea with the updated environment to upgrade the models to run correctly on the new software stack before the main system upgrade. Notwithstanding, it is essential sometimes for the developers and users to continue to have access to the old software stack. That way, they are able to have ample time for testing or to continue running the old models without the changes in results. To address this, we make use of the HPE Cray Programming Environment (CPE) containers [4]. HPE releases these containers beginning with the CPE 23.05 release. The HPE Cray 23.03 programming environment which comprised cray-mpich 8.1.25, libfabric 1.12.1, libsci 23.02.1.1 and Intel 2022.2.1 which provided the icc 2021.7.1 compiler on Gaea was scheduled to be removed and replaced with an upgraded CPE

Since there was no 23.03 container, we had to rely on a 23.09 CPE container and significantly modify it to produce a container with the 23.03 programming environment. Gaea uses Apptainer as its runtime, but the HPE containers which are built with Docker and include environment modules [5], do not initialize those modules correctly when started with Apptainer. This is because the initialization scripts do not run like they do on Docker to set the environment variables. We solve this by starting the container with Podman (which is equivalent to Docker), copy the properly initialized environment variables into a file, then create and run the Apptainer container by loading these saved environment variables at the start in order to set up the modules on the running Apptainer container. This is done along with removing the 23.09 specific software from the CPE 23.09 container and installing the 23.03 specific software. In addition, the required old Intel compiler version is also installed since the CPE containers don’t distribute those. To verify, we had a reproducer running a development version of the GFDL Atmospheric model (AM5), based on AM4 [1]. The reproducer was the binary and files that were built in the older software environment but wouldn’t work on the updated environment. We run them from within the running 23.03 container and verify the binaries work within the container environment. We verify with checksums of the values that represent the zonal wind components calculated across data spread across MPI processes. The checksum is compared with the checksum value produced from an older run that predates the system upgrade. Previous papers [6,7,8,9] have covered container performance for high performance computing workloads and found it close to native, so it is not reiterated here.

The Interdependent Networked ity Resilience Modeling Environment (IN-CORE) is a state-of-the-art computational environment developed by the NIST Center of Excellence for Risk-Based ity Resilience Planning to advance community resilience measurement science and provide a decision support system. IN-CORE integrates various models, including physics-based modeling of infrastructure, network models, data-based socio-economic models, and resilience metrics, to help communities prepare for, mitigate, and rapidly recover from hazard events. IN-CORE has over 28 GitHub repositories, managing various components of the system, including Helm charts used for deployment in a Kubernetes cluster, several Python libraries for modeling, visualization, and community contribution, as well as the Java-based IN-CORE web services. Each software component has continuous build and integration that happens through a series of GitHub actions that help automate the software development process, from Unit testing and Linting to automatic GitHub releases to GitOps for software deployment. While all projects can benefit from this automation, it is particularly beneficial on a large-scale project with many components to help manage the complexity of the entire process, help with quality control, and ensure smooth delivery of releases and deployments. On the modeling side, the IN-CORE development team at NCSA has developed a process for managing the intake of new models from researchers that is dependent on where the researcher is in their development process

For the case where a researcher has code that is ready, they can submit it for review by a research software engineer (RSE). An RSE is assigned to work with the researcher to review their code and finish the implementation in IN-CORE. If a researcher is at an earlier stage of development, they can submit for consultation so that an RSE can help them with library selection, requirements gathering and other implementation details to help get their code ready for inclusion in IN-CORE. Regardless of the readiness, this often involves an iterative process of working with the researcher and communicating with the research team through online meetings, Slack and email. In some cases, there may be scientific questions that need to be resolved and a separate Science Committee (SciCom) is involved to help resolve any issues before the implementation can continue. The software development life cycle is more complex on larger projects with many moving parts. Having good processes in place can help manage this complexity.

Large language models (LLMs) with billions of parameters present substantial challenges in terms of memory, computation, and energy requirements. Their scale often renders them impractical for deployment on resource-constrained devices such as personal computers, laptops, and mobile phones. These limitations significantly hinder their usability in real-time, on-device applications, such as home security systems requiring immediate threat detection.

While training a large language model is a resource-intensive, one-time process, the operational costs of inference—the repeated running of the model to serve users—also accumulate rapidly. Industry data indicates inference makes up majority of machine learning workloads; for example, Amazon Web Services and Nvidia estimate that 80–90% of overall ML demand is devoted to inference rather than training. As LLM usage scales into millions or billions of queries, the ongoing inference energy consumption can ultimately surpass the energy cost of initial training, especially when factoring in growth in users and applications. Inference workloads not only drive-up electricity usage and data center cooling demands, but also carry significant carbon and water footprints, particularly when powered by fossil-fuel-based energy sources. This ongoing operational burden is a critical factor in the overall environmental and economic impact of LLM deployment.

To address these issues, model compression techniques such as pruning, quantization, distillation, and factorization have been proposed. These methods aim to reduce model size and computational complexity without significant loss in performance. Compressed models not only reduce storage and cloud computing costs but also improve inference speed and energy efficiency. Importantly, they expand accessibility by enabling advanced AI capabilities on low-power and edge devices. These advancements are essential for building sustainable, scalable, and widely deployable AI systems.

Simulating the Earth's climate is an important and complex problem, thus Earth system models are similarly complex, comprising millions of lines of code. In order to appropriately utilize the latest computational and software infrastructure advancements in computational and Earth system science, while running on modern hybrid computing architectures, to improve the model performance, precision, accuracy, or all three; it is important to ensure that model simulations are repeatable and robust. This introduces the need for establishing statistical or non-bit-for-bit reproducibility, since bit-for-bit reproducibility may not always be achievable. Here, we propose a short-simulation ensemble-based test for an atmosphere model to evaluate the null hypothesis that modified model simulation results are statistically equivalent to those of the original model, and implement this test in US Department of Energy's Energy Exascale Earth System Model (E3SM, Golaz, et al., 2022). The test evaluates a standard set of output variables across the two simulation ensembles and uses a false discovery rate correction (Benjamini and Hochberg, 1995) to account for multiple simultaneous testing

The false positive rates of the test are examined using re-sampling techniques on large simulation ensembles and are found to be lower than the currently implemented bootstrapping-based testing approach in E3SM (Mahajan et al. 2019). We also evaluate the statistical power of the test using perturbed simulation ensemble suites, each with a progressively larger magnitude of change to a tuning parameter. The new test is generally found to exhibit greater statistical power than the current approach, being able to detect smaller changes in parameter values with higher confidence.

The field of computational paleontology is rapidly advancing with many recently developed open-source R packages leading the charge for more standardized, reproducible, and open research. This push is a relief for many data-science-minded paleontologists who have previously toiled over writing their own scripts to download, clean, analyze, and visualize their data. Many of these steps are now covered by functions in these new packages (and those of other packages in the R universe). However, this push for more script-based research may introduce a wrench in the existing scientific workflows of less technical researchers who lack a background in coding or cause a greater learning curve for new researchers introduced to the field. Therefore, bridging the gap between visual, hands-on workflows and digital, code-based workflows is imperative to the collaborative future of computational paleontology. Here I present a new open-source Shiny app, paleopal (https://github.com/willgearty/paleopal), that provides a user-friendly interface to build paleontological data science workflows without any programming knowledge

The app connects existing paleontological R packages such as palaeoverse1 and deeptime2 with the tidyverse3 suite of R package to encourage standardized scientific pipelines. Specifically, paleopal presents users with a curated set of workflow “steps” (e.g., data upload, data cleaning, and data visualization) that they can then choose from, customize, and reorder to develop their pipeline. The app is a built on top of the shinypal4 package which uses the shinymeta5 R package to provide a live code and results panel and a downloadable RMarkdown script as the user develops their pipeline. To increase accessibility, I have hosted the shiny app as a serverless application on GitHub Pages (http://williamgearty.com/paleopal/) using the shinylive6 R package and the webR framework. To my knowledge, this is the first use of shinymeta within a webR project which has presented many technological hurdles to overcome, including dealing with browser filesystems, restricted access to operating system software, and cross-browser support. Nonetheless, paleopal aims to spearhead the next generation of training of computational paleontologists, regardless of age, background, or technical expertise. Additionally, the extensible nature of paleopal makes it easy to add further curated workflow “steps”, and the underlying shinypal package could also be used to create similar shiny apps for other scientific fields.

Generative AI tools are rapidly impacting software development practices including the development of graphical user interfaces (GUI). These coding and design tools offer an attractive promise of accelerating the pace of development while reducing onerous or mundane engineering tasks. Generative AI provides Research Software Engineers (RSEs) an opportunity to design and implement the graphical user interface (GUI) of scientific web applications rapidly, even when they may not be trained in web development. Generative AI tools such as Cursor [1], Builder.io [2], Lovable [3] and others, take a prompt (e.g. “I need a scientific data dashboard to monitor experiments and their data outputs”) and rapidly produce an attractive, seemingly useful interactive design that looks like it can be connected to a developer’s pre-existing application. However, the question is whether this approach will truly be able to help produce usable and meaningful GUIs.

RSEs often do not have substantial training in user experience design and usability evaluation work. It is essential that RSEs leveraging AI tools to create UIs have resources that can provide them with confidence in the quality of the resulting user experience. The STRUDEL project [4] is dedicated to building open source scientific user experience (UX) resources, which incorporate expert UX researcher and designer knowledge that RSEs and domain scientists can easily leverage. Currently the project team is investigating the effectiveness of various AI tools to generate deployable prototypes for scientific research web applications.

Upon initial exploration, we find that while AI tools can generate useful ideas and prototypes, they often require significant refinement, revision and expertise to generate usable and maintainable UIs for scientific software. To help improve the usability of the prototypes generated through AI tools, we experimented with using our STRUDEL design templates as starting points from which the AI tools can expand upon. Our experiments show the promise of using carefully researched templates as a type of guardrail for the AI tools. These guardrails can help ensure that the UI product adheres to certain design guidelines without the need to explicitly provide those details in a prompt. We posit that the use of human-designed seed templates as starting points can be a useful method for improving the overall usability of AI-generated GUIs. Using a seed template is a much-needed way to incorporate user experience expertise in the loop while leveraging AI tools to generate UI flows and patterns.

Our initial work explores the implications of AI tool-generation on the usability, development, and long term maintenance of the GUIs, but it is clear this will be an important consideration for the US-RSE community. With this poster we aim to foster discussion among the US-RSE community to delve into the grand promises and likely perils of generative AI tools for building usable scientific web applications.

Large Language Models (LLMs) offer new possibilities for augmenting scholarly document analysis, including automating the extraction and verification of software mentions in scientific papers. While LLMs achieve high recall in identifying software mentions, we observe significant inconsistency across runs in classifying those mentions as cited or uncited. This variability raises critical questions about the reproducibility and trustworthiness of LLM-driven bibliographic tools, especially for tasks with high semantic precision like software citation tracking.

In this poster, we will present the SoftciteAuditor tool we built to extract software mentions and their use in research papers which may or may not be properly cited. The tool employs LLM based on user choice to perform the analysis of a paper and suggest missed software citations with a hope of ensuring proper attribution and highlighting the importance of software in research publications.

The growing importance of research software heightens concerns about research software security, which will only intensify if not proactively addressed. Before any specific measures or interventions can be suggested, it is essential to understand the RSE community’s security behaviors, competencies, and values, collectively referred to as their ‘security culture’ [1]. While studying the climate and culture within a group of people is not a new concept or research topic, to our knowledge, no security culture research has taken place within the RSE community.

In this study, we aim to characterize the security culture of the RSE community by replicating a prior work performed in the open-source software space [3]. To broaden our sample, we distributed this survey to RSE community members in both the US and Germany. By replicating an existing survey, we can compare the RSE community’s responses with those of the open-source community, which shares some characteristics with RSE [4-5]. In addition to the original survey, we added a series of vignettes to gauge the RSE community’s knowledge and perception of threat modeling, a standard “shift-left” approach to security. By doing so, we gauge RSE interest in participating in security efforts and motivate future security research in the research software domain.

Ultimately, we surveyed 104 members of the RSE community, including both those in the US and Germany. To characterize RSE security culture, we ask the following research questions: RQ1: What is the security culture of the RSE community? RQ2: How does the RSE community’s security culture compare with the Open-Source ity’s security culture? RQ3: What is the perception among RSE community members on adopting threat modeling during development?

The primary contributions of this study are: 1) A novel characterization of the RSE community’s security culture, 2) an empirical comparison of the security culture of RSEs and OSS developers, and 3) recommendations for internal and external stakeholders to improve RSE security culture. This study is a first step toward tailoring “shift-left” security principles to address the unique challenges that RSEs face.

Graphical Processing Units (GPUs) are now integrated into the worlds fastest supercomputers to solve the most computationally difficult, mechanistic scientific problems in physics, chemistry, biology, material sciences, energy, earth and space sciences, national security, data analytics, and optimization. An 8 year, $1.8 billion effort called the Exascale Computing Project involving 2,800 scientists and engineers recently finished modernizing the underlying scientific numerical software to efficiently use this new generation of machines at more than a billion, billion floating point calculations per second referred to as exaflops. While NSF ACCESS and the computing facilities themselves directly grant U.S. institutional and industry researchers access to these GPU-accelerated machines at no cost, research software engineers and scientists must first show that their computations scale to efficiently use multiple GPUs across many computer servers / nodes. Therefore, it is imperative to teach the data parallel software development skills, calibration techniques, and sustainable software practices now required to enable large scale, GPU-powered computational research. This poster discusses the unique challenges of semester-long training and course infrastructure to cross-train domain scientists as research software engineers.

The course is architected around solving a real-world research problem with a faculty mentor as a capstone project and is tailored to undergraduate students, graduates students, faculty and staff. Undergraduate students and staff members are paired with a faculty mentor ideally before the course and within the first 2 weeks of the course before the drop date; this helps scope the capstone project and group learners in complementary research areas together. This course does not strongly focus on AI tools—there are plenty of courses covering those—but instead focuses on high performance computing (HPC) tools and HPC numerical libraries that can be instrumented for hardware performance analysis. The capstone project is intended to support summer NSF research experiences for undergraduates (REU), collaborations between faculty and core-facility staff, and preliminary grant work of faculty, postdoctoral trainees, and graduate students; this intent is similar to introductory grant writing classes.

The course is organized into three phases. The first phase is a crash-course covering efficient within-node, multi-node, and data parallel GPU programming; this provides the initial foundations for learners tobegin writing their capstone project code. These topics are covered using C and C++ because C more easily allows inspecting vectorized instructions and C++ is central to vendor-agnostic, GPU and numerical HPC libraries. The second phase is basic knowledge and skills in distributed methods for parameter fitting, performance tuning, automated unit testing and performance testing, and stress-free peer code review along with writing various types of documentation. The third phase covers advanced topics with less immediate applicability to the capstone project, doing deeper dives into topics of interest to the cohort, and for progress reports and final presentations of individual capstone projects.

As this course is still being developed, invaluable feedback from the USRSE community about their teaching, learning, and pedagogical experiences would help make it an enjoyable experience for the future learners. The course assumes experience with at least one programming language, therefore, assigned reading material with a short, low-stakes quiz before each class would help level the knowledge and increase confidence of learners for the classroom instruction. Providing classroom laptops or portable PCs are being investigated to deliver instruction because they would allow group activities such as connecting machines to form a cluster and allow students to watch local resource usage, understand latencies, topologies, and provide memorable, active learning experiences complementing terminal work on remote HPCs. It would be of great interest to learn from USRSE instructors about creating engaging opportunities for classroom learning with such portable GPU machines. Lastly, the University of Pittsburgh libraries support developing open educational resources for which USRSE attendees can weight in on gradable, pedagogically useful exercises often missing from documentation and training resources to help teach RSE skills. Hopefully this poster discussion and feedback from the USRSE community can help make this course a success and support similar teaching and learning endeavors.

As 3D printing moves toward full autonomy, the need for a universal, scalable, and intelligent architecture becomes increasingly critical. Modern additive manufacturing faces persistent barriers: fragmented device interfaces, lack of remote operability, ad hoc parameter tuning, and insufficient integration between physical processes and data-driven control. As additive manufacturing evolves toward more complex applications such as structural color fabrication, there is a growing demand for a universal, intelligent control framework capable of integrating diverse printer hardware, automating parameter optimization, and managing experimental data at scale.

Our work presents a novel universal edge-cloud architecture that combines edge-side hardware adaptors with a cloud-based backend to support real-time control, intelligent optimization, and robust data management in the domain of autonomous 3D print. A centralized, user-friendly web interface allows users to configure and launch print jobs (campaigns), define experimental parameters, and review historical results. The cloud backend is built with scalable microservices, including a high-performance RabbitMQ-based messaging system for edge-cloud communication, MongoDB for structured storage of experimental metadata, print records and Clowder for managing archival files and analysis reports. Distributed image analysis and machine learning-based prediction services run in parallel to optimize print parameters. This supports autonomous closed-loop experimentation across large-scale campaigns. To ensure flexibility and hardware independence, print jobs are abstracted using a custom PCP (Parameterized Control Protocol) file format. PCP files encapsulate device-specific instructions along with structured metadata describing print sequences, geometric layout, and adjustable parameters. This abstraction enables batch-based execution and seamless orchestration across printers running different configurations. The system exchanges messages with 3D printers via a custom-developed edge adaptor deployed to the near printers.

We validate the system through an application in structural color 3D printing, focusing on the relationship between color properties—represented in the HSV color space—and key printing parameters such as pressure, speed, and bed temperature. The system was deployed on the UIUC Radiant cloud cluster and tested across multiple print shapes defined by PCP files on the 3D printers using Marlin firmware. To efficiently explore the parameter space and enhance predictive performance, we incorporate a Bayesian optimizer into the machine learning workflow. The system’s adaptability and modular design enable scalable, reproducible experimentation across a wide range of hardware, demonstrating its potential as a universal platform for autonomous 3D printing.

The development and use of scientific software often entails collaborative work within and across teams. Understanding how such teams collaborate in addition to other factors that influence their likelihood of success is critical given the prevalence and necessity of teamwork for scientific software [1]. In particular, we contend that understanding the current state of collaboration in scientific software projects affords the discovery of opportunities to design more effective collaborations and produce better software and scientific outcomes. To advance the study of teamwork for scientific software, we conducted a systematic literature review (SLR) with a focus on papers that analyze or otherwise describe collaborative work in the domain of scientific software [2]. The main objective of the SLR was thus to provide a foundation for future research based on key insights and gaps identified in the literature on scientific software teams. Our search was applied to three databases (Web of Science, ACM Digital Library, IEEE Xplore) with backward and forward citation search applied to papers selected for final analysis [3]. The results of the SLR indicate that teamwork in scientific software remains understudied: collaboration is repeatedly mentioned but rarely the focus of investigation. At the same time, researchers in this domain recognize the significance of teamwork and have sought to develop tools with the goal of facilitating various aspects of collaboration like communication and data sharing. The evaluation of collaboration tools and the collaboration itself is generally not reported in these publications, suggesting that such evaluations are infrequent, undervalued, and/or constrained (e.g., due to lack of time and funding). Fitting with recent calls for research [4], our results highlight the need for a concerted effort to analyze the inputs, processes, and outcomes of collaborative work in the development and use of scientific software. We propose a path forward for studies of collaboration in scientific software aimed at enhancing both teamwork and software, with emphasis on team roles, cross-disciplinary requirements, and generative AI.

References [1] M. Heroux et al., “Basic Research Needs in The Science of Scientific Software Development and Use: Investment in Software is Investment in Science,” United States Department of Energy, Advanced Scientific Computing Research, Workshop, Aug. 2023. doi: 10.2172/1846009. [2] B. Kitchenham and S. Charters, “Guidelines for performing systematic literature reviews in software engineering,” Keele University & University of Durham, Keele, UK, Technical Report EBSE-2007-01, 2007. [3] C. Wohlin, “Guidelines for snowballing in systematic literature studies and a replication in software engineering,” in Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering, in EASE ’14. New York, NY, USA: Association for Computing Machinery, May 2014, pp. 1–10. doi: 10.1145/2601248.2601268. [4] M. Felderer, M. Goedicke, L. Grunske, W. Hasselbring, A.-L. Lamprecht, and B. Rumpe, “Investigating Research Software Engineering: Toward RSE Research,” Commun. ACM, vol. 68, no. 2, pp. 20–23, Feb. 2025, doi: 10.1145/3685265.

Research computing centers need flexible accounting systems to track consumption of compute and storage resources under a cost model that allows researchers (or their groups) to purchase capacity for their workloads [1]. An effective system must: (1) bind funding sources to the Principal Investigators (PIs) who control them, (2) link those funds to specific resource allocations, and (3) retain a complete transaction history for auditing and reporting. Existing tools—such as Moab Accounting Management and the native accounting features in SLURM—do not fully capture the complex organizational structures, funding flows, and policy constraints typical of academic research environments. To address these gaps, the Partnership for an Advanced Computing Environment (PACE) at Georgia Tech is developing the Research Computing Accounting Management System (RCAMS), an open‑source Python platform engineered from first principles with contemporary software design best practices [Figure 1].

RCAMS features a maintainable, three‑layer architecture shown [Figure 2]:

1. Domain Layer: Uses SQLAlchemy ORM to represent core entities (accounts, funds, resources, organizations, owners) with built‑in data integrity checks and a full audit‑logging subsystem that records every change to the database.

2. Operations Layer: Implements generic CRUD operations via a flat inheritance model centered on _OperationBase[T], where T is the domain entity [2]. This approach avoids deep class hierarchies and simplifies maintenance.

3. CLI Front‑End: Inspired by Git’s user experience, the rcams command exposes subcommands for each entity (e.g., rcams account add-storage, and rcams fund get-spending-history), ensuring discoverability and consistency.

To optimize performance—especially when auditing hundreds of thousands of records—RCAMS isolates each entity’s operations and CLI handlers into separate modules. This compartmentalization reduces import overhead and facilitates parallel development by multiple Research Software Engineers (RSEs) with minimal merge conflicts.

Quality is enforced through a comprehensive CI/CD pipeline in GitHub Actions: linting with Ruff [3], static‑typing validation with mypy [4], over 95 % test coverage via pytest [5], automated Sphinx [6] documentation deployment to GitHub Pages, and seamless issue‑tracking updates with GitHub Projects Kanban Board. Together, these practices ensure that every release is reliable, well‑documented, and immediately deployable.

RCAMS is well documented with contributor guidelines, which allows efficient onboarding for the new team members; modular design increases the parallelism in the development, greatly reduced the merge conflicts; well defined maintainer role and developer role ensure code quality; internal team working group forms a governance unit which decides the feature changs and policy enforcement.

By combining modular design, performance‑optimized architecture, and fully automated development of workflows, RCAMS empowers institutions to tailor their accounting processes, enhance transparency, and streamline resource management.

Modern neuroscience, among others, produces increasing amounts of raw data, e.g. from multi-channel electrophysiology or functional imaging. These data often originate from individual experimental sessions, each resulting in one dataset that is self-contained in the sense that it includes all the information for subsequent processing. Thus, these datasets are independent except on the conceptual level imposed by the experiment. Processing in this context refers to each step in the analysis pipeline, automatic or manual, which can be performed on datasets prior to pooling or cross-referencing them. Examples include: spike detection and spike sorting, or extracting downsampled local field potentials in electrophysiology; in functional imaging, regions of interest are defined and fluorescent signals saved as simple time series. In many cases, the extracted relevant data is much smaller than what was originally recorded. Therefore, it is desirable to keep only the smaller and processed datasets for final analysis on a local computer, while the original raw data and intermediate datasets can be stored on suitable server infrastructure. Maintaining the integrity of datasets throughout this process, including the relationship across phases, is crucial for reproducibility of analysis workflows and becomes increasingly challenging with the amount of data. This is especially important for labs that do not have massive infrastructure or people managing it, and instead rely on whatever their institution provides, which could be as basic as a network file share.

We propose a convention-driven framework to leverage well-established software tools, namely Git [1] and Git LFS [2] (large file storage) to solve the outlined challenges. The main idea is simple: Datasets in uniquely named directories are originally added into separate Git branches, following a customizable naming scheme, to allow independent retrieval and processing of each dataset. Git's history with a common, empty (no files) ancestor commit allows pooling of the fully-processed datasets using simple merge strategies, while maintaining a traceable record of each dataset's history. Leveraging Git LFS for storage and transfer of large files ensures binary data integrity and prevents several scenarios of accidental data loss. Furthermore, commands like “git lfs prune” allow to effortlessly free up storage space on local machines without the need to double check which data was successfully transferred previously, avoiding many human/organizational errors. We also aim at creating a new command line tool "didg", which is still under development, to facilitate the use of this framework, especially for users less familiar with Git. We aim to make didg a helpful and easy-to-use companion for neuroscience labs to manage the processing flow and integrity in a world of increasing data volumes.

One of the main motivations of this kind of convention-driven framework is the independence of additional tools (besides Git, which should be in everyone’s use already). This has allowed us to freely combine and leverage the existing tools for data processing rather than replacing them, whether that’s commercial software or custom written scripts, and even integrate manual and semi-manual steps within the same traceable analysis pipeline, all while keeping local hardware and other requirements to a minimum.

In today’s data-driven research landscape, ensuring that datasets are discoverable, reproducible, and properly attributed is essential. Assigning a Digital Object Identifier (DOI) achieves this by providing each dataset with a unique and persistent identifier, empowering researchers to cite and link data with confidence and amplifying the impact of scientific research.

The Environmental System Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) data repository stores diverse Earth and environmental science data generated by projects funded by the U.S. Department of Energy. To ensure transparency and enable better data citation practices, the ESS-DIVE Digital Object Identifier (DOI) management tools allow users to reserve a DOI before a dataset is publicly released and automatically keep those DOI records in line with user-submitted metadata throughout the dataset review process, upon publication, and beyond. This ensures that researchers can cite their data in papers under review, even if the corresponding datasets have not yet been made publicly accessible. ESS-DIVE’s DOI tools leverage the DOE's OSTI Elink service to manage DOI records with DataCite. We recently performed a major upgrade of these tools to integrate with the latest version of OSTI Elink 2.0, which overhauled their existing infrastructure with modern web standards. This was a major, collaborative effort at modernizing the software infrastructure of ESS-DIVE in step with OSTI. The upgrade of the ESS-DIVE DOI management tools also contributed to advancing the underlying Metacat platform, which is used by many other DataONE projects as well, thus positively impacting the larger research community.

This modernization effort demonstrates how RSEs can successfully navigate complex collaborations while maintaining service continuity. Through this process, we enhanced ESS-DIVE's capabilities while contributing to the broader research infrastructure through Metacat platform improvements, showcasing the multiplier effect of thoughtful RSE work across the scientific community.

Efficient discovery of relevant data for scientific applications is a major challenge for researchers. This is particularly compounded for general purpose data repositories that store heterogeneous datasets with diverse metadata for different data types. The Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE) is a U.S. Department of Energy repository that archives Earth sciences data. The repository recommends archiving data in specific data/metadata community standards to improve data discovery and reuse. While the repository search enables discovering data based on textual metadata provided by users, there was a need to discover data based on the file contents given the vast numbers of files archived within the datasets.

To address this issue, we developed a fusion database (FusionDB) that enables a deeper search of the metadata within the files of each dataset. FusionDB transforms this by validating datasets against community standards and indexing the actual parameters within each file, like location coordinates, time ranges, and measurement types, making them searchable and filterable.

The core engineering insight was recognizing that metadata tells you what researchers think their data contains, but automated content analysis reveals what it actually contains. The results demonstrate the power of thoughtful automation: review processes have been significantly expedited, data quality improved through automated feedback, and researchers can now preview and filter datasets based on the file contents before download. This approach will create a positive feedback loop where better tooling encourages better data practices, with the potential to raise standards across the entire community.

For fellow RSEs, FusionDB demonstrates how solving one community's pain point can catalyze systemic change. By automating the tedious work, we didn't just save time, we enabled researchers to focus on science while positioning the community for improved data quality practices. The lesson: sometimes the most impactful software engineering isn't about building new capabilities, but about removing friction from existing workflows at exactly the right point in the process.

Eye-trackers have long been used within cognitive neuroscience to better understand human development, behavior, and clinical conditions. They are also increasingly being used in commercial applications aimed at accessibility or virtual/augmented reality

Despite Python’s prominence in scientific computing generally, and neuroscience specifically, the scientific community has not coalesced around a Python library dedicated to eye-tracking research. We present our work towards integrating eye-tracking support into MNE-Python, the most popular Python library for human neurophysiological data analyses.

Open source software (OSS) for scientific applications is becoming increasingly ubiquitous. This has led to major strides and breakthroughs in research due to the sharing of computational tools and correct code resources, further promoting software openness in scientific research.

However, many repositories lack strong software practices, leading to a limited contributions from a community of users and issues with overall repository maintenance. These limitations are typically due to the mismatch that results with researchers who are not traditionally trained in good software engineering practices are focused on developing open source computational software. Some existing open source frameworks like RepoScaffolder focus on teaching researchers about the OSS lifecycle as well as required files and best practices to make a project easier to understand for both users and contributors.

While training of researchers in how to effectively create and maintain an open source repository is always important, there may also be compliance reasons to audit and address issues with repositories associated with an institution or corporation. As an example, a particular organization might want to check that all public repositories representing the organization have the same license and basic files like READMEs and CONTRIBUTING pages.

For these reasons, we propose RepoAuditor, a software tool which analyses and audits code repositories for OSS and software engineering best practices. Teams with limited software engineering skill sets can utilize RepoAuditor to identify and adopt good open-source software practices for their code repositories, and they can extend or scope the tool to performance compliance checks across many organizational repositories.

This poster goes into detail on the design and flexibility of the RepoAuditor framework as well as how RSEs can utilize this tool for their own projects and institutional requirements. Additionally, we will share results from user feedback studies as well as their impacts on the design and evolution of the RepoAuditor framework.

Open source software relies on the community of volunteer developers that contribute to and maintain the project. For research software, these developers are often the researchers who actively use the software for scientific work. What motivates researchers to contribute to the research software that they use? What are the barriers that researchers face when contributing to research software? As project maintainers for jsPsych, a widely-used open source library for building behavioral experiments, we wanted to understand how researchers who used our software thought about contributing to it.

jsPsych features a hyper-modular design that allows users to add new functionality as plugins, without needing to contribute to a central repository. An innovation in one lab can be easily published and shared with other labs, and yet we've found that despite a quickly growing userbase, the number of contributors is growing at a much slower pace. To try and get an understanding of why this is the case, we conducted qualitative interviews to uncover users’ relationships to open source contribution. We discerned common themes in contributors’ motivations, as well as barriers articulated by those who had made modifications without contributing them.

We conducted 63 interviews with active users of the software. Participants were selected from cited uses of jsPsych and ranged from university faculty to PhD candidates and even postgraduate research assistants. Only some individuals demonstrated training as software engineers, while the majority consisted of novice programmers who had hacked research solutions with the framework. Though this exploratory sample could not capture the breadth of user innovations, it became an effective base from which to launch sustained engagement of users-turned-developers. We arrived at the following conclusions by the end of our interview process. First, we found that while over half of users interviewed had made modifications, less than a quarter had ever contributed to the repository ecosystem. Next, we found that a majority of non-contributing modders had not contributed due to technical insecurities or lack of familiarity with code contribution standards. Additionally, we found that a little under a third were unfamiliar with contribution repositories, while another equal proportion of interviewees asserted their code was out of date and incompatible with current versions of the framework. Third, we found that of the few contributors interviewed, the vast majority contributed entirely in good faith and out of gratitude. Only half did so out of faith specifically in open-source principles, while only one expressed peer recognition as an additional motivator.

Based on these initial conclusions, we suggest that project leads in open source adopt more proactive approaches when calling fresh contributors into the fold. Documentation should be restructured to guide users towards contribution as a readily achievable goal. We likewise recommend that project maintainers open other accessible channels of communication with their communities, so that developers can showcase their contributions and demonstrate participation in open source development to their peers.

Monolithic software architectures are increasingly being replaced by microservices architectures due to their modularity, scalability, and flexibility. While monolithic systems package all components into a single deployable unit, microservices consist of independently deployable components, each responsible for a specific functionality. Despite the growing popularity of microservices, many organizations adopt architectural styles based on trends or anecdotal success stories, often without a comprehensive understanding of their benefits and trade-offs. This research aims to provide empirical data comparing the performance, scalability, and resource efficiency of monolithic and microservices architectures under varying levels of load. The evaluation utilized the Spring Petclinic application, deployed in both monolithic and microservices configurations

A series of controlled test cases were executed on a local machine, and performance metrics were gathered using Apache JMeter. The results indicate that the monolithic architecture outperformed the microservices configuration in terms of throughput, latency, and resource utilization under most load conditions. While microservices demonstrated lower error rates at minimal load levels, they exhibited significantly higher error rates under peak load, primarily due to service failures resulting from limited CPU and memory resources. These findings suggest that microservices architectures, although promising in scalability, require more robust infrastructure and careful orchestration to perform effectively at scale. This research contributes to the ongoing discourse on software architecture by providing quantifiable insights into when and how to adopt specific architectural paradigms based on system requirements and available resources.

Electroencephalography (EEG) is a cornerstone neuroscience technique used in humans to diagnose and localize epilepsy and for validating and generating a mechanistic understanding in animal models of neurological disease. A number of both free and paid software packages have been developed to analyze rodent EEG recordings; however, there is no interoperability between packages. This fragmentation of analysis tools and techniques is a challenge to researchers. The ability to unify analysis pipelines will allow collaboration and comparison of models across laboratories and animal models. This will, in turn, advance research into neurological disorders. We developed PyEEG (working title), a Python-based EEG analysis package designed for rodent EEG. As a free and open-source tool, PyEEG focuses on modularity, interoperability, and scalability of EEG analysis pipelines. A generalized framework for feature calculation is implemented so that contributors may easily extend the list of features. The package utilizes the SpikeInterface[1] and MNE[2] packages for data import and export, allowing support of a wide variety of file formats with a syntax used frequently in neuroscience.

Development of PyEEG adopts the Continuous Integration practice of code development, with self-contained branches tested before merging. EEG datasets can grow to several terabytes in size, which introduces complications with data loading, memory usage and compute time. Our initial implementation of serial analysis in-memory failed to scale for the whole dataset, so we address the problem by integrating dataset caching into the pipeline and using Dask to parallelize feature computation on a high-performance computing cluster. Intermediate computations are saved to avoid rerunning the whole pipeline due to trivial errors.

Design of the package was guided directly by questions we and collaborators would like to answer with EEG analysis. This produced a tool that was immediately useful, sufficiently fast, and tested on experimental data. The heterogeneity of EEG data formats prompted us to use a modular package structure that was agnostic to file types. Visualizing the dataset with plotting functions was a particular need early so that we could periodically “sanity-check” pipeline calculations with collaborators and benchmark progress.

PyEEG uses two parallel and configurable pipelines to analyze EEG traces: Window Analysis which extracts features on time-binned windows, and Spike Analysis which implements electrographic spike detection. Artifacts are rejected by outlier root mean square (RMS) amplitude events, high RMS amplitude, high beta band power or high local outlier factor. Thresholds were selected by comparison with biological limits and visual inspection of filtered data. Features from both pipelines feed into plotting utilities to visualize at the experiment level or the animal level. We validate this toolbox over mouse intracranial EEG datasets collected from several models of epilepsy. Our analysis demonstrates the value of this tool to unify pipelines in an ever-evolving field, and the lessons that come along with development. The code is available at https://github.com/josephdong1000/PyEEG. Code documentation is a work in progress at the time of this writing and available at https://josephdong1000.github.io/PyEEG/.

1. Buccino AP, Hurwitz CL, Garcia S, Magland J, Siegle JH, Hurwitz R, Hennig MH. SpikeInterface, a unified framework for spike sorting. Elife. 2020 Nov 10;9:e61834. doi: 10.7554/eLife.61834. PMID: 33170122; PMCID: PMC7704107. 2. Alexandre Gramfort, Martin Luessi, Eric Larson, Denis A. Engemann, Daniel Strohmeier, Christian Brodbeck, Roman Goj, Mainak Jas, Teon Brooks, Lauri Parkkonen, and Matti S. Hämäläinen. MEG and EEG data analysis with MNE-Python. Frontiers in Neuroscience, 7(267):1–13, 2013. doi:10.3389/fnins.2013.00267.

Testing legacy codebases often poses considerable challenges, including a scarcity of tests, monolithic architecture, inadequate documentation, and a lack of standardized development practices. These obstacles typically arise from the absence of formal testing, complicating efforts to maintain and update the code for compatibility with newer hardware and software. Consequently, writing unit tests often necessitates refactoring the code, which involves restructuring it without compromising its original functionality. However, this raises a critical question: how can one refactor without having tests in place to verify functionality?

This poster will explore the incorporation of software better practices, with a particular emphasis on testing, to facilitate the modernization of legacy codebases. By adopting a structured approach to testing, our aim is to enhance the reliability and maintainability of the code while preserving the current functionality and stability of software that produces science-ready data. Discussions will focus on practical strategies for implementing testing frameworks, establishing a culture of continuous integration, and ensuring that updates do not disrupt the production of critical scientific outputs. Through these efforts, we can create a more robust and adaptable codebase that meets the demands of modern scientific research.

In astronomy, researchers are increasingly turning to machine learning pipelines to process ever-growing datasets. These pipelines often require GPUs and stacks of interdependent software, making them challenging to manage and maintain across research groups and operating systems. In this poster, we present an example machine learning pipeline for galaxy redshift prediction using a dual-branch convolutional neural network (CNN) that combines multi-band galaxy images with brightness measurements (magnitudes) as inputs. The model is implemented in TensorFlow and features automated experiment tracking through MLflow, which provides environment versioning, model checkpointing, training histories, visualization plots, and model metadata. We use the GalaxiesML dataset, comprising approximately 300,000 galaxies. A Jupyter notebook offers step-by-step instructions for running the pipeline on Windows and Linux platforms (Apple Metal is not supported due to TensorFlow backend limitations)

The code and datasets are openly available on GitHub and designed with novice users in mind. Our example also demonstrates how to train with datasets larger than available GPU memory, addressing a common limitation in CNN tutorials. The modular design allows researchers to modify network architectures, hyperparameters, and training configurations while maintaining a complete record of results in a reproducible local environment. We will present results, visualizations, and practical guidance for deploying scalable machine learning tools in big data astronomy. The poster will also address platform-specific challenges and best practices for getting the pipeline running across different environments. Our goal is to share practical strategies for tracking experiments, model management, and reproducible workflows for research software engineers (RSEs) and domain scientists working with machine learning pipelines.

OM (OnDA Monitor) is widely used at free-electron laser and synchrotron facilities for online scoring and monitoring of diffraction during serial crystallography. We present a re-architected parallelization layer that replaces the original MPI layer with a ZeroMQ (“ZMQ-socket”) design that reduces external dependencies, improves portability across heterogeneous beamlines, and enables dynamic worker join/leave so monitoring can resume after interruptions without loss of accumulated statistics. Beyond these systems gains, we expose a lightweight probabilistic interface that supports Monte Carlo adaptation in the small-hit regime: sequential sampling updates run-time estimates of hit rate and resolution, tunes acquisition/processing thresholds, and provides uncertainty-aware stopping/continuation decisions to use beamtime efficiently. The core remains usable out-of-the-box, while optional plug-ins can ingest streaming features for complementary deep-learning quality scoring. Ongoing work targets optimization of the ZMQ layer and benchmarking Monte Carlo policies under realistic beam conditions, toward a portable release for routine deployment at beamlines.

Modern High-Performance Computing (HPC) centers face significant challenges in ingesting large and diverse data streams, resulting in bandwidth bottlenecks and scientific delays. To overcome the limitations of traditional static allocation and simple queuing methods, this work introduces a dynamic, value-based approach to bandwidth allocation. We propose two new auction-based mechanisms: the computationally efficient Greedy Value Density Auction and the theoretically robust Vickrey-Clarke-Groves (VCG) Knapsack Auction. Both mechanisms utilize user-provided bids that specify data requirements and scientific value, aiming to maximize the total value of successful data transfers. Simulation results, based on realistic workload characteristics, show that these auction mechanisms significantly outperform standard First-Come, First-Served (FCFS) baselines. In high-load scenarios, our methods reduce average and tail completion delays by over 80% and improve predictability by decreasing the coefficient of variation of delay by up to 85%. Additionally, network stability is enhanced, with the peak-to-average load ratio reduced by as much as 70%. This value-driven, adaptive strategy alleviates congestion, improves bandwidth utilization, and ensures that access is prioritized based on scientific importance—ultimately accelerating scientific discovery.

Understanding the stability of crystalline materials is central to designing new functional compounds. In this work, we integrate Phonopy, a first-principles phonon calculation package, with PyGAD, a Python-based library for multi-objectives genetic algorithms, to systematically search stable regions of chemical space and structural stability. Phonopy enables the calculation of phonon dispersion relations, where the presence or absence of imaginary vibrational modes determines dynamical stability. These results are coupled with PyGAD’s multi-objective genetic algorithm (MOGA), where Born–von Kármán force constants evolve across generations to efficiently search parameter space and identify Pareto-optimal solutions. The workflow produces a “stability map” for body-centered cubic (BCC) structures such as Fe and V, balancing stability criteria with agreement to experimental phonon spectra. Importantly, all simulations are executed on the National Energy Research Scientific Computing Center (NERSC) JupyterHub platform, which provides scalable high-performance computing resources and a reproducible environment for running optimization-driven phonon calculations. By combining phonon-based lattice dynamics with evolutionary algorithms on HPC infrastructure, this framework demonstrates a transferable and efficient strategy for navigating chemical space, accelerating materials discovery, and enabling high-throughput stability screening.

Current GPU training systems waste significant compute and memory on zero activations in sparse neural networks. We propose an event-driven GPU architecture that triggers computation only for non-zero activations, achieving 3x memory efficiency and 2x training speedup on sparse workloads. Our system uses CUDA events to coordinate dynamic memory allocation and kernel dispatch based on runtime sparsity patterns. We evaluate transformer models with 70-90% sparsity, showing consistent improvements over PyTorch and other baselines while maintaining accuracy. This work demonstrates how event-driven computing can unlock GPU efficiency for the next generation of large, sparse neural networks.

The poster will present the Interactive Grit Tracker, a web-based tool designed to support mentoring for NSF STEM in rural Louisiana. Built using Python and JavaScript, the tool allows mentors and students to input challenges, like financial stress or time constraints, and resilience strategies, like tutoring or peer support, generating real-time visualizations like bar charts to track progress. Drawing form an NSF S-STEM photovoice project, the tracker uses student narratives to foster academic confidence and belonging, addressing retention gaps in underserved communities. The poster will showcase the tool’s interface, a sample output, and its potential to enhance mentoring practices.

Extracting underlying trends from noisy time series is a fundamental challenge in fields such as economics, finance, and environmental science. Traditional methods such as the Hodrick Prescott (HP) filter apply an ℓ₂ penalty that smooths curvature but often oversmooths abrupt changes, limiting their usefulness in detecting structural breaks. Building on the theoretical foundation of Kim, Koh, Boyd, and Gorinevsky (2009), this project applies ℓ₁ trend filtering which replaces the ℓ₂ penalty with an ℓ₁ penalty on second order differences to real financial data. The ℓ₁ approach produces piecewise linear trend estimates where slope changes correspond to structural events and it achieves computational complexity comparable to HP filtering. Using daily log transformed S and P 500 index data from early 2025, we implemented gradient descent optimization and compared both filters. The ℓ₁ filter more effectively identified sudden drops and recoveries, while residual analysis confirmed improved noise removal without oversmoothing. These results demonstrate ℓ₁ trend filtering as a practical and scalable alternative for uncovering meaningful regime shifts in financial and other time series.

Graph Neural Networks (GNNs) process large, sparse, and irregular graphs, making them useful for particle track reconstruction. This project uses GNNs with TrackML which is a particle tracking challenge with the goal of reconstructing the paths of particles that traveled through detectors, and doing so as fast as possible without sacrificing accuracy. The GNN input graphs are processed using float32, but it is possible to also use Automated Mixed Precision (AMP) or manually convert to different precisions.

Traditional methods for modeling natural hazards such as landslides, floods, and storm surges are computationally intensive and time-consuming, thus limiting their applicability in effective disaster preparedness and response in real-world scenarios. To address this challenge, we present a novel framework, through TACC’s HPC resources, for rapidly creating digital twins that significantly reduces the time, manual input, and computational resources needed for simulating the interaction between natural hazards and real-world 3D objects. Our X2Sim framework utilizes an agentic text-to-simulation Large Language Model (LLM) to generate digital twins, integrating two distinct 3D object generation methods: (i) A text-to-3D point cloud diffusion model that swiftly creates 3D point clouds from natural language descriptions, enabling rapid digital twin prototyping, and (ii) An efficient method for constructing high-fidelity point clouds from video input, allowing for more detailed digital twin representations of existing structures. These digital twins are integrated into a Graph Network-based Simulator (GNS) that models the dynamics of particle and fluid interactions, enabling the simulation of complex natural hazard scenarios. Our X2Sim system allows for adjusting simulation parameters, offering a robust tool for exploring various disaster scenarios and their impacts on the digital twins. While our digital twin framework may not match the accuracy of high-fidelity numerical methods, it significantly reduces computation time and complexity, making it viable for near-real-time applications. The X2Sim approach offers a valuable balance between speed and precision in digital twin creation and simulation, providing a streamlined, low-intervention workflow for researchers and practitioners in natural hazard modeling and disaster preparedness.

Psychological safety is the belief that a context is safe for interpersonal risk taking - that speaking up with ideas, questions, concerns, or mistakes will be welcomed and valued. Psychological safety is crucial in software engineering communication, particularly in tightly coupled team activities such as mob programming (i.e., mobbing), where three or more team members collaborate to develop software together. Autistic software engineers can feel unsafe in teams due to the anxiety and stress when communicating with others with different communication styles. A collaborative space that allows autistic team members to communicate in neurodiverse teams flexibly can increase the psychological safety and accessibility of collaborative software development. To identify tools and practices that foster psychological safety in neurodiverse collaborative mob programming, I will conduct a series of mixed-method, design-based studies. First, I conduct a survey and interview study to uncover the relationship between neurodivergent cognitive and communication traits and psychological safety in teams. Second, I generate design principles for psychological safety through the iterative design and evaluation of a neuroinclusive digital collaboration space. Third, I evaluate the impact of these design principles through an experiment with minority, majority, and all neurodivergent teams. My work makes the following contributions to accessible software engineering education and practice: 1) Novel descriptions of psychological safety relating to neurodivergent cognitive and communication attributes; 2) design principles for fostering psychological safety in collaborative software development teams; 3) a software development tool that scaffolds psychologically safe mobbing in neurodiverse software teams.

To reduce human error in creating input files for simulations, especially when handling hundreds or even thousands of jobs, we developed a robust and versatile Python framework to streamline the generation and analysis of large-scale simulation datasets for various executables. This framework automates the creation, submission, and execution of simulations, significantly reducing the time spent on preparing, copying, and submitting input files, executables, and additional parameters (such as potentials). This approach has greatly improved the efficiency and accuracy of simulations, especially on Perlmutter, a NERSC supercomputer, though it is adaptable for use on other systems as well can be adapted to another computing. This framework was deployed on Perlmutter, a supercomputer delivered to the National Energy Research Scientific Computing Center (NERSC), with successful results and is adaptable to other computing platforms.

Large Language Models (LLMs) are increasingly being used as agentic systems in hyper-specific information domains such as medicine, law, and therapeutic practices. However, their capacity to "drift" from their designated context or generate hallucinated information undermines trust. This project introduces a modular supervision architecture designed to detect and flag hallucinatory drift in LLM outputs. The system utilizes multiple layers, namely centroid-based embedding comparison and k-nearest neighbor (KNN) classification to evaluate alignment between generated output and source context. The toolkit is being developed with a focus on flexibility and usability. Researchers can configure supervision layers and evaluation strategies according to their domain needs. Prototypes have demonstrated the ability to identify divergence from baseline context in conversational LLM pipelines and agentic task-based systems like Anthropic's VendingBench simulation. Future versions aim to support broader applications, such as scientific/medical document summarization and content moderation. By providing a configurable framework for context drift detection, this project highlights the importance of reliability, interpretability, and reproducibility in AI research software.

Rising suicide rates in the U.S. remain a critical public health issue, especially when viewed alongside social determinants of health (SDOH) such as income, education, and access to care. This project introduces an interactive web map built with Python to explore these relationships at the county level over a 20-year period. The platform allows users to dynamically filter, explore, and overlay suicide data with key demographic indicators to identify high-risk regions. This poster highlights how accessible, web-based GIS tools can help researchers, policymakers, and communities better understand spatial health disparities and support data-driven, equity-focused public health strategies.

This project aims to analyze data from isotope production experiments at the Cyclotron Institute, a U.S. user facility. The experiment consisted of a beam of lithium-6 on a natural samarium target at different energies. From the NuDat database a chart of nuclides and focusing on the isotopes’ half-lives, stabilities, and properties can be used to predict outcomes of the experiments. The foils were counted a day after irradiation, using three high-purity germanium (HPGe) detectors. Background, natural samarium foils (with thicknesses of 100 and 250 micrometers), europium-152 source, and gold foils (with a thickness of 5 micrometers) data was collected on the HPGes. To analyze these spectra, the python toolkit called Curie was used to simplify the identification and fitting of the peaks. After calibrating the spectra using the europium-152 source in Curie, the resulting isotopes can be identified. With this data we can conclude that the isotope production of terbium-149 via this method may be possible, but more measurements will need to be performed to confirm the optimal production pathway.

Monolithic software architectures are being replaced by microservices-based software architectures, which are more modular and flexible in design. Monoliths have all their components packaged within a single artifact and are deployed together. Microservices, on the other hand, is structured as a collection of smaller, independently deployable components with each component having its own responsibility. Many businesses adopt a software architectural style based on success stories, without a clear understanding of its benefits and trade-offs. There is a need to quantify the performance of each architecture to establish a solid foundation for understanding when and how to implement a given architecture best. As software systems become more complex, understanding the trade-offs between architectural choices becomes critical for long-term success. Our study evaluates the performance, scalability, and resource efficiency of monolithic and microservices architectures under varying levels of load. The Spring Petclinic applications were tested using prepared test cases on the local computer of one of the authors. The results were collected and compared using the JMeter tool. The results showed that the monolith outperformed the microservices in most metrics: it handled more requests, achieved over 70% higher throughput at maximum load, and maintained lower latency. The microservices architecture had a 0% error rate at low concurrency levels; however, as the load increased, the error rate rose to 63% at 100 users due to service failures caused by limited memory and CPU resources. Additionally, microservices consumed significantly more CPU and memory due to inter-service communication. These findings show that microservices provide modularity and deployment independence, but they need strong infrastructure to scale reliably. On the other hand, monolithic architecture is still a better choice for applications that run in constrained environments or have predictable workloads because it uses fewer resources and is more stable.

Early risk assessment of healthcare conditions becomes particularly challenging in rural and resource-limited settings, where access to dense, continuous, and high-frequency medical data is constrained. Similar challenges arise in crisis scenarios, such as natural disasters, where robust monitoring infrastructure is unavailable. Despite these limitations, timely and real-time alerts are critical for enabling proactive clinical interventions. To address this gap, we explore graph-based algorithms that effectively leverage sparse, irregularly recorded, and low-frequency data to provide reliable early risk assessment under rural data-quality constraints.

The Epilepsy Foresight Toolkit is designed to be a comprehensive suite for analysis and research through data-driven insights. Users can directly upload datasets or access publicly available clinical trains or electronic health records for interpretation. Two predictive models, AutoRegressive Integrated Moving Average (ARIMA) and Long Short-Term Memory (LSTM) are used to forecast trends and epilepsy incidents for analyzing time-series data. The toolkit’s interface supports a variety of operations including report generation, data preprocessing, secure storage, and viewing reports or data. Exploratory data analysis offers insights into patterns and trends in epilepsy data with the goal of providing crucial information to clinicians and researchers to improve epilepsy patient outcomes.

Posters

INnovative Training Enabled by a Research Software Engineering Community of Trainers (INTERSECT)Jeffrey Carver and Ian Cosden

Better Scientific Software Fellowship ProgramElsa Gonsiorowski, Adam Lavely, and Mary Ann Leung

Presenting the Actionable Guidelines for FAIR Research Software Task ForceBhavesh Patel and Daniel Garijo

Steps Towards a Digital Twin for Safeguards in Nuclear Waste ManagementManuel Kreutle and Irmgard Niemeyer

Searching Sage: AI-Powered Image Retrieval on a Nationwide Edge Computing CyberinfrastructureFrancisco Lozano, Neal Conrad, Sean Shahkarami, Peter Lebiedzinski, and Pete Beckman

Collaborative Lesson Development Training for RSEsSarah Stevens, Toby Hodges, and Aleksandra Nenadic

An Easy, Automatic, Flexible Website Template for LabsVincent Rubinetti

Community Code Review in the Digital HumanitiesJulia Damerow, Rebecca Sutton Koeser, Jeffrey C. Carver, and Malte Vogl

Surveying the Digital Humanities Research Software Engineering LandscapeRebecca Sutton Koeser and Julia Damerow

Testing Research SoftwareNasir Eisty, Jeffrey Carver, and Upulee Kanewala

Building Scientific Python PackagesHenry Schreiner

Ten Simple Rules for Catalyzing Collaborations and Building Bridges between Research Software Engineers and Software Engineering ResearchersNasir Eisty, Jeffrey Carver, Johanna Cohoon, Ian Cosden, Carole Goble, and Samuel Grayson

Recreating Old Cray Programming Environments with Containers for the Gaea SupercomputerSubil Abraham and Elijah Maccarthy

IN-CORE - Software Engineering Practices on a Large Scale ProjectJong Lee, Christopher Navarro, Rob Kooper, Chen Wang, Yong Wook Kim, and Rashmil Panchani

Sustainable LLMs: Compression-Driven Efficiency for Faster Training and InferenceElham Barezi and Primus Kabuo

Enhancing Earth System Reproducibility Testing with False Discovery Rate CorrectionMichael Kelleher and Salil Mahajan

paleopal: a highly interactive Shiny app for building reproducible data science workflows in paleontologyWilliam Gearty

Exploring the Promise and Peril of Generative AI for Usable Scientific GUIsRajshree Deshmukh, Cody O'Donnell, Johanna Cohoon, Drew Paine, Sarah Poon, Dan Gunter, and Lavanya Ramakrishnan

Trust, but Not Too Much: How Far Can LLMs Be Trusted to Audit Software Citations?Ketan Bhardwaj, Robin Fievet, Dave Brownell, and Jeffery Young

Characterizing the Security Culture of the Research Software Engineering ityMatthew Armstrong, Jeffrey Carver, and Reed Milewicz

Teaching a semester course in GPU-centric, scalable scientific computing for mechanistic modelingPariksheet Nanda

A Universal Edge-Cloud Architecture for Autonomous 3D PrintingBing Zhang, Austin Lomas, Sandeep Puthanveetil Satheesan, Matthew Berry, and Ying Diao

Collaboration In Scientific Software: A Systematic Literature ReviewOlivia B. Newton, Anshu Dubey, Denice Ward Hood, and Lois Curfman McInnes

Research Computing Accounting Management System (RCAMS): An Inner-Source Code Built with Open-Source Best PracticesDevin Fromond and Fang Liu

Dataset Integrity Done with Git (didg) – A framework for managing neuroscience datasets along processing phasesRoland Ferger and Jose Pena

Streamlining Data Publication: A Case Study in Upgrading DOI Services for Environmental ScienceShalki Shrivastava and Fianna O'Brien

Going Deeper: Advancing Data Discovery Through Content AnalysisShalki Shrivastava

Eye-tracking in MNE-Python: A Unified Framework for Neural and Ocular Data AnalysisScott Huberty

RepoAuditor: A New Approach to Auditing Open Source RepositoriesVarun Agarwal, Mathieu Tanneau, David Brownell, and Jeffrey Young

Motivations and Barriers to Contributing to Open Source Research SoftwareValeria Inojosa, Alex Teabo, and Josh de Leeuw

Evaluating Scalability and Modularity: A Comparative Study of Traditional Software ArchitecturesOnyedikachi Kanu, Bouchaib Falah, and Olamide Tawose

PyEEG: Unifying Rodent EEG Analysis PipelinesJoseph Dong, Yongtaek Oh, and Eric Marsh

Some tips on modernizing legacy softwareAlex Koufos

Dual-Branch Neural Networks for Galaxy Redshift EstimationJacob Nowack, Srinath Saikrishnan, Vikram Seenivasan, Tuan Do, Bernie Boscoe, Zhuo Chen, and Chandler Campbell

From MPI to ZeroMQ: Scalable Online Feedback for Serial Crystallography with Bayesian/ML-Ready HooksAbdullah Maruf

Bandwidth Allocation for Heterogeneous HPC Data Ingestion using Dynamic AuctionsAbrar Hossain

Searching Stable Chemical Space through Computational Methods using Multi-objectives Genetic AlgorithmAmir Husen

Event-Driven GPU Memory Management for Efficient Sparse TrainingAreebah Layne

Interactive Grit Tracker: A Tool for Mentoring STEM ResilienceAshley Escude

Adaptive Trend Extraction in Financial Time Series Using ℓ₁ Trend FilteringAugustine Manu-Frimpong

Accelerating GNN Inference for High-Speed Particle Track ReconstructionMeara Whitely and Carson Zoccole

X2Sim: Rapid digital twin creation from text and videos for natural hazard modelingCristian Moran

Fostering Psychological Safety in Neurodiverse Tech TeamsDarren Butler

Utilities To Execute Pipelines (UTEP)Diego Juarez Rosales

A Modular Architecture for Detecting Contextual Divergence in Large Language ModelsHattie Lyons

Innovative Web Applications for Demographic and Health Spatial Data VisualizationLaphatrada Richards

Software for Nuclear Data AnalysisLeonith Rodriguez

Evaluating Scalability and Modularity: A Comparative Study of Traditional Software ArchitecturesOnyedikachi Kanu

Early Risk Assessment of Healthcare Conditions under Rural Data-Quality ConstraintsSonal Jha

Epilepsy Foresight ToolkitWilliam Planic