Research software once was a heroic and lonely activity, particularly in research computing and in HPC. But today, research software is a social activity, in the senses that most software depends on other software, and that most software that is intended to be used by more than one person is written by more than one person. These social factors have led to generally accepted practices for code development and maintenance and for interactions around code. This paper examines how these practices form, become accepted, and later change in different contexts. In addition, given that research software engineering (RSEng) and research software engineers (RSEs) are becoming accepted parts of the research software endeavor, it looks at the role of RSEs in creating, adapting, and infusing these practices. It does so by examining aspects around practices at three levels: in communities, projects, and groups. Because RSEs are often the point where new practices become accepted and then disseminated, this paper suggests that tool and practice developers should be working to get RSE champions to adopt their tools and practices, and that people who seek to understand research software practices should be studying RSEs. It also suggests areas for further research to test this idea.
Artificial intelligence (AI) and machine learning (ML) have been shown to be increasingly helpful tools in a growing number of use-cases relevant to scientific research, despite significant software-related obstacles. There exist large technical costs to setting up, using, and maintaining AI/ML models in production. This often prevents researchers from utilizing these models in their work. The growing field of machine learning operations (MLOps) aims to automate much of the AI/ML life cycle while increasing access to these models. This paper presents the initial work in creating a nuclear energy MLOps platform for use by researchers at Idaho National Laboratory (INL) and aims to reduce the barriers of using AI/ML in scientific research. Our goal is to promote the integration of the latest AI/ML technologies into researchers' workflows and create more opportunity for scientific innovation. In this paper we discuss how our MLOps efforts aim to increase usage and the impact of AI/ML models created by researchers. We also present several use-cases that are currently integrated. Finally, we evaluate the maturity of our project as well as our plans for future functionality.
Software sustainability is critical for Computational Science and Engineering (CSE) software. Highly complex code makes software more difficult to maintain and less sustainable. Code reviews are a valuable part of the software development lifecycle and can be executed in a way to manage complexity and promote sustainability. To guide the code review process, we have developed a technique that considers cyclomatic complexity levels and changes during code reviews. Using real-world examples, this paper provides analysis of metrics gathered via GitHub Actions for several pull requests and demonstrates the application of this approach in support of software maintainability and sustainability.
Increasingly, scientific research teams desire to in- corporate machine learning into their existing computational workflows. Codebases must be augmented, and datasets must be prepared for domain-specific machine learning processes. Team members involved in software development and data maintenance, particularly research software engineers, can foster the design, implementation, and maintenance of infrastructures that allow for new methodologies in the pursuit of discovery. In this paper, we highlight some of the main challenges and offer assistance in planning and implementing machine learning projects for science.
Sandia National Laboratories is a premier United States national security laboratory which develops science-based technologies in areas such as nuclear deterrence, energy production, and climate change. Computing plays a key role in its diverse missions, and within that environment, Research Software Engineers (RSEs) and other scientific software developers utilize testing automation to ensure quality and maintainability of their work. We conducted a Participatory Action Research study to explore the challenges and strategies for testing automation through the lens of academic literature. Through the experiences collected and comparison with open literature, we identify these challenges in testing automation and then present strategies for mitigation grounded in evidence-based practice and experience reports that other, similar institutions can assess for their automation needs.
Visual Studio Code (VSCode) has emerged as one of the most popular development tools among professional developers and programmers, offering a versatile and powerful coding environment. However, configuring and setting up VSCode to work effectively within the unique environment of a shared High-Performance Computing (HPC) cluster remains a challenge. This discusses the configuration and integration of VSCode with the diverse and demanding environments typically found on HPC clusters. We demonstrate how to configure and set up VSCode to take full advantage of its capabilities while ensuring seamless integration with HPC-specific resources and tools. Our objective is to enable developers to harness the power of VSCode for HPC applications, resulting in improved productivity, better code quality, and accelerated scientific discovery.
We provide an overview of the software engineering efforts and their impact in QMCPACK, a production-level ab-initio Quantum Monte Carlo open-source code targeting high-performance computing (HPC) systems. Aspects included are: (i) strategic expansion of continuous integration (CI) targeting CPUs, using GitHub Actions runners, and NVIDIA and AMD GPUs in pre-exascale systems, using self-hosted hardware; (ii) incremental reduction of memory leaks using sanitizers, (iii) incorporation of Docker containers for CI and reproducibility, and (iv) refactoring efforts to improve maintainability, testing coverage, and memory lifetime management. We quantify the value of these improvements by providing metrics to illustrate the shift towards a predictive, rather than reactive, sustainable maintenance approach. Our goal, in documenting the impact of these efforts on QMCPACK, is to contribute to the body of knowledge on the importance of research software engineering (RSE) for the sustainability of community HPC codes and scientific discovery at scale.
Evidence-based practice (EBP) in software engineering aims to improve decision-making in software development by complementing practitioners' professional judgment with high-quality evidence from research. We believe the use of EBP techniques may be helpful for research software engineers (RSEs) in their work to bring software engineering best practices to scientific software development. In this study, we present an experience report on the use of a particular EBP technique, rapid reviews, within an RSE team at Sandia National Laboratories, and present practical recommendations for how to address barriers to EBP adoption within the RSE community.
As social media platforms continue to shape modern communication, understanding and harnessing the wealth of information they offer has become increasingly crucial. The nature of the data that these platforms provide makes them the emerging resource for data collection to conduct research ranging from measuring sentiments of the people over a particular trend in society to drafting a major policy by governing agencies. This paper presents PULSE, a powerful tool developed by Decision TheaterTM (DT) at Arizona State University in the United States, designed to extract valuable insights from Twitter. PULSE provides researchers and organizations with access to a curated dataset of public opinions and discussions across diverse research areas. Further, the tool uses various machine learning and data analytical algorithms to derive valuable insights on the subject under research. These insights are efficiently displayed using an interactive dashboard to assist the researchers in extracting useful insights to draw appropriate conclusions. The paper also illustrates the technical functionalities and visualization capabilities of the tool with the case study on Hurricane Laura.
This paper introduces CACAO, a research software platform that simplifies the use of cloud computing in scientific research and education. CACAO's cloud automation and continuous analysis features make it easy to deploy research workflows or laboratory experimental sessions in the cloud using templates. The platform has been expanding its support for different cloud service providers and improving scalability, making it more widely applicable for research and education. This paper provides an overview of CACAO's key features and highlights use cases.
Science gateways connect researchers to high-performance computing (HPC) resources by providing a graphical interface to manage data and submit jobs. Scientific research is a highly collaborative activity, and gateways can play an important role by providing shared data management tools that reduce the need for advanced expertise in file system administration. We describe a recently implemented architecture for collaborative file management in the Core science gateway architecture developed at the Texas Advanced Computing Center (TACC). Our implementation is built on the Tapis Systems API, which provides endpoints that enable users to securely manage access to their research data.
High-fidelity pattern of life (PoL) models require realistic origin points for predictive trip modeling. This paper develops and demonstrates a reproducible method using open data to match synthetic populations generated from census surveys to plausible residential locations (building footprints) based on housing attributes. This approach presents promise over extant methods based on housing density, particularly in small neighborhood areas with heterogeneous land-use.
Community resilience assessment is critical for the anticipation, prevention and mitigation of natural and anthropic disaster impacts. In the digital age, this requires reliable and flexible cyberinfrastructure capable of supporting research and decision processes along multiple simultaneous, interconnected concerns. To address this need, the National Center for Supercomputing Applications (NCSA) developed the Interdependent Networked Community Resilience Modeling Environment (IN-CORE) as part of the NIST-funded Center of Excellence for Risk-Based Community Resilience Planning (CoE), headquartered at Colorado State University. The Community App is a web-based application that takes a community through the resilience planning process using IN-CORE analyses for the measurement science to measure community resilience. Complex workflows are managed by DataWolf, a scientific workflow management system, running jobs on the IN-CORE platform utilizing the underlying Kubernetes cluster resources. Using the community app, users can perform realistic and complex scenarios and visualize the results to understand their resilience to different hazards and enhance their decision-making capabilities.
This paper examines the potential of containerization technology, specifically the CyVerse Discovery Environment (DE), as a solution to the reproducibility crisis in research. The DE is a platform service designed to facilitate data-driven discoveries through reproducible analyses. It offers features like data management, app integration, and app execution. The DE is built on a suite of microservices deployed within a Kubernetes cluster, handling different aspects of the system, from app management to data storage. Reproducibility is ensured by maintaining records of the software, its dependencies, and instructions for its execution. The DE also provides a RESTful web interface, the Terrain API, for creating custom user interfaces. The application of the DE is illustrated through a use case involving the University of Arizona Superfund Research Center, where the DE's data storage capabilities were utilized to manage data and processes. The paper concludes that the DE facilitates efficient and reproducible research, eliminating the need for substantial hardware investment and complex data management, thereby propelling scientific progress.
The INTERSECT Software framework project aims to create an open federated library that connects, coordinates, and controls systems in the scientific domain. It features the Adapter, a flexible and extensible interface inspired by the Adapter design pattern in object-oriented programming. By utilizing Adapters, the INTERSECT SDK enables effective communication and coordination within a diverse ecosystem of systems. This adaptability facilitates the execution of complex operations within the framework, promoting collaboration and efficient workflow management in scientific research. Additionally, the generalizability of Adapters and their patterns enhances their utility in other scientific software projects and challenges.
Full-stack research software projects typically include several components and have many dependencies. New projects benefit from co-development of these components within a well-structured monolith. While this is preferred, over time this can become a burden to deploy in different contexts and environments. What we would like is to independently deploy components to reduce size and complexity. Maintaining separate packages however allows for developmental drift and other problems. So called 'monorepos' allow for the best of both approaches, but not without its own difficulties. There is almost no formal treatment in the literature of this particular dilemma however. The technology industry has started using monorepos to solve similar challenges, but perhaps in the academic context we should be cautious to not simply replicate industry practices. This short paper merely propositions the research software engineering (RSE) community into a discussion of the positives and negatives in structuring projects as monorepos of discrete packages.
Single-page applications (SPAs) have become indispensable in modern frontend development, with widespread adoption in scientific applications. The process of creating a single-page web application development environment which accurately reflects the production environment isn’t always straightforward. Most SPA build systems assume configuration at build time, while DevSecOps engineers prefer runtime config- uration. This paper suggests a framework-agnostic approach to address issues that encompass both development and deployment, but are difficult to tackle without knowledge in both domains.
Documentation is a crucial component of software development that helps users with installation and usage of the software. Documentation also helps onboard new developers to a software project with contributing guidelines and API information. The INTERSECT project is an open federated hardware/software library to facilitate the development of autonomous laboratories. A documentation strategy using Sphinx has been utilized to help developers contribute to source code and to help users understand the INTERSECT Python interface. Docstrings as well as reStructuredText files are used by Sphinx to automatically compile HTML and PDF files which can be hosted online as API documentation and user guides. The resulting documentation website is automatically built and deployed using GitLab runners to create Docker containers with NGINX servers. The approach discussed in this paper to automatically deploy documentation for a Python project can improve the user and developer experience for many scientific projects.
Collaboration networks for university research communities can be readily rendered through the interrogation of coauthorships and coinvestigator data. Subsequent computation of network metrics such as degree or various centralities offer interpretations on collaborativeness and influence and can also compose distributions which can be used to contrast different cohorts. In prior work, this workflow provided quantitative evidence for ROI of centralized computing resources in contrasting researchers with and without cluster accounts, where significance was found across all metrics. In this work, two similar cohorts, those with RSE-type roles at the university and everyone else, are contrasted in a similar vein. While a significantly higher degree statistic for the RSE cohort suggests its collaborative value, a significantly lower betweenness centrality distribution indicates a target for potential impact through the implementation of a centralized RSE network.