Chapter 1 Introduction

Software pervades all parts of modern scientific research, including data analysis and inference as well as computational science. One would be hard pressed to find an area of research that is not impacted by software. Recent surveys in the US and UK show that 90-95% of researchers rely on research software, and 63-70% of them cannot continue their work if these software were to stop functioning (Hettrick et al. 2014). Much of this software is developed by researchers for researchers, as the contemporary scientific process demands the development of new methods in tandem with the demands of new discoveries and fields. However, despite its importance, a large proportion of research software is developed in an ad hoc manner, with little regard for the high standards that are characteristic of other research activities. As a result, the research software ecosystem is fragile and the source of numerous problems that plague modern computational science (Carver et al. 2018).

Researchers today are under intense pressure to demonstrate expertise in their chosen domains while also trying to maintain a working current knowledge of digital skills such as software engineering. This combination is unsustainable for most researchers. With little bandwidth to keep up with best practices or sufficient recognition of software development as a scholarly activity, much research software is developed in a manner that makes it wholly unsustainable, despite the obvious role that it plays in modern research, for a multitude of reasons. Academic promotion and tenure, even in institutions with liberal policies, consider peer-reviewed publications to be the primary metric of progress in most disciplines. Even when the impact of software is made clear, it is usually not considered a traditional scholarly activity, making it very challenging to get credit (Organisation for Economic Cooperation and Development 2019). There is no shortage of horror stories of academics who have built demonstrably impactful software, only to be denied tenure. Even outside the tenure track, only a few academic jobs offer meaningful career progression for software work. A second reason, strongly correlated with the lack of recognition of software, is the lack of training opportunities. Many research software engineers are self taught. Others learn programming from bootcamps and workshops rather than in traditional academic coursework. Overworked academics are unable to take advantage of such opportunities and therefore develop software using outdated practices. Lastly, even when software is recognized as having impact, funding agencies rarely fund maintenance and ongoing development of such work, leading to reinvention rather than reuse (https://chanzuckerberg.com/rfa/essential-open-source-software-for-science/)

We have spent the last two years engaged in a series of activities designed to gain a deeper understanding of why research software is so unsustainable and what can be done about it. Through numerous discussions with diverse groups of researchers, we have brainstormed challenges and solutions that are highly scalable and impact a large swath of researchers. This plan discusses the problem in this chapter, describes our activities (in Chapter 2), outlines high-level plans and methods (in Chapter 3), discusses more detailed plans in the following five chapters (community & outreach activities in Chapter 4, education & training activities in Chapter 5, incubator activities in Chapter 6, policy activities in Chapter 7, and mangement & coordination in Chapter 8), talks about budget (in Chapter 9), discusses metrics & evaluation (in Chapter 10), and then concludes (in Chapter 11). This complete document is the justification and plan for a new institute that will work in multiple areas to improve research software and the careers of those that produce it, with an end goal of performing better research.

1.1 Nature of the problem

One would be hard pressed to name any field of scientific endeavor that has not been substantially transformed by software. From physics to psychology, software has transformed the way we create, acquire, process, model, and draw insights from data. Much of this transformation has come from the increasing availability of open source tools, many of which have helped improve the rigor, quality, and reproducibility of research. The development of research software is often not considered scholarship, making it very difficult for academics to seek funding and find meaningful career paths, especially when research software activities make up a significant part of their contributions. Such people often lead double lives, working tirelessly to meet the traditional responsibilities of academic life, while developing open source tools that enable modern research. We are able to break the problem down into the following four areas:

Research software itself is not sustainably developed: In many fields research software is developed by academics for other academics. Because these people have spent much of their careers developing deep domain expertise, but not developing deep software expertise, software does not often get the same level of care as other aspects of the research enterprise (https://www.nature.com/articles/s41592-019-0686-2). Therefore the quality of the software is highly variable, making it hard to sustain. Mounting technical debt often makes it easier to develop software from scratch than to use existing tools. Versions of software used in papers are exceedingly hard to track down, making it challenging to reproduce research findings or reuse research software. When Collberg and colleagues (Collberg et al. 2013; Collberg, Proebsting, and Warren 2014) decided to measure the extent of the problem precisely, they investigated the availability of code and data as well as the extent to which this code would actually run with reasonable effort. The results were dramatic: of the 515 (out of 613) potentially reproducible papers, across applied computational research, the authors managed to ultimately run only 102 (less than 20%). These low numbers only count the authors’ success in running the code, not in actually validating the results.

Lack of career opportunities: Software does not often count for career advancement (e.g.,promotion and tenure) in academia, making it an invisible scholarly contribution. Research software is often not cited (31-43%), even in highly ranked journals (Howison and Bullard 2016). Besides the negative impact on career trajectories, this lack of visibility means that incentives to produce sustainable, widely shared, and collaboratively developed software are lacking. For those outside of traineeships or tenure track positions, the Research Software Engineer (RSE) movement has begun creating a new class of academic positions that explicitly value software work, but such positions are not very common in universities the United States.

Lack of training opportunities: When NSF PIs in the BIO directorate were asked about their biggest challenges in leveraging vast amounts of data currently available, lack of training was listed as the single biggest challenge (Barone, Williams, and Micklos 2017). Although this training deficit describes the ability to use existing data science software, the skills needed to develop them are harder to come by. While programs like The Carpentries and a handful of university courses offer training in analyzing data, very few train researchers in modern open source software development (Hettrick 2014; Hettrick et al. 2014; Nangia and Katz 2017). This gap remains to be filled.

Lack of diversity in research software: Open source communities struggle to gain participation from women and more broadly from underrepresented groups. Less than 10% of contributors to open source communities identify as female (Lee and Carver 2019) compared with approximately 25% of the overall computer science field (National Science Foundation 2017; Vasilescu, Capiluppi, and Serebrenik 2012). Cultivating a diversity of perspectives, fields, and backgrounds is important for growing a robust research software community. There is a need to understand the sources of diversity problems and work to improve over the current state (Daniel, Agarwal, and Stewart 2013). Contrary to the initial belief in open source communities, the ability to contribute to a project anonymously does not solve the gender diversity issue (Nafus 2012) at least partially because project members are able to determine the gender of contributors, even those that use pseudonyms (Vasilescu et al. 2015). In addition, a large percentage of female contributors have either been subject to or witnessed gender-based discrimination (Powell, Hunsinger, and Medlin 2010) and have been discouraged from participating in these projects because of the aggressive nature of the discourse and the lack of female role models (Reagle 2012). In our own URSSI survey, described in more detail later, we found evidence of the same lack of diversity. When asking survey respondents to self-identify their gender, only 25% identified as female. The 164 US respondents to the 2018 International RSE Survey (Philippe et al. 2019) reported being 82% male, 14% female, and 4% preferred not to say. They also reported being 77% white, 11% Asian, 6% Hispanic/Latino, 5% other, and 2% Black. Additionally, 3% reported having a disability.

1.2 Valuing producers of research software

Despite the numerous barriers that prevent people from receiving recognition and career success for their research software work, some have successfully overcome them. An exemplar is in this regard is Dr. Fernando Perez, currently an associate professor in statistics at the University of California, Berkeley. For much of his career he worked in a traditional untenured position as a research scientist in neuroscience and computational research, while collaboratively developing an open source notebook interface for the Python programming language as a side project. Over time, his software work started having much more of an impact than any of his traditional scientific contributions. The current evolution of his group’s efforts, the Jupyter ecosystem (https://jupyter.org/), is considered by many to be “universally accepted by the scientific community” and has won him and his team awards such as the Association for Computer Machinery award for software. More recently, the magnitude of his software contributions and the far reaching impact of this effort (https://www.theatlantic.com/science/archive/2017/06/gravitational-waves-black-holes/528807/, https://www.theatlantic.com/science/archive/2018/04/the-scientific-paper-is-obsolete/556676/)has earned him a fast-tracked tenured position at University of California, Berkeley. This type of unconventional success in a traditional public university is a sign that the recognition of software work as scholarship is changing.

While the result of Dr. Perez’ story is promising, it is very difficult to achieve. Other academics who have produced work of similar or greater magnitude in the past weren’t as fortunate. Travis Oliphant, for example, served as an assistant professor of Electrical and Computer Engineering at Brigham Young University in the early 2000s. Among his accomplishments during this time, he is credited as the primary creator of NumPy, the Python library for numerical arrays that is the foundation of modern data science tools, and as one of the early contributors to SciPy, the widely used library for scientific Python. These contributions were deemed insufficient for tenure. Kirk McKusick, a professor at EECS in Berkeley was denied tenure in the early 1990s because his primary work was on the BSD Software Project, which then went on to become the foundation of the modern internet. [TODO: Other examples would be welcome and appreciated. Are there women/people of color who have experienced similar situations that we can highlight here?]

1.3 Why we ran this conceptualization

It comes as no surprise to researchers that software is undervalued in academia. However, substantive evidence supporting this claim is scattered and mostly anecdotal, making it hard to build a convincing case that the research community and their stakeholders need to care more. In 2017 we submitted a proposal to the National Science Foundation’s Software Infrastructure for Sustained Innovation (S2I2) program to gather this evidence, identify unique challenges not already being addressed, and to formulate a plan for an institute to implement solutions. The proposal was funded in December 2018, allowing us to engage in various activities over the following 24 months. Despite the awareness of these issues and the existence of similar conceptualizations (albeit domain-centric ones), the core challenges around sustainability (of people/software/practices), recognition and credit, and training/workforce development remain poorly understood and poorly addressed.

We used a wide range of approaches to understand the core social and technical challenges of developing sustainable research software. These approaches include an extensive survey, in-person unconferences, both general and topic focused, a pilot training event, and a series of ethnographic studies. We mapped out the current state of software related challenges with an extensive survey targeting researchers across the country. We also invited participants to workshops across the country to share critical challenges and brainstorm solutions in small groups. We dug deeper on a couple of core issues that arose repeatedly (credit and incubators) by organizing two focused workshops to tackle them (credit and incubators). This report captures our summary of the survey, workshops, and studies and describes a core set of activities that define the work of a future US Research Software Institute.

1.4 What we plan to do and why

The conceptualization phase with its survey, workshops, and ethnographic studies elucidated the need in the community for different components of a potential implementation of URSSI. We identified four areas for supporting the research software community - to accelerate science for diverse research domains as well as software engineering as a research area in its own right. Each of these activities contributes to the desired impacts described in Section 3.4.

  1. Incubator: Sustainability of research software has many aspects beyond good software engineering practices that overlap with diverse areas of expertise. Such areas include technology advice, project management, business planning, usability advice, license management, etc. The incubator service area would focus on providing projects with experts who have a consultancy agreement and could support a project in the different stages of their life cycle – from spinning up a project to first results and uptake of software to planning its sustainability after the funding ends.

  2. Education & Training: Many universities offer curricula in conceptual software engineering but there is a lack of practical training to meet the community needs to go into depth for different technologies. University courses offered in computer science departments are also often impractical for domain researchers. The survey showed that there is a need to choose formats and timeframes that are suitable for people in the role of Research Software Engineers (RSEs) and busy domain scientists who lack formal or informal training. While The Carpentries and the RDA-CoDATA summer school, for example, do an excellent job of teaching basic programming and computational & data analytic methods to researchers in a peer-to-peer model, mostly in 2-day courses, URSSI has a massive opportunity fill the gap in teaching more in-depth software engineering, software project management, and community development practices in longer engagements, such as a 5-day summer- and/or winter-school. URSSI will strive to collaborate on topics with the Carpentries and the existing software sustainability institutes, such as the Science Gateways Community Institute and the Molecular Sciences Software Institute, so that rather than replicate effort, identify training opportunities in the overall research software landscape that is currently missing.

  3. Policy: Policy is an important area to improve the sustainability of research software. Policies could include campaigns as well as guidelines for citation of software, templates for job positions, good software engineering practices as well as bad software engineering practices as counterexamples. We have already been collaborating with the UK SSI for several years and plan to expand collaborations with initiatives/projects such as US-RSE to find a common ground on international level while considering the specific situation in the US.

  4. Community & Outreach: Many researchers or developers in the role of an RSE work in silos and the Community & Outreach area of URSSI would connect them to peers and provide access to beneficial material, resources, and contacts to improve this situation. The RSE community has already achieved some successes by simply working together. The number of chapters has gone from 1 UK university in 2013 to 28 chapters in 2020. URSSI has the potential to capture more of this momentum. Community engagement would include scalable communication such as a website, blogs and newsletters, two-way communications for discussions such as webinars and online discussion forums as well as face-to-face meetings in the form of workshops. In addition, this area will include a fellows program to support work done by community members that benefits URSSI activities.

Building these different areas into URSSI would formalize URSSI’s informal position as the focal point for the overall community of software developers as well as for a set of disciplinary communities. URSSI will help accelerate science that needs research software and improve the career paths of those who develop and maintain it, including RSEs.

References

Barone, Lindsay, Jason Williams, and David Micklos. 2017. “Unmet Needs for Analyzing Biological Big Data: A Survey of 704 NSF Principal Investigators.” PLoS Computational Biology 13 (10). https://doi.org/10.1371/journal.pcbi.1005755.
Carver, Jeffrey C., Sandra Gesing, Daniel S. Katz, Karthik Ram, and Nicholas Weber. 2018. “Conceptualization of a US Research Software Sustainability Institute (URSSI).” Computing in Science & Engineering 20 (3): 4–9. https://doi.org/10.1109/MCSE.2018.03221924.
Collberg, Christian, Todd Proebsting, Gina Moraila, Akash Shankaran, Zuoming Shi, and Alex M Warren. 2013. “Measuring Reproducibility in Computer Systems Research.” Department of Computer Science, University of Arizona TR 13-03. https://www.cs.arizona.edu/sites/default/files/TR13-03.pdf.
Collberg, Christian, Todd Proebsting, and Alex M Warren. 2014. “Repeatability and Benefaction in Computer Systems Research.” Department of Computer Science, University of Arizona TR 14-04. http://repeatability.cs.arizona.edu/v2/RepeatabilityTR.pdf.
Daniel, Sherae, Ritu Agarwal, and Katherine J. Stewart. 2013. “The Effects of Diversity in Global, Distributed Collectives: A Study of Open Source Project Success.” Information Systems Research 24 (2): 312–33. https://doi.org/10.1287/isre.1120.0435.
Hettrick, Simon. 2014. “It’s Impossible to Conduct Research Without Software, Say 7 Out of 10 UK Researchers.” https://www.software.ac.uk/blog/2014-12-04-its-impossible-conduct-research-without-software-say-7-out-10-uk-researchers.
Hettrick, Simon, Mario Antonioletti, Les Carr, Neil Chue Hong, Stephen Crouch, David De Roure, Iain Emsley, et al. 2014. “UK Research Software Survey 2014.” Zenodo. https://doi.org/10.5281/zenodo.14809.
Howison, James, and Julia Bullard. 2016. “Software in the Scientific Literature: Problems with Seeing, Finding, and Using Software Mentioned in the Biology Literature.” Journal of the Association for Information Science and Technology 67 (9): 2137–55. https://doi.org/10.1002/asi.23538.
Lee, Amanda, and Jeffrey C. Carver. 2019. “FLOSS Participants’ Perceptions about Gender and Inclusiveness: A Survey.” In IEEE/ACM 41st International Conference on Software Engineering (ICSE), 677–87. https://doi.org/10.1109/ICSE.2019.00077.
Nafus, Dawn. 2012. “’Patches Don’t Have Gender’: What Is Not Open in Open Source Software.” New Media & Society 14 (4): 669–83. https://doi.org/10.1177/1461444811422887.
Nangia, Udit, and Daniel S. Katz. 2017. “Track 1 Paper: Surveying the u.s. National Postdoctoral Association Regarding Software Use and Training in Research.” figshare. https://doi.org/10.6084/m9.figshare.5328442.v3.
National Science Foundation. 2017. “Women, Minorities, and Persons with Disabilities in Science and Engineering.” https://www.nsf.gov/statistics/2017/nsf17310/.
Organisation for Economic Cooperation and Development. 2019. OECD Skills Outlook 2019. https://doi.org/10.1787/df80bc12-en.
Philippe, Olivier, Martin Hammitzsch, Stephan Janosch, Anelda van der Walt, Ben van Werkhoven, Simon Hettrick, Daniel S. Katz, et al. 2019. softwaresaved/international-survey: Public release for 2018 results (version 2018-v.1.0.2). Zenodo. https://doi.org/10.5281/zenodo.2585783.
Powell, Whitney E., D. Scott Hunsinger, and B. Dawn Medlin. 2010. “Gender Differences Within the Open Source Community: An Exploratory Study.” Journal of Information Technology 21 (4): 29–37. http://jitm.ubalt.edu/XXI-4/article3.pdf.
Reagle, Joseph. 2012. ‘Free as in Sexist?’ Free Culture and the Gender Gap.” First Monday 18 (1). https://doi.org/10.5210/fm.v18i1.4291.
Vasilescu, Bogdan, Andrea Capiluppi, and Alexander Serebrenik. 2012. “Gender, Representation and Online Participation: A Quantitative Study of StackOverflow.” In 2012 International Conference on Social Informatics, 332–38. https://doi.org/10.1109/SocialInformatics.2012.81.
Vasilescu, Bogdan, Daryl Posnett, Baishakhi Ray, Mark G. J. van den Brand, Alexander Serebrenik, Premkumar Devanbu, and Vladimir Filkov. 2015. “Gender and Tenure Diversity in GitHub Teams.” In 33rd Annual ACM Conference on Human Factors in Computing Systems, 3789–98. CHI ’15. https://doi.org/10.1145/2702123.2702549.