SimPEG recently won an Innovative Dissemination of Research Award from the University of British Columbia (UBC) Library. The award honours UBC faculty, staff and students who expand the boundaries of research through the creative use of new tools and technologies. Lindsey Heagy, Seogi Kang, and me were honored to receive this award, and are extremely proud of community that is being built around the SimPEG framework.
SimPEG is a software package for simulation and parameter estimation in geophysics. The package provides a framework and a comprehensive set of building blocks to carry out our numerical research. It is targeted at promoting exploration, experimentation and extension of ideas in a reproducible manner. It is tested so that with any changes to the code-base, users can have confidence in the components we rely upon as a growing community. It is documented to allow new collaborators and researchers to quickly build upon our initial work. It is openly licensed to promote dissemination. All of these strategies allow geoscientists to focus on the challenges in the future, instead of solved problems of the past.
The future of geoscience research lies in the integration between disciplines (geology, geophysics, hydrology, engineering). These problems require new collaborations to be catalyzed so that new methodologies and knowledge can be developed. This can only be enabled through a radically different dissemination and engagement model. Our student team started the SimPEG project four years ago with this in mind, and we have tailored our dissemination approach to support these fundamental goals. We aim to open up conversations by inviting collaborations from a community, including researchers, industry practitioners and educators, by enabling the integration and extension of ideas across a range of applications in geophysics.
As scientists we seek to find models that reproduce the observations that we make in the world. In geophysics, we use inverse theory to mathematically create models of the earth's subsurface from measured data, and these models are used to inform decisions in a variety of industry and environmental applications. These problems are difficult to solve because they involve many moving pieces: physics, discretization, simulation, regularization, optimization, computer science, linear algebra, geology. Moreover, the environmental, resource management, and geotechnical problems we are facing require the integration of data to better characterize the subsurface and better inform the decision we make. The future of geoscience research is focused on combining methodologies and experimenting with new data integrations to get the best, unbiased knowledge of the subsurface possible.
What we are building is called SimPEG: Simulation and Parameter Estimation in Geophysics. Our approach and dissemination model is radically different than those in our field. Over the last four years we have organized the methodologies (25+ years) into a framework, created a toolbox that enables rapid interrogation and ideation, and have disseminated our work through an open, tested, and documented medium. The toolbox, combined with a framework, allows researchers to solve their own problems and creates opportunities for users to forge new collaborations through their common tools. These tools are necessary and invaluable in industry, education, and accelerating research. We believe that geophysics can be more innovative and informative if the tools, methods, and dissemination models we choose enable the open exchange of ideas.
Due to the perceived value of geophysical software in industry applications (e.g. oil and gas, mining, engineering), research codes and the ideas inside them are often treated as black box, proprietary commodities. Industry terminology such as 'trade secrets' have leaked into the research vernacular and obscured assumptions; tests are non-existent (or hidden), so results (papers!) are often non-reproducible. Even when software is shared, researchers are isolated and often restricted in their ability to build upon, or extend, the software because of licensing or obfuscation. When software is walled off, and only a select group of people have access to its inner workings, these ideas remain stagnant, cannot be extended or improved. Furthermore, this paradigm actively inhibits the collaborations that are crucial to the future of our field.
SimPEG is built in the open, from scratch, by students. It is fully open-source, allowing unhindered academic, educational, and industry use, adaptation and redistribution of the code. It is tested and assumptions are stated in the documentation. With any change or addition to the software, the entire SimPEG package is tested, and the pass/fail results are available publicly (Travis-CI). Documentation, including examples, provides a starting point for users and new contributors (http://docs.simpeg.xyz). SimPEG is hosted on the social-coding platform GitHub. This provides public access to our development history as well as the mistakes, flaws, and enhancements we are actively discussing and addressing. This openness allows other researchers to join the team and it allows and enables ideas to be dynamically extended, robustly tested, and encourages reproducible research (e.g. https://goo.gl/8Mi5R6, http://simpeg.xyz/journal).
Building from a trusted framework and focusing on the common foundation of geoscience simulations and inversions has enabled the growth of a community around SimPEG. This growth has started with colleagues at UBC, including students studying magnetotellurics for geothermal exploration and potential fields for mineral exploration. These individuals bring their expertise and individual research problems and have championed the development needed for the application they are focused on, all while advancing a common core of functionality. This community has recently grown beyond UBC to include researchers at UWO and UofC who have brought their expertise in applications of seismic for oil and gas and direct current resistivity for near surface hydrologic studies (i.e. https://zephyr.space).
The SimPEG development team has grown from 3 at its conception to now 12 active developers. The package is downloaded 554 times every month. We are extremely excited to see the project gaining such traction, as most software development projects in our field never grow beyond a team of 3 people and a handful of expert users. It is fundamentally the novel and open dissemination strategy that we have cultivated, and invested in, that has enabled this growth and the potential for a rich and diverse future.
By starting from a framework common to methodologies across the geosciences, conversations that span these sub-disciplines have been enabled. Differences in terminology and jargon between disciplines such as seismic, fluid flow, and electromagnetics have made the transfer of knowledge and ideas between these disciplines a rare occurrences. The SimPEG framework has served as a common foundation in which we can hold these conversations: on a daily basis and between collaborators in differing disciplines within the geosciences across Canada.
One thing that we did not expect was the uptake of this intricate scientific software in undergraduate education and industry training (see the talk: https://youtu.be/4msHJMBvzaI, won an SCLT award: http://sclt.science.ubc.ca/Spring2015). The SimPEG building blocks have been recombined in novel ways to enhance the UBC undergraduate program, and enabled spin-off projects. These projects use dynamically generated figures and provide students access to interactive simulation tools (http://github.com/ubcgif/gpgLabs). They have also promoted reproducible, collaborative, scientific writing and the development web-based textbooks (http://gpg.geosci.xyz, http://em.geosci.xyz). Under the hood, these simulations are running advanced computations, but for the students, this is not the point. The point is that these tools enable them to explore and experiment with geologic models and visualize the behaviour of the geophysical responses in an interactive manner. These interactive pictures promote and heighten intuition that is often not possible to glean from examining the governing equations alone. We believe in enabling students to experiment and learn in the same way that scientists do: by using creative tools to explore physics, math and data through computation and visualization.
SimPEG started when three students brought their individual (award winning: http://goo.gl/0BqHza, https://goo.gl/Q5Nu2t) research together with the goal of collaborating on the common aspects. We distilled the core methodologies into a common terminology allowing us to have more powerful, general, actionable conversations about the field. From this we constructed a framework to organize the tools and ideas that we are building and enabled them to work together. With these new tools and ideas, we have been afforded new research opportunities with broader scope (also winning awards! http://goo.gl/hmHTrM, https://goo.gl/P52Hgx, http://goo.gl/x5m1LG). This has grabbed the attention of industry, and we have been approached by professionals in mining exploration, geologic modeling, oil and gas, hydrogeology, civil engineering, geothermal, and USA National Labs about the use and development of the software. Many of these collaborations are in progress, and, perhaps most importantly, these collaborations are happening outside the core team working on this project.
Geophysical simulation tools are too often reserved for industry experts and select laboratories who know how to make, or have the funding to buy, all the pieces. This community is exclusive and this paradigm does not appeal to our scientific morals. An open dissemination strategy is novel in our field and attracts a much wider audience as is demonstrated by our growing list of collaborators and the collaborations that we have enabled. The repackaging of these tools for education is not possible when starting from a proprietary mindset. Our open dissemination model targets the users of geophysical simulations directly and enables all of us to raise our game by working together.
As geoscientists, explaining the methodology used to address or investigate a problem is essential. As educators, enabling students to explore and experiment with concepts, ask 'What if...?', and cultivate their understanding of the world, is the goal toward which we strive.
Accomplishing these goals requires multiple forms of engaging media, of which, software is an essential building block. By running simulations, we can generate images and videos that add visuals to the narrative. Open source tools enable this to be taken a step further: towards reproducible scientific media, which can be shared and cited. This past September, our research group initiated an effort to create an open-source web-based textbook for electromagnetics (http://em.geosci.xyz). It is an effort that has involved 10+ scientific writers/researchers. We have brought the lessons learned from building and managing a community of contributors and we have used the SimPEG resources to create interactive, reproducible figures. This web-based interactive textbook is a novel dissemination strategy for scientific writing: being a 'reader' is now more than a passive act. Source code for the figures, videos, and examples can be downloaded, executed, and serve as a jumping off point for exploration. It provides the opportunity to catapult undergrad students into an active research community: (e.g. a directed study: http://goo.gl/JGdPUm). 'This is left as an exercise for the reader' takes on new meaning when it is enabled by tools at the forefront of the field.
Our essential functions as researchers are the pursuit and dissemination of knowledge through research and education. As numerical scientists we have the opportunity to reach a wide audience through software; the code is simply an executable record of an idea, a set of instructions for how to solve a problem. As scientists, we have a responsibility to enable others to use our ideas in a reproducible manner. Our response to these ideals in the context of the field of geophysics has been the SimPEG project. This is a student initiative: from conception, to technical development, to collaborations initiated, to driving educational initiatives, to putting these interactive materials online and in the hands of industry and external researchers. We are cognizant of the tools we employ, whether they be technical, legal or social, to enhance the dissemination of our research to the widest possible audience. We are driven by the ideal that we can do better than stand-alone tools used as a black-box by an exclusive group. We require tools that augment our thoughts, bring reproducible rigour to our visualizations, and allow us to bring together multiple disciplines in geophysics, hydrogeology, and geology. The future of our field is rooted in integrating knowledge between disciplines. This is only possible if we build on a common foundation. This is only possible when supported by an open, thriving community.