Software Carpentry in High Performance Computing
ATPESC 2015

Aron Ahmadia
Continuum Analytics

11 August, 2015

Copy This Lecture!







Creative Commons License
Fundaments of Computational Software Engineering by Aron Ahmadia and Christopher Kees is licensed under a Creative Commons Attribution 3.0 Unported License.

Introduction

There are a plethora of best practices available to help you compute on HPC systems. It is likely that you will only be able to afford a limited amount of time learning a subset of them. The purpose of this lecture is to help orient you on the path to writing software as part of your research by:

The 8 Essential Practices

  1. Write Programs for People, Not Computers
  2. Let the Computer Do the Work
  3. Make Incremental Changes
  4. Don't Repeat Yourself (or Others)
  5. Plan for Mistakes
  6. Design Flexibly for Performance, Build Accessibly for Correctness
  7. Document Design and Purpose, Not Mechanics
  8. Collaborate

1. Write Programs For People, Not Computers

1. Write Programs For People, Not Computers

2. Let the Computer Do the Work

2. Let the Computer Do the Work

3. Make Incremental Changes.

3. Make Incremental Changes.

Organize with Wikis

Use Version Control for Checkpointing and Collaboration

4. Don't Repeat Yourself (or Others)

4. Don't Repeat Yourself (or Others)

Automate common actions by saving simple blocks of code into scripts

Refactor commonly used blocks of code into functions

Group commonly used functions into libraries

5. Plan for Mistakes

5. Plan for Mistakes

Verify and Validate your Code

6. Document design and purpose, not mechanics.

6. Document design and purpose, not mechanics.

Principles of documentation

7. Design flexibly for performance, build accessibly for correctness

7. Design flexibly for performance, build accessibly for correctness

Be fluent in multiple languages

You speak multiple languages when interacting with a computer. Choosing to use a new tool, library, or language can be similar to learning a new language:

Use domain specific languages and libraries to increase your expressivity

Use REPL Environments for Development

REPL (read-eval-print-loop) environments tighten the coupling between the code you write and the results you see, increasing productivity.

REPL non-REPL
IPython and Python C/C++
Julia Fortran
Interactive Sessions Batch Systems

Collaborate

Collaborate

Closing Thoughts

Reduce Complexity

Aim for reproducibility

References and Further Reading

Books

Pragmatic Programmer, The: From Journeyman to Master

Andrew Hunt and David Thomas

ISBN 978-0132119177

Describes the important principles and practices of being an effective programmer, instead of teaching a specific language or technique

Code Complete Second Edition

Steve McConnell

ISBN 978-0735619678

Focuses on principles of software construction, with attention to skills, testing, and design.

Verification and Validation in Scientific Computing

William L. Oberkampf and Christopher J. Roy

ISBN 978-0521113601

Focuses on verification and validation of numerical solutions to models described by systems of partial differential and integral equations.

Research Literature

Programming Languages for Scientific Computing

Matthew G. Knepley

Preprint: http://arxiv.org/pdf/1209.1711.pdf

Gives an overview of modern programming languages and techniques such as code generation, templates, and mixed-language designs. This is a preprint, so expect some rough spots.

Two Solitudes

Greg Wilson

Slides: http://www.slideshare.net/gvwilson/two-solitudes

Describes Greg's journey as a scientist and leader for the Software Carpentry project, provides some insight into the differences between industry and academics.

Best Practices for Scientific Computing

D. A. Aruliah, C. Titus Brown, Neil P. Chue Hong, Matt Davis, Richard T. Guy, Steven H. D. Haddock, Katy Huff, Ian Mitchell, Mark Plumbley, Ben Waugh, Ethan P. White, Greg Wilson, Paul Wilson

Preprint: http://arxiv.org/abs/1210.0530

Good summary paper of many fundamental practices for working with and developing scientific software.

Web References

What Every Computer Scientist Should Know About Floating-Point Arithmetic

David Golberg

Web article: http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html

Introduction to the IEEE floating-point standard, its implications, and many of the common pitfalls when using floating-point numbers in scientific computing

The Research Software Engineer

Rob Baxter, Neil Chue Hong, Dirk Gorissen, James Hetherington, and Ilian Todorov

Web article: http://dirkgorissen.com/2012/09/13/the-research-software-engineer

Discussion of the current challenges to scientific software engineering as a profession.

Science Code Manifesto

http://sciencecodemanifesto.org

Publicly signed commitment to clear licensing and curation of software associated with research publications.

US Army Engineer Research and Development Center

http://www.erdc.usace.army.mil/

Innovative solutions for a safer world.