Why Academic Software Sucks
What can academic coders learn from software developers in industry?
During my time as a PhD student I have developed software in the academic setting. At that time I was already under the impression that my work would probably not meet industry standards. Having recently transitioned to an industry job, I quickly realized how coding in academia is different from coding in industry. This post summarizes the main differences between the two fields and extrapolates what coders in academia can learn from industry.
The following list gives the key differences between academic and industry software development:
1. Teamwork
Academic software is usually developed by individuals and work in teams is uncommon. In industry, on the other hand, software is usually developed in teams. While working in a team poses its own challenges, it also affords many benefits:
- Quality of code: cross-functional teams lead to higher quality code
- Fun: working in a team is much more fun than working alone
- Learning: expertise can easily be shared and acquired
2. Code Reviews
One reason why the quality of academic code is lacking is that there is usually no code review. In industry, on the other hand, each segment of code should have been seen by at least four eyes. Why is it worth to spend additional time on code reviews?
- Quality of code: prevention of bugs, performance improvements, ease of maintenance, …
- Learning: you’ll learn from reading other’s people code as well as from being reviewed
3. Version Control
In academia is unfortunately not yet the standard. This makes proper software development difficult. Even when a version control system is used, it may not be state-of-the art (e.g. use of SVN instead of git) or may not be used to its full capacity. Although I was using git, I was using a very limited amount of its features. For example, my go-to git workflow consisted merely of:
- git pull origin master
- do some changes
- git commit -am “some changes”
- git push origin master
Looking back, there is something wrong with each step in this workflow:
- Fetch with rebase should be favored over pulling in order to avoid merges
- Code changes should strive towards a specific goal and not be a collection of various things
- Commit messages should be concise and interpretable
- Are you sure you want to push to the main repository without a code review?
Particularly the craft of creating well-structured series of commits is not taught at universities.
4. Testing and Test Automation
Code developed in academic settings is often not tested. It is therefore likely to contain many undetected bugs. In industry, however, testing is done on different scales:
- Unit tests: isolated test of a single functionality
- Component test: test a single component
- Integration test: test the interaction of multiple components
All of these tests are run in an automated fashion using continuous integration tools such as Jenkins, which run tests after every commit or at specified intervals (e.g. every night). This ensures that the software always works as expected.
5. Documentation
Badly commented academic code is the norm - probably because the person that wrote the code thought that no one would ever read it except for him. In industry, however, the expectation is that code should be understandable to other developers. Thus, the code is either well documented or so well written that it doesn’t require any commenting.
Summary
In summary, if you are writing code in an academic setting, you should implement the following five steps now:
- Search for someone to work on the code together with you
- Find someone to review your code (ideally this is done by your newfound team member)
- Use a version control system (e.g. git) and use it properly
- Test your software and automate the process by using continuous integration pipelines (e.g. Jenkins)
- Improve code documentation to make sure other people could work with what you built
I know that you are probably busy with a lot of other things. However, I promise you that these measures will pay off eventually, particularly in terms of greater software quality and less time required for code maintenance.
Comments
There aren't any comments yet. Be the first to comment!