Why are Shell Scripts Bad?

Shell scripts are bug prone, unmaintainable, and inscrutible. But what in particular makes them bad? This is a good question without answers. I have four independent theories.¹ [¹ Naturally all probably contribute, but assigning blame, or resolving different kinds of errors to different causes, would be valuable. "All of the above" is usually true but without more details also usually useless.]

Shell is a bad programming language
Data structures beyond text are necessary
Processes are a baroque, complex replacement for function calls
All shell scripts use the global, mutable file-system

Reading the list, you likely nodded your head to each of them. But these theories have very different implications! To fix the first requires only switching shells: should we all use Fish?² [² I used to, and don't any more, and did not see any benefits.] To fix the second, it's enough to use a real programming language like Python.³ [³ When I replace shell scripts with Python I usually decrease the bug rate, but it remains substantially higher than "normal" Python code.] The third suggests that Python programs that use real modules instead of subprocess should be enough. The last suggests that private file system namespaces, in the Plan 9 sense, could make shell scripting sane again.

Another place to look for answers is analogs where one of the above theories might apply. JavaScript is a (comparatively) good programming language, with real data structures, with functions instead of processes, but where everything has access to a global, mutable tree structure. Bugs and unexpected interactions are common, and in fact modern JS design involves virtualizing the entire DOM to enforce isolation. In TCL the main data structure is a string but it doesn't have Shell's reputation. Shell uses processes instead of function calls because the system API is exposed to C, a difficult language; but in Emacs those APIs are expressed directly to the usual scripting language. Emacs-Lisp is pretty bug-prone to me, and the modern style involves a lot of private buffers and save-excursion macros.

So examining this evidence… I guess I lean toward theory 4? In which case, did iOS get it right by not exposing a common file system? Should shell scripts make a lot more use of Linux's private namespaces?

Bonus question: Is this why build systems are bad? Or is that something else?

Footnotes:

Naturally all probably contribute, but assigning blame, or resolving different kinds of errors to different causes, would be valuable. "All of the above" is usually true but without more details also usually useless.

I used to, and don't any more, and did not see any benefits.

When I replace shell scripts with Python I usually decrease the bug rate, but it remains substantially higher than "normal" Python code.

By Pavel Panchekha

12 December 2019

Why are Shell Scripts Bad?

Footnotes: