A few months ago, I had a discussion with some friends online. The premise of the discussion was that even if you account for complexity, shorter code is more likely to be bug-free code.
As a C programmer for decades, my mind rebelled against the idea. “Nonsense,” “Absurd” and “Too simple” were my knee-jerk reactions.
Taken to its ultimate end, this premise suggests that code golf — code intentionally reduced to the absolute minimum number of characters — is the most bug-free code. Code golf is, by definition, dense and barely readable code. How is that a good thing?
But the more I thought about the idea and how to debunk it, the more it seemed to make sense. Code-golf code tends to either work 100%, or it fails spectacularly. Once you get the code to work it tends to be correct as there is literally no room for error.
Python’s failings: Performance and multithreading
The Python language’s one major failing is performance. Yes, you can use C-based libraries to provide high-performance versions of what should be core functions of the language, so its performance is almost comparable to off-the-shelf Java or C#. However, this works against Python’s base architecture decisions.
Python is an interpreted language as opposed to being compiled, and it was never designed to efficiently support multithreading. Since Python’s introduction, mainstream computing has moved in the direction of multithreading. Almost every modern processor is multicore.
Once again, different versions of Python can overcome this performance restriction, but with their own tradeoffs. This leads to a fragmented code ecosystem in which code might work well on one implementation but not on another.
With such performance and multithreading issues, why still consider Python? There is, in fact, a very good reason.
Python’s saving grace: Shorter, cleaner code
Python’s saving grace can be found within the original premise above: all other things being equal, shorter code is more likely to be bug-free.
When you combine Python’s dynamic typing with its generally very compact syntax, you can succinctly and clearly express complex ideas and calculations in fewer lines of code than equivalent C-family languages. This reduces the cognitive load on the programmer, and a lot of boilerplate functionality is already built into the language. The resulting code is likely to be shorter and clearer.
Here’s another way to think of this: Consider the lines of code as an attack surface for bugs. Fewer lines of code means less attack surface.
Code examples: Python vs. Java
Let’s start with the good old “Hello World” app. To create this in Java we are looking at five lines:
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello, World!");
}
}
The same thing in Python is only one line:
print("Hello, World!")
However, we could reject those as merely boilerplate code. A bit more concrete example is to define a class. A simple class in Java takes about a dozen lines:
public class Person {
private String name;
public Person(String name) {
this.name = name;
}
public void greet() {
System.out.println("Hello, my name is " + name);
}
public static void main(String[] args) {
Person person = new Person("Alice");
person.greet();
}
}
We achieve the same thing in Python in seven lines:
class Person:
def __init__(self, name):
self.name = name
def greet(self):
print(f"Hello, my name is {self.name}")
person = Person("Alice")
person.greet( )
Sometimes code savings are not so much in number of lines, but in terms of readability. For example, here is a lambda function that applies a square function to a list in Java:
public class LambdaExample {
public static void main(String[] args) {
List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5);
List<Integer> squared = numbers.stream()
.map(n -> n * n)
.collect(Collectors.toList());
System.out.println(squared);
}
}
And here is the code to achieve the same thing in Python:
numbers = [1, 2, 3, 4, 5]
squared = [n * n for n in numbers]
print(squared)
Finally, let’s compare a round of code golf in both languages to extract duplicates from a list. In Java the function reads as below:
public static List<Integer> RemoveDuplicateHashSet(List<Integer> items) {
return new ArrayList<>(new HashSet<>(items));
}
The equivalent in Python is as follows:
def remove_duplicate_set(items):
return list(set(items))
Rebuttals: Versatility, libraries, dynamic vs. static typing
There are several rebuttals to this premise that shorter code is less buggy.
Probably the first and simplest point is that a programming language’s aim is not the fewest number of keystrokes. (Again, code golf.) That can mean it is less versatile than comparable, more eloquent languages.
Another concession is that Python relies heavily on outside libraries to perform actions that other languages accomplish natively. This effectively hides the true size of the Python code.
Complexity is another argument against Python. It is notoriously hard to model in Python, and any shorter code produced is likely to be dense and thus complex. If the prevailing metric to measure the likelihood of bugs is the complexity of the code, then the code’s length is not a good measure of that.
Finally, Python is a dynamically typed language. Compared to a statically typed language, it is more likely to have bugs that do not emerge until runtime due to type mismatches and unexpected data casting.
Conclusion: Is Python code both shorter and less buggy?
All these rebuttals have truth to them — but so does the original premise.
All things being equal with the code, and programmer skill being equal as well, Python requires fewer lines of typed code to perform the same outcome as other languages.
This is a complex and emotive topic for developers, but it is worth some thought. In the end it might be the saving grace for the Python language and ecosystem.
David “Walker” Aldridge is a programmer with 40 years of experience in multiple languages and remote programming. He is also an experienced systems admin and infosec blue team member with interest in retrocomputing.