Reinvent the wheel

2011-09-02, 18:00

There is quite common wisdom that one should not reinvent the wheel because mostly it can be dangerous, wasteful for resources and time consuming. The main critique is that as the wheel is invented, available and has no conceptual flaws, reinventing it is pointless.

And while this is true when we talk about wheels, it seems less and less true when we’re thinking of software.

I am told that reinvention of the wheel is the first instinct of an incompetent programmer. I know, it’s harsh, and I might have agreed some time ago, when I didn’t have the experience of meeting and hurting because of really incompetent programmers. But now I’m pretty sure that it’s the other way around: reinvention of wheels is a quality of a very good programmer. Not only that, but the reinvention of wheels is at the core of the most successful projects out there, and their essential feature. I write this post and I dedicate it to all the programmers out there who dare to reinvent the wheels.

There is an axiom that says that no software is shipped without bugs. Some are obvious, some not, but they are there. We shall start from this axiom because it’s the foundation of how software is written these days; even if Dijkstra would be extremely disappointed in what I’m saying here, we must somehow accept that and take it as such. And from this axiom, we extract the first reason why you should reinvent your software wheel:

0. Their software is buggy as well

This is my favorite reason, but in itself is not a sufficient reason. While you might feel that this is an extreme reason, you just have to think about it, and you’ll understand what I want to say. For those unconvinced, just read ahead.

In my experience it’s not your code that causes the biggest problems. The biggest problems are caused by legacy code: code that is not maintained for a long period of time, where the know-how faded in time because either too much time passed or the team members that maintained that part of the code left the organization. The code you’re writing right now it’s not the problem – it’s the solution. The problem is the old code.

The problem with most of your own old code is that it lacks a few things. It lacks the overview that you gained in time over the issue that you once solved, it lacks the flexibility that you now need and it lacks the level of maturity that you gained by learning from the mistakes from that library. Life is about evolution, and I can tell that you evolved. And you can tell from your code as well.

How about the code of your colleague that was so upset about the food at the nearby restaurant that he quit and moved to India? Well, maybe he wasn’t that good of a programmer, was he? Or, perhaps, he just didn’t like to explain the things that are bugging you when you’re using his code. There are no samples of how it should be used. And it’s awfully limited.

1. Legacy code is a liability. Remove it, re-write it, or hide it in a library that’s tested to hell and back and incredibly well commented.

You want nothing from the legacy code, just to leave you alone. What you want, instead, is the capability to write new good code fast – and to test it well. If you don’t, then follow my advice: lock that in a portable library, test it in all possible cases and comment it well. Use it like it’s written by your worst enemy. After you finish that, it becomes THEIR software. To which we add the axiom: their software is buggy. And soon enough, you’ll find work-arounds for bugs in your own code. And when you have to find work-arounds for bugs in your own code, you realise that it would have been a lot better to have the wheel reinvented, and to have the ability to quickly update it and to fix that bug. But if the code is old, the risk of updating it might be too high.

What about code reuse? Isn’t that a great idea? Well, yes, of course it is, but if you do code reuse, make sure you reuse recent code. Why? Really, you didn’t read above? Let me repeat, then: because legacy code is a liability. I really wish I could give you examples, but I can’t: the examples I have in mind are confidential due to my work relations. But if you nod in agreement, you need no example.

Getting back to code reuse, the main idea behind code reuse is that you don’t want to rewrite the same thing over and over again. How about using some code written by others? Let’s say the standard C library, STL, Boost, J2EE or .NET framework? I take on the standard libraries because they are at the core of programming: there are very few programs that don’t use them. Especially modern programmers rely exclusively on very complex tools like the .NET framework, and that simplifies the work a lot. You can put up elements together with high speed and high precision. When you piece together things they just work, without too much head-aches.

Now I am thinking about large projects, both as time and ‘space’ (a lot of source code, a lot of people involved). In a large project, the amount of code is probably a lot larger than the core frameworks, and they get to use and overuse certain features. And these huge projects probably influence the way the frameworks are built – it’s these projects that drive the frameworks, it’s the sponsors of these projects that decide how your core tools are built. And your project is not so big, and definitely doesn’t own as many resources as others. You’re not a deciding factor – and therefore you’re stuck with ‘stock software’.

What happens with the stock software? Well, first it’s the bugs. Some huge chunks of software even advertise features that are badly done or not implemented at all. Some bugs will never get solved, and you have to work around them. That means that in every program you use that feature you have to work around it. Sooner or later you’ll maintain your own wrapper over some standard features that you use often. But still you gain a lot: the software that’s already written, that saves you a lot of time. So true.

Or not. Please look at the screenshot fragment below. This is Total Commander, in my opinion the most efficient file manager out there. There is nothing like the beauty of browsing your files using it: it’s keyboard driven, so users that know how to use the keyboard (and they are fewer these days, I know) will be incredibly efficient. Yet it’s incredibly hard to use the WinForms controls to implement this. Why? Well, because Microsoft has a clear opinion on what you’ll use the controls for, and once you try to break out of there, you’re pretty much on your own.

In another example, there is no way to fully customize the winforms controls on Compact Framework. If you choose to customize them, to change the background color, for example, you’ll end up having red buttons with gray margins. Why? Because Microsoft never thought that you’ll ever need to change the color of the 3d effect it creates for the buttons. You’ll end up drawing gray controls, and soon enough you’ll realize that your application is bland and actively boring. And this is because of reason #0 which brings #2:

2) Other people’s bugs are worst than your own, and costlier to work around.

Why? Because you don’t know about them until you find them. And at first you’d never suspect that ‘it is their fault’. A good programmer starts from the presumption that it’s his fault, and makes sure that he’s not doing anything wrong. But you have to rely on solid code and solid components to find your bugs fast. And when the bug is in your standard C library, it is kind of hard to see. Ah, you thought that just because it’s standard, it doesn’t have bugs? Look at this: a list of open bugs in glibc – an implementation of C standard library. Some bugs refer to basic functionality: for example, wrong output for strtod, a call you’d usually rely that it works. You’ll end up soon enough understanding why everyone in his sane mind uses a wrapper over the standard library, or uses his own implementation.

If you can ignore that other people’s bugs are worst than your own and you think you can just live with them, work around them or spend the countless hours needed to fix a bug in someone else’s code, maybe this will change your mind.

When was the last time you felt the need of writing your own memory allocator? For me it was about one year ago; and maybe if you never felt the desire and need to write your own memory allocator this post is not for you. Why would I write a potentially buggy memory allocator? Well, that’s simple: because of faulty memory allocators, that’s why. And because of lack of features. And because I want memory pools that I can free whenever I want in a more predictable operation time. And because rule #2, and because of #0.

Nobody actually thought of what I want my memory for. And nobody can ever imagine the scenarios I want to create with my memory allocator. The standard offering is allocate, free and (maybe!) reallocate, which may or may not be possible without copying data. That doesn’t suit me; I want more control.

I gave the example of the Total Commander browsing widget. I can’t find it anywhere in the standard offerings – that’s why Mr. Ghisler makes some money out of it, although it looks like it comes from the stone-age. I can’t improve it. The author of Total Commander invested a ton of time in it, and if I want to get there too, I have to do it too. That’s because #3:

3) Standard offerings don’t satisfy all the customers. Your needs are custom, and they are probably unmet by standard offerings.

Off-the-rack is not for everyone, and most of the time you need your things custom made. Sure, for a two people project you’ll choose the standard offering and make the best out of it. But specializing people on the intricate details of someone else’s technology is as dumb as the US exporting all its knowledge base to China, then wondering why they can’t make a Kindle at home. Does it make sense on short term? Yes, it does. On medium term? Oh… maybe. On long term? That technology is already dead. I see this error made over and over again, and it was the main advantage of Microsoft all this time. This is why Microsoft is a very very very smart software company. It invents its own wheels, and makes sure that everyone tests it for them (of course they do their own testing too).

What’s the other software company that took this wheel idea further? Well, let’s see, someone who builds its own software AND hardware… And are seen today as the supreme inventors of everything. (see PS)

And what about Google? Well, let’s think for a bit: they used indeed the Linux kernel. But they wrote their own Java, why? Because the standard offering was not for them.

Hey, let’s even use Linus Torvalds as an example: Minix was almost good, but Linux was in time better. A reinvented wheel, now, 20 years later, it redefines the way we do computing.

Why all the big names do their own thing? Well, it’s a simple rule. If you use another man’s wheel, you’ll always be poorer than him, that’s because you pay him to make your wheel spin the right way. Or, rule #4

4) If you want to be in top, you have to do reinvent the wheel. Because leeches are **not competitive; only against other leeches.

And, with no further ado, rule #5:

5) Because it’s too damn fun.

Now, there are some caveats to what I’m writing here. This does not apply to everyone. In fact it only applies to a certain category of programmers and entrepreneurs, and only they can understand what I’m saying here. I know that this category is very badly represented among the great masses. But in the end that’s how it has to be.

And nobody says that you should reinvent all the wheels. Only the ones that are at the core of your business; because those are the ones that define you as a programmer, or you as a business.

So don’t be afraid if your wheel needs some adjustments. Maybe it just needs a bit of reinventing. And it needs you, to have fun while doing it. So please, start working on your own wheel model today!

PS: yes, it’s Apple.

Later edit: I forgot to add the best reason for reinventing your wheel. Nobody will ever be able to tell you that your old wheel is inadequate, and you should upgrade to wheel 3.5. That’s because you can’t be bothered really with upgrades that you don’t need on your OWN tools.

And an addition to reason #3: One more reason is that you don’t need all those features. For example you might not need the full glibc, sometimes the uClibc is just fine and dandy. And it will make your programs faster. And you can actually avoid all those features. Also, let’s say you need a smart pointer in C++. You’ll find the best implementation in Boost: but if you bring 100 MB of sources in your projects, you’ll be dead by sunrise. And who maintains all that nonsense that Boost has?

As further reading, I recommend Joel Spolsky’s Fire and Motion. It will make you hate the way you’re kept addicted to software upgrades that you don’t need.