Closed Communication is Harmful to Open-Source

February 26, 2007

Lets say you have an itch to scratch, and you start an open-source project, uploading initial code to a public website for anyone to view and suggest modifications to. The problem now is that there is no systematic way for people to post their suggestions, bug-fixes or enhancement-patches.

In response to that, you create an open bug-tracking system such as bugzilla or jira. Contributors to your project are happy, becuase now they have an open discussion forum to discuss issues they care about in your project. You find some prolific contributors, who agree with your philiosophy about where the project should be heading. (If they do not agree, they are willing to discuss, believing that such discussion will lead to better roadmap. Afterall, isn’t the open-source movement based on the principle that two minds are better than one? Remember “Given hundreds of eyeballs, all bugs are shallow”? That was just a fancy way of saying “more the merrier”.) You make them committers to the project, i.e. folks who have the right to make changes to the central repository of sources of your project.

Time passes, your project gets a few other contributors. They submit patches, bug-fixes, improvements to your projects. Of course, now your project is big, visible, and deemed popular. Do you give up the open discussion group about the project that you have formed for fostering the growth of the balm for what was once your personal itch ? Do you often talk to other committers of your project in a closed forum (such as IM or personal e-mails)? Do you decide on the merits of submitted patches in this closed forum to make decisions about which patches to commit and which to reject ?

Look at your practices again. Because they show that you have not committed to developing open-source software. The main factor feeding development of open-source software is open communication. There is a reason that the open-source bug-tracking software allows anyone to subscribe to the disscussion of one or many bugs/features/issues that interest them. Contributors to open-source are contributors for a reason. They cherish the development process. And open discussion is perhaps the most important feature of the open-source developmeent model.

Closed communication is a relic from the Cathedral. Bazaar demands open communication.

So, open-source project administrators, if you want to have a private conversation about which pub to go to for drinks this evening, its okay. Do not talk about the project development roadmap, discussing merits of contributions, or whether to approve a particular patch or not, in private. Stick to the spirit and letter of open-source by embracing open communication.


Program in Parallel, NOW !

February 20, 2007

In the old days of parallel computing, where I would like to claim my fifteen minutes of fame, scalability of a parallel application was THE metric for measuring success of your approach. If your application did not scale linearly according to the number of processors available to you, you were toast, period. My recent experience suggests otherwise. In a design planning meeting that I attended in an advisory role recently, the client said, “We run this app in 2 hours sequentially. We really need to get it below an hour. No matter how many CPUs we need to throw at it.” A purist in me would have said, if I rewrote your app according to my training, you would need only two machines to get to an hour. But guess what, I don’t have time to rewrite their app, nor do they have the kind of money I would deserve to rewrite their app. They might as well throw money at the hardware infrastructure.

Parallelism has been forced upon us. The research doctrine of “publish or perish” has changed. Now it is “parallel or perish”. In the past, proponents of parallel programming were citing speed of light arguments to promote their wares. Now they do not need to cite theoretical physics. They can refer to the hardware industry’s push over the last few years to conserve power by moving to multi-cpu architectures instead of giving you, the consumer, a single 100GHz CPU.

Speeding up gates and transistors have failed. Power leakage and such technical factors have brought the uni-processor camp to its knees. They have given up. IBM and Sun came to their senses early. But the behemoth of computing silicon, which matters most to you and me, Intel, has come to its senses recently (last couple of years). Now, there is no going back from the promised land of parallel computing. Instead of raising the speed of the all-powerful single processor, the largest (in terms of the number of instructions executed in the world) designer of the brain of our computers has said, “we were wrong, you were right”. Its a cause to celebrate, my fellow parallel programmers.

With the lowering latency of interconnects between cpus (AMD’s Hyper-Transport) or between machines (PathScale’s Infinipath at 1.29us), parallel programmers have started thinking about hundreds of thousands of machines to execute their scientific computations (aka stupid matrix transformations). They have never thought about the kind of parallelism they needed to provide to parse an XML serialization of a large row in a database. (After all, these things could be speeded up by fast single-threaded string-processing, right?) So, addition of two or four cores to the CPU itself must have been some news they would treat as trivial. But is XML parsing really trivial to parallelize? This is our chance to show the world that we have been preparing for. Exactly for this moment of fame in the larger scheme of things !

It is a chance to show the world that we know our threads, our locks, our critical sections, our semaphores, our RPCs, our asynchronous communication. We know what it takes to get their application to get the required twenty percent speedup when run on two CPUs, or hundred percent speedup on rack full of machines.

This is the time to inflict our craft on the computing world, demonstrating that the poor sequential programmers what they always were. Poor. And sequential.

Start coding for multiple processors. NOW.