Monday, October 22, 2007

Open Source Phenomenon

The Open Source Phenomenon has changed drastically the skills set requirement for our software development professionals today.

Traditionally, software development involves not just writing the business logic of your application, but also a lot of plumbing logic (such as data persistence, fault resilience). These plumbing logic are typically independent of your business logic, and also quite complicated (although they are well-studied engineering problem). In most projects, large portion of the engineer effort had gone to re-inventing solution for these recurring plumbing problem.

The situation is quite different today in our software development world. Thanks to the Open Source Phenomenon, most of the plumbing logic are available in some forms of software libraries or frameworks with source code available. There are different kinds of open source licensing model with difference degrees of usage or modification restrictions. Nevertheless, the software development game plan was changed drastically since then. Instead of building all plumbing from scratch, we can just leverage what is out there as Open source libraries/frameworks. Open source phenomenon bring huge savings in the project cost and also reduce time to market.

However, surviving in the Open Source world requires a different set of software engineering skill sets that is not taught in our college. Most techniques we learn from colleges assumes a green-field development environment and we learn how to design a good system from scratch rather than how to understand or adapt existing code. We learn how to "write code" but not "read code".

Skills to survive in the Open Source world are quite different. Such as ...

"Knowing how to find" is more important than "knowing how to build"

"Google", "Wikipedia" are your friend. Of course, "Apache", "SourceForge" is good place to check. Sometimes I find it is not easy for a hard-core software engineer to look for other people's code to use. Especially before she has certain familiarity of the code base. The mindset need to be changed now. The confidence should be based on its popularity and in some case, you just use it and see how it goes.

"Quick learning" and "start fast"
Looking at the usage examples is usually a good start to get a quick understanding on how things works. Then start to write some Unit Test code to get a better feel on the area that you are interested.

Be prepared to "switch"
This is very important when you are using less proven Open Source products. Even with proven open source product, the level of supports may fluctuate in future, not something under your control. Therefore, your application architecture should try hard to minimize the coupling (and dependencies) to the open source products.

In most cases, you need just a few features out of the whole suite of the open source product. A common technique that I use often to confine my usage is to define an interface that captures all the API that I need. And then write an implementation which depends on the Open Source product. Now the dependency is localized in the implementation class. Whenever I switch, I just need to rewrite the implementation of the interface. I can even use IOC (inversion of control) mechanism to achieve even zero dependencies between my application to the Open source product.

Ability to drill down

Quite often, the open source provides most of what your need, but not 100%. In other scenarios, you want to do some enhancement, or fixing a bug. You need to be able to drill down into the code and feel comfortable to make changes.

This is a tricky area. I know many good engineers who can write good code, but only very few can pick up code written by other people, and be able to familiar with that. As I said earlier, we are trained to write code, but not read code. In dealing with open source product, code reading skills is very important.

In fact, the skills of reading code is generally very important as most projects are in maintenance mode rather than green-field development mode. I will share code reading technique more detail in later blog when I talk about doing surgery on legacy code. But here is a quick summary of an important subset that are relevant to dissecting open source products ...
  1. Look at its package structure, class name, method name to get some rough ideas about the code structure and data structure.
  2. Put some trace code that prints out its execution path, then run some examples to verify your understanding.
  3. For further drill down, put in break points and step through the code
  4. Document your findings
  5. Subscribe to the community mail alias and send an email to ask

No comments: