Spiral Arm Logo

Richard's technical notes

Wednesday, May 25, 2005

Common Development and Distribution License

This blog is about software licensing based on my personal reading of license agreements and should not be taken as advice of any kind. If you intend to act or not act based on the contents of this blog you do so at your own risk.

The OSI have approved Sun's Common Development and Distribution License (CDDL) so I thought I'd take a look and see what all the fuss is about.

When I license software I've usually gone for the GPL, occasionally the LGPL, or a custom commercial license (which brings joy to our solicitors).

CDDL is new for me. It's based on the Mozilla license (MPL). Compared to the GPL, I find the Mozilla license to be a tough read -- due to the phrasing, the number of parties involved and the way it extends into other areas of "intellectual property". So the place to start with this stuff is Andrew M. St. Laurent's wonderful book Understanding Open Source and Free Software Licensing. It deals with the MPL but was published before the CDDL arrived.

Here's a summary, ignoring all the details, of how I think of the licenses (so please take a moment to re-read my disclaimer at the start of this blog). The GPL says: where ever you use this source code, the resultant project must also be made available in source form for anyone to use under the terms of the GPL. My gut reaction here is that this is what I mostly want for things I give away for free: you can't "steal" my software and put it into a commercial product.

In contrast, the MPL says: if you modify the source code, document your changes, give us credit, make the modifications available as source, but you can use the source in another project and license that larger work however you want. This is quite a leap: here's my freely donated code, if you make it better I want to see the changes, but otherwise go ahead and commercialize it or do whatever you want with it. In some senses that's making the code more valuable and giving more freedoms than GPL. I'm leaning towards this style of license now.

The CDDL says: this is the MPL but cleaned up so you can use it without having to resolve disputes in, and only in, California.

It's important to note that source licensed under the GPL cannot be mixed with source licensed under MPL or the CDDL -- but see the FSFs comments on various licenses for more information. This means you need to decide where you stand on the various freedoms offered by the various licenses, or get into dual or triple licensing and everything that entails.

Thursday, May 12, 2005

Sleepycat: Berkeley DB Java Edition

I recently had a look at Sleepycat Software's Berkeley DB Java Edition. What's on offer is a transactional data store that can be embedded inside an application, including inside web applications. Compared to other embedded products, this one has been around forever, so it should be bullet proof.

I think of the Berkeley DB as more-or-less a persistent hash table. One object is a key, another is the content, and the rest is all put and get against a transaction ("extends TupleBinding" is the magic needed on a class). It's fast, it's reliable and it's simple to use. The lack of external dependencies (no server to worry about) means it's a doddle to write unit tests for. There is a little bit of serialization/deserialization code to write, but even that's pretty easy.

In fact, there are just two fatal downsides.

When I went on my first and only proper database training course (for Illustra, which shares a history with Postgres) I'm sure one of the things I was told was: SQL is about declarative data access. You get to say what data you want without worrying about how you get it. That doesn't really sink in until you have to worry about the how-you-get-at-it part. With Berkeley DB, you do have to worry about that. For example, you're storing some data with one kind of key, and now you want to access the data some other way. Tough. You need to iterate over the data and figure it out, or implement an index for secondary keys. Pain and hassle. If you reach that point it's probably time to move to SQL.

The second problem is all about money. At first glance the licensing terms look great, but digging into the definition of "redistribution" it becomes apparent that any commercial application needs to buy a license. Fair enough, so how much? "For pricing information, or if you have further questions on licensing, please contact us." So I did, and we'd be looking at US$40k - US$150k before annual support. Considering that price is not based on number of servers or CPUs or any time period, it's not so bad. But not so good if you can work with MySQL or PostgreSQL. (Other pricing is available, if you want to talk to the Sleepycat sales team).

Summary: I found it rock solid and blindingly fast. It's a good product, but we all have different trade-offs for what we do. I'd consider it for specialized projects, but for the rest of the time I'll stick with SQL.

If you want to try it out, the getting started guide (PDF) is a well-written document worth reading.