Saturday, January 24, 2009

Open source software is self service software...

Jeff Attwood at Coding Horror makes an interesting point about open source software being "self service" software. That immediately struck a chord with me as a I saw his point straight away. 

But he didn't quite mean what I expected him to say. What he meant was that participating on software projects is a self service enterprise and thus leaders of open source projects need to make their projects as easy as possible to serve yourself on. There are particular issues that go with any kind of self service.

But the same rings true for _users_ of open source software, and those users are exactly that, self service users. With open source software the users help themselves, they fix their own bugs, get their own help and basically make their own way. They do not rely on a commercially driven entity in order gain value from the software they are using.

And I think companies thus need to take this into account when they're considering whether so use open source software. At cinema's in South Africa they have self service kiosks for buying movie tickets (I don't know if they have them elsewhere in the world) - and I really appreciate these. I do not have to wait in the queue, I also get to have my pick of where I want to sit. 

However, it is only particular people that are able to use these kiosks. They are not everyone's cup of tea. You need to be technologically savy, and need to be willing and able to back yourself. iow, to try new things and have the confidence and where withall to see it through.

But let me say that not all open source is as self service as "self service". Some open source initiatives are supported by commercial entities, and some open source applications look and behave like the closed source equivalents - OpenOffice and Apache Web server being two simple examples.

In general though, using open source is analogous to using the self service option, when presented, and seeing things in that light help to shed light on some the challenges and value adds of open source software.

Friday, January 09, 2009

Hibernate: What does update actually mean?

I work on a large enterprise application that uses hibernate for it's persistence.

I regularly in the code base find the equivalent of...
Person person = personDao.loadById(id);
person.setLastname("newlastname");
person.setFirstname("newFirstname");
personDao.update(person);
The update call in the above code snippet is unncessary and useless.

Even after being on the project for more than 2 years I still find developers not understanding the purpose of update and I think I understand why. The obvious thought process is, I've changed a stored object, I now need to call update to save the change; sounds reasonable enough? That however, is not the purpose of update.
This particular use of update belies a misunderstanding of the nature of hibernate, and that is that it is a "transparent" persistence framework. What that means is that persistence is made to be transparent to the developer. You should not have to worry about the persistence issues. You load objects from data store, you change the objects, and the changes are saved, without any explicit save instructions. Imagine for a minute, if you had to call update whenever you made a change the complexity upgrade would be significant as it would mean having to track all the objects you do in fact change.

So then, if update is not for that, then what is it for?

The update method could have been named "attachAndPersistChanges" - that is a more accurate name, but probably a little unwieldy. It could also be described by the sentence "update this object in the data store". In other words, update is designed for objects that are not currently persistent.

The update method makes objects that are not persistent, persistent and updates the persistent object with any changes present in the not persistent (detached) object. The object must not already be available in the level one (session bound) cache otherwise an exception will be thrown (for this situation, use merge).

It does nothing for objects that are already persistent. I do not know if there is a performance penalty with running update on an already persistent object, if there is, it's probably minimal. It does clutter up the code with needless lines however.

Then when should I use it?

My current project is a large enterprise system with java on the back end, .net on the front end and web services in between. The front end is fairly light and thus does not know about the data model. For that reason the front end does not operate directly on the data objects. If it did, there might be a case for using update. i.e. I've received an object from the front end with changes that were made by the user. and I need to persist those changes so I call update on the object. The pattern we follow however is the one outlined in the code snippet. Load an object from store, make changes to it based on data received from front end - changes are automatically persisted, no update call.

But there is one case that I've encountered where we do need to use update and it illustrates an important thing to remember about hibernate...

Consider the code... 
transactionController.startTransaction();
Person person personDao.loadPersonById(personId);
person.setLastName("newLastname");
person.setFirstName("newFirstname");
transactionController.endTransaction();

transactionController.startTransacton();
person.setLastName("secondLastname");
transactionController.endTransaction();
I know it's a silly piece of code and you wouldn't find it like this in reality, but it illustrates the case simply. It is setting up a case where there are two transactions and one transaction is interacting with an object which another transaction touched.

What will the value of person.getLastname on the database? Believe it or not, the value will be unaffected by the second setLastname operation. It will be "newLastname". Why, what's going on?

The thing to remember here is that as far as hibernate is concerned, from the perspective of the second transaction, that person object is transient. The session that it was connected to is now closed and any change that is made to it is made as if it's a normal pojo with no knowledge of the persistence store even though a new transaction has been started. That object does not know about the new transaction.

And that is where update comes in. An update needs to be called on the person object in the second transaction to persist the change to lastname. This as we explained earlier, will attach the object to the new session and apply any required updates. 
transactionController.startTransacton();
person.setLastName("secondLastname");
personDao.update(person);
transactionController.endTransaction();
On our project, that is the only legitimate context where update is appropriate and necessary.