Sunday, October 18, 2009

Are ORMs really a thing of the past ?

Stephan Schmidt has blogged on the ORMs being a thing of the past. While he emphasizes on ORMs' performance concerns and dismisses them as leaky abstractions that throw LazyInitializationException, he does not present any concrete alternative. In his concluding section on alternatives he mentions ..

"What about less boiler plate code due to ORMs? Good DAOs with standard CRUD implementations help there. Just use Spring JDBC for databases. Or use Scala with closures instead of templates. A generic base dao will provide create, read, update and delete operations. With much less magic than the ORM does."

Unfortunately, all these things work on small projects with a few number of tables. Throw in a large project with a complex domain model, requirements for relational persistence and the usual stacks of requirements that today's enterprise applications offer, you will soon discover that your home made less boilerplated stuff goes for a toss. In most cases you will end up either rolling out your own ORM or start building a concoction of domain models invaded with indelible concerns of persistence. In the former case, obviously your ORM will not be as performant or efficient as the likes of Hibernate. And in the latter case, either you will end up building an ActiveRecord model with the domain object mirroring your relational table or you may be more unfortunate with a bigger unmanageable bloat.

It's very true that none of the ORMs in the market today are without their pains. You need to know their internals in order to make them generate efficient queries, you need to understand all the nuances to make use of their caching behaviors and above all you need to manage all the reams of jars that they come with.

Yet, in the Java stack, Hibernate and JPA are still the best of options when we talk about big persistent domain models. Here are my points in support of this claim ..

  • If you are not designing an ActiveRecord based model, it's of paramount importance that you keep your domain model decoupled from the persistent model. And ORMs offer the most pragmatic way towards this approach. I know people will say that it's indeed difficult to achieve this in a real life world and in typical situations compromises need to be made. Yet, I think if you need to make compromise for performance or whatever reasons, it's only an exception. Ultimately you will find that the mjority of your domain model is decoupled enough for a clean evolution.

  • ORMs save you from writing tons of SQL code. This is one of the compelling advantages that I have found with an ORM that my Java code is not littered with SQL that's impossible to refactor when my schema changes. Again, there will be situations when your ORM may not churn out the best of optimized SQLs and you will have to do that manually. But, as I said before, it's an exception and decisions cannot be made based on exceptions only.

  • ORMs help you virtualize your data layer. And this can have huge gains in your scalability aspect. Have a look at how grids like Terracotta can use distributed caches like EhCache to scale out your data layer seamlessly. Without the virtualization of the ORM, you may still achieve scalability using vendor specific data grids. But this comes at the price of lots of $$ and the vendor lock-ins.


Stephan also feels that the future of ORMs will be jeopardized because of the advent of polyglot persistence and nosql data stores. The fact is that the use cases that nosql datastores address are very much orthogonal to those served by the relational databases. Key/value lookups with semi-structured data, eventual consistency, efficient processing of web scale networked data backed with the power of map/reduce paradigms are not something that your online transactional enterprise application with strict requirements of ACID will comply with. So long we have been trying to shoehorn every form of data processing with a single hammer of relational databases. It's indeed very refreshing to see the onset of nosql paradigm and it being already in use in production systems. But ORMs will still have their roles to play in the complementary set of use cases.

17 comments:

Stephan.Schmidt said...

Some comments:

"[...] littered with SQL that's impossible to refactor when my schema changes."

Never saw a ORM (like Hibernate) help when changing the database. And on the contrary: I haven't seen schema changes in large databases, because too many systems (reporting, accounting) depend on a schema. Your domain model will change much more likely, and when the gap between your domain classes and your db is too large, your ORM will break. This often prevents refactoring of domain classes.

"If you are not designing an ActiveRecord based model, it's of paramount importance that you keep your domain model decoupled from the persistent model."

As said above, ORMs do not decouple your domain classes from the database, but instead nail your domain classes to your database schema.

Ever tried splitting domain classes that are in one table? Everything beside renaming classes and attributes is out of the window if you use an ORM (just my experience, YMMV).

Cheers
Stephan
http://www.codemonkeyism.com

Unknown said...

You can also use refactoring aware SQL dsls like squill, jequel, empiredb.

Regarding those _big_ domain models. In DDD terms they are broken anyway as there are no modules or bounded contexts that address the relevant part of the domain model at once.

When talking to BigDaveThomas at JAOO he also stressed that most solutions today are just simple CRUD systems that are bloated with ORM. Just mapping the tables to a screen is often a simple case of generic SQL and you're done :)

Michael
Michael

Stephan.Schmidt said...

"Have a look at how grids like Terracotta can use distributed caches like EhCache to scale out your data layer seamlessly."

We use TC and it does scale out our data without Hibernate.

Cheers
Stephan

Unknown said...

I am starting to learn that Hibernate requires more understanding and time than most people want to give it, but with that understanding, it can really work for you. I was talking to a user yesterday who asserts that QueryCaching is bad for him. I listened, checkpointed with someone who knew query caching very well, and found out that it will indeed work for this user if used properly.

I see that since Terracotta stopped fighting Hibernate and embraced it in the market and now that we build products for Hibernate users, my understanding of the technology has grown. Our ability to serve the needs of higher performance while staying within the confines of the Hibernate world have vastly improved over the last year.

Yes Hibernate has a few problems but I see the path fwd as contributing fixes and helping and not trying to invent yet another way to do what is inevitable, marshaling data to and from an RDBMS.

--Ari

Monis Iqbal said...

Lazy Initialization is not a feature/side-effect of ORMs, they can be present in your DSLs as well.

ORMs seems like a hindrance at the start of the project or when there are less "objects".

I think they are well suited for Object-Oriented minded teams. But now, as we are exploring different areas, paradigms, we tend to move away from ORMs and that's natural for these kinds of projects.

Anonymous said...

ORM is nothing more than an alternate marshalling scheme. To add layers upon layers of marselling has never made sense. That said db calls are tied to the network and is the case with all technologies that rely on aslow under pinnings caching will be essential. The best way to make an application cache resistant is to scatter the calls throughout your application. At least ORM normalize execution paths which makes it easier to add caching.

That said for the moment, applications are going to need to rely on something other than RDB technologies if they are going to scale. Technologies such as memcached look very interesting in that it is a very simple technology that is highly scalable

Kirk

Dave said...

I'm not sure with Stephen means that ORMs cannot help when refactoring a database; they help a hell of a lot more than strings containing SQL all over the place would; it's trivial to write an integration test to load up all your mapped beans and try to access one. If your mappings and database are inconsistent, you will know immediately what is the problem and where.

Having seen many codebases NOT using an ORM, I have to say they were all a big, huge, mess. ORM makes the code cleaner (or can help). And clean code can be refactored, maintained and optimized a lot better than a big mess of SQL statements everywhere.

Anonymous said...

Just put everything in stored procs, then your java code is completely shielded from the database structure. I've used this approach on a number of projects, and it has worked well. Simple and easy to maintain. This is in stark contrast to the ORM based projects I've worked on where no one on the project truly understood what was going on with all of the complex mappings, cachings, cryptic errors, etc. I can show any who knows SQL how to do virtually anything needed in a few hours with stored procs. ORM adds much more complexity...and I often see lazy loading all over the place causing horrible performance.

Anonymous said...

I can understand decoupling your domain model from the database in that the domain objects should be simple POJOs. That is where something like JdbcTemplate really shines (similar to using iBatis).

In this modern era of polyglot, how come we don't recognize that SQL is a language of its own. When tuning queries, it seems easier to enlist our team's database engineer to help me out by showing him queries, rather than bringing him up to speed on XYZ-QL.

Unknown said...

Maybe the real solution is some kind of "extended ddl" where one could specify validations/constraints/other business logic" more easily.
Like Hibernate, but without the "object mapping" part (why should one try to map relational data to objects?). Like "stored procedures", but functional instead of procedural (though i like spaghetti with cheese).

(just an idea)

Anonymous said...

Rails doesnt use straight SQL, so there is no need to move away from ORMs. Just wait and see what the Rails guys do.

Anonymous said...

I worked on a complete rewrite of a system and the Lead Developer did not use an ORM. We basically wrote our own, and it was a total mess and a waste of time. We spent most of our time debugging our data access layer and never made meaningful progress on the true functional requirements. After a year of hell, the Lead was fired and we threw the code away and started over with an ORM. What a relief! Not having to spend a lot of time dealing with the data access layer freed us up to focus on the functional requirements which makes happy clients which makes happy managers which makes happy developers.
Somehow I ended up on a team of developers that thought 3rd party tools are for wimps, and they could write everything themselves. What a bunch of arrogant fools, and what a waste of time. If we had all the time in the world, maybe we could write a better tool, but I doubt it. I freely admit that developers who create tools like Hibernate are smarter than me. Why would I waste any time trying to reinvent the wheel? It may not be perfect, but it's better than anything I could create myself. As for the arrogant fools who liked to create their own tools instead of just finding one already built? They were all fired at various times for consistently not finishing projects.

Unknown said...

@Anonymous (of the last comment)
You resonate pretty much what I wanted to say. It's true that ORMs like Hibernate are not without the warts. At the same time they offer a tonne of benefits too. My suggestion will be :
1. to use what good they offer (and they really offer a lot)
2. avoid the sucky features
3. use your judgement to selectively apply the ones that are debatable.

If you do not want to use the persistence context or automated session management, use the stateless session interface, where you use your ORM to marshal / unmarshal data out of your RDBMS and get stuff in the form of detached objects. Hibernate offers this .. check out http://docs.jboss.org/hibernate/core/3.3/reference/en/html/batch.html#batch-statelesssession ..

Rob said...

Session-less Approach
----------------------
For those wanting a "session-less" approach you can also check out Ebean ORM to see if it is more of your liking.

http://www.avaje.org

This means you don't need to worry about
- LazyInitialisationException
- management of session objects (Hibernate session/ JPA EntityManager).
- merge/persist/flush replaced with save

Sorry for the blatent plug but if you are looking for a simpler/session less approach it would be worth a look :)

Cheers, Rob.

Unknown said...

I don't think ORMs are a thing of the past but I also don't think they are a one size fits all option.

I was wondering if you've had a chance to check out Squeryl ? This is a LINQ style DSL for Scala. Eg:

def songsInPlaylistOrder =
    from(playlistElements, songs)((ple, s) =>
   where(ple.playlistId === id
       and ple.songId === s.id)
   select(s)
   orderBy(ple.songNumber asc)
   )

This is translated to SQL and executed for you. If you need to refactor (assuming someone will develop adequate refactoring tools for Scala) nothing is missed because there are no hbms to worry about.

Unknown said...

I have looked at SQueryl. Then there is ScalaQuery as well and quite a few other frameworks inspired by LINQ. All of them do a nice job of providing type safe queries on the domain objects. This way you save a lot from writing SQLs. But my main concern is that this process can quickly go out of bounds in a large project where you may have thousands of tables. Besides hiding SQLs, an ORM also does this job of virtualizing the data layer. This means you can scale up your data layer as transparently using products like Terracotta, Coherence or Gigaspaces. I like the elegance of LINQ inspired frameworks, but still skeptical about their usage in a typical enterprise application which needs high scalability.

Unknown said...

I view SQueryl as more of a small scale option although I'd still write a domain model that is separate from the persistent model with that tool.

The .Net world offers the best of both worlds with the NHibernate guys supplying a Linq provider. The Linq generates criteria API calls rather than SQL.