Saturday, May 27, 2006

Functional Programming Under the Hoods - Developers Call it LINQ

The May CTP of C# 3.0 introduces LINQ amongst some other additional features of the language like implicitly typed local variables, extension methods, collection and object initializers etc. All of these features have been extensively critiqued here, here and here. But undoubtedly the features on which most blogs have burnt are the lambda expressions and Language INtegrated Query (LINQ). Lambda expressions in C# form the core of LINQ and brings us back the memories of functional programming of Lisp genre. With the current generation afflicted with the perils of Java schools, it will definitely require a change in mindset, a fresh relook at the old pages of Paul Graham to appreciate the underpinnings of the design of LINQ in the new release of C# 3.0 and Visual Basic 9. Think collections as monads, query operators as monad primitives, queries as monad comprehensions and what you get in LINQ is today's incarnation of Lisp, Haskell or ML. As Erik Meijer has mentioned correctly,
Functional programming has finally reached the masses, except that it is called Visual Basic instead of Lisp, ML, or Haskell.


Code-as-Data: Unveiling Lambdas in C# 3.0

Like Lisp, the lambdas of C# 3.0 support the code-as-data paradigm. However, in Lisp or Scheme, it is upto the developer to decide whether to use the code or data through the application of quote and quasi quote. C# 3.0 and VB 9 compiler automatically decides on the code or data for the lambda expression depending on the static type of the context in which the lambda occurs.

Expression<Func<Customer,bool>> predicate = c => c.City == "London";
Func<Customer,bool> d = predicate.Compile();


In the first statement, Expression<T> is a distinguished type, which preserves the lambda in the form of expression trees instead of generating traditional IL-based method body. Thus the lambda is data, instead of code - lispers reminisce of the quoted expression to achieve the same effect. In Lisp you use eval to translate the data into some executable - C# 3.0 offers Compile() over the expression tree, as in the second line above, which makes the compiler emit the IL.

The Grand Unification

The primary goal of LINQ is to unify programming against relational data, objects, and XML. Typically in today's applications, developers employ three disparate models - SQL for relational database programming, XQuery, XSLT etc. for XML processing and OOP for business logic. The LINQ framework presents a unified model using the technology of functional programming under the hoods and exploiting the algebraic nature of collections and operations on collections.

The design of LINQ consists of generic operations on collections abstracted as base patterns of query operators and domain specific extensions for XML (XLINQ) and relational database programming (DLINQ). Some of the experts have expressed concerns that LINQ may once again prove to be a bloat in the language with efforts towards ORM implementations. I have a completely different opinion on this. I have worked on enterprise scale Java EE projects with a relational database at the backend. I have personally burnt my head over performance issues in these applications and have come to the hard realization that you need to write performant SQL queries to have good transaction throughput. And typically developers write these SQL queries as untyped string constants within the Java codebase. Efforts towards using popular ORM frameworks like Hibernate simplify things a little bit, but still you cannot avoid having typeless SQL statements buried within your Java code, so long u have the impedance mismatch of the relational database at the backend. This is where LINQ and DLINQ rocks with the typed lambdas - the developer can write typed queries that work seamlessly over collections of objects, relational tables and XML documents. The ultimate aim of Microsoft is to use the LINQ framework to provide a unified approach to programming data and offering DSL support over multiple tuple-based data sources, resulting in a simpler programming model.

Polymorphic Queries in LINQ

Another exciting new feature that LINQ brings is the IQueryable interface, which allows writing true polymorphic queries that can be executed on an immediate or deferred mode in any target environment. The following example is from The Wayward WebLog :

public int CustomersInLondon(IQueryable<CUSTOMER> customers) {
  int count = (from c in customers
        where c.City == "London"
        select c).Count();
  return count;
}


In case you want to execute the query over a collection of objects ..

List<Customer> customers = ...;
CustomersInLondon(customers.ToQueryable());

And if the query is executed against a remote DLINQ collection, the underlying engine translates it to the appropriate SQL for the underlying database. Actually what goes on underneath is what the functional community calls comprehension. In case of the above query, the comprehension syntax is translated by the compiler into standard query operators.


All the new features in C# 3.0 and Visual Basic 9 bring back strong memories of Lisp and Haskell and rediscovers the joy of declarative programming. It only took a couple of decades to establish them as part of a mainstream language. It remains to be seen if other languages like Java follow suit.

1 comment:

Anonymous said...

Html Help please I need to build a mustang parts website.