Monday, December 01, 2008

Guido's thoughts on Scala

Many people who have reacted to Guido's gripes with Scala have bent the post towards a static-typing-is-bad-and-dynamic-typing-is-good syndrome. I think the main point of concern that Guido expresses relates to the complexity of Scala's type system and the many exceptions to the rules for writing idiomatic Scala.

While it is true that there are quite a few rough edges in Scala's syntax today, none of them look insurmountable and will surely be addressed by the language designers in the versions to come. We need to remember that Scala is still a very much growing language and needs time and community feedback to evolve into one of the mainstream general purpose languages. As Martin has mentioned elsewhere, Scala's grammar is no bigger than Java, OCaml or C# and hence the language should not impose a more complex mental model for using the same set of features that the other statically typed languages espouse.

However, Scala's typesystem is expressive enough to offer succinct solutions to many problems that would look much more obtuse in Java. A better modeling of the Expression Problem is a classic example, which, as this paper demonstrates, looks much more verbose, inelegant and non-extensible using Visitors in Java. I have been programming in Scala for sometime now, and I have the feeling that Scala's extremely rich type system can be exploited to implement domain models that encapsulate most of the business rules through declarative type constraints. I have blogged on this very recently and described my experiences of how Scala's type system, specifically, features like abstract types, self type annotations etc., works for you to implement expressive domain models, without writing a single line of runtime type checking code. Less code, more succinct abstractions, and most importantly less tests to write and maintain. After all, everything has a cost, it depends whether you abstract the complexity within the language or within your application.

Another excellent point that has been raised by David Pollak in one of the comments to Guido's post is that we need to judge complexity of language features separately depending on whether you are a library producer or a library consumer. As a library producer, you need to be aware of many of the aspects of Scala's type system, which may not be that essential as a library consumer. Sometime back, I had demonstrated an example of an internal DSL design technique in Scala, where, given a sufficiently typed library for financial trading systems, you can write DSLs in Scala of the form ..

new Order to buy(100 sharesOf "IBM")
  maxUnitPrice 300
  using premiumPricing


Another great example is the Scala parser combinator library, which hides all nuances of Scala's type system and allows you to design external DSLs. Have a look at this example of another financial trading system DSL built using Scala parser combinators ..

On the whole I think Scala has a compelling appeal to the programming community on the JVM. Statically checked duck typing that Scala offers makes your code look and feel like Ruby and yet offers the safety net of compile time checking. Scala is object oriented, though most of idiomatic Scala is functional. In a real world application development environment, where statefulness is the way of life, Scala offers a very good compromise with its hybrid model of programming. Model your states using objects, and model your computations using functional Scala.

Regarding the other concern raised by Guido on the non-uniformity of Scala's syntax, it is indeed true that there are some rough edges to deal with .. like .. () versus {} in for-comprehensions and inconsistent inferring of semi-colons, the number of interpretations that an underscore (_) can have, some subversions in partial function application syntax, syntax for def foo() { .. }. I guess there are quite a few of them that are currently being debated as possible candidates for change. The good part is that the language experts are working with Martin to iron these issues out and smoothen out the rough edges, even at the risk of losing some amount of backwards compatibility.

16 comments:

martin said...

Hi Debasish:

Nice post. The only point I want to raise is your mention about underscores. I think it's simply not true that underscores have "a number of interpretations". Fundamentally, it's always the same: an underscore denotes something that's missing. When used as an expression, the missing thing needs to be supplied as a parameter. When used as a type, the type inferencer will find a type for you. That's all. I find it very consistent, really.

Unknown said...

Hi Martin:

Thanks for reading my blog.

I also find the usage of underscores to be very consistent across Scala - it serves as a universal placeholder and allows powerful binding idioms. But very frequently I come across opinions where people gripe about underscores and (maybe) I fail to explain them clearly enough. Your 2 lines explanation clears it all very succinctly.

Thanks again - I am sure we will have a huge 2.8.

Cedric said...

Hi Debasish,

You write:

"
none of them look insurmountable and will surely be addressed by the language designers in the versions to come
"

I don't think this is realistic. Over time, languages get more complex, not less complex, and I don't know how it would be possible for Scala to become less complex (both in syntax and semantics) without completely breaking backward compatibility.

Also, I'm surprised to see Martin use the size of the grammar as an argument to claim that Scala is not complex: it's possible to create very obfuscated syntaxes with just a few lines of EBNF.

The bottom line is that gurus in a language will never see the language they are comfortable with as complex: the only credible judges are newcomers, and so far, the verdict has been pretty negative for Scala.

--
Cedric

Unknown said...

Hi Cedric -

I didn't mean reducing the complexity of Scala's syntax. I meant smoothening out the currently existing inconsistencies some of which have been mentioned by Guido himself in his post. As of today, there are some rough edges in Scala syntax which are considered exceptions to rules for writing idiomatic Scala. I have mentioned a few of them in my post, you can get a more comprehensive list if u follow some of the discussions in the scala-debate forum.

I know you have some reservations over the apparent complexity of Scala syntax (particularly compared to Java). I have a different opinion on this - I feel the basic syntax of Scala is a good compromise of expressiveness and succinctness. The language just needs to evolve and mature over time and Martin has mentioned about a more mature release for 2.8.

Raoul Duke said...

personally i am so absolutely no longer willing to buy the "oh the complexity will only impact people writing libraries-for-other-people" claim. that's the same thing people said about some of the weirdnesses of generics in java.

i can be writing code that is just for me, and want to use abstraction to make the code Not Suck. which pretty much then means *bang* i suddenly have to deal with the type system. this is my experience with generics in Java, and i do not believe there is any reason it would be too different were i working in Scala; inevitably i would run into the various weirdnesses and pain points of the system.

Ricky Clarkson said...

"bang! I have to deal with the type system".

Yes, you're using a typed language. Scala has a better and more pleasant type system than Java does, so "bang! I have to deal with the type system" is replaced by "great! I can use the type system".

As for Cedric's assertions that newcomers find Scala too complex, I don't suppose he considers that most Scala users were newcomers once. As I've said elsewhere, most of the initial complexities in Scala are quite superficial; you won't need to think about them much to write Scala code, only if you want to have a kneejerk reaction to a language you've never used.

That's not to diminish the point that there are real problems in Scala that could be, and might well be, cleaned up. There are already some breaking changes planned for 2.8.0, so it might be a good time to make some other cosmetic improvements.

Cedric said...

Ricky,

Of course, everyone is a newcomer to Scala at some point, but don't underestimate the power of incremental improvements.

The reason why C, C++, Java and C# became so popular is because each of these languages was an incremental improvement over the previous one, and programmers of one language could usually dive in pretty quickly into the next language of that family.

I'm afraid Scala had made a leap in syntax and semantic that will turn off all but the most determined programmers.

For example, I picked the following from a Scala blog:

def flatMap[TT <: T, X](g: R => RichFunction1[TT, X]) =
rich[TT, X](t => g(apply(t))(t))

Ricky Clarkson said...

Cedric,

I can tell without even checking, that you took that code snippet from Tony Morris's blog. He is probably the Scala programmer who cares least about readability. I know you chose it on purpose.

I suggest you look at (and post snippets of) Specs instead. And be fair, post Specs uses rather than backend implementation code.

I don't mean to rubbish Tony here, he's a great programmer, but he doesn't really care about identifier names, and this is deliberate.

You'll find Tony's Java code just as readable, and apparently he's writing some C#, so prepare yourself.

Anyway, if you have any difficulties with Scala, drop me a line and I'll be pleased to help, or find someone who can.

Unknown said...

Cedric -

I have 2 observations to make ..

1. One can very well limit oneself to the Java/C# syntax subset even in languages like Scala. Only if you want to have the additional batteries of the more powerful type system or the power of functional languages, do u need to use the additional stuff.

2. With the incremental improvement in syntax, don't you feel that we will be stuck with the C/C++ legacy syntax for ever ? Only today I read the following quote in Bill De hOra's blog, regarding Clojure ..

"The downside is that it's a Lisp. As a syntax, Lisp is a fail, which if you understand what Lisp syntax provides is a (the) great irony of programming. When you see how difficult it is for JRuby/Jython or even Groovy to break the syntax stranglehold Algol/C based languages have on the industry, you despair. Arguably, it loses out to those more syntactically appealing languages, who have a hard enough time knocking down the door."

James Iry said...

Cedric, if one dense code sample proves that an entire language is unreadable then there are no readable languages.

Guido van Rossum said...

Thanks for the response. I hope the Scala community can live with backwards incompatibilities -- if you stick to strict compatibility this early. you'll end up with C++-scale warts. (Or the horrible "tab" rule in Unix Make, which was an early design mistake that couldn't be fixed "because there were already a dozen users".)

My concern about the DSL example is that I would expect that typos and other mistakes by users who don't know the whole language might cause really confusing error messages, since the compiler doesn't know that the user is using a very constrained subset of the language. If the user accidentally uses syntax they don't know but which has some advanced meaning, they won't be able to understand the error messages at all.

So you could say that I don't buy the "library consumers don't need to know the whole language" argument.

Ricky Clarkson said...

Guido,

Scala is stuck with some arguably dubious syntax, but there is a lot that is still malleable, and some breaking changes are coming in the next major release.

Regarding DSLs, I would guess that half of all user errors are "cannot find symbol" errors due to a missing import, etc. For example, Specs has an experimental 'data tables' feature, and uses both ! and | as operator names. I got this wrong the first time I used it, and so did my boss. These are trivial problems.

Of the other half, probably you would understand 80% of them if you understand generics (from Scala, Java or C#). That leaves 10% of the original whole, where you would (mock horror) have to read the source for the DSL, or consult the author.

TIMTOWTDI, and yes, some of them suck.

David Pollak said...

My argument is not that you can write DSLs in Scala for non-programmers to use.

Digression. I have done a significant amount of work related to DSLs over the years. There's definitely a choice between DSLs that are simply libraries in a given language and have access to all the language features. This is the "Ruby Way" of doing DSLs and there are a couple of DSLs of this type in Lift (SiteMap and Machine). There's also the "new language, new syntax" mechanism where you create a mini language. Examples of this are spreadsheets, Textile/Markdown, etc. Once again, Scala shines for dealing with these kinds of DSLs. Finally, there are DSLs expressed in XML that are generated using external GUI tools and interpretted by your program. Ant and Maven are two examples. While I haven't written these kind of DSLs with Scala, Scala's awesome XML support should make working with DSL in XML very simple.

Undigression... My argument is that being a library producer requires different thinking are reasoning. I don't think you can dismiss my argument until you've spent 6 months writing libraries in Scala, OCaml, F#, or some other language that allows for reasoning about code using types. It's not dissimilar from someone who has never designed a language dismissing your comments about design decisions in Python.

As a practical matter, there's only been one case on the Lift list of someone getting confused about a compiler error message. On a list that averages 15-20 messages a day, one problem in 18 months of the type your describing seems to indicate that your assertion is unfounded.

Tony Morris said...

I'd like to correct you Ricky. I care more about readability - perhaps more than you do. To use Cedric's example, this is exactly why I write that. Would you like to see the alternative in Java? I don't mean a direct translation; rather how a typical Java programmer would solve that computational problem.

The code snippet that Cedric gave is entirely readable. There is some excess Scala noise of course; type annotations and type argument declarations.

To use another example, let reverse = foldl' (flip (:)) is far more readable than an explicit pattern match, however, I'll bet any poorly disciplined programmer is going to object. This programmer has failed to understand what it means to be readable.

An appropriate response to Cedric is "so?" but I suspect he has a lot of learning to do before a meaningful discussion can take place around that level of abstraction.

Kris Nuttycombe said...

Tony, I think that Ricky's complaint is valid simply due to the lack of semantic cues from single-character variable names (t) and names with highly generalized English semantics (apply). As much as anything, I personally find it difficult to rapidly visually distinguish single-character names from the surrounding syntactic markup. It's not that your code doesn't make sense; it's that it's difficult to take in its meaning at a glance due to the lack of adequate semantic cues and your penchant for terseness.

Ricky Clarkson said...

Mine was not a complaint about Tony, but about Cedric.