Sunday, June 2, 2013

"Why do you prefer Java to C++?"


This is a question I have been asked during an interview a long ago. Whilst it is a nice question it is not an easy one, not only because your interviewer could be a C++ evangelist and you'll never know (shit happens!), but also because there might be tons of reasons hard to explain in such a context where you have time to just name a few. So let's go through the long answer here.
Apart from the usual and quite widespread considerations about the major shortcomings of the C++ language, such as:

  • The syntax. It is really annoying and hard to read. It's definitely the worst I have ever seen (yes, I know, Perl is also bad but I'm not a Perl developer). Stroustrup once said that C++ was designed to be an elegant language, but other languages, such as Python, make this statement quite embarrassing. Regarding Java, it is somewhat too verbose and bureaucratic, requiring a lot of boilerplate code at times, but to some extent I agree with those who think that redundancy contributed to make it an easy to read and a robust language, well suited for large development teams. By the way, take a look at this slide by Josh Bloch about this topic.

  • No real platform, just an overly wide and complicated language (with the addition of a scanty yet intricate STL), which also implies that you have to be constantly trained. I really can't stand the C++ policy that whenever you need something you are “free” (well, rather forced) to do it your own way, even for very basic needs. More than often you waste a lot of time searching for libraries or writing things from scratch that prove to be not as good as if they were supported by the language/platform itself. In general, frequent and basic tasks are almost always more complicated with respect to any other OO language, which is clearly not a pro. Moreover, my own experience tells me that is easier to change part of the design inside a Java application than in C++ one, simpler languages tend to be more “agile”.

  • Low productivity. There is still a fair number of developers that either neglect this or simply justify it by asserting that, on the other hand, you have a dramatic increase in performance. It could actually be true sometimes, but most of the time you might not care, it's not a compelling reason. I tend to see myself as a software architect (or at least this is what I wish to become, although right now I'm nothing in particular), hence I'm rather “obsessed” with finding a good, future proof and/or scalable design; usually performance comes later (remember, premature optimization is an anti pattern).

  • Old designed and old fashioned. Think about how modern Python was when it came out in 1991 and how old C++ stood when it has been standardized in 1998. For instance, consider std::function or std::tuple. I mean, 1991 vs. 2011. Ok, they are very different languages and the comparison is a bit far fetched, but the point is that it is not reasonable to wait until 2010/2011 for stuff like smart pointers, portable multi-threading, regex, typesafe null and enums, hashmaps and hashset and so on. We are talking about pretty basic stuff, software building blocks. Does it look like a modern language? Really? Even in the early days it looked exactly what it indeed was, that is an extension to the C language rather than a new design with a C-like syntax. And today, being C compatible is a cons rather than a pro. (if you are interested in this topic have a look at this message from Bruce Eckel)

  • Many more such as the lack of memory safety, bad unit testing support, the madness of the operator overloading coupled with RAII, friend classes/methods, private subclassing and so on.
 
Apart from the above, here are some other deficiencies I happened to see during my working life using C++:
 
Memory management
While smart pointers are a fundamental and useful tool, they don't come without a price. First off, no covariant return. Also, they require you to pay attention to circular references. Nonetheless they are so beneficial that many experienced programmers say that smart pointers should be used everywhere (more or less); and I do agree with them, indeed. But... hey, wait a minute! If I have to use std::shared_ptr<> (and sibillings) all over the place cluttering my code, why should I use C++ in the first place? Is a timely resources release a must have? 
Well, usually it isn't. Aside the fact that a new proxy object must be allocated every time you enter a new scope, the real point is that memory management is something that steals time and attention to the developer, and to some extent it keeps being the same even when using smart pointers. Although I'm familiar with them, I'm also aware that they will never make reasoning about pointers/references as easy and natural as it is in Java.
But I'm just scratching the surface: complicated concurrent programs are not easy to achieve without a Garbage Collector. This is the reason why in this talk you can see Rob Pike from Google stating that the adoption of a GC has been a day one decision when designing the Go language (alternatively take a look at http://talks.golang.org/2012/splash.article#TOC_14.). The topic would require an entire post alone but I'd like to mention a few concrete examples related to non-locking synchronization. Usually non-blocking data structures take advantage of the Compare And Swap (CAS) instruction (or emulate it, e.g. on architectures that have LL/SC), which unfortunately suffers from the ABA problem. This is even more problematic where you have no GC [see Java Concurrency in Practice, 15.4.4, or What every programmer should know about memory, 8.1]. In other cases a reference counting system is used to avoid memory leaks, increasing complexity (think of Read Copy Update (RCU)). Instead, using CopyOnWriteArrayList and CopyOnWriteArraySet is very easy and natural.
Back to C++, RAII is usually nice (well, not always, that's why the C++ committee introduced the move constructor), but heap memory management can be really a huge source of troubles, thus a GC is almost necessary today.

Concurrency
Wow, as of 2011 we have std::thread. But what to do when you need to stop a std::thread? Just design your “cooperative interruption” mechanism (you may use apoison pill[see Java Concurrency in Practice, 7.2.3]). So, what about interrupt in std::thread? Can you Java developers out there think of getting rid of the interrupt signaling mechanism provided by the language? And what about the concept of executors? It is good to see std::atomic<T>, but it would have been nice to see a new Java-like volatile keyword too.
To get to the point, C++11 introduced some other stuff like async and promise (well, async is not really async...), but let's be honest, all the “new” features are way too few and poor for modern concurrency scenarios. Solid threading support is a must have for a modern programming language, and C++ just isn't. Someone might point out that even Java cannot compete with different a more recent concurrency models like Scala's agents or goroutines, and it's true. But still, every time I use C++ I miss so much the concurrency package and all that amazing code written by Doug Lea, Josh Bloch and many more.
By the way, here some interesting resources about std::thread, showing that the multi-thread support is not mature yet (will it ever be?):

Lack of reflection (and annotations)
Whenever you need to perform double dispatching in a language that do not support it natively, the usual approach adopted to avoid the bad looking if-else RTTI chain in either Java or C++ is the visitor pattern. A smarter solution that avoids the other part to accept the visitor, is to use a dispatching table that retrieves a specific handler given a parameter that acts as a key. This is the solution my colleague adopted in our C++ DROP middleware when necessary, even though it required a considerable amount of partially obscure code. Now let's talk about another solution that may be adopted in Java: a few lines of the reflection API. No doubt it would be better to avoid reflection at all and that a fast performing method is preferable to a slow one (quite often good design and performance are at odds with each other), but in C++ you are left with no option. 
More in general, reflection is a powerful tool that C++ lacks. Granted that, as we all know, it should be used with care, the reflection API is a milestone in many frameworks, is at foundation of almost every serialization framework (e.g. standard Java serialization, Google's Protocol Buffers, etc.), code inspection tool, plugin-in based functionalities, and, along with annotations, makes Java flexible. By the way, annotations really deserve mentioning too, just think of how handy they are when you use JAXB, JPA, JUnit or any other kind of source code analysis. When I switch from Java to C++ it takes some time to accept that both reflection and annotations are not there.

Static init fiasco
Consider a messaging/event system, where you have a hierarchy of events/messages and some factories or dispatchers which perform specific actions depending on the class type. I'm referring to what Thinking in Java 4th ed. calls Registered factories. The approach is to use some static construct to let the factories know the whole event/message hierarchy and, if you already dealt with this problem in Java at least once, you surely know that it not possible to place this self registration logic inside the static block of the derived classes. This is due to the presence of a class loader and its class loading on first use. Conversely in C++ such technique works because static code is usually executed before the main (entry point) function gets called. However you should really take care about the ordering between the events and the factory, otherwise you'll end up with the so called “static init fiasco”. Fortunately you can avoid it by using the construct-on-first-use-idiom.
In my opinion the problem here is that the solution doesn't really fit with the language design and looks like a dirty hack: having an object whose destructor get never executed is a bad language corner case, and it's not the only one you can came across in C++. In this particular example Java isn't really shining either, nonetheless if you can live with this and few other small downsides class loaders will compensate you with other benefits. Think of their use in Java EE application containers for instance.

Header and namespace hell
The C/C++ include system is a crap and the namespace mechanism is just a lousy technique to prevent name clashes. I won't spend time here elaborating neither the first sentence nor the second as many people before me already did it thoroughly (one such example is again this talk about Go and its design). I'm simply going to report that I experienced or heard of any possible problem, from include ordering issue and double declarations to weird clashes, especially during integration. I don't think that packages are the best solution ever, but they are a clean design decision, conversely namespaces are just a limited solution.


So, thanks to the the lack of many standard frameworks and the complexity of the language, I spend a considerable amount of time reasoning on how to translate my ideas into C++ rather than the ideas themselves, while when using Java writing feels way more natural, even after some time away from it. In other words, with the latter I concentrate more on where I want to go (the design), rather than how to get there (the mere code and syntax). Albeit not perfect, Java is able to provide a decent set of tools (language features and toolkits/frameworks) to produce useful work in a reasonable amount of time and effort, but I can't tell exactly the same for C++.

Fortunately, since C++11 came out, you are no longer forced to use the boost libraries even for basic functionalities, but it keeps going down this road I really don't like. The language is now absurdly huge, way too complex, yet not really effective. To be clear, the latest revision is definitely a good improvement, but I see it as a late (if desperate) and only partially succeeded attempt to port some of well established features since long present in other languages (mainly Java), before the language considerably looses developers for not being competitive enough compared to those; those that instead provide a whole platform (e.g. Java & C#) or great strength in some specific domains (e.g. Ruby on Rails, JavaScript, Dart, etc.). Lately I started being the more and more intrigued by Scala and Go by their use in Twitter and Google respectively, especially for their modern approach to concurrency. Here the gap with C++ is enormous.

There is one more aspect that I consider relevant: the JVM. The one from Sun/Oracle is an excellent piece of software fitted with many valuable technologies and ideas, and I see many positive aspects in using a managed language with an “insulation” layer, at the same strength of an hypervisor for servers, even when portability is not a requirement. Moreover it is reasonable to say that many dynamic optimizations now yield to very good performance, not to mention a bunch of other crucial and well known advantages. Even if you are not aiming at coding in Java, a good knowledge and experience with the JVM and a GC is a good starting point for Scala, Closure, Groovy, but also JavaScript or Dart.

Overlooking the technical aspects, I do believe C++ is the past, I had to move from C++ to C++11 while working on the DROP project and I admit that to some extent it's been helpful. But at the same time I'm convinced more than ever that I don't want C++ to be my everyday coding tool. I have no problems with it, simply it's not much fun, there are much better and newer languages I enjoy much more, and Java is one of them. As soon as my TODO list shrinks a bit I will stick with a few books about Scala and Go... and maybe JavaScript (likely abandoning my will to improve my Python skills, although I like it, and learning Ruby).

In the end, it's just a matter of fun, that's why “I prefer Java to C++”.

Tuesday, May 28, 2013

Microsoft vs. Google vs. Linux


The last week I have read this news http://mashable.com/2013/05/15/google-stock-900/ and a few others about the fact that, according to the number and the value of the shares, Google now worth more billions than Microsoft. But money is not what I want to talk about, I would instead spend a few words on this recent and somehow related “story” I have seen here these days: http://blog.zorinaq.com/?e=74For the lazy people not willing to read that blog post (in my opinion you really should), just know that a Microsoft insider has made a few critic statements, there reported, that point out how bad management decisions and bad hiring practices are destroying the company. Aside many technical considerations and comparisons with Linux that really worth reading, I'm going to report here a few things I consider particularly interesting:
“Another reason for the quality gap [with Linux] is that that we've been having trouble keeping talented people. Google and other large Seattle-area companies keep poaching our best, most experienced developers, and we hire youths straight from college to replace them.”
Then:
“our good people keep retiring or moving to other large technology companies, and there are few new people achieving the level of technical virtuosity needed to replace the people who leave. We fill headcount with nine-to-five-with-kids types, desperate-to-please H1Bs, and Google rejects. We occasionally get good people anyway, as if by mistake, but not enough. Is it any wonder we're falling behind?”
Sure, a few points are debatable, but what to say? Did he really tell the truth? Does it sound reasonable to you?

My short answer is: well, I think he is very right but somehow very wrong as well.
Honestly, I can help but saying that I really believe his words, because it is a very common situation these days, especially in Europe, and in Italy in particular. Saving costs is now the rule to be competitive and the vast majority of engineering companies seems to forget that the real value of this kind of company lies in the employers, their skills, their know-how and their ideas. And even worse, they have a pair of legs and can leave the company taking some knowledge away with them. I can say that here in Italy every medium to large business seems to care about just one thing, to pay the very least for their engineering staff. The result is that people starts being very frustrated, discouraged and less productive. Hence the most talented people (try to) leave the company (usually landing in smaller companies where their opinions and skills matter), which in turn makes these companies hire younger and less trained people, often unable to carry on because there are no longer enough experienced engineers to guide and train them. Of course you can't keep up for a long time, on the long run the company is going to loose a considerable know-how and market share.

So, sure, I believe is indeed a true story, but only half of the story. In fact he fails to see how a company cannot be considered good or bad and valuable on a technical basis only, there is much more. For instance, people's attitude, attention to the customers, meritocracy and many more aspects. The real problem with Microsoft is that it keeps copying and never innovate. It is not relevant how many skilled people you have, no innovation no future. I'm neither saying that copying is always bad as you might still gain money providing more value over pre-existing products; nor that innovation is sufficient as you are still required to translate it into a (good) product. But, seriously, I can't remember an innovation coming from Microsoft. Not even one! Neither MS-DOS. I'm not going to discuss and elaborate whether Windows was better than OS/2 & MacOS or not. whether MS-Office was better than StarOffice or not, whether Windows Server + IIS + ASP was better than LAMP, IE was better than Firefox, whether the .NET platform (C#, F#) is/was better than Java/Scala/Clojure, the Xbox is/was better than the PlayStation, Windows Phone is better than Android and so on. I'm just pointing out that I can't remember anything really new from Microsoft, at least targeting the mass market. If your products are good and the context is favorable you might perform well, but sooner or later you will fall behind as the market is a moving target. Is it any wonder we're falling behind?”. Indeed, not at all, but mostly due to this latter fact than the former he was sustaining!

What surprises me most is that this problem has been there for at least 10 years and nothing really changed meanwhile. During my first year at the university back in 2001 there was a plenty of Microsoft advocates who would have joined the company immediately if only they got an offer. Because Microsoft is Microsoft and Bill Gates is a rich famous computer genius (while Richart Stallman is an horrifying loser and Linus Torvalds... who is Linus Torvalds?). There I was maybe the only person seriously interested in Linux (and not because of Stallman's ideas, I mostly don't give a shit even today), because it was something new, dynamic and different. I really thought about new possibilities. I admit it, Linux has never become what I expected and hoped for (it turned out it is not bad at all), but it really opened the road to many new and innovative products we have today that are not related to the data center world. And keeps changing and improving!
So, it comes as no surprise to learn that Google, which takes both skills and culture very seriously, is now more valuable than Microsoft and that Linux is faster than Windows.