Deprecated: Assigning the return value of new by reference is deprecated in /nfs/c03/h02/mnt/52932/domains/ on line 472

Deprecated: Assigning the return value of new by reference is deprecated in /nfs/c03/h02/mnt/52932/domains/ on line 487

Deprecated: Assigning the return value of new by reference is deprecated in /nfs/c03/h02/mnt/52932/domains/ on line 494

Deprecated: Assigning the return value of new by reference is deprecated in /nfs/c03/h02/mnt/52932/domains/ on line 530

Deprecated: Assigning the return value of new by reference is deprecated in /nfs/c03/h02/mnt/52932/domains/ on line 103

Deprecated: Assigning the return value of new by reference is deprecated in /nfs/c03/h02/mnt/52932/domains/ on line 21

Deprecated: Assigning the return value of new by reference is deprecated in /nfs/c03/h02/mnt/52932/domains/ on line 623 » On C being the new Assembly

nerd nouveau

On C being the new Assembly

A few days ago, Daniel Jalkut (whose blog I obviously read) posted an entry titled “C Is The New Assembly“. It’s about the introduction of (more) scripting language bridges into the upcoming MacOS 10.5 Leopard. In short, this means that developers will be able to write applications for Leopard using scripting languages such as Python or Ruby, opening the system up to a wider audience of developers. Code with tighter performance constraints could still be written in C (or Obj-C), hence the comparison of the new role of C to that traditionally associated with Assembly language.

I’ve been thinking about the implications of these language bridges a lot during the last two days. First, some comments to the post pointed out that the reason for having to use Assembly language in the earlier days was that the code generated by compilers sucked. It was often much slower than what a skilled programmer could achieve by coding in Assembly language directly. However, it is wrong to think that this problem is non-existant today. In fact, compiler technology is still vastly lagging behind hardware technology. For example, let’s have a look at the vector units modern CPUs are equipped with, such as Altivec on PowerPC, or MMX/SSE[1-4] on Intel. GCC hardly ever manages to generate well-vectorized code even if potential for parallelization is given and the code if fairly straight-forward (i.e. no branching). Even Intel’s C Compiler (ICC) doesn’t do a particularly good job. Then again, where ICC really shines, is (of course, given that Intel knows their own hardware) instruction scheduling: The same vectorized code, compiled to Assembly using GCC and ICC, can run up to 2x faster if ICC is used rather than GCC. That’s a whole lot of a speedup! Now let’s take into account that modern CPUs have far more features than just vector units, they also employ complex caching methods (and constraints), branch prediction and multiple parallel execution units, just to name a few. And sometimes even quirky constraints, such as the G5’s two independent FPUs being hardwired in a way that every (i+2)nd instruction goes to the same core, hence if the ith and (i+2)nd instruction use the same execution units inside the CPU, performance is sacrificed, even if the other FPU is idle!). Compilers hardly exploit this (in fact, it’s not that compiler developers suck, it’s just that it’s extremely hard to exploit these features without imposing heavy constraints onto the C-code structure).

To make it short, hand-coded Assembly functions crafted by skilled programmers still show a vastly improved performance over what a good compiler can achieve for most applications. I’ve worked together a bit with people doing a research project related to automatic vectorization at my university, hence I have somewhat of a background. For example, the math kernel libraries distributed by Intel are mostly hand-tuned.

Second, scripting languages such as Ruby and Python are very expressive and easy to learn. This is a good thing. They allow even inexperienced programmers to develop applications quickly. However, the easier a programming language is to learn, i.e. the higher the level of abstraction it offers is, the less is a programmer forced to learn about the underlying mechanisms of a language. For example, if you learn C, you will sooner or later be confronted with memory management issues (such as what the stack, the heap and the data segment are). This is not true for scripting languages. Similarly, scripting languages employ high level libraries that are extremely simple to use. Now what I am afraid of is that especially inexperienced programmers may be delusional about the simplicity of scripting languages. Bugs could be the result, especially where high-level concepts clash with low-level concepts, for example when using “dynamic typing”. Let’s look at Dashboard Widgets, introduced in in MacOS 10.4 Tiger: They are extremely easy to develop, being mostly based on HTML, CSS and Javascript (even though plugins written in Cocoa are possible). Thusly, there are lots of Widget developers, many of them not having a prior background in programming. Therefore, many Widgets are extremely buggy. If you have some Dashboard Widgets open, just take a look at your Console messages from time to time. I’m sure you’ll see lots of Javascript errors in your Widgets there. In fact, I believe that the average Dashboard Widget is way buggier than the average Cocoa application. The reason, at least to my mind, is that their development seems to be so accessible that they attract less experienced programmers more than Cocoa does. Therefore, I believe that this symptom could also affect normal Mac GUI apps once it becomes possible to write them entirely in simple scripting languages. I don’t want the average GUI application to be as buggy as the average Dashboard Widget!

Don’t get me wrong, I love Dashboard Widgets and have even written some myself (App Update and Widget Update). However, I was trying to employ a coding style that I’ve learned using lower-level languages, for example, using extensive error checking, type checking, checking of result values etc… You wouldn’t believe how often you see a null-pointer related error in your Console messages, caused by various Widgets. All of these could be show-stopper bugs and/or crash a script that is bridged to Cocoa!

Third, GUI programs written in scripting languages will definitely suffer from a performance impact (and use more memory, too) than a comparable program written in Cocoa, but that’s what’s bothering me least, considering that the interpreters are constantly getting enhanced and optimized, and that experienced programmers know how to write code in order to minimize its resource consumption. Languages like OCaml prove that extremely fast code can be written in scripting languages, too.

To sum it up, I’ve always been a great fan of abstraction in programming language. In fact, a scripting language can be an extremely powerful tool in the hands of a skilled programmer. My main concern, however, is that the simpler a language is, the less experienced the people it attracts are. The average PHP programmer (or Python programmer, whatever) possibly doesn’t have the technical background of the average C or even Assembly programmer. And this could mean that we’re probably going to see some buggy MacOS X GUI software in the future, just in the same manner as we’re currently seeing a lot of buggy Dashboard Widgets.

1 Comment

Comments are closed | Comments RSS

  1. Deprecated: Function ereg() is deprecated in /nfs/c03/h02/mnt/52932/domains/ on line 445
    assembler of god
    wrote on Dec 25, 2007 at 8:29

    nice blog….a good site to learn assembly language especially for beginners


Hi, how are you? My name is Georg Kaindl, and I'm a twenty-something from Vienna, Austria. During the day, I'm a CS student at the Vienna University of Technology, but at night, I turn into an independent software developer for the Macintosh platform, social nerd, lazy entrepreneur and intuitive researcher.

I like to write about everything that matters to considerate technology enthusiasts, but humbly retain the right to go off-topic from time to time.

My posts are licensed under a Creative Commons Attribution 3.0 License.


You can reach me by email if you have something to say that's not related to a blog post or that you don't want to have publicly available as a comment to a post.

However, you'll have to prove that you are human! Even though I personally like robots very much, I'm less of fan of SPAM. It's just a simple riddle to solve, but a SPAM bot won't cut it!

To get my email address, take the word before the .com in my domain name first (Hint: The word you are looking for starts with a "g" and ends with an "l"). Next, simply attach to this word.

Boom, there's my email address. Simple, isn't it?

Powered by WordPress

Comments RSS

Entries RSS