Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
GHC Haskell switches to an LLVM backend (haskell.org)
87 points by dons on Feb 19, 2010 | hide | past | favorite | 19 comments


Although this is really cool stuff and it looks like it's going to get merged in, the title is a little misleading here. GHC hasn't switched to an LLVM backend -- they're currently reviewing this patch and possibly merging in this support for LLVM output (not necessarily switching over to it entirely).


The title is accurate, as simultaneously, the GCC backend is being dropped. [1]

1. http://www.haskell.org/pipermail/glasgow-haskell-users/2010-...


But that link only says that via-C is being dropped which, as I understand it, is not currently the default backend -- NGC is[1]. This means that (1) via-C is being dropped; (2) NCG (native code generator) has been, and is still the default; and (3) this LLVM backend support is currently being considered to be merged in.

[1] http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler...


That's right. The GCC (-fvia-C) is being dropped in favor of LLVM. Work on the native codegen continues separately.


Is this a big deal? (not a rhetorical question!)



Yes; at present, adding code generation for new architectures is kind of a nightmare (google "haskell Evil Mangler"). LLVM will make this much simpler (or, at least, foist it on the LLVM guys :) )


What about GC? When I last looked at LLVM (the current stable release, in fact), it looks like it didn't have a very good story there. Plenty planned out, but little implemented and less solidly.


LLVM is the low-level virtual machine. (Basically portable assembly; except much smarter.)

GC is orthogonal. There is some work to add auto-gc to anything on LLVM, but while that's brewing, you can also do it in your runtime. GHC's runtime already does GC, so this is not a big deal.


Errr ... OK. So the GHC runtime that supports (supported) GCC (and that now will support LLVM) doesn't depend on any special cooperation by that level below it, and does the GC on top?

Hmmm, Gambit-C shows you can do this with good results, but ... uck. Still, I could well see how this new arrangement would be much better than using GCC as a back end.


You eventually have to have code to run on the CPU. LLVM lets your compiler output LLVM instructions instead of assembly, and then LLVM converts those instructions to assembly. The level of expressiveness is about the same.

Your CPU doesn't do GC, so LLVM is not necessarily the best place to do it, either. (LLVM does do other things your CPU doesn't do, which is why C compiled to LLVM runs faster than C compiled to native code in many cases. I imagine this will work well for GHC, as well, although GHC's native codegen can outperform gcc too.)

It's also worth noting that Apple (and others) have poured hundreds of thousands of dollars into LLVM. Taking advantage of that is always good, and the whole point of "free software", in fact.


Yes, I've looked hard at using LLVM for the backend of a project, so I'm familiar with the above.

However LLVM does at least one thing that assembly doesn't, which is implement a stack. It also obviously handles CPU registers (and opaquely). To do accurate garbage collection (http://llvm.org/docs/GarbageCollection.html; note, the whole site LLVM site is not loading for me right now) you have to work with it to find roots in the stack(s) and registers. And the state of the implementation of that is what I was talking about.

Does this GHC backend use a conservative collector such as the Boehm collector? As I recall VMKit does.


Remember, this isn't the only backend. Native code generation continues on some architectures. LLVM just has its own benefits.


LLVM is mainly developed by Apple and Objective-C (via clang) requires garbage collection, therefore garbage collection shouldn't be a problem.


Without going in depth this looks like very impressive honors thesis work compared with some of the other work I have seen.


Especially given that the author writes that he had "no haskell knowledge at first."


What happened to C--?


GHC uses a modified form of C-- internally, which is fed to LLVM, GCC or the native codegen.

The C-- standalone compilers never reached the maturity of e.g. LLVM, to justify the port.


Yeah, as far as I can tell C-- is lying the street slowly bleeding. The group at Harvard has run out of money for the compiler they were working on, and as soon as the principle investigator can catch his breath (he had to assume the duties of a TA) he'll finish packaging it into something that others can easily work with (e.g. a Git repository as I recall).

As it is, it's supposed to be pretty hard to build, depending on a lot of old and/or exotic tools.

GHC forked off of this effort sometime before the current C-- standard.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: