10 June 2011

don't generate 400 lines of IR for 5 trivial lines of source code. There's just no way LLVM could possibly optimize all of that away.

In particular, never, ever generate diamond-shaped control-flow. Don't annoy the compiler with lots of compensation code in the slow path, either. Just branch to a fallback routine. Don't assume LLVM is going to inline a branchy helper function and reduce it to nothing -- inline the fast path yourself.

The source language (or a properly designed bytecode) provides much more high-level information. You lose all of this by mindlessly lowering it into a low-level IR. LLVM would need this info for better alias-analysis. But since the info is already gone, it can't effectively weed out the forest of loads and stores.

Better forget about the C-centric features of LLVM, e.g. the standard calling conventions. And due to the nature of dynamic code, regular LICM doesn't work well, either. All those expensive optimizations in LLVM don't buy you much, since most of them won't work with dynamic code that includes tons of guards.

Basically you'd need to do a lot of optimizations yourself at a higher level, e.g. forwarding/sinking of stack slot accesses. Only pass the cleaned up IR to LLVM -- use it more like a dumb assembler. That bears the question, though, whether LLVM is the right tool for the job.

Mike Pall LLVM comment

A Rant About PHP Compilers and HipHop


Post a Comment

<< Home