Monday, March 26, 2012

Code gen redo preview

Rewriting the code generation phase of a compiler is not for the faint of heart. Nor, perhaps, for the sound of mind.

I've nearly completed a rewrite of the code gen code of the ClojureCLR compiler.  There are still a few things on my punch list (see below), but all the clojure.test-clojure.* tests run now.  I hope an intrepid few will give it a spin before I push the changes to master.  The new code can be found in the nodlr branch in the github repo.

When I wrote the ClojureCLR compiler, I was interested in seeing what kind of mileage I could get out of the Dynamic Language Runtime, specifically the DLR's expression tree mechanism.  The DLR's ETs extended the ETs used in Linq by providing enhanced control flow capabilities.  They are central to other dynamic language implementations on the CLR such as IronPython and IronRuby.

The first version of the ClojureCLR compiler mimicked the JVM compiler through its initial phases.  The Lisp reader translates source code to Lisp data structures that are parsed to generate abstract syntax trees.  The ClojureJVM compiler traverses the ASTs to produce JVM bytecodes.  The ClojureCLR compiler instead generates DLR ETs from the ASTs.   Those ETs are then asked to generate MSIL code.

I got a lot of mileage out of using the DLR for code generation.  I got to avoid some of the hairier aspects of MSIL and the CLR-- things like value types, generic types, nullable types, for example, are handled nicely by ETs.  I also found it easier  to experiment.  However, using ETs  had at least two drawbacks. One was that going from ASTs to MSIL through ETs likely nearly doubles the work of MSIL generation. Another was that ETs were restricted to producing static methods only.  Working around this restriction introduced several inefficiencies in the resulting code.

The Clojure model for functions maps each defined function to a class.  For example, compiling

(defn f
  ([x] ... )
  ([x y] ... ))

yields a class named something like user$f__1295 that implements clojure.lang.IFn, with overrides for virtual methods invoke(Object) and invoke(Object,Object).  (The actual value to which f would be bound would be an instance of this class.)

Note that the invoke overrides of necessity are instance methods.  Recall from above that DLR ETs cannot produce instance methods.  Toss in a another little problem referring to unfinished types.  Shake and stir.  You end up with the following abomination:  Where Clojure/JVM generates one class and two methods for the example above, ClojureCLR would have to generate two classes and four methods.  An invoke override method is just a passthrough to a DLR-generated static method taking the function object as a first paramenter.

For several years I hoped that the DLR team would get around to looking into class and instance method generation.  This now seems unlikely.  So I finally decided to rewrite the code generation phase to eliminate most uses of the DLR.

The new code gen code yields significant improvements in compilation time and code size.  Compiling the bootstrap clojure core environment is roughly twice as fast. The generated assemblies are about 20% smaller.  Startup time (non-NGEN'd) is 11% faster.  A few benchmarks I've run show speedups ranging from 4% to 16%.  This is in line with my best hopes.

One other benefit: with code generation more closely modeled after the JVM version, future maintainers will need less knowledge of the DLR.

There are drawbacks to this move.  The DLR guys know a lot more about about generating MSIL code than I do.  Some wonderful goodness with names like Expression.Convert and Expression.Call were my best friends    They are (mostly) gone now.   And, oh, the beauty of DebugView for ETs for debugging code gen--this will be missed.  My new best friends are peverify and .Net Reflector, the caped duo for rooting out bad MSIL.   Wonderful in their own way, but I have a sense of loss.

So, where are we?  I have a little more work to do before putting this on the master branch.    I plan to make one last traversal of the old code looking at all occurrences of my former best friends  to make sure I've been consistent in handling the complexities they hid.  I also plan to reimplement a 'light compile' variation to be used during evaluation.  The current version has it.  (What this is and why it matters I leave to another time.)  Neither task will take long.

In the meantime, the nodlr branch is ready for a workout by those who care and dare. Have at it.

P.S.  The DLR is still being used to provide CLR interop and polymorphic inline caching.  Another topic-for-another-day.


  1. Mostly looks pretty good. I really like the fact that functions defined at the repl get a meaningful type name so that the stacktraces are more readable.

    I'm getting the following error with one part of my code:
    CompilerException System.InvalidOperationException: Can't embed object in code, maybe print-dup not defined: System.Xaml.Schema.XamlMemberInvoker
    at clojure.lang.CljCompiler.Ast.ObjExpr.EmitValue(Object value, CljILGen ilg)
    at clojure.lang.CljCompiler.Ast.ObjExpr.EmitConstantFieldInits(CljILGen ilg)
    at clojure.lang.CljCompiler.Ast.ObjExpr.EmitStaticConstructorBody(CljILGen ilg)
    at clojure.lang.CljCompiler.Ast.ObjExpr.DefineStaticConstructor(TypeBuilder fnTB)
    at clojure.lang.CljCompiler.Ast.ObjExpr.Compile(Type superType, Type stubType, IPersistentVector interfaces, Boolean onetimeUse, GenContext context)
    at clojure.lang.CljCompiler.Ast.FnExpr.Parse(ParserContext pcon, ISeq form, String name)
    at clojure.lang.Compiler.AnalyzeSeq(ParserContext pcon, ISeq form, String name), compiling: (NO_SOURCE_PATH:50)

    This is in a library I am writing for creating WPF apps in Clojure. With all the appropriate WPF assemblies (PresentationFramework, etc.) loaded and this namespace loaded, entering something like (caml (:StackPanel (:TextBlock))) at the REPL will produce the above stacktrace. The code works fine with the ClojureCLR master branch. Other than that, most things seem to load up just fine. If there is anything I can do to help with debugging this, let me know.

    1. This is the 'live object' problem. I'll have to look at this particular example to see what the deal is. ClojureJVM has the same problem -- it would be interesting to find a parallel example.

      The reason that it works in the master branch and not in the nodlr branch is that the master branch has a special 'light compilation' mode that uses the DLR to play some tricks that allows embedding of otherwise-disallowed live objects. My last remaining task on this code gen redo is reproducing light compilation without using the DLR. It makes playing at the REPL easier and faster -- you're seeing 'easier' here.

      Whether or not I pull off light compilation, I'll do a blog entry with the details on what's going on. Your example might well serve as an illustration.

      Speaking of which: is there anything I need to know in order to pull your code into the REPL for testing?

    2. I think the main thing is having the WPF assemblies (Presentation Framework, WindowsBase, etc.) loaded in your AppDomain.

    3. So, this might not actually be a problem. Would you in general say that it is better to not embed the live objects, at least in terms of performance? The reason for embedding the live objects was actually to reduce reflection which may be defeated anyway because of this light compilation.

    4. You won't be able to embed live objects in compiled code without a print-dup method for the type. That is actually true in the current master branch.

      Light compilation is only used when not compiling. It does not apply to deftypes and related. (Internally, it is only for FnExpr, not for NewInstanceExpr.) Light compilation vs regular:

      Regular: every fn generates a class
      Light: simple fns reuse the same class

      Regular: non-emittable constants are held in static fields in the class for the fn -- require static initialization code.
      Light: non-emittable constants are held in an array. The value stored in the array is the value passed in.

      Regular: closed over variables become instance fields in the class for the fn
      Light: closed over variables accesses become references to a 'closure' array

      Regular: methods are Reflection.Emit-generated
      Light: methods are DynamicMethods -- faster generation

      So the tradeoff is: save a class gen, faster method generation vs array accesses instead of field accesses.

      Reflection is not an issue -- same in either case.

    5. Ok, so is the lack of light compilation the reason why I am now seeing meaningful stack traces for my repl-defined functions? Personally, I'd take the better stack traces over faster compilation.

      Also, I found another way write my macros that doesn't require that live object and the code is actually faster.

    6. Probably so. Not having made any errors lately, I hadn't noticed. :)

      I was planning on making it switchable in the new version. I may need it more than anyone else. It has an effect when loading the clojure core evaluated during bootstrap compilation. It also keeps devenv.exe from going memory-crazy when running the test suite under the debugger. So, I may be the only one who cares.

    7. Oh, well if I had to vote, I would keep the default for the repl as it is now (i.e. full compile). 1) I really appreciate the meaningful class names in stacktraces and 2) it more accurately reflects how code will behave when AOT compiled - before, I mistakenly wrote code that I thought was what I wanted because it worked at the REPL, with the new version - although it was frustrating at first, I eventually came up with a better and more performant solution.

      One other question, is there a reason that GenClass doesn't generate a type in the *.clj.dll assembly and instead creates its own somewhat oddly named .dll? Couldn't it be made to work just like GenProxy where the type is defined dynamically when at the REPL and in the output assembly when compiling?

    8. I can leave full compile as the default. As I said, I may be the only one who really finds light-compile all that useful.

      Consistency of behavior between eval and AOT-compiled is useful.

      I'll look at the GenClass situation. Probably an artifact from long ago, not necessary.

    9. Also, I'm not able to get by-ref working. Maybe it's broken... Looking at the AOT compiled code in reflector it seems like the parameter is being passed with "ref" to the __interop_ method, but no values change. I can't seem to get type hinting to generate code that invokes the method without reflection when I use by-ref.

    10. I'll take a look. I did have to rewrite interop code for this redo, may have mangled it. That's a piece of interop not covered by the standard Clojure tests--a gap I need to fill. Something I'll work on before going live with this work.

    11. Aaron: Try the latest commit in the nodlr branch. I worked on the by-ref problem.

    12. I'm finally ready to push the new code gen over to the master branch.

      The hard question: squash the 65 commits on the nodlr branch down to 2 or 3 milestones or let the master branch preserve history?

  2. My vote is to preserve the history. Always better to be able to go back and see what was done.

    I'll see if I can look at the latest commit soon - will be going on vacation in a few days so it might be tough. Btw, did you get a chance to look at any of my changes?

  3. There's some pretty ugly history in there. :) It was about 30 commits before I got the new code gen to work on core.clj. Then there were the tests. But they are mostly marked by WIP on the comment, so the two people who will look at it (you and me, I'm guessing) will know to go whistling past those commits.

    My plan after moving the new code gen to master is first to catch up on the last four months of commits on ClojureJVM. This will allow me to get an official 1.4 release out. Your changes are next on the list.

  4. Hey Dave, did you ever get a chance to look at my changes? Maybe there's some way we could chat about how to best approach this? I'm wondering if maybe I should create a patch for you rather than trying to merge the branches.