Monday, December 26, 2011

Using ngen to improve ClojureCLR startup time

Startup speed ranked fairly low in importance in the 2011 ClojureCLR survey.  Still, it can be annoying and is an impediment to certain uses of ClojureCLR.

I've used several profiling tools to examine the startup period.   The only conclusion I've been able to draw:  JIT-compilation is the culprit.  The percentage of startup time devoted to JITting is in excess of 90%.  One solution to this: pre-JIT.

If you run ngen.exe on the primary DLLs involved in ClojureCLR startup, you will experience significant startup time improvement.  I ran

time ./Clojure.Main.exe -e  "(println :a)"

on a 4.0 Debug build, on a 4.0 Release build, and on a 4.0 Release build with ngen.  For comparison, I also ran
 time java -jar clojure.jar -e "(println :a)"
(I used my git-bash shell via msysgit, so time was available via mingw.)

Results:

DebugReleaseRelease / ngenJVM
real
0m5.118s
0m8.130s
0m0.231s
0m1.053s
user
0m0.000s
0m0.015s
0m0.000s
0m0.000s
sys
0m0.000s
0m0.000s
0m0.015s
0m0.000s

The slower startup time of Debug vs non-ngen'd Release is no doubt due to the more extensive JIT optimizations taking place in the latter.  Approximately four times as fast as the JVM version is good enough for me.

I did the following ngens.  All these DLLs are loaded on an intial startup through one eval and printing.

ngen install Clojure.dll
ngen install clojure.clr.io.clj.dll
ngen install clojure.core.clj.dll
ngen install clojure.core.protocols.clj.dll
ngen install clojure.core_clr.clj.dll
ngen install clojure.core_deftype.clj.dll
ngen install clojure.core_print.clj.dll
ngen install clojure.core_proxy.clj.dll
ngen install clojure.genclass.clj.dll
ngen install clojure.gvec.clj.dll
ngen install clojure.main.clj.dll
ngen install clojure.pprint.cl_format.clj.dll
ngen install clojure.pprint.clj.dll
ngen install clojure.pprint.column_writer.clj.dll
ngen install clojure.pprint.dispatch.clj.dll
ngen install clojure.pprint.pprint_base.clj.dll
ngen install clojure.pprint.pretty_writer.clj.dll
ngen install clojure.pprint.print_table.clj.dll
ngen install clojure.pprint.utilities.clj.dll
ngen install clojure.repl.clj.dll
ngen install clojure.walk.clj.dll

You will get errors, mostly of the form
1>Common Language Runtime detected an invalid program. while compiling method clojure/walk$macroexpand_all$fn__12034__12039..ctor

It works nevertheless.  Someday I'll have to figure out exactly what sin I'm committing in my constructor code.

I cannot competently hypothesize why ClojureCLR kills the JITter like this, in comparison to the JVM, or in comparison to other similar sized programs.  Delayed JITting may be one reason.  Also, the generated ClojureCLR code contains twice as many classes as the JVM code does, due to the tactic I use to workaround the inability of the DLR to generate instance methods.


9 comments:

  1. Interesting read -- I'd be very interested in similar optimisations when using mono.

    ReplyDelete
    Replies
    1. At present, I'm rewriting all the code gen of the ClojureCLR compiler to see if I can remove some inefficiencies forced on the implementation by the use of the DLR. (Specifically, that DLR can only generate static methods, forcing an extra method call from instance to static on every Clojure IFn invoke.) Once I complete that (succesfully or not), I'll be looking at Mono.

      Delete
  2. What version of ClojureCLR were you using for this? I am using 1.3 and I am definitely getting much higher start-up times.

    By the way, when I tried to build the github master branch, I am missing pnunit.framework. Maybe this DLL could be posted as a github download?

    Thanks so much for your work by the way!

    ReplyDelete
    Replies
    1. I was using the master branch at that time, most likely. Some flavor of 1.4-master-SNAPSHOT. I was using Java Client, most likely.

      Re punit, two solutions: (1) It's only used in the Clojure.Test project -- just delete it from your solution if you aren't planning to test. (2) Install it to Clojure.Test using NuGet. (My preferred approach.)

      -David

      Delete
  3. Ok, thanks. Unfortunately, I'm still not noticing much of a speed up on my machine. It actually takes about 9 seconds to run the example above. I wonder what I'm doing wrong. It would be great if I could get clojure-clr to load faster because then I could justify using it in some of my projects at work.

    ReplyDelete
    Replies
    1. Can you give me a sense of the times you are seeing, ngnen'd and not? The speedup I (and a few others) have gotten from this is substantial. I don't think I was doing anything special, just the obvious.

      Delete
    2. It appears that ngen is not working at all on my system. A typical run is timed as follows whether I use ngen or not:

      real 0m8.634s
      user 0m0.031s
      sys 0m0.000s

      I tried various things to get the native images working and then I ran fuslogvw.exe to see what was happening. Basically my system is not finding the native images. I'll try to look into this and see if I can find the cause.

      Delete
    3. I finally got this working when I ILMerge'd the .clj.dll files into Clojure.dll. fuslogvw indicated that it wouldn't look for assemblies loaded with Assembly.LoadFrom() in the native image cache. With the ILMerge'd version, everything is in one DLL, so it loads for me with no problem. I still wonder why we had different experiences with this, but anyway, it works well enough now...

      Delete
  4. Hi

    I have exactly the same issue with IronScheme. Normal around 3 seconds, NGEN'd gets down to 0.1 seconds. As aaronc mentioned though, NGEN does not work form LoadFrom which is a pain if you want to integrate it with VS.

    I have not been able to pinpoint the issue. Would love to hear your ideas of what could cause this. Perhaps relocation?

    Cheers

    leppie

    ReplyDelete