Thursday, December 29, 2011

Calling generic methods

To take advantage of some of the recent API goodness coming from Microsoft, interoperating with generic methods is a must.  Here's an example from System.Linq.Reactive.Observable:

public static IObservable<TResult> Generate<TState, TResult>( 
    TState initialState, Func<TState, bool> condition, 
    Func<TState, TState> iterate, 
    Func<TState, TResult> resultSelector, 
    Func<TState, TimeSpan> timeSelector )

and from the land of Linq, in System.Linq.Enumerable:

public static IEnumerable<TResult> GroupBy<TSource, TKey, TElement, TResult>( 
    this IEnumerable<TSource> source, 
    Func<TSource, TKey> keySelector, 
    Func<TSource, TElement> elementSelector, 
    Func<TKey, IEnumerable<TElement>, TResult> resultSelector, 
    IEqualityComparer<TKey> comparer )

Much of the goodness of Linq, Reactive Framework and others comes from the ability to chain together generic method calls with minimum specification of type arguments for those calls.   If you are in a statically-typed language such as C#, there are plenty of types floating around to do inferencing on. Of course, with dynamic, C# is not quite the paragon of static typing it once was. The mechanisms C# uses for dynamic call sites surface in the Dynamic Language Runtime and so are available to the wider world. Following the path blazed by IronPython, we have recently enhanced ClojureCLR's ability to interoperate with generic methods.

Start a REPL and start typing:

(import 'System.Linq.Enumerable)
(def r1 (Enumerable/Where [1 2 3 4 5] even?))
(seq r1)                                        ;=> (2 4)

"But of course", you say. Not so fast; under the covers there is merry mischief.

The generic method Where is overloaded with the following signatures.

public static IEnumerable<TSource> Where<TSource>(
    this IEnumerable<TSource> source,
    Func<TSource, bool> predicate)

public static IEnumerable<TSource> Where<TSource>(
    this IEnumerable<TSource> source,
    Func<TSource, int, bool> predicate)

In the Where call above, the [1 2 3 4 5] is a clojure.lang.PersistentVector. This class implements IEnumerable<Object> and so matches the IEnumerable<TSource> in the first argument position.

The value of even? is a clojure.lang.IFn, more specifically a clojure.lang.AFn. In ClojureCLR, clojure.lang.AFn implements interfaces allowing it take take part in the DLR's generic method type inferencing protocol. One interface answers queries about the arities supported by the function. The function even? reports that it supports one argument and does not support two arguments. Therefore, it supports casting to Func<TSource, bool> but not to Func<TSource, int, bool>, allowing discrimination between the two overloads. (Func<TSource, bool> is a delegate class representing a method taking one argument of type TSource and returning a value of type Boolean.)

From this information, we can pick the overload of Where to use. The value returned by the Where call is of type System.Linq.Enumerable+WhereEnumerableIterator<Object>. This type implements IEnumerable and seq can do the expected thing to it.

If we had a function f that supports one and two arguments, say

(defn f
  ([x] x)  
  ([x y] [x y]))

the type inferencing illustrated in the previous example would not work.

(Enumerable/Where [1 2 3 4 5] f)   ;=> FAIL!
The error messages is quite clear:

ArgumentTypeException Multiple targets could match:  
  Where(IEnumerable`1, Func`2),
  Where(IEnumerable`1, Func`3)

There are two ways around this. The simplest is to use an anonymous function of one argument that calls f:

(Enumerable/Where [1 2 3 4 5] #(f %1))

The second way is to declare the types explicitly. Macros sys-func and sys-action are available to create a function with delegate type matching  System.Func<,...> and System.Action<,...>, respectively.

(Enumerable/Where [1 2 3 4 5] (sys-func [Object Boolean] [x] (f x))))

The first way clearly is preferable. However, when type inferencing does not suffice, sys-func and sys-action can be used. (If delegates of types other then Func<,...> and Action<,...> are required, gen-delegate is available.)

Not just Clojure data structures can participate as sources:

(def r2 (Enumerable/Range 1 10))
(seq r2) ;=> (1 2 3 4 5 6 7 8 9 10)
(seq (Enumerable/Where r2 even?)) ;=> (2 4 6 8 10) 

In fact, any of the following calls will work:

(Enumerable/Where r2 (sys-func [Int32 Boolean] [x] (even? x)))
(Enumerable/Where r2 (sys-func [Object Boolean] [x] (even? x)))
(Enumerable/Where (seq r2) even?)

There are situations where you need to supply type arguments explicitly to a generic method. For example, the following fails:

(def r3 (Enumerable/Repeat 2 5) ;=> FAILS!

The error message states:

InvalidOperationException Late bound operations cannot be performed 
on types or methods for which ContainsGenericParameters is true.

We can cause the type arguments on the Repeat<T> method to be filled in using the type-args macro:

(def r3 (Enumerable/Repeat (type-args Int32) 2 5))
(seq r2) ;=> (2 2 2 2 2)

If you'd like to do Linq-style method concatenation, don't forget the threading macro:

(seq (-> (Enumerable/Range 1 10)
            (Enumerable/Where even?)
            (Enumerable/Select #(* %1 %1))))  ;=> (4 16 36 64 100)

Of course, if you'd like to have the Linq syntax that is available in C#, you are free to write a macro.

Tuesday, December 27, 2011

Working with enums

Working with enum types is straightforward. We have provided a few core methods to simplify certain operations.

Accessing/creating enum values

An enum type is a value type derived from System.Enum. The named constants in an enum type are implemented as static fields on the type.  For example, System.IO.FileMode, defined in C# as

public enum FileMode
    Append = 6,
    Create = 2,
    CreateNew = 1,
    Open = 3,
    OpenOrCreate = 4,
    Truncate = 5

has an MSIL implementation more-or-less equivalent to

public sealed class FileMode : System.Enum
    public static System.IO.FileMode CreateNew = 1;
    public static System.IO.FileMode Create = 2;
    public static System.IO.FileMode Open = 3;
    public static System.IO.FileMode OpenOrCreate 4;
    public static System.IO.FileMode Truncate = 5;
    public static System.IO.FileMode Append = 6;

Thus, we can use our regular static-field access interop syntax to retrieve these values:

(import 'System.IO.FileMode) ;=> System.IO.FileMode
FileMode/CreateNew           ;=> CreateNew

These are not integral-type values. They retain the enumeration type.

(class FileMode/CreateNew) ;=> System.IO.FileMode

You can convert them to an integer value if you desire:

(int FileMode/CreateNew) ;=> 1

If you want to convert from an integer value to an enumeration value, try:

(Enum/ToObject FileMode 4) ;=> OpenOrCreate

If you want convert from the name of an integer to an enumeration value, the enum-val method will work with strings or anything that name works on:

(enum-val FileMode "CreateNew") ;=> CreateNew
(enum-val FileMode :CreateNew)  ;=> CreateNew

Working with bit fields

Enumeration types that have the Flags attribute are often used to represent bit fields. For convenience, we provide methods bit-or and bit-and to to combine and mask bit field values. For example, System.IO.FileShare has the Flags attribute. It is defined as follows:

[Serializable, ComVisible(true), Flags]
public enum FileShare
 Delete = 4,
 Inheritable = 0x10,
 None = 0,
 Read = 1,
 ReadWrite = 3,
 Write = 2

Use enum-or to combine values.

(import 'System.IO.FileShare)
(enum-or FileShare/Read FileShare/Write) ;=> ReadWrite

Use enum-and to mask values.

(def r (enum-or FileShare/ReadWrite FileShare/Inheritable))
(= (enum-and r FileShare/Write) FileShare/Write) ;=> true
(= (enum-and r FileShare/Write) FileShare/None)  ;=> false
(= (enum-and r FileShare/Delete) FileShare/None) ;=> true

You can also use the HasFlag method to test if a bit is set:

(.HasFlag r FileShare/Write)  ;=> true
(.HasFlag r FileShare/Delete) ;=> false

Monday, December 26, 2011

Using ngen to improve ClojureCLR startup time

Startup speed ranked fairly low in importance in the 2011 ClojureCLR survey.  Still, it can be annoying and is an impediment to certain uses of ClojureCLR.

I've used several profiling tools to examine the startup period.   The only conclusion I've been able to draw:  JIT-compilation is the culprit.  The percentage of startup time devoted to JITting is in excess of 90%.  One solution to this: pre-JIT.

If you run ngen.exe on the primary DLLs involved in ClojureCLR startup, you will experience significant startup time improvement.  I ran

time ./Clojure.Main.exe -e  "(println :a)"

on a 4.0 Debug build, on a 4.0 Release build, and on a 4.0 Release build with ngen.  For comparison, I also ran
 time java -jar clojure.jar -e "(println :a)"
(I used my git-bash shell via msysgit, so time was available via mingw.)


DebugReleaseRelease / ngenJVM

The slower startup time of Debug vs non-ngen'd Release is no doubt due to the more extensive JIT optimizations taking place in the latter.  Approximately four times as fast as the JVM version is good enough for me.

I did the following ngens.  All these DLLs are loaded on an intial startup through one eval and printing.

ngen install Clojure.dll
ngen install
ngen install clojure.core.clj.dll
ngen install clojure.core.protocols.clj.dll
ngen install clojure.core_clr.clj.dll
ngen install clojure.core_deftype.clj.dll
ngen install clojure.core_print.clj.dll
ngen install clojure.core_proxy.clj.dll
ngen install clojure.genclass.clj.dll
ngen install clojure.gvec.clj.dll
ngen install clojure.main.clj.dll
ngen install clojure.pprint.cl_format.clj.dll
ngen install clojure.pprint.clj.dll
ngen install clojure.pprint.column_writer.clj.dll
ngen install clojure.pprint.dispatch.clj.dll
ngen install clojure.pprint.pprint_base.clj.dll
ngen install clojure.pprint.pretty_writer.clj.dll
ngen install clojure.pprint.print_table.clj.dll
ngen install clojure.pprint.utilities.clj.dll
ngen install clojure.repl.clj.dll
ngen install clojure.walk.clj.dll

You will get errors, mostly of the form
1>Common Language Runtime detected an invalid program. while compiling method clojure/walk$macroexpand_all$fn__12034__12039..ctor

It works nevertheless.  Someday I'll have to figure out exactly what sin I'm committing in my constructor code.

I cannot competently hypothesize why ClojureCLR kills the JITter like this, in comparison to the JVM, or in comparison to other similar sized programs.  Delayed JITting may be one reason.  Also, the generated ClojureCLR code contains twice as many classes as the JVM code does, due to the tactic I use to workaround the inability of the DLR to generate instance methods.

Saturday, December 24, 2011

ClojureCLR has a new home

The survey action items had as item #1 the following:
Action item: Ask Rich Hickey and Clojure/core to clarify their position and/or plans re ClojureCLR.
You can refer to the post on viability to to get a better sense of what responders were looking for.

As a result of a discussions with core, ClojureCLR has moved to a better neighborhood.  The repo has moved under the clojure group at github:
joining the mainline Clojure project, ClojureScript and the contrib libs.

ClojureCLR is now an official project under
Issues will be managed through this site.   We will continue to maintain RHCAH: Rich Hickey Contributor Agreement Hygiene.  Please have a CA on file with Rich before submitting patches.

I've used the move as an opportunity to rewrite and reorganize the wiki pages.  I'll be posting entries here on some of the newest material, such as the improved support for generic method interop.

This (much appreciated) token of core's regard is likely to be the extent of official support for ClojureCLR  for the time being.  Core does not have sufficient resources for more aggressive stewardship of the project.

The answer to the action item: ClojureCLR is a community-supported project.  If you are interested in the long-term success of Clojure on the CLR, roll up your sleeves.