F# vs C# performance

I do not understand why F# so slow

let inline swap<'a> (a: byref<'a>, b: byref<'a>) =
    let t = a
    a <- b
    b <- t

let sort arr =
    let ln = Array.length arr

    for i = 0 to ln - 1 do
        for j = 0 to ln - 2 - i do
            if arr.[j + 1] < arr.[j] then
                swap (&arr.[j + 1], &arr.[j])

let doTest () =
    let max = 100_000
    let value = float (max)

    let arr =
        Array.init max (fun i -> value - float (i))

    sort arr
    printfn "%A" (arr.[0..5])

#time

doTest ()

It is very slow (few minutes).
Analog on C# take above few seconds. Why?
Original cod here vostok/example/BubbleSort.mod at master · Vostok-space/vostok (github.com)
(Interesting post about result for it on Oberon(simple language from Nicolaus Wirth) Восток: Скорость пузырьковой сортировки на Oberon, C++, Go, Rust (vostok-space.blogspot.com))

using System.Runtime.CompilerServices;

var m = 100_000;
var a = new double[m];
var s = (double)m;
for (var i = 0; i < m; i++)
{
    a[i] = s;
    s = s - 1;
}

Sort(a);

Console.WriteLine(a[0]);

static void Sort(double[] a)
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    void Swap(ref double x, ref double y)
    {
        var t = x;
        x = y;
        y = t;
    };

    for (var i = 0; i < a.Length; i++)
        for (var j = 0; j < (a.Length - 1 - i); j++)
            if (a[j + 1] < a[j])
                Swap(ref a[j + 1], ref a[j]);
}

I’m new to F# so I can’t really offer an explanation, but the following change on my local machine dramatically improves performance:

let sort arr =

to

let sort (arr: float array) =

Without the type annotation I ended up having to kill FSI after a few minutes. With the annotation I get the following:

[|1.0; 2.0; 3.0; 4.0; 5.0; 6.0|]
Real: 00:00:08.292, CPU: 00:00:00.201, GC gen0: 0, gen1: 0, gen2: 0

I would love to know what’s going on here myself. It must have something to do with the generic function not properly passing by reference or something?

I’m on an M1 Mac with .NET 7.0.100.

3 Likes

The arr parameter in the first version using type inference is constrained to be 'a when 'a: comparison, so the what the fifth line does is to call LanguagePrimitives.HashCompare.GenericLessThanIntrinsic which is costly due to boxing. In C# version or non-polymorphic version, there’s no such a problem.

Update: Mark the function as inline then the compiler will expand the difinition at the callsite according to concrete type of arr’s elements instead of uniformly apply intrinsic function.