F# Performance vs C# and Garbage Collector - 2x slower

matteobortolazzo · April 11, 2020, 7:56pm

Hi all,
I am new to this forum and F#.
I tried to learn F# writing things I want to learn first in C# and then converting it into F#.

I wanted to learn how compilers work so I tried to write an arithmetic expression evaluator like if it was a language.

I put limits on my C# code

Recursion only
Immutable only
Static methods only
Immutable Linked List (I created one)

The first thing I wrote is the Lexer: it converts a string like “30 - 10 * 5 % 3 + 20” into a list of tokens.

I wrote the function with the 2 languages 1 to 1, and what I found out is that the F# code was more that 2.7 times slower (I run the function 10M times).

17s: C#
45s: F#
31s: F# Tail Rec
25s: F# Tail Rec with Char list.

What I can see from the memory graph is that the Garbage Collector is called 3x more when running the F# code.

So, my question is, can anyone help me figure it out why this happens?

C#

using System.Collections.Generic;

namespace CalculatorCSharp
{
    public enum TokenType
    {
        Operation,
        Float
    }

    public class Token
    {
        public Token(TokenType type, string value)
        {
            Type = type;
            Value = value;
        }

        public TokenType Type { get; }
        public string Value { get; }
    }

    public static class Lexer
    {
        private static readonly List<char> _operations = new List<char> { '+', '-', '*', '/', '%' };
        private static bool IsOperation(this char c) => _operations.Contains(c);
        private static bool IsNumber(this char c) => c >= '0' && c <= '9' || c == '.';
        
        private static int GetFloatLenght(this string input, int lenght)
        {
            if (input.Length == 0 || !input[0].IsNumber())
            {
                return lenght;
            }

            return GetFloatLenght(input[1..], lenght + 1);
        }

        public static LinkedList<Token> Tokenize(this string input)
        {
            if (input.Length == 0)
            {
                return LinkedList.Empty<Token>();
            }

            var head = input[0];

            if (head.IsOperation())
            {
                var token = new Token(TokenType.Operation, head.ToString());
                return input[1..].Tokenize().Add(token);
            }

            if (head.IsNumber())
            {
                var floagLenght = input.GetFloatLenght(0);
                var token = new Token(TokenType.Float, input[0..floagLenght]);
                return input[floagLenght..].Tokenize().Add(token);
            }

            return input[1..].Tokenize();
        }
    }
}

F#

module Lexer 

type TokenType =
    | Operation
    | Float

type Token = {
    Type: TokenType
    Value: string
}

let private operations = [|'+'; '-'; '*'; '/'; '%'|]

let private isOperation c = Array.contains c operations

let private isNumber c =
    match c with
    | c when c >= '0' && c <= '9' || c = '.' -> true
    | _ -> false

let rec private getfloatLenght input lenght =
    match input with
    | "" -> lenght
    | _ ->
        match input.[0] with
        | c when isNumber c -> getfloatLenght input.[1..] (lenght + 1)
        | _ -> lenght

let rec tokenize input =
    match input with
    | "" -> []
    | _ ->
        match input.[0] with
        | c when isOperation c ->
            let token = { Type = TokenType.Operation; Value = input.[0].ToString() }
            token :: (tokenize input.[1..])
        | c when isNumber c ->
            let floatLenght = getfloatLenght input 0
            let token = { Type = TokenType.Operation; Value = input.[0..(floatLenght - 1)] }
            token :: (tokenize input.[1..])
        | _ -> tokenize input.[1..]

I hope the answer is not super obvious

FoggyFinder · April 12, 2020, 7:05am

Results are quite expected.

Discriminated Unions (DU) is not the same as Enum type.

Enum type is much simpler

type TokenType =
    | Operation = 0
    | Float = 1

To check what compiler gives you can use extremely helpful sharpLab.

Here is a link to DU version: TokenTypeDU

and that’s to Enum version TokemTypeEnum

F# records is also not the same as C# classes:

Differences Between Records and Classes

type Token(value:string, token:TokenType) =
    member __.Type = token
    member __.Value = value

F# classes

Side note: for creating benchmarks you can use BenchmarkDotNet

matteobortolazzo · April 12, 2020, 1:51pm

Hi,

thanks for the quick response, I learned something new!

However I updated the code to use enums and classes and I gained just 2 seconds, so 43s.

Just to be sure I changed my custom list to FSharpList<> in C# (thanks for sharpLab), but now it’s even faster.

There must be something fundamentally wrong in my F# code.

However change the char check to the following shaved off another 10 seconds (35s).
I cannot find a way to declare a constant list.

[<Literal>]
let Operations = "+-*/%"
[<Literal>]
let Numbers = "1234567890."

let private isOperation (c: char) = Operations.Contains(c)

let private isNumber (c: char) = Numbers.Contains(c)

With this change I get
16s: C#
35s: F#
25s: F# Tail Rec
19s: F# Tail Rec with Char list.

FoggyFinder · April 12, 2020, 2:01pm

Can you put benchmark project to github repo or something ?

matteobortolazzo · April 12, 2020, 5:10pm

Sure!

Here it is: https://github.com/matteobortolazzo/FunctionalArithmeticEvaluator

Even if I don’t know why with BenchmarkDotNet the “char list” version is slower…

BTW thanks

FoggyFinder · April 12, 2020, 7:30pm

Perfect.

You’re still using F# records.
To understand where the difference lay you can compare generated IL.

And then you can easily notice that input.[1..] generates something very suspicious while C# version is clean and use Substring internally.

Also F# version performs matching to a string while C# version compares length of a string.

I’ve opened a PR with these changes. Should be faster than it was.

To create better benchmark you can add more cases (not only simple short string but big one and consider some corner cases).

matteobortolazzo · April 12, 2020, 9:48pm

Thank you for the PR and the time you spent on this!!
I moved back to records because, as tested later, there are no significant performance differences in this particular application.

The real problem was F# string Slicing! As you said C# string[x…] is very very different from F# string.[x…]

So here the end results

Now F# is faster then C#!!!

I just wonder why string splitting behave so differently, and why F# version is so inefficient when just a Substring can fix it, probably for other types of application it is worth it.

Thank you again for your time.
I had fun doing this and I learned a lot!

Topic		Replies	Views
F# faster than C#? General	3	244	January 21, 2024
I am curious to find out more about this language	1	665	November 5, 2020
F# vs C# in Performance General	5	9878	May 9, 2020
Why choose F# over C# for graphics, compiler and machine learning?	0	701	June 17, 2019
C# interoperability / how to call c# methods / f# as a non-dotnet developer General	3	185	October 26, 2023

F# Performance vs C# and Garbage Collector - 2x slower

Related Topics