Hi all,
I am new to this forum and F#.
I tried to learn F# writing things I want to learn first in C# and then converting it into F#.
I wanted to learn how compilers work so I tried to write an arithmetic expression evaluator like if it was a language.
I put limits on my C# code
- Recursion only
- Immutable only
- Static methods only
- Immutable Linked List (I created one)
The first thing I wrote is the Lexer: it converts a string like “30 - 10 * 5 % 3 + 20” into a list of tokens.
I wrote the function with the 2 languages 1 to 1, and what I found out is that the F# code was more that 2.7 times slower (I run the function 10M times).
17s: C#
45s: F#
31s: F# Tail Rec
25s: F# Tail Rec with Char list.
What I can see from the memory graph is that the Garbage Collector is called 3x more when running the F# code.
So, my question is, can anyone help me figure it out why this happens?
C#
using System.Collections.Generic;
namespace CalculatorCSharp
{
public enum TokenType
{
Operation,
Float
}
public class Token
{
public Token(TokenType type, string value)
{
Type = type;
Value = value;
}
public TokenType Type { get; }
public string Value { get; }
}
public static class Lexer
{
private static readonly List<char> _operations = new List<char> { '+', '-', '*', '/', '%' };
private static bool IsOperation(this char c) => _operations.Contains(c);
private static bool IsNumber(this char c) => c >= '0' && c <= '9' || c == '.';
private static int GetFloatLenght(this string input, int lenght)
{
if (input.Length == 0 || !input[0].IsNumber())
{
return lenght;
}
return GetFloatLenght(input[1..], lenght + 1);
}
public static LinkedList<Token> Tokenize(this string input)
{
if (input.Length == 0)
{
return LinkedList.Empty<Token>();
}
var head = input[0];
if (head.IsOperation())
{
var token = new Token(TokenType.Operation, head.ToString());
return input[1..].Tokenize().Add(token);
}
if (head.IsNumber())
{
var floagLenght = input.GetFloatLenght(0);
var token = new Token(TokenType.Float, input[0..floagLenght]);
return input[floagLenght..].Tokenize().Add(token);
}
return input[1..].Tokenize();
}
}
}
F#
module Lexer
type TokenType =
| Operation
| Float
type Token = {
Type: TokenType
Value: string
}
let private operations = [|'+'; '-'; '*'; '/'; '%'|]
let private isOperation c = Array.contains c operations
let private isNumber c =
match c with
| c when c >= '0' && c <= '9' || c = '.' -> true
| _ -> false
let rec private getfloatLenght input lenght =
match input with
| "" -> lenght
| _ ->
match input.[0] with
| c when isNumber c -> getfloatLenght input.[1..] (lenght + 1)
| _ -> lenght
let rec tokenize input =
match input with
| "" -> []
| _ ->
match input.[0] with
| c when isOperation c ->
let token = { Type = TokenType.Operation; Value = input.[0].ToString() }
token :: (tokenize input.[1..])
| c when isNumber c ->
let floatLenght = getfloatLenght input 0
let token = { Type = TokenType.Operation; Value = input.[0..(floatLenght - 1)] }
token :: (tokenize input.[1..])
| _ -> tokenize input.[1..]
I hope the answer is not super obvious