Why Does F# Rely On Ref Type Wrappers?

I was watching Don Syme talk about the F# compiler and was impressed that it implicitly promotes a mutable value type to the heap (FSharpRef<>) when it escapes the stack context:

let x(p) = 
    let mutable x = 3
    let f () = x <- x + 1

But that weird corner case in which the compiler performed so gracefully got me thinking about all the reference type wrapping F# does by default.

First, I’d like to understand why Option<'T> is a reference type. I see Option<'T> as a textbook example of when to use a struct vs a class, but since ValueOption<'T> exists I feel I must be missing something. From the .NET design guidelines, structs should be used when:

  • it’s immutable
  • it’s short-lived
  • logically represents a single value
  • small (under 16 bytes)
  • boxed infrequently

F# is immutable by default and this wrapper is typically short-lived: a method returns an option that is immediately decomposed with a match expression. The corner case of all corner cases would be for someone to declare an option as mutable and let it escape the context, but as shown above, the compiler handles that perfectly.

ValueOption<'T> doesn’t automatically meet the 16-byte criteria, but it only increases the obj size by 4 bytes. Option<'T> has a single 8-byte backing field for a pointer (internal T item;) and uses item = null for pattern matching. Since 'V value types have a default value, it adds an extra 32-bits to the size of 'V for an enum backing field: ValueNone | ValueSome.

The same logic applies to single case DUs. This tick’s the remaining box on the when to use struct checklist:

  • commonly embedded in other objects

type OrderID = OrderID of int get’s embedded in type Order, but defaults to boxing primitive types. This behavior put me off from using them, but I do see the value (just spent a few hours debugging a method that was passed a userID instead of their groupID). I think I’ll use the struct version in the future.

I’ll save regular DU’s for another question. There’s a bit more to the struct version of that wrapper and I’m not sure why it creates a backing field for each case (internal Case1 case1; internal Case2 case2;) instead of one generic field (internal T item;).

So, these are good and interesting questions. While I don’t have a complete answer for you, I can say that the answer is, at least, partially historical. F# first appeared on top of .NET 2.0 (and, indeed, retained compatibility with .NET 2.0 for quite a long time). The CLR looked very different back then – especially, then in terms of performance. This, in turn, drove many of the low-level design decisions (sub-classing versus delegates, reference type versus value type, et cetera). Over the years, as the run-time has changed, F# has tried to change with it – without sacrificing backwards compatibility. So, this means things like adding ValueOption instead of changing Option, and so forth. There’s obviously a lot more involved in providing a complete answer to your questions. But hopefully this gives some useful context.

I get the legacy cost, but it still poses the question of whether Option/ValueOption could be added as a compiler optimization for new code.

Short of that, it’d be up to the community to push the use of struct based single value wrappers over ref based in the docs, advent calendar, etc. As someone new to the language, ValueOption is not all that visible. I only discovered it after scratching my head at some GC stats.

It’s primarily legacy, but the other thing to keep in mind is that the large majority of types in .NET are reference types and the runtime was built to handle them well. For certain situations, like a struct wrapper for domain modeling (e.g., single-case DU) they’re always better, but very often performance characteristics depend on enough factors to make it not always obvious that a struct is a better choice.

Sure, one db call outweighs a thousand options. The point though is that the F# way of doing things, the pit of success style, is perfect for high-performance applications.

Just imaging you’re looking at F# for a real-time stream processor where latency is an issue. You do all the tutorials, read the docs, etc, and translate some c# code into F# for a quick performance test. Those extra latency spikes from GC matter, and you assume that safety comes at the cost of performance.

I’m looking at doing a CAD plugin with F# after I finish my Discord bot. On the CAD forum, another dev was having some perf issues. His matrix transformation in c# was 100 times slower than the c++ version. The issue was 100% GC related, in an operation that should never cause GC to be invoked (ref points in new arrays vs struct points in an array pool).

1 Like