Is F# development and promotion lagging and ignoring data science?

I was checking the mbrace project (mbrace.io) for cloud computing with F#/C#. The project seems to be either dead or not maintained much. The latest updated ROADMAP is over 3 years old and most of git updates seem to be over one year old, which for a young project does not seem right. While Spark has risen sharply in use and has greatly helped adoption of Scala, MBrace seems to be going the way of the Dodo.
I have always just enjoyed the elegance of coding in F#, but prohibitive issues have remained constantly present over the last 6-7 years. Recently, I was talking about F# to a data scientist and researcher and he posed several questions:

  • Do I have reliable and well-document access to libraries and packages for exploration, visualization, and modeling?
  • Do I have reliable and well-document utilities for doing GPU programming?
  • Do I have reliable and well-document access to tools like Hadoop and Spark?
  • Do I have reliable and well-document tools for utilizing Keras, CNTK, TensorFlow, and alike?

I did not have good answers for any of them. First of all, even when you have something available for F#, documentation and examples are severely lacking. Most of the times though, you don’t get anything native or F# sugar-coated and wrapped. You can use Mobius but it is primarily a C# API. You can use AleaGPU but still documentation and examples are mostly geared for C#. Accord.NET is a very capable package but it still is C#-ish. ML.NET and Encog are very interesting and capable but again C#-oriented. You are even likely to get a Python interface for a Microsoft product much faster than F#.

Another point of worry is the insufficient levels of investment (money and people) by Microsoft. When you have a corporate giant giving birth to and purporting to support a technology, you kind of expect a lot more. Both cloud computing platforms (OneNet/Prajna and MBrace) seem to be goners, and they do not seem to have anything in the pipeline. The simple fact is, people doing things in their free-time for nothing, when the community is not so big and diverse, is simply not going to cut it. I just hope that they develop the willingness to throw some money and resources at critical and valuable projects. Letting go of big data and cloud computing competing with JVM products just does not seem to be very farsighted and wise.
BTW - I really do not see this as an opinion-based topic. Whether or not you like a BMW or whatever brand is an opinion but whether or not it has certain features to make it a safe automobile has a lot more to do with facts. Whether or not there is sufficient investment in a certain product by some company and whether appropriate tooling and support are present and being actively developed in some ecosystem to make it competitive for particular tasks/projects, should not be matters of personal taste and opinion.

2 Likes

ML.net works currently with F#. However you are correct that it’s currently implemented with C# in mind. Being said the team has it as an initiative to better support F# and discussion is being had around how to do that. As for your concerns about Microsoft, I would not worry about them. F# is useful today, it will be actively maintained by the community for the foreseeable future. It is supported in Microsoft’s latest tools and products such as Azure. I wouldn’t depend on Microsoft to dump large resources into F# any time soon but it’s hardly going away. The fact that the language is open means it is here to stay with or without Microsoft. There are even some here who would consider Microsoft’s gentle influence on the language a blessing.

4 Likes

I want to comment a bit on the ‘Microsoft’ part.

Over the last 10 years, Microsoft has made a huge shift from a very corporate-y for-profit entity, to contributing a lot of public effort towards open-source initiatives. Additionally, I don’t think you can really ‘measure’ the resources Microsoft is putting into F#. Sure, we see @cartermp and Don Syme regularly, but Microsoft is not particularly well-known for divulging the entirety of a (the) team. We may not see all the resources it puts in, because it does have internal workings.

Additionally, Microsoft’s involvement is less necessary due to the open-source setup F# follows. By using the RFC process, and open-sourcing the entire compiler and editing tools, Microsoft has allowed those using the language to make first-class changes to it. It’s not longer only considering it’s own interests, it’s actually allowing us, the community and developers using it, to put what we need into the language. I think the biggest reason we don’t see as much public Microsoft involvement is that, I’m 99% sure, they don’t use it internally. C# is more heavily-invested because it is the de-facto tool for Microsoft work, as well as a very standard tool for third-party developers (us).

4 Likes

Also, how are we supposed to have ownership of our tools if we depend on Microsoft making all of them. If it’s “open source” and nobody reads it then it provides little value. Community ownership must naturally come from a community. We should rely on Microsoft for things we cannot provide for ourselves but not literally everything, the goal is interdependence not codependence.

Tech-ppl being smart and-all-that, happen to usually kinda always miss the points other then tech. Every product requires momentum to take off. There can be hypes that generate momentum en-mass under right circumstances at right times, but the majority of products and services require the support of an organisation, money, anddedicated expertise, for development, production, promotion, positioning, advertisement, marketing, tooling, support, …

In a breakneck-competition-infested capitalism we live in right now, even those with the proper jazz tend not to alway get ahead or even remain competitive, and those without the appropriate bells and whistles, are in many cases doomed. Now, granted, some things are lucky to generate a hype for whatever reasons at certain times:

  • Java made life so easier for developers suddenly flocking to it (+100 other reasons)
  • R/Python offering somewhat simple scripting solutions especially in data analytics (+100 other reasons)
  • Scala getting Spark and what-not hand-in-hand which became a star product which in-turn made Scala popular (+100 other reasons)

In any case, they happened at the right time offering the right solutions perhaps offering the best problem-solving package for the first time, in their kind of way. I admit, being free had and still has a lot to do with them taking-off. Being open-source and free of legal hurdles comes next. Should the same products be offered now, I doubt they would garner much interest. In any case, F# is over a decade old and if the so-called open-source community wanted to embrace it (or in fact FP as a whole), they would have done it by now - SIGH!

Lisp is a brilliant functional language and by some accounts at the top of the pinnacle - hate the whole parentheses clutter though - and it is from the 1950s. Where is it used now? How many people use it? How much it is used? How many real useful used things have been developed in it? Are there enough jobs in it? Is it fulfilling its potential? It was more-or-less a niche then, it is so now …

For F# to become a success and reach its potential it requires excellent tooling - instead, it has gotten worse -e.g., compare VS 2017 and 2015. It needs to be taken seriously and properly supported as a first-class citizen by MS so other businesses and corporations feel comfortable and can trust to utilise it. It needs true cross-platform support and tooling as in actually working .NET Core on Linux, Mac, etc rather than Mono. It would go a long way if they offered the same polished VS cross-platform. I am not saying there needs to be native libraries, but nice polished consistent documented official F# wrappers would go a long way. Forget about the wrappers even, I will make do with proper documentation.

.NET is severely lacking when it comes to data analytics/data science tools and it is a huge and ever-growing market. Surprisingly, I haven’t seen F# wrappers for any of Accord.NET, ML.NET, CNTK, or Encog. They gave up on their own cloud-computing platform (OneNet/Pirahna) and they did not take an interest in MBrace and that appears dead as well. They were perhaps right to do so given the momentum that Hadoop/Spark had/have, but perhaps they should have done a much better job at providing nice interfaces from .NET into Python or JVM based standard tools of the trade. FYI, Mobius also has no F# wrapper. You want concurrency, Orleans is your best and Akka is next, and both eh, C#.

You want to utilise your GPU, there is Alea (C#). The dominant AI tools that utilise GPU like Keras and Pytorch are Pythonic with No production-ready .NET interface. Tensorflow and CNTK have C# interfaces but that’s it -> having seen the C# code for CNTK and Tensorflow, I can really understand the need for a nice easy-to-use interface like Keras that combine the capabilities of all major frameworks. There is no such thing in .NET and no interface to it in .NET. That’s just lame and lazy.

Then, there is the paramount need for libraries/packages to handle various tasks. What exists on .NET is simply insufficient. The strength of R and Python is their very rich ecosystem. They do suffer from the issues relating to open-source software developed by people who are by-and-large not professional developers, but they get the job done, and in the end, that is what counts.

Now, I can even forget about the wrappers if there was good documentation for doing stuff in F# but again, in many cases, I would have to resort to answers in C# and convert them to F#, which does not always work as well as one would like or need. Even when it comes to its own stuff, its lacking: a couple of month after F# 4.5, I was checking and I still couldn’t find any resource that could tell me why and in what situations I should use ValueOptions rather than Options.

All these being said, when we are talking about tools that want/need/depend on parallelism and concurrency - as with Machine Learning and AI, Cloud Computing, GPU Computing, Data Processing, … - F# with its functional first nature with a been a much more suitable choice for all such tools to be developed in. There are also proprietary softwares still popular both for support and legal reasons. SAS and MATLAB are really outdated and clunky yet widely used and a commercial Microsoft platform could do really well there. F# could easily be the backbone there.

Developing such interfaces, frameworks, libraries, documentation and self-learning resources, takes a lot of man-hours and resources that a relatively small or even a mid-size community cannot handle on its own. They are a giant to the tune of over half-a-trillion dollars sitting atop 10s of billions of dollars in cash. What’s wrong with spending a few million to promote a product that is going to increase their own market penetration.

F# is a beautiful elegant succinct simple concise language. It has a tone of exhilarating great core features (algebraic types, computations expressions, async workflows, type providers, parallel sequences and arrays, lazy eval, etc). It is an excellent teaching language and fun to learn. With its functional nature, it is well suited to the multi-core nature of the moment and the future. Parallelism, concurrency, and reactive coding come naturally to it. No design patterns, simple functions, less errors, … but it needs proper well-placed investment and support

1 Like

Even when it comes to its own stuff, its lacking: a couple of month after F# 4.5, I was checking and I still couldn’t find any resource that could tell me why and in what situations I should use ValueOptions rather than Options.

Which resources were you looking at? The official documentation on ValueOptions has mentioned when you would use one since June of last year - coinciding with the Preview release and two months before the full release of F# 4.5.

2 Likes

Well, I would rather not pointlessly argue. I clearly recall it was sometime after the release, I looked it up and could not find anything decent. Under certain conditions structs enjoy much better performance than classes. The way I see it, since a ValueOption is a struct, smililar to a primitive type, small, immutable, and is usually not subject of frequent boxing/unboxing, it should almost always perform better than an Option which I presume is a class. If that is the case, it should simply be mentioned as such on the page, and if there are exceptions or cases where an Option might be better, then that should be made clear as well. It is a functionall-first programming language - inherently a strong mathematical and logical structure - and there is no need for vagueness.

1 Like

Check the below text from the page. It leaves to the developer, the task of drilling into the abstraction to determine the underlying type and whether an struct is preferable to a class. What is wrapped in my ValueOption/Option and how big is it? Is is going to be copied much? Is it going to be boxed/unboxed much? etc … Adding to the issue: What exactly are big/small/limit-number of copies/boxings/unboxings/… that determine whether it is best to use an struct or a class? Frankly, I like the minimalistic approach of F# - Less Is More! The smart lads that spend time and effort adding this feature probabely had much clearer and better explained reasons for doing this than what is on that page. I would rather see that than some neither-here-nor-there intro.

From the ValueOptions:

The Value Option type in F# is used when the following two circumstances hold:

  1. A scenario is appropriate for an F# Option.
  2. Using a struct provides a performance benefit in your scenario.

Not all performance-sensitive scenarios are “solved” by using structs. You must consider the additional cost of copying when using them instead of reference types. However, large F# programs commonly instantiate many optional types that flow through hot paths, because structs can sometimes yield better overall performance over the lifetime of a program.

I really do not know what to take from this last bit.

1 Like

On the other hand, I liked this one very much: Span Support.

@sashab If you want to discuss when to use an Option or a ValueOption please do that in a separate thread and without the aggression. Taken individually, your comments may be useful contributions to a number of conversations, but as a whole they are a stream of consciousness that no one would be able or willing to reply to.

.Net libraries with nice APIs are perfectly usable from F#. If any open source .Net libraries don’t have nice APIs it’s better to work to make them nice rather than writing wrappers.

2 Likes

Dear @charlesroddie,

Should you feel in-anyway-unpleasant about a topic, be it in form, content, length, names, people, etc, you are free to disregard such discussions and not read them. If you feel something is inappropriate, please document the facts and report the issues to the authorities, or if you are an authority take appropriate action and notify the concerned parties through appropriate private channels, rather than public schooling and scolding. Otherwise, while I thank you for kindly expressing your artistic critique of my writing, please keep your thoughts and opinions to yourself, more so in public. We certainly do not know each other well-enough for me to afford you such privilege.

First and foremost, I would like to emphasize my all-due-respect attitude. Second, your post/reply merited a detailed hard response. Third, F# is a technically-excellent language that is aptly suited for data-science/mathematical/statistical/number-crunching, and despite the fact that F# 1.0 was released in 2004, in terms of ecosystem growth, market expansion, and adoption, tooling, it is barely keeping up with Julia, for which Julia 1.0 was released in mid-2018. We can ignore the fact that Scala (an OOP-first language of similar age), despite clear and significant inferiority in a great many aspects, is the tool of choice for doing number-crunching in a functional way, but none-the-less, the sad fact is there. Now, we can accept the status quo or try to improve F#. Clearly, doing what we have been doing so far has not yielded the fruits we all desire and deserve. Logically, maybe we can try to ask new questions and take different approaches.

I like things to be precise so lets break it down and go about it piece-by-piece.

If you want to discuss when to use an Option or a ValueOption please do that in a separate thread and without the aggression.

My latest few “responses” on the topic of ValueOptions were in direct response to @cartermp. Not “response” as in taking-a-position but as in discussing-to-discover. On one hand, I was taught better than leave a question directly poised to me go un-answered - it is simply extremely rude for me to do so. On the other hand, given that he is a moderator, a member of the Board of Trustees, an active highly-regarded developer in F#, C#, .NET, ML.NET, etc, I am sure he has more than enough authority and expertise to enlighten me further. Also, he would probably be the first to warn someone where they out of line.
You accused me of “aggression”, and while there is a certain subjectivity in assigning attributes to things, there is barely enough negative attitude in my posts to call it “frustration”. It is important not to carelessly and casually assign attributes to people and their thoughts, words, and actions. When we do, we best accompany that with strong evidence. I was deeply offended there.
The only thing that I noticed that might remotely be considered negative is: Tech-ppl being smart and-all-that, happen to usually kinda always miss the points other than tech. This is an expression of a common trend not an insult nor a fact as in “Dentists always miss tech points”. Of course, there are dentists who are tech-savvy and may even know how to code, but as a rule of thumb, tech is not their cup of tea and they view things as a medical professional would which naturally comes with years of practicing something all day long.

Taken individually, your comments may be useful contributions to a number of conversations, but as a whole they are a stream of consciousness that no one would be able or willing to reply to.

If you are a moderator, kindly clearly advertise that fact and express why and according to what rules and bylaws, you are giving me a warning. From my point of view (and based on the universal Principle of Innocence):

  1. All the posts in the thread are F#-specific and on-topic perfectly suitable for the forum.
  2. If I deviated from the topic of my own thread on ValueOptions, it was simply to discuss a point brought up by a moderator and a valued member of our community.
  3. I did not break any character/length limits. Unlike what the last couple of years have made us believe, important issues usually take more than “140” or “280” characters to address. In some cases, like this one, I think issues are numerous and complex and take a lot of attention, time, and focus to address - more than the 2-minute attention-span that is the norm nowadays.
  4. While I thank you for kindly expressing your artistic critique of my writing as “stream of consciousness that no one would read …”, please keep your thoughts and opinions to yourself as far as my person is concerned. We certainly do not know each other enough for us to allow one another any such privileges. This is another personal insult that you graciously afforded me.

Prior to learning to respect ideology, fate, race, colour, language, culture, opinions, etc of other people, we should learn to at least respect their very personal choices. Should you feel you do not like the topic, form, content, length, names, people, you have complete freedom to simply ignore such threads and not read any of it. If you feel something is inappropriate, kindly document the facts and report the issues to the moderators, or if you are a moderator take appropriate action and notify me through private channels, rather than publicly trying to school and scold.

.Net libraries with nice APIs are perfectly usable from F#. If any open source .Net libraries don’t have nice APIs it’s better to work to make them nice rather than writing wrappers.

Thank you for pointing these out. I am well aware of them just like any JVM libraries are accessible from Scala or Kotlin. That does not mean it is optimal or nice to have to work with something with API and design in another language. I do not recall once mentioning .NET libraries that do not have nice APIs. In fact, since they are mostly developed by professional developers, they tend to be of the highest quality. Wrappers are a subject unrelated to the quality of API. We need wrappers for utilities we use so that we would not have to switch to idiomatic C# so frequently.
I am not sure if you wanted to convey the meaning: “Go do something about it and improve .NET libraries.” But we always need to consider the fact that users far outnumber developers in any area and their feedback and criticism must be carefully considered and cherished. Even if I (or anyone else for that matter) am only posing questions/problems/criticism, they should be viewed in a positive light and taken seriously. F# was not made to be a hobby or a toy or a source of amusement for computer scientists. At the end of the day it was made with the goal to become a real tool for real users to solve real problems in a quick and efficient manner …

1 Like

As one of the “smart lads” who spent the time and effort adding and shipped the feature, and the author of the article, I can only repeat what the article mentions: the primary reasons for using a struct are not straightforward. They are not a catch-all speed dial for any scenario, and can often yield worse overall performance if misapplied. ValueOption, by being a struct, is subject to these limitations.

You can also check the history of the document about availability: https://github.com/dotnet/docs/commits/master/docs/fsharp/language-reference/value-options.md

3 Likes

Well, the way I see it, there is something awfully and utterly wrong if I have to think about 10,000 things when doing something as simple as selecting a data structure or type or container or whatever. If one checks the code of say top 100 F# programmer, we can clearly see patterns as to where and why choices are made. Having clear rules of thumb is benefitial and I just cannot accept that you have to backtrace your code and do an analysis for hours just to determine Option vs. ValueOption. There should be 5-10 things at most that you consider before making that choice, otherwise if it is more complicated, then something is seriously wrong. If we could distill them into “Rules of Thumb”, it might be benefitial.

1 Like

Performance is like this in any language. The rule of thumb for structs is they should have a small memory footprint. 16 bytes used to be the magic number but with changing processors I don’t know if that’s still the case. However there’s edge cases that pop up as you get closer to the hardware and you’ll have to consider them just as you would in any language that gives you this level of control. Sometimes a rule of thumb while pithy can create an illusion of understanding that gets them to do something that they really should have read up on first. I would say the best rule of thumb is to avoid them unless you understand specifically why they can help performance.

Apologies for resurrecting this dead thread I didn’t notice the time stamps. :frowning:

1 Like

Thank you for taking the time to provide direction. However, a vague statement such as “The primary reasons for using a struct are not straightforward. They are not a catch-all speed dial for any scenario, and can often yield worse overall performance if misapplied.”, not only is not helpful, but adds to the problem by its inherent lack of useful data. It is as if devised by a legal team somewhere just to offload any burden of responsibility and retain everything in a shroud of mystery.

In any case, the following excerpt from Microsoft Docs (https://docs.microsoft.com/en-us/dotnet/standard/design-guidelines/choosing-between-class-and-struct) provides a very decent rule-of-thumb that can be of use to most people:

✓ CONSIDER defining a struct instead of a class if instances of the type are small and commonly short-lived or are commonly embedded in other objects.

X AVOID defining a struct unless the type has all of the following characteristics:

  • It logically represents a single value, similar to primitive types ( int , double , etc.).
  • It has an instance size under 16 bytes.
  • It is immutable.
  • It will not have to be boxed frequently.