I’m a beginner in Machine Learning, I just completed Stanford Coursera classes with Andrew NG.
I’m now looking to play more with real world examples. I’m a big fan of C#, as I have many years working with it. I was looking at F# and the learning curve is just too heavy in order to me to apply novice ML understanding with F# and I will have no choice to start with python first.
It would be great to have something similar to https://www.dataquest.io/ in order to improve the adoption of F# for ML beginner
Do you have other ideas or training materials that I didn’t discovered yet?
I just completed that Andrew Ng course too. And I’m learning Python, although I would prefer to use F# for everything if I could.
I also plan to start this course soon: https://course.fast.ai/
I’m almost done reading this book (with F# examples), which I recommend: https://www.amazon.com/Machine-Learning-Projects-NET-Developers/dp/1430267674
And these links are helpful for anyone who decides to try ML.Net with F#:
Configure F# Interactive to use Microsoft.ML.Net | F# Snippets
Minimal AutoML Binary Classification Sample | F# Snippets
There are a few:
Of course, for Python or R, you get datacamp.com or dataquest.io and that’s due to their immensely large user base and commercial pull. You are not going to get that for F# or Julia. Sites like that make life much more pleasant as you can find professional training in almost any relevant application area quickly with real worked-out examples to follow or to utilise as solution templates. They are also very expensive to set up and maintain/update (content-wise).
The most important platform Microsoft rolled out in the form of ML.NET is strangely C#-ish and has no decent F# wrapper. It could have really helped F# adoption and growth the way Spark did with Scala, but with Microsoft, F# seems to be that ignored in-the-back child that never gets the really cool stuff.
Now to ML:
- Data Preparation as in transform information (structured/unstructured) structured data as input. Cleaning Data, Manipulating Data, Handling Erroneous Data and Missing Values, … In most jobs, that’s like 50% (if you’re very lucky) to 90% of your time and effort. This bit --> get data and make it ready for analysis.
- Do exploratory and visual analysis. For a lot of jobs, this is where it ends, you provide simple statistics measures of central tendency, dispersion, bias, kurtosis, and alike, along with plots and graphs. Then you go and either compile them into a report or develope dashboards with live graphics that can be drilled up and down. This bit --> overview of data and what you might be able to do with it.
- Modelling - Inferential or Predictive, Frequentist or Bayesian, …?
3.a) Hypothesis Testing (t Test, F Test, A/B Testing, ANOVA, …), Regression, Classification, Clustering. These are your bread and butter unless you land in the hot seat at some cutting-edge mathematics/computer-science research centre at a university or few dozen companies like Microsoft, Google, …, Goldman Sachs, Lockheed Martin, …, etc.
3.b) GLMs, LDA, SVMs, Non-Linear Regression, Bayesian Graph Networks, ANNs, Deep Learning, … you could theoretically spend time endlessly learning algorithms one after the other until you turn into a skeleton and there will be new stuff coming out yet. Within each area of application, there are few widely adopted techniques and methodologies you can hack together, and most people would only experience working in few areas.
3.c) Most important is to know which class of models and techniques to use for each situation, what data you need for it and in what format to make it a viable input, what you get out and how to interpret important bits of output, and how to assess/monitor the performance of whatever tools you are employing (e.g., accuracy, precision, specificity, F1-statistic, ROC Curve, …).
- CLOUD? Azure, AWS + Keras, Tensorflow, ML.NET, Spark, Hadoop, etc
You go there simply because the size of your problem/data and the complexity of your solution mandates that the issue can no longer be handled by your local computer’s resources such as RAM memory and processing power. You just have to use different syntax and write a few more lines to access the cloud resources rather than your PC or local server.
- Your end result is invariably a report or a dashboard or such that - most of the times - dry and devoid of any formulas or messy tables and numerous numbers and crowded graphics. Target audience is rarely an ML expert and mathematical or programming savvy people are in general not that many.
So for this whole workflow, I find that FSharp is still lacking: lack of libraries, lack of native libraries, lack of documentation, lack of article-quality graphics facilities, no interactive visualisation meaning no dashboards or reports. You can do a simple y ~ x regression on artificial data in R, Python, and F# (ML.NET). Then decide which looks and feels simpler/faster. Data analytics by-and-large is still scripting and prototyping, and in that R/Julia/Matlab still outshine full blown general purpose programming languages like F# or Python. On the positive side, F# is brilliant for data prep manipulating/merging data sources and producing inputs. I usually move on to R/Python for the rest of the workflow. Also, F# code is generally orders of magnitude faster than R and Python, more akin to C#.
F# has great potential. Its syntax sits nicely with math and stats and data science/analytics in theory and practice. FP makes life simple because things are immutable and you only need functions everywhere, maybe some procedural code peppered here and there, rarely OO, which makes everything so simple and easy to parallelize wherever. For me, F# coding and code, in general, is somewhat therapeutic and meditational, I admit: I find it wonderfully intuitive and fun with a minimalistic tendency that I always appreciate. And to be honest, I think FP as a whole has an uptake issue and F# is perhaps doing better than most if not all FP Langs. Scala is an OO language with FP sugar added-on, after all, and it is on JVM which offers it a bigger user base anyway especially in the open-source community where all the hype usually resides. I personally hope that with .NET 5 settling down, perhaps within 1-2 years after its introduction, finally F# settles down as well and gets all it deserves in terms of tooling and support from Microsoft/Community and much larger community adoption. It is a brilliant language for math/stats/data, scripting, prototyping, production, whatever … my overall wildly vane hope/dream is for Julia/F# to replace R/Python …