https://github.com/dotnet/machinelearning is now out on github and nuget. I get the feeling that it’s very early days, which means it’s a great opportunity for F# developers to make sure it works nicely with F#.
It is extremely early, yes. I can speak a bit more about the goals of ML.NET:
- First and foremost, be a good .NET API which meets table stakes for what developers need out of machine learning APIs now and in the future.
- Run where developers need it to run. GPU compute, FPGA compute, etc. Needs to run everywhere.
- Have good tooling support, somehow, in developer tooling that Microsoft delivers. Visual Studio is one such tool, but the team is well aware that ML work is not done primarily on Windows and VS.
- Be a good citizen for the .NET ecosystem, including F#.
The first point is the most important. From my point of view, a good .NET API is a good F# API. Proper naming, not exposing inheritance hierarchies, not exposing mutation, etc. When a good .NET API is delivered, it is much easier to have F# wrappers built atop it. When it’s a poorly-formed API, then wrappers are more challenging.
The second point is less interesting to me right now, because it’s really just a bunch of hard work that the team needs to do. It’ll come over time.
The third point is interesting, but I don’t think it’s of a tremendous concern for F# (certainly not at this point). The team cares about building a complete and enjoyable API. I’d love to have amazing visualization tools that work everywhere too, but realistically there is not capacity to develop everything at once. I’ve been an advocate for focusing primarily on bringing up algorithms and exposing them with a pleasant .NET API above all else.
The fourth point is already in-progress. For starters, I’ve been involved in some parts of ML.NET and will likely be involved further into the future. We’ve also reached out to @mathias.brandewinder before the announcement went live, and we’re dedicated to having an open communication line to folks who want to provide feedback and try things out to see how they “feel”.
Hopefully this helps shed a bit more light.
Excuse my maybe naive question but is this going to be some kind of Tensorflow equivalent but for .NET?
Have you seen the .NET Bindings for Tensorflow?
.NET will also have a
Tensor<T> eventually: https://blogs.msdn.microsoft.com/dotnet/2017/11/15/introducing-tensor-for-multi-dimensional-machine-learning-and-ai-data/
(Sorry, could not post two links in one post…)
I know that meanwhile there are some bindings but I still prefer to see a solution designed for .NET
If you are looking for a Tensorflow alternative I’d and have a look at CNTK. @mathias.brandewinder build a F# API for it and there is even a talk you can watch here which should give you a nice intro into his goals for the library.
If I understand @cartermp and MSFT correctly (I watched the build keynotes plus the AI talk on day 3 - so that is not the most complete knowledge one can have) - ML.NET will be as much AI\DeepLearning\etc. as you can cram into the .NET space. Might be thing you are looking for when it is “developer ready”. What you are using right now should not matter too much. The underlying concepts hold true regardless of the framework and MSFT tries to make switching frameworks as easy as humanly possible with ONNX.
Kevin linked in Slack / #datascience some code he wrote, converting one of the C# samples into F#:
A sample just reproduced from the ML.NET test cases: https://gist.github.com/kevmal/07c0a27ef83282e799cf4e3962311594
Pretty straight forward, just need to make sure
CpuMathNative.dllcan be found and the TextLoader wants fields specifically (records do not work)
I tried and have so far been unable to get the basic sample working (https://docs.microsoft.com/en-gb/dotnet/machine-learning/tutorials/sentiment-analysis). The script I created is here https://github.com/dotnet/machinelearning/issues/92. If anyone has any ideas what I’ve done wrong that would be great - completely unsure at the moment if it’s an F# thing, a script thing (ML .NET seems to use some reflection to dynamically load assemblies at runtime) or an ML .NET bug.
It works fine in C# in VS 2017 15.7.1. Of course, there might be a bug on the F# side. Are you definitely in an F# .NET Core context? If you’re in VS make sure you select the F# .NET Core project type (new in 15.7 I think) I made that mistake initially with the C# project and IIRC got a similar kind of error.
Hi Kevin. As you can see from the sample code, it’s not a console app or project but just a pure standalone script.
I’m wondering it you have any ideas of what might be on the “F# side” - the only thing that’s F# is the code in the script, the DLLs are just the same ones that come from the NuGet package (or at least, they should be).
I haven’t had the time to play with this yet (//Build was crazy, lots of in-flight work to land right now), but how does it work for an F# console app on .NET Core? I would be wary of any .NET Standard and F# scripts right now given that we haven’t completed the work to properly handle things in that space yet.
Seems like it will work with classes. Related question:
It should work with records as long as they have the CLIMutable attribute on them. I don’t think that’s the problem.
@cartermp I should definitely test it in a console app, you’re right.
However, even if that does work, we need to understand why in scripts it doesn’t work. I’m doing F# scripts on .NET standard elsewhere without a problem, which suggests that there must be something that’s different with this ML package, whether it’s the reflection / dynamic loading that’s causing an issue or something else.
Either way - scripts are IMHO the #1 way that we should be approaching this, not console apps.
Yeah, it should works with records. I had run your code in F# console app (with version 0.1.0 of MLNet) and got the same error.
Thanks. I’ve reproed here and added https://github.com/dotnet/machinelearning/issues/180 as a result.
Mind you the script version still fails, just now for some other reason (because it can’t find some other assembly dynamically).
Finally got a bit of time to play with this. It’s… interesting.
Thanks @FoggyFinder for pointing out the StackOverflow example - I was trying on my own to convert the Iris sample, I initially missed that these were fields and not properties. That’s an odd choice. I’ll need to dig into the docs, too, because I wonder how you would go about creating your own custom features, say, you wanted a feature that is
SepalWidth * SepalLength.
The thing which really puzzles me at that point is the pipeline where you just “add stuff”. I tried for fun to add the same thing twice, like
… and the whole thing explodes at runtime. It strikes me as odd to make everything, from data loaders to features definitions and algorithm type one common interface.
There are some things I like so far - like the fact that it seems to do a lot of auto tuning magically behind the scenes - but the overall design is a bit puzzling to me at that point.