I’ve been doing Java heavily for a few years, and last year I got a really good grip on Kotlin. My workplace has asked if I could learn R and machine learning. I just started doing that and it has piqued my interest.
I know there are a few Java libraries for machine learning and some book titles are slowly emerging this year on Java for machine learning. I’m not quite sure if this will cause a flood of ML users to switch to Java.
Part of me thinks that Java never really caught on for machine learning because it is cumbersome and not as tactical or nimble as Python, and it isn’t mathematics/statistics focused like R. I suppose it could just be catching up to the machine-learning movement that recently got prominence.
But is it possible Kotlin could make the Java platform a practical alternative for ML instead of Python? I’ve noticed Kotlin is a highly nimble, tactical language without compromising performance and scalability. Do you think it could make a sizable entrance as a machine-learning solution, contingent on the right libraries being developed?
It might help a bit in the sense that it’s a better language than Java and the syntax is more suited to prototyping than Java’s. But for heavy numeric/scientific work on the JVM the Valhalla project will be needed, as that’s how to get support for things like very wide numbers, complex numbers, etc with acceptable performance.
Also, Kotlin can be a bit clunky for doing lots of math. If you’ve got some functions that really want Double and others that really want Float (e.g., all the trig functions in java.util.Math take Double, and most of the Android drawing calls want Float), then you’re doing a ton of .toDouble() and .toFloat(). Kotlin, for better or for worse, has all sorts of places where you find yourself writing these sorts of explicit coercions. (Kotlin hides this in many common cases, and I know they’ve got good reasons, but… it still feels painful sometimes.)
I don’t know much of anything about ML, but I assume that most of the heavy lifting happens in libraries, and many of those will in turn want to dump the heavy lifting out to GPUs. Consequently, the code you have to write to make your ML do its thing might not be too bad at all, and Kotlin’s general-purpose integration with Java would enable you to work with any Java ML library. To pick one recent popular example, Google’s TensorFlow only natively supports APIs for Python and C++, but they tell you to go ahead and use SWIG to adapt it to whatever, and SWIG certainly supports Java…
Yeah I was giving it some thought as I was doing some studying yesterday and today. You are right that the libraries do a lot of the heavy lifting, and they written with much lower-level languages than Java. I can see why the JVM would be clunky for processing pure numbers, but it could be useful for certain ML situations that are not as volume-intensive. It’s a shame, but I guess that is why there is no “one-size-fits-all technology”.
Yes! As expressive as Kotlin is, one of its few disadvantages (by design) is that it is not very strong with things like implicit numerical casting. Where similarly targeted languages take too much care (and hide operations - which is problematic also), Kotlin requires you to be very expressive. @dwallach - I agree with you 100% on the first paragraph. However, having worked with Scala’s implicit system and what it does to IDE and compilation (and complexity) - I kind of am willing to pay the price of being explicit (but it at this day and age really feel unnatural most times).
I think Java is more widely used for machine learning than you think, with libraries like Apache Spark, Weka, and even my own QuickML (shameless plug ;)).
The major advantage of Java is that you can implement your own ML algorithms and have them be very performant without having to resort to native code as Python does.
Of course, there are a lot of people that were forced to learn Java at university and have hated it ever since, I think the popularity of Python among the ML community has more to do with that than any inherent advantage Python has over Java for ML stuff.
That QuickML looks like an interesting library. I’ll need to play with that and see what models it supports.
Anyway, that’s what I was wondering. It confused me why Java couldn’t do what Python/R could do when it came to machine learning. The productionizing of ML especially left me with questions. I have never been a fan of dynamic typing anyway, and in my line of work exploration inevitably leads to putting findings in production (which further solidified my bias for JVM ML).
I’ll probably have to learn R anyway just to have that as a mature data science tool to measure against. But it would be cool if I am able to do ML with Kotlin at some point.
Wrappers and integrations are almost always possible.
Question is if there is a way to effectively reuse data in memory with no re-allocation. That pitfall happens often enough within a single language / single package, let alone across varying languages and packages.
And I don’t know the space well enough yet, but if SWIG is a viable alternative for TensorFlow, then would have been implemented, and maybe it has, though a quick search shows this is attempting to do java direct with no SWIG which sounds like one less translation layer: Java interface · Issue #5 · tensorflow/tensorflow · GitHub
Machine learning is an application of AI (AI) that gives systems the power to automatically learn and improve from experience without being explicitly programmed.During Machine learning course I learn to focus on the event of computer programs that will access data and use it to learn for themselves.
Yes, I used to be giving it some thought as I used to be performing some studying yesterday and today. you’re right that the libraries do tons of the work, and that they are written with much lower-level languages than Java. I can see why the JVM would be clunky for processing pure numbers, but it might be useful surely ML situations that aren’t as volume-intensive.
You can also opt in to using Arrow-Meta with the typeproofs plugin and have an internal @Coercion extension function that automatically converts from float to double and vice versa
JVM is actually excellent in processing numbers. You can look into some realistic benchmarks run by ojalgo team. The problem arises when you get to process generic numbers (something you can’t do in native anyway).