Python and R are undoubtedly the most widely used languages for machine learning, and yet there is no dearth of developers who use Java for the same purpose. In fact, the language is slowly catching up with Python.
Meanwhile, LinkedIn and Oracle released Dagli and Tribuo frameworks, respectively, in 2020, which are also contributing to the Java Machine Learning Library (JavaML). The library gives users access to an extensive range of machine learning tools, apart from wrappers and APIs to integrate different frameworks to Java.
How Java is used in ML
Java is the go-to tool for many machine learning tasks. Users can create algorithms, build models, and easily launch applications with this language. The good thing about Java is its flexibility—it can handle everything from preparing data to making models.
Evelyn Miller, data science lead at Magnimind Academy, said, “You should remember that Java gives support for development in any field you want, and data science is no different.”
Developers can use Java to make it easy for different parts of their app to talk to the ML features. Using third-party open source libraries and frameworks, users can leverage Java to implement what any other language does. For instance, the open source library TensorFlowJava can run on any JVM for building, training and deploying machine learning models.
Java also helps make the launch of machine learning applications smooth and offers libraries with specific tools for different tasks. A popular Java machine learning toolkit Weka provides a graphical interface for data preprocessing, modelling, and evaluation.
This library, developed by the University of Waikato, is as old as the language itself. However, it is still the most widely used library available and its popularity continues to rise because of its flexible data mining software.
Even big tech companies, including Google, Amazon, and Microsoft, are leveraging Java for machine learning. Google developers use Java for various applications, in fact, the entire Google Suite is built especially in Java code.
Apart from Weka, Apache Mahout is another framework widely used by enterprises like Facebook, LinkedIn, Twitter, and Yahoo. This is mostly because the framework is scalable. Complex data structures are manipulated in Java, which might not be possible in Python.
This can be done using different frameworks, for example, Mahout uses a distributed linear algebra while ADAMS (Advanced Data mining And Machine learning System) is a tree-like structure. This allows data manipulation in a variety of ways.
Adopting Java
There are 8-10 million Java developers in the world. Frank Greco, a senior consultant at Google, said at a talk, “All the big tech companies are interested to know more about using Java for ML.”
He, along with his peers, are working on promoting the language for ML. “Java’s role in ML will come as a revelation,” Greco said. His team engaged with major players, the likes of Twitter, Oracle, IBM, and Amazon.
The excitement for using Java in ML is unanimous across these industry giants — there is a genuine interest in exploring how Java could be harnessed for ML. “It isn’t a case of dismissing Java in favour of Python; instead, all are keen to understand Java’s potential in the ML realm,” he explained.
Greco built the JSR 381, a Java-friendly API for visual recognition and generic ML API which can be used for high-level abstractions. This API is not tied to any ML framework but developers can choose a framework that best suits their needs.
“The goal was to make visual recognition and ML easy to use by non-experts,” he said. Amazon implemented this API, and Greco says it is a good starting point for the language. He said, “I believe that with feedback from the community, we can move this forward.”