With the recent release of Anthropic’s Claude Sonnet 3.5, the company has gained a temporary lead in a long-standing AI race. However, while companies like Anthropic and OpenAI stumble over themselves to better their capabilities, it seems that Databricks is taking a different approach in focusing on compound AI systems.
At the recent Data+AI Summit, Databricks announced several updates to its Mosaic AI platform, allowing its customers to build their own compound AI systems.
Speaking to AIM, Databricks’ vice president of field engineering APJ, Nick Eayrs, said that compound AI systems offer huge value compared to building a single large model.
“How do we make these models better at integrating with other systems, both upstream and downstream? How do we build tools around the models so that they can operate in these compound systems to provide better insights and capabilities for customers and citizens?” he asked.
Where Do Compound AI Systems Fit into the Ecosystem?
While there are comparisons to multimodal systems, like GPT-4o, complex AI systems are a much broader term, encompassing multimodal systems, as well as other capabilities like using multiple AI models and techniques for better and more complex reasoning.
“We believe that compound AI systems will be the best way to maximise the quality, reliability, and measurement of AI applications going forward, and may be one of the most important trends in AI in 2024,” said Databricks co-founder and CTO Matei Zaharia.
This is unsurprising, with many AI companies slowly pivoting towards offering enterprise AI services, including Microsoft, which recently killed its GPT Builder in Copilot for consumer purposes to focus on enterprise and commercial sectors.
Databricks’ Data Intelligence platform means that companies can make the best use of their data. As Databrick CEO Ali Ghodsi said, essentially democratising their data in order to allow access to it to anyone within the company, while also making sure that their data is not at the mercy of outside vendors.
Data Governance is Key
Obviously, for all of this to work, data is key – not just data but also how this data is formatted and labelled.
Using financial data and software company FactSet, which is a Databricks client, as an example, Databricks co-founder and VP of engineering Patrick Wendell pointed out that the company has a huge amount of data for existing queries “with labelled English examples so they could tune a model that understands their data extremely well”.
Another major announcement from the summit came in the form of Databricks officially open-sourcing Unity Catalog. This allows companies to standardise their data, allowing for more accurate training and information retrieval.
Will Databricks’ Focus on Democratising AI Prevail?
However, with Databricks effectively coming out with an entire ecosystem for companies to use in the name of democratising AI systems, this could change slowly.
With combined factors like industries pivoting towards AI use, the need to effectively use data and concerns on commercial data privacy, Databricks has managed to corner an untapped market.
But Zaharia is correct in saying that this has slowly become a massive trend in 2024. While Databricks has focused on ensuring that everything remains democratised and largely accessible, this doesn’t stop other AI companies from leveraging their technology and pushing their own formats for enterprises to use.
However, unlike their Tabular acquisition, the company may not be able to enforce their democratic AI ideal as big tech companies venture into the domain.
This may not necessarily be a bad thing. With Databricks ensuring democratic AI first and foremost, and their early tapping of a market, the company has essentially set a standard for how things need to be done when it comes to leveraging compound AI systems.
Even if larger companies try to commercialise these systems in their own formats, Databricks’ early cornering of the market could help them remain in the lead for years to come.