At its annual user conference Snowflake Summit 2024, the company today announced Polaris Catalog, a vendor-neutral, open catalog implementation for Apache Iceberg. It’s an open standard of choice for implementing data lakehouses, data lakes, and other modern architectures.
Snowflake plans to open-source Polaris Catalog in the next 90 days to provide enterprises and the entire Iceberg community with new levels of choice, flexibility, and control over their data, with full enterprise security and Apache Iceberg interoperability with Amazon Web Services (AWS), Confluent, Dremio, Google Cloud, Microsoft Azure, Salesforce, and more.
Apache Iceberg emerged from incubation to a top-level Apache Software Foundation project in May 2020, and has since surged in popularity to become a leading open-source data table format.
With Polaris Catalog, users now gain a single, centralised place for any engine to find and access an organisation’s Iceberg tables with full, open interoperability.
Polaris Catalog relies on Iceberg’s open-source REST protocol, which provides an open standard for users to access and retrieve data from any engine that supports the Iceberg Rest API, including Apache Flink, Apache Spark, Dremio, Python, Trino, and more.
“Organisations want open storage and interoperable query engines without lock-in. Now, with the support of industry leaders, we are further simplifying how any organisation can easily access their data across diverse systems with increased flexibility and control,” said Christian Kleinerman, EVP of product, Snowflake.
Moreover, Snowflake revealed that organisations can get started running Polaris Catalog hosted in Snowflake’s AI Data Cloud within minutes (Snowflake-hosted in public preview soon), or self-host it in their own infrastructure using containers such as Docker or Kubernetes.
Since Polaris Catalog’s backend implementation will be open source, organisations can freely swap the hosting infrastructure while eliminating vendor lock-in.
To ensure Polaris Catalog can meet the evolving needs of the wider community and landscape, Snowflake is collaborating with the Iceberg ecosystem to drive the project forward. Interestingly, a part of what makes Apache Iceberg so powerful is its vibrant community of diverse adopters, contributors, and commercial offerings.