Google DeepMind announced the release of Gemma 2, an advanced version of its open models, available in 9 billion (9B) and 27 billion (27B) parameter sizes.
The model is accessible on Google AI Studio, Kaggle, Hugging Face Models, and soon on Vertex AI Model Garden. Researchers can apply for the Gemma 2 Academic Research Program for Google Cloud credits, with applications open until August 9.
Gemma 2 offers significant improvements over its predecessor, including competitive performance to larger proprietary models and optimised cost efficiency. The 27B model can perform inference on a single NVIDIA H100 Tensor Core GPU or TPU host, reducing deployment costs.
The new models integrate easily with major AI frameworks like Hugging Face Transformers, JAX, PyTorch, and TensorFlow via Keras 3.0. Developers can deploy Gemma 2 on various hardware setups, from cloud-based environments to local CPUs and GPUs.
Gemma 2 is available under a commercially-friendly license, encouraging innovation and commercialization. Google Cloud customers will be able to deploy and manage Gemma 2 on Vertex AI starting next month. Additionally, Google provides the Gemma Cookbook, offering practical examples for building and fine-tuning applications with Gemma 2.
Google emphasises responsible AI development with Gemma 2, incorporating robust safety processes, pre-training data filtering, and rigorous testing against bias and risk metrics. The LLM Comparator tool and text watermarking technology, SynthID, are part of these efforts.
The initial release of Gemma resulted in over 10 million downloads. Gemma 2 aims to support even more ambitious projects, with future plans to release a 2.6B parameter model to balance accessibility and performance.