Japanese AI startup, Sakana AI introduced EvoSDXL-JP, an image generation model built via Evolutionary Model Merge, which delivers 10x faster image generation for Japanese styles. EvoSDXL-JP, is now publicly available on the HuggingFace platform for research and educational purposes, accompanied by an accessible demo for immediate testing.
The model that can support Japanese and generate Japanese style images by fusing different open models. Compared to the existing Japanese model, the inference speed is 10 times faster, but it shows better performance in the benchmark, said the company in its blog post.
EvoSDXL-JP is capable of high-speed and low-cost image generation, and is the best model to easily try and experience generative AI. The company said it expects it to be used in educational sites in Japan so that more people can enjoy the benefits of generative AI.
Sakana AI recently introduced an innovative model construction approach using evolutionary algorithms called “Evolutionary Model Merge.” The company says Evolutionary model merge is not limited to specific modalities, and can be applied to models of any modality in principle.
Furthermore, the company has released the EvoLLM-JP, a large-scale Japanese language model, and the EvoVLM-JP, an image language model, both constructed through Evolutionary Model Merge. These models were based on self-regressive Transformer models designed for language generation.
EvoLLM-JP, was made by merging the large-scale language model (LLM) of Japanese and the LLM of mathematics, and was found to be good not only in mathematics but also in the overall ability of Japanese.
In addition, EvoVLM-JP, which was made by merging Japanese LLM and image language model (VLM), can respond to knowledge of Japanese culture and achieved the best results in benchmarks using Japanese images and Japanese text.