Google is rolling out a swathe of updates on the generative AI front, including a new text-to-image tool. What’s different about ImageFX is that it has an interface that features “expressive chips.” The idea here is that these will help you “quickly experiment with adjacent dimensions of your creation and ideas.”
Alongside the debut of ImageFX, Google says it has improved MusicFX and TextFX. The company’s claims that it’s made upgrades to the MusicLM model that include faster generation of music and higher-quality audio, along with new features. Generated songs can now last up to 70 seconds. As for TextFX, Google has rolled out usability updates in the aim of improving navigation and the overall user experience.
ImageFX-generated images and audio made with MusicFX are tagged by SynthID, a digital watermark that aims to make it clear that these are forged using AI, especially when they appear in Search or Chrome. ImageFX creations will also include IPTC metadata. This, according to Google, will offer “people more information whenever they encounter our AI-generated images”
Folks in the US, Kenya, New Zealand and Australia can try out these new and revamped tools in the AI Test Kitchen starting today. They’re only available in English for now.
The Imagen 2 model is powering the new image generation features of ImageFX. It’s also the tech that’s driving new generative AI options in Bard, Search, Ads, Duet AI in Workspace and Vertex AI. Google says that Imagen 2 helps to deliver its highest-quality AI-generated images yet. The company notes that the model helps keep images clear of artifacts and improves on areas of image generation that such tools have struggled with until now.
In addition, Google says it has made “significant investments” in Imagen 2 training data safety while adding guardrails to “limit problematic outputs like violent, offensive or sexually explicit content as well as applying filters to reduce the risk of generating images of named individuals.” This is due to the model’s upgraded ability to generate photorealistic images. The company claims it also carries out “extensive adversarial testing” to detect and clamp down on potentially problematic and harmful content.
Elsewhere, Gemini Pro in Bard is more broadly available starting today. It’s now accessible in more than 40 languages and north of 230 countries and territories. Also as of today, Google says people in most countries can generate images in Bard in English for free. These images will include SynthID watermarks.