As visual generative AI matures from research to the enterprise domain, businesses are seeking responsible ways to integrate the technology into their products.
Bria, a startup based in Tel Aviv, is responding with an open platform for visual generative AI that emphasizes model transparency alongside fair attribution and copyright protections. Currently offering models that convert text prompts to images or transform existing images, the company will this year add text-to-video and image-to-video AI.
“Creating generative AI models requires time and expertise,” said Yair Adato, co-founder and CEO of Bria. “We do the heavy lifting so product teams can adopt our models to achieve a technical edge and go to market quickly, without investing as many resources.”
Advertising agencies and retailers can use Bria’s tools to quickly generate visuals for marketing campaigns. And creative studios can adopt the models to develop stock imagery or edit visuals. Dozens of enterprise clients have integrated the startup’s pretrained models or use its application programming interfaces.
Bria develops its models with the NVIDIA NeMo framework, which is available on NGC, NVIDIA’s hub for accelerated software. The company uses reference implementations from the NeMo Multimodal collection, trained on NVIDIA Tensor Core GPUs, to enable high-throughput, low-latency image generation. It’s also adopting NVIDIA Picasso, a foundry for visual generative AI models, to run inference.
“We were looking for a framework to train our models efficiently — one that would minimize compute cost while scaling AI training to more quickly reach model convergence,” said Misha Feinstein, vice president of research and development at Bria. “NeMo features optimization techniques that allow us to maximize the GPUs’ performance during both training and inference.”
Creative Solutions to Creative Challenges
Bria, founded in 2020, offers flexible options for enterprises adopting visual generative AI. By adopting Bria’s platform, its customers can gain a competitive edge by creating visual content at scale while retaining control of their data and technology. Developers can access its pretrained models through APIs or by directly licensing the source code and model weights for further fine-tuning.
“We want to build a company where we respect privacy, content ownership, data ownership and copyright,” said Adato. “To create a healthy, sustainable industry, it’s important to incentivize individuals to keep creating and innovating.”
Adato likens Bria’s attribution program to a music streaming service that pays artists each time one of their songs is played. It’s required for all customers who use Bria’s models — even if they further train and fine-tune the model on their own.
Using licensed datasets provides additional benefits: the Bria team doesn’t need to spend time cleaning the data or sorting out inappropriate content and misinformation.
A Growing Suite of NVIDIA-Accelerated Models
Bria offers two versions of its text-to-image model. One islatency-optimized to rapidly accomplish tasks like image background generation. The other offers higher image resolution. Additional foundation models enable super-resolution, object removal, object generation, inpainting and outpainting.
The company is working to continuously increase the resolution of its generated images, further reduce latency and develop domain-specific models for industries such as ecommerce and stock imagery. Inference is accelerated by the NVIDIA Triton Inference Server software and the NVIDIA TensorRT software development kit.
“We’re running on NVIDIA frameworks, hardware and software,” said Feinstein. “NVIDIA experts have helped us optimize these tools for our needs — we would probably run much slower without their help.”
To keep up with the latest hardware and networking infrastructure, Bria uses cloud computing resources: NVIDIA H100 Tensor Core GPUs for AI training and a variety of NVIDIA Tensor Core GPUs for inference.
Bria is a member of NVIDIA Inception, a program that provides startups with technological support and AI platform guidance. Visit Bria in the Inception Pavilion at NVIDIA GTC, running March 18-21 in San Jose and online.
To train optimized text-to-image models, check out the NeMo Multimodal user guide and GitHub repository. NeMo Multimodal is also available as part of the NeMo container on NGC.