Warning: exif_imagetype(https://venturebeat.com/wp-content/uploads/2024/09/robot-search-algorithm.jpg?w=750): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests in /dom281985/wp-includes/functions.php on line 3338

Warning: file_get_contents(https://venturebeat.com/wp-content/uploads/2024/09/robot-search-algorithm.jpg?w=750): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests in /dom281985/wp-includes/functions.php on line 3358
  • Latest
  • Trending
  • All
  • BUSINESS
  • ENTERTAINMENT
  • POLITICAL
  • TECHNOLOGY
Multimodal RAG is growing, here’s the best way to get started

Multimodal RAG is growing, here’s the best way to get started

November 9, 2024
NYPD condemns Trump’s DHS for playing politics with counterterrorism funds

NYPD condemns Trump’s DHS for playing politics with counterterrorism funds

October 2, 2025
Morocco: The 14th edition of the Magreb International Film Festival opens in Oujda

Morocco: The 14th edition of the Magreb International Film Festival opens in Oujda

October 2, 2025
South Korea airport workers go on strike starting Wednesday, Korea Airports Corp says, Asia News

South Korea airport workers go on strike starting Wednesday, Korea Airports Corp says, Asia News

October 2, 2025
Mike Johnson Caught on Camera Admitting Trump Is ‘Unwell’

Mike Johnson Caught on Camera Admitting Trump Is ‘Unwell’

October 2, 2025
Madagascar: Protests ongoing to demand president’s resignation as police presence grows

Madagascar: Protests ongoing to demand president’s resignation as police presence grows

October 2, 2025
ICA foils attempt to smuggle 9,200 e-vaporiser pods declared as power banks, 25-year-old Singaporean man arrested, Singapore News

ICA foils attempt to smuggle 9,200 e-vaporiser pods declared as power banks, 25-year-old Singaporean man arrested, Singapore News

October 2, 2025

Pope makes rare comments on U.S. politics, military gathering

October 2, 2025
DRC: Joseph Kabila’s death sentence sends shockwaves through Goma

DRC: Joseph Kabila’s death sentence sends shockwaves through Goma

October 2, 2025
Former lovers acquitted of all charges over alleged sexual abuse of woman’s daughter, Singapore News

Former lovers acquitted of all charges over alleged sexual abuse of woman’s daughter, Singapore News

October 2, 2025
A government shutdown role reversal: From the Politics Desk

A government shutdown role reversal: From the Politics Desk

October 2, 2025
Athens paralyzed by general strike against new labor laws

Athens paralyzed by general strike against new labor laws

October 2, 2025
Nicole Kidman and Keith Urban separate after nearly 2 decades together, Entertainment News

Nicole Kidman and Keith Urban separate after nearly 2 decades together, Entertainment News

October 2, 2025
  • About
  • Advertise
  • Privacy & Policy
  • Contact
Saturday, February 14, 2026
No Result
View All Result
  • HOME
  • BUSINESS
  • ENTERTAINMENT
  • POLITICAL
  • TECHNOLOGY
  • ABOUT US
  • OUR POLICY
  • Login
  • Register
  • HOME
  • BUSINESS
  • ENTERTAINMENT
  • POLITICAL
  • TECHNOLOGY
  • ABOUT US
  • OUR POLICY
No Result
View All Result
Huewire
No Result
View All Result
Home TECHNOLOGY

Multimodal RAG is growing, here’s the best way to get started

by huewire
November 9, 2024
in TECHNOLOGY
0
Multimodal RAG is growing, here’s the best way to get started
497
SHARES
1.4k
VIEWS
Share on FacebookShare on Twitter

November 8, 2024 5:51 PM

robot search algorithm

Image credit: VentureBeat with DALL-E 3

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


As companies begin experimenting with multimodal retrieval augmented generation (RAG), companies providing multimodal embeddings — a way to transform data to RAG-readable files — advise enterprises to start small when starting with embedding images and videos. 

Multimodal RAG, RAG that can also surface a variety of file types from text, images or videos, relies on embedding models that transform data into numerical representations that AI models can read. Embeddings that can process all kinds of files let enterprises find information from financial graphs, product catalogs or just any informational video they have and get a more holistic view of their company. 

Cohere, which updated its embeddings model, Embed 3, to process images and videos last month, said enterprises need to prepare their data differently, ensure suitable performance from the embeddings, and better use multimodal RAG.

“Before committing extensive resources to multimodal embeddings, it’s a good idea to test it on a more limited scale. This enables you to assess the model’s performance and suitability for specific use cases and should provide insights into any adjustments needed before full deployment,” a blog post from Cohere staff solutions architect Yann Stoneman said. 

The company said many of the processes discussed in the post are present in many other multimodal embedding models.

Stoneman said, depending on some industries, models may also need “additional training to pick up fine-grain details and variations in images.” He used medical applications as an example, where radiology scans or photos of microscopic cells require a specialized embedding system that understands the nuances in those kinds of images.

Data preparation is key

Before feeding images to a multimodal RAG system, these must be pre-processed so the embedding model can read them well. 

Images may need to be resized so they’re all a consistent size, while organizations need to figure out if they want to improve low-resolution photos so important details don’t get lost or make too high-resolution pictures a lower quality so it doesn’t strain processing time. 

“The system should be able to process image pointers (e.g. URLs or file paths) alongside text data, which may not be possible with text-based embeddings. To create a smooth user experience, organizations may need to implement custom code to integrate image retrieval with existing text retrieval,” the blog said. 

Multimodal embeddings become more useful 

Many RAG systems mainly deal with text data because using text-based information as embeddings is easier than images or videos. However, since most enterprises hold all kinds of data, RAG which can search pictures and texts has become more popular. Organizations often had to implement separate RAG systems and databases, preventing mixed-modality searches. 

Multimodal search is nothing new, as OpenAI and Google offer the same on their respective chatbots. OpenAI launched its latest generation of embeddings models in January. Other companies also provide a way for businesses to harness their different data for multimodal RAG. For example, Uniphore released a way to help enterprises prepare multimodal datasets for RAG.

VB Daily

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat’s Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Read More

Share199Tweet124
huewire

huewire

Recent Comments

No comments to show.

Recent Posts

  • NYPD condemns Trump’s DHS for playing politics with counterterrorism funds
  • Morocco: The 14th edition of the Magreb International Film Festival opens in Oujda
  • South Korea airport workers go on strike starting Wednesday, Korea Airports Corp says, Asia News
  • Mike Johnson Caught on Camera Admitting Trump Is ‘Unwell’
  • Madagascar: Protests ongoing to demand president’s resignation as police presence grows
Huewire

Copyrights © 2025 Huewire.com.

Navigate Site

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Follow Us

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms below to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result
  • HOME
  • BUSINESS
  • ENTERTAINMENT
  • POLITICAL
  • TECHNOLOGY
  • ABOUT US
  • OUR POLICY

Copyrights © 2025 Huewire.com.