data, center, cloud, RAG, data, LLMs, paradoxes, connected devices, edge, IoT, data center

For enterprises, a major challenge with large language models (LLMs) is that they don’t natively integrate with proprietary datasets, such as emails, customer support tickets and corporate documents. As a result, the output can be misleading, inaccurate or irrelevant. 

One way to address this is through fine-tuning the LLM. This involves retraining it on a company’s specific data. This process involves adjusting the model’s weights and parameters to better align with the proprietary dataset. However, fine-tuning can be complex and resource-intensive to implement effectively.

A more accessible and cost-effective solution is Retrieval-Augmented Generation (RAG). This method uses a vector database to search for and retrieve relevant information. 

AWS

“RAG flows are gaining significant popularity and momentum in recent months, with good reason,” said Jeremy Kelway, who is the VP of Engineering for Analytics, Data and AI at EDB. “They enable access to information in ways that facilitate the human experience, saving time by automating and filtering data and information output that would otherwise require significant manual effort and time to be created.”

RAG is certainly powerful, and it is making major inroads in the enterprise. However, like any emerging technology, there are drawbacks as well. It’s not a one-size-fits-all solution and requires a careful approach for the implementation.

Types of RAG and Data 

Generally, RAG is not difficult to use.  There are plenty of YouTube videos that show how to setup a system.  

Yet if you want to create a sophisticated application – that is enterprise-grade – then the complexity will increase significantly. Note that there are actually different flavors of RAG.

“RAG can be classified by its core components,” said Yi Fang, who is a Ph.D. and associate professor of Computer Science and Engineering at Santa Clara University. “First, there is the retriever and the generator. Different retrievers offer varying trade-offs between retrieval efficiency and quality. Some prioritize speed and may fetch results quickly but with less accuracy, while others focus on delivering higher-quality, more relevant data at the cost of slower retrieval time. Generators can be default models, like GPT-series models, or retrieval-augmented generators, such as RETRO, which blend retrieved data into responses.”

Beside selecting the type of RAG, you need to deal with the perennial challenge with AI systems: Data.  

“What will my RAG model do when it encounters bad, missing or misformatted data?” asks Eric Best, who is the CEO of SoundCommerce.  “Suppose I’m expecting to build a quarter-over-quarter report on my company’s financial performance, but some of the date fields in my database are stored as text strings rather than numerical date fields. How will my model overcome this discrepancy?”

To get the best results, you need to have someone with a background in data science. But there will also need to be SMEs (subject-matter experts) who understand the nature of the data and how it relates to the organization.  

“It’s essential to connect and contextualize data throughout the organization to truly unlock the promise of RAG,” said Philip Miller, who is an AI Strategist at Progress.

Integration 

Integrating RAG into existing enterprise systems presents significant challenges. One major concern is handling sensitive data. “Without proper safeguards, there’s a risk that personally identifiable information (PII) could leak into LLMs,” said Dorian Selz, who is the CEO at Squirro. “To mitigate these issues, organizations can use advanced versions of RAG, such as EnhancedRAG or graphRAG alone. Masking PII with privacy and security layers and strict access control listing, and implementing strong data governance frameworks are essential to minimize risks.”

There are also issues with integrating RAG within existing workflows and legacy systems. For example, integrating RAG with customer support software or data management systems might require custom development work to ensure that data flows seamlessly between systems. This is especially difficult when the systems involve different data formats or protocols. 

And what if you already have AI tools? How will these work with RAG?  This can mean inconsistencies or data duplication.  Thus, a robust data integration strategy is necessary before deploying RAG across the enterprise.

Deployment and Monitoring

For a RAG system to be effective, extensive testing is crucial. While there are some tools designed to assist in this process, many of these are still relatively immature. Much of the testing tends to be manual, requiring developers to fine-tune retrieval mechanisms, assess the accuracy of generated responses, and ensure the system performs well under various conditions.

Once a RAG system is deployed, the need for ongoing monitoring arises.  This can also be a costly and onerous process.

Finally, the system will need to be regularly updated to reflect new information. Whether it’s incorporating new customer interactions, corporate documents, or industry-specific developments, keeping the knowledge base current is a continuous effort. This requires not only updating the vector database but also retraining retrieval mechanisms to ensure that they stay relevant. Each of these updates adds another layer of complexity and cost to the ongoing maintenance of a RAG system.

Conclusion

From data quality issues and integration complexities to the ongoing need for monitoring and updates, organizations must approach RAG implementation carefully. While it offers significant benefits, it requires a well-planned strategy, technical expertise, and continuous maintenance to deliver meaningful results.

Techstrong TV

Click full-screen to enable volume control
Watch latest episodes and shows

AI Data Infrastructure Field Day

SHARE THIS STORY

RELATED STORIES