Ragnarock: AI
Best Practices
What is Ragnarock?
Ragnarock is the framework for best practices we’ve developed through building generative AI solutions on AWS. It is a combination of several techniques, a common set of patterns, and an overall architecture we’ve found particularly efficient, accurate, and effective when building generative AI solutions. The name itself is derived from the combination of Retrieval-Augmented Generation, a Neoteric* Agent, and Amazon Bedrock (RAG-NA-rock).
Why Ragnarock?
How do you handle the inherent randomness of large language models when you need to build a reliable, accurate, and consistent solution?
Traditionally there have been two answers to this problem.
-
Avoid use cases which require highly deterministic behavior
-
Tolerate randomness and aim for consistency elsewhere in the system
Ragnarock is a third way—our collection of best practices for creating consistent and performant generative AI solutions.
Benefits of Ragnarock
- Cost Efficiency (Token Usage)
- Prompting Flexibility
- Latency (Time-to-Output)
- Throughput (Generations Per Second)
- Security
- Data Privacy
- Reliability (Request Timeout / Error Handling)
- Accuracy (Hallucination Rate)
- Safety
- Leverage Your Data
- Leverage Your Infrastructure
Project Fit Criteria
- Knowledge Management
- AI Assistants & Chatbots
- Customer Support
- Search & Research Tooling
- Keyword Extraction
- Concept-Specific Image Generation
- Document Processing & Creation
- Any Solution Needing Agent Capabilities
How Ragnarock Works
We rely on “semantic distillation,” a process for extracting and augmenting information while retaining meaning. We do this to the underlying data as well as the prompts, ensuring that the linkages between the two are embedded into the system and sufficiently generalized. This allows the AI to handle a wide variety of inputs while still arriving at the intended output.
To do this, we use a combination of
-
An Agent Orchestrator
-
Meta-Prompting
-
Chain-of-Thought
-
A Vector Database
-
Retrieval-Augmented Generation
-
Embedding
-
AWS Step Functions
-
Web Crawling (optional)
-
Long-Term Storage
-
Amazon Bedrock
With these elements and techniques, we tune and prepare a language model for its use case and simulate a large variety of prompting scenarios, testing and refining until we achieve the accuracy required of the solution.
Learn more about the technical details of this approach here.
Elements of the Solution
Retrieval-Augmented Generation (RAG)
What is RAG? RAG is a technique which queries data which has been vectorized and uses it to inform how a large language model responds. (The response the model generates is augmented by retrieving information from this data.) By vectorizing relevant data, you put it into a form which the model can recognize and query, improving the accuracy and efficacy of its responses.
Vector Databases we use: Amazon OpenSearch, Postgres Vector (pgvector), Pinecone, Chroma, FAISS, Llama Index
Agents
Many generative AI solutions can benefit from agentic work— research on the web, connecting to proprietary data sources, or integrating with other applications. To understand the importance of agents, think of them as a model’s eyes and hands; they allow models to interact with the outside world, manipulate or reference that world through their outputs.Agents we use: LangChain, AutoGPT, Amazon Q
Bedrock
We have found Amazon Bedrock to be unsurpassed for generative AI solutions. Bedrock has the benefits of accessing models via API— which eliminates the costs and infrastructural complexity of hosting—but without sacrificing performance, stability, or security. By keeping traffic local to AWS, Bedrock greatly reduces solution latency and lets you easily integrate with your data and infrastructure. And because Bedrock offers best-of-breed solutions from a number of model producers, the array of possibilities with Bedrock continues to grow, keeping up with the rate of innovation industry wide.
Some Foundation Models we use: Claude, Amazon Titan, Mistral, Llama, StableDiffusion, Jurassic-2, Cohere – accessed via Amazon Bedrock
Our Process
Get Your Pilot Launched
Would you like to see how the Ragnarock best practices can impact the performance of your solution? Talk to one of our generative AI specialists today.