RAG Pipeline API
This page documents the RAG (Retrieval Augmented Generation) pipeline components.
RAGPipeline
class RAGPipeline:
"""Manages the RAG pipeline for document question-answering."""
def __init__(self, model_name: str, embeddings: Embeddings):
"""Initialize RAG pipeline with model and embeddings."""
Methods
create_vector_store
def create_vector_store(self, documents: List[Document]) -> VectorStore:
"""Create a vector store from documents."""
Parameters:
- documents
: List of processed documents
Returns: - Initialized vector store
get_relevant_documents
def get_relevant_documents(self, query: str) -> List[Document]:
"""Retrieve relevant documents for a query."""
Parameters:
- query
: User question
- k
: Number of documents to retrieve (default: 4)
Returns: - List of relevant documents
generate_response
def generate_response(self, query: str, context: List[Document]) -> str:
"""Generate response using LLM and context."""
Parameters:
- query
: User question
- context
: Retrieved documents
Returns: - Generated response
Usage Example
# Initialize pipeline
pipeline = RAGPipeline(
model_name="llama2",
embeddings=NonicEmbeddings()
)
# Process query
docs = pipeline.get_relevant_documents("What is RAG?")
response = pipeline.generate_response(
query="What is RAG?",
context=docs
)
Configuration
The pipeline can be configured with:
- Model selection
- Embedding type
- Retrieval parameters
- Response templates
Performance Tuning
Optimize the pipeline by adjusting:
- Number of retrieved documents
- Context window size
- Temperature setting
- Response length