Embeddings API
This page documents the text embedding components used for semantic search.
NomicEmbeddings
class NomicEmbeddings:
"""Manages text embeddings using Nomic's embedding model."""
def __init__(self, model_name: str = "nomic-embed-text"):
"""Initialize embeddings with model name."""
Methods
embed_documents
def embed_documents(self, texts: List[str]) -> List[List[float]]:
"""Generate embeddings for a list of texts."""
Parameters:
- texts
: List of text strings
Returns: - List of embedding vectors
embed_query
Parameters:
- text
: Query text
Returns: - Embedding vector
Usage Example
# Initialize embeddings
embeddings = NomicEmbeddings()
# Embed documents
docs = ["First document", "Second document"]
doc_embeddings = embeddings.embed_documents(docs)
# Embed query
query = "Sample query"
query_embedding = embeddings.embed_query(query)
Configuration
Configure embeddings with:
- Model selection
- Batch size
- Normalization
- Caching options
Performance
Optimization options:
- Batch processing
- GPU acceleration
- Caching
- Dimensionality
Best Practices
- Text Preparation
- Clean input text
- Handle special characters
-
Normalize length
-
Resource Management
- Batch similar lengths
- Monitor memory usage
-
Cache frequent queries
-
Quality Control
- Validate embeddings
- Check dimensions
- Monitor similarity scores ```