Skip to content

Embeddings API

This page documents the text embedding components used for semantic search.

NomicEmbeddings

class NomicEmbeddings:
    """Manages text embeddings using Nomic's embedding model."""

    def __init__(self, model_name: str = "nomic-embed-text"):
        """Initialize embeddings with model name."""

Methods

embed_documents

def embed_documents(self, texts: List[str]) -> List[List[float]]:
    """Generate embeddings for a list of texts."""

Parameters: - texts: List of text strings

Returns: - List of embedding vectors

embed_query

def embed_query(self, text: str) -> List[float]:
    """Generate embedding for a single query text."""

Parameters: - text: Query text

Returns: - Embedding vector

Usage Example

# Initialize embeddings
embeddings = NomicEmbeddings()

# Embed documents
docs = ["First document", "Second document"]
doc_embeddings = embeddings.embed_documents(docs)

# Embed query
query = "Sample query"
query_embedding = embeddings.embed_query(query)

Configuration

Configure embeddings with:

  • Model selection
  • Batch size
  • Normalization
  • Caching options

Performance

Optimization options:

  • Batch processing
  • GPU acceleration
  • Caching
  • Dimensionality

Best Practices

  1. Text Preparation
  2. Clean input text
  3. Handle special characters
  4. Normalize length

  5. Resource Management

  6. Batch similar lengths
  7. Monitor memory usage
  8. Cache frequent queries

  9. Quality Control

  10. Validate embeddings
  11. Check dimensions
  12. Monitor similarity scores ```