Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.langchain.com/llms.txt

Use this file to discover all available pages before exploring further.

LiteLLM is a library that simplifies calling Anthropic, Azure, Huggingface, Replicate, and many other LLM providers through a single unified interface.

Installation and setup

pip install langchain-litellm

Chat models

from langchain_litellm import ChatLiteLLM
from langchain_litellm import ChatLiteLLMRouter
See the LiteLLM chat guide for full usage details, including streaming, tool calling, structured output, and Vertex AI Grounding.

Embeddings

from langchain_litellm import LiteLLMEmbeddings
from langchain_litellm import LiteLLMEmbeddingsRouter
LiteLLMEmbeddings embeds text across 100+ providers with a single consistent interface. All configuration is explicit, with no environment variables required.
from langchain_litellm import LiteLLMEmbeddings

embeddings = LiteLLMEmbeddings(
    model="openai/text-embedding-3-small",
    api_key="sk-...",
)

vectors = embeddings.embed_documents(["hello", "world"])
query_vector = embeddings.embed_query("hello")
Switch providers by changing model. The interface stays the same:
# Cohere
embeddings = LiteLLMEmbeddings(
    model="cohere/embed-english-v3.0",
    api_key="...",
    document_input_type="search_document",
    query_input_type="search_query",
)

# Azure OpenAI
embeddings = LiteLLMEmbeddings(
    model="azure/my-embedding-deployment",
    api_key="...",
    api_base="https://my-resource.openai.azure.com",
    api_version="2024-02-01",
)

# Bedrock
embeddings = LiteLLMEmbeddings(
    model="bedrock/amazon.titan-embed-text-v1",
)
For load-balancing across multiple deployments of the same model, use LiteLLMEmbeddingsRouter:
from litellm import Router
from langchain_litellm import LiteLLMEmbeddingsRouter

router = Router(model_list=[
    {
        "model_name": "text-embedding-3-small",
        "litellm_params": {
            "model": "openai/text-embedding-3-small",
            "api_key": "sk-key1",
        },
    },
    {
        "model_name": "text-embedding-3-small",
        "litellm_params": {
            "model": "openai/text-embedding-3-small",
            "api_key": "sk-key2",
        },
    },
])

embeddings = LiteLLMEmbeddingsRouter(router=router)

Document loaders

from langchain_litellm import LiteLLMOCRLoader
LiteLLMOCRLoader loads documents via a LiteLLM proxy’s OCR endpoint (e.g. Azure Document Intelligence). The proxy handles all provider-specific authentication and configuration.
from langchain_litellm import LiteLLMOCRLoader

loader = LiteLLMOCRLoader(
    proxy_base_url="http://localhost:4000",
    api_key="my-bearer-token",
    url_path="https://example.com/document.pdf",
    model="azure-document",
    mode="page",  # "page" = one Document per page; "single" = concatenate all pages
)
documents = loader.load()
Async loading is also supported:
documents = await loader.aload()

API reference

For detailed documentation of all classes and configurations, see the langchain-litellm API reference.
ClassDescription
ChatLiteLLMLangChain chat model wrapper for LiteLLM
ChatLiteLLMRouterRouter-backed chat model for load balancing and fallbacks
LiteLLMEmbeddingsEmbed text across 100+ providers with a single consistent interface
LiteLLMEmbeddingsRouterRouter-backed embeddings for load balancing across deployments
LiteLLMOCRLoaderDocument loader via LiteLLM proxy’s OCR endpoint