Documentation Index
Fetch the complete documentation index at: https://docs.langchain.com/llms.txt
Use this file to discover all available pages before exploring further.
LiteLLM is a library that simplifies calling Anthropic, Azure, Huggingface, Replicate, and many other LLM providers through a single unified interface.
Installation and setup
pip install langchain-litellm
Chat models
from langchain_litellm import ChatLiteLLM
from langchain_litellm import ChatLiteLLMRouter
See the LiteLLM chat guide for full usage details, including streaming, tool calling, structured output, and Vertex AI Grounding.
Embeddings
from langchain_litellm import LiteLLMEmbeddings
from langchain_litellm import LiteLLMEmbeddingsRouter
LiteLLMEmbeddings embeds text across 100+ providers with a single consistent interface. All configuration is explicit, with no environment variables required.
from langchain_litellm import LiteLLMEmbeddings
embeddings = LiteLLMEmbeddings(
model="openai/text-embedding-3-small",
api_key="sk-...",
)
vectors = embeddings.embed_documents(["hello", "world"])
query_vector = embeddings.embed_query("hello")
Switch providers by changing model. The interface stays the same:
# Cohere
embeddings = LiteLLMEmbeddings(
model="cohere/embed-english-v3.0",
api_key="...",
document_input_type="search_document",
query_input_type="search_query",
)
# Azure OpenAI
embeddings = LiteLLMEmbeddings(
model="azure/my-embedding-deployment",
api_key="...",
api_base="https://my-resource.openai.azure.com",
api_version="2024-02-01",
)
# Bedrock
embeddings = LiteLLMEmbeddings(
model="bedrock/amazon.titan-embed-text-v1",
)
For load-balancing across multiple deployments of the same model, use LiteLLMEmbeddingsRouter:
from litellm import Router
from langchain_litellm import LiteLLMEmbeddingsRouter
router = Router(model_list=[
{
"model_name": "text-embedding-3-small",
"litellm_params": {
"model": "openai/text-embedding-3-small",
"api_key": "sk-key1",
},
},
{
"model_name": "text-embedding-3-small",
"litellm_params": {
"model": "openai/text-embedding-3-small",
"api_key": "sk-key2",
},
},
])
embeddings = LiteLLMEmbeddingsRouter(router=router)
Document loaders
from langchain_litellm import LiteLLMOCRLoader
LiteLLMOCRLoader loads documents via a LiteLLM proxy’s OCR endpoint (e.g. Azure Document Intelligence). The proxy handles all provider-specific authentication and configuration.
from langchain_litellm import LiteLLMOCRLoader
loader = LiteLLMOCRLoader(
proxy_base_url="http://localhost:4000",
api_key="my-bearer-token",
url_path="https://example.com/document.pdf",
model="azure-document",
mode="page", # "page" = one Document per page; "single" = concatenate all pages
)
documents = loader.load()
Async loading is also supported:
documents = await loader.aload()
API reference
For detailed documentation of all classes and configurations, see the langchain-litellm API reference.
| Class | Description |
|---|
| ChatLiteLLM | LangChain chat model wrapper for LiteLLM |
| ChatLiteLLMRouter | Router-backed chat model for load balancing and fallbacks |
| LiteLLMEmbeddings | Embed text across 100+ providers with a single consistent interface |
| LiteLLMEmbeddingsRouter | Router-backed embeddings for load balancing across deployments |
| LiteLLMOCRLoader | Document loader via LiteLLM proxy’s OCR endpoint |