Log retriever traces

Many LLM applications retrieve documents from vector databases, knowledge graphs, or other indexes as part of a retrieval-augmented generation (RAG) pipeline. LangSmith provides dedicated rendering for retriever steps, which makes it easier to inspect retrieved documents and diagnose retrieval issues.

These steps are optional. If you skip them, your retriever data will still be logged, but LangSmith will not render it with retriever-specific formatting.

To enable retriever-specific rendering, complete the following two steps.

Set `run_type` to retriever

Pass run_type="retriever" to the traceable decorator (Python) or traceable wrapper (TypeScript). This tells LangSmith to treat the step as a retrieval run and apply retriever-specific rendering in the LangSmith UI:

from langsmith import traceable

@traceable(run_type="retriever")
def retrieve_docs(query):
    ...

If you are using the RunTree API instead of traceable, pass run_type="retriever" when creating the RunTree object.

Return documents in the expected format

Return a list of dictionaries (Python) or objects (TypeScript) from your retriever function. Each item in the list represents a retrieved document and must contain the following fields:

Field	Type	Description
`page_content`	string	The text content of the retrieved document.
`type`	string	Must always be `"Document"`.
`metadata`	object	Key-value pairs with metadata about the document, such as source URL, chunk ID, or score. This metadata is displayed alongside the document in the trace.

The following examples show a complete retriever implementation with both requirements applied:

from langsmith import traceable

def _convert_docs(results):
    return [
        {
            "page_content": r,
            "type": "Document",
            "metadata": {"foo": "bar"}
        }
        for r in results
    ]

@traceable(run_type="retriever")
def retrieve_docs(query):
    # Returning hardcoded placeholder documents.
    # In production, replace with a real vector database or document index.
    contents = ["Document contents 1", "Document contents 2", "Document contents 3"]
    return _convert_docs(contents)

retrieve_docs("User query")

import { traceable } from "langsmith/traceable";

interface Document {
    page_content: string;
    type: string;
    metadata: { foo: string };
}

function convertDocs(results: string[]): Document[] {
    return results.map((r) => ({
        page_content: r,
        type: "Document",
        metadata: { foo: "bar" }
    }));
}

const retrieveDocs = traceable((query: string): Document[] => {
    // Returning hardcoded placeholder documents.
    // In production, replace with a real vector database or document index.
    const contents = ["Document contents 1", "Document contents 2", "Document contents 3"];
    return convertDocs(contents);
}, {
    name: "retrieveDocs",
    run_type: "retriever"
});

await retrieveDocs("User query");

In the LangSmith UI, you’ll find each retrieved document with its contents and metadata.

Annotate code for tracing: Overview of all tracing methods, including traceable, RunTree, and the REST API.
Log LLM calls: Similar custom logging requirements for LLM steps.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Edit this page on GitHub or file an issue.

Tracing setup

Configuration & troubleshooting

Set `run_type` to retriever

Return documents in the expected format

​Set run_type to retriever

​Return documents in the expected format

​Related

Set `run_type` to retriever

Return documents in the expected format

Related