LangSmith Cloud changelog - Docs by LangChain

Weekly updates to LangSmith Cloud and LangSmith Fleet.

Subscribe: This changelog includes an RSS feed that can integrate with Slack, email, Discord bots like Readybot or RSS Feeds to Discord Bot, and other subscription tools.

If you use self-hosted LangSmith, see the self-hosted changelog for updates.

LangSmith Cloud
LangSmith Fleet

June 15-19, 2026

Observability and evaluations

Automations

Automations now let you control trace retention per action, so traces matched by a rule can stay at base retention instead of being upgraded.

Engine

The Engine issue board now shows a Connect GitHub action when GitHub is not connected, so you can set up pull request creation without leaving the board.
Engine now has a unified enablement screen with access requests, and organization settings consolidate Engine usage and limits in one place.
Organization admins now receive Engine spend emails when spend crosses each configured threshold, and pausing or disabling Engine now asks for confirmation.

Datasets and experiments

Experiments now show live loading progress in the header and the Progress column, so you can track completed and evaluated runs in real time.
Evaluators now include a trace-retention toggle in the advanced options, so scored traces can stay at base retention when that fits your workflow.
Evaluator prompt editing now offers an advanced mode for editing Mustache templates directly with separate variable mappings.
You can now apply resource tags when creating a dataset, including from scratch, file upload, or a clone.
Auto-attached Assertions evaluators now read assertions from the reference output, so experiment scores reflect actual pass and fail results.

Prompts and playground

OAuth client credentials now support per-workspace setup on model configurations, so workspace admins can self-serve OAuth on saved prompts and models.
The Playground now exposes a Reasoning Summary option for OpenAI reasoning models on the Responses API.
The model dropdown no longer suggests OpenAI models for an OpenAI Compatible Endpoint, so you can enter your own custom model name.

Tracing

Trace query syntax now has a full operator reference, field table, and quick examples, so API filtering is easier to discover.
The OpenTelemetry guide now explains how to link spans to an existing LangSmith SDK trace and what happens when a parent span never arrives, so cross-process traces are easier to debug.

Monitoring and alerting

Dashboards now include a chart builder with chart templates, a create and edit pane, and brush and series controls on time series charts.
You can now send alerts to Slack as a native notification target and connect or disconnect the Slack app from the UI.

Deployment

Preview deployments now build the image for the preview commit instead of reusing the parent deployment’s image.

Sandboxes

Sandbox auth proxy now documents GCP rules and service-account handling, so Google API access through the proxy is clearer.
Sandboxes now marks AWS US SaaS availability as generally available, so the region table reflects the current rollout.
Sandboxes now support Git mounts and Google Cloud Storage bucket mounts.

Admin and billing

Administration

Organization settings now clarify that SSO/SCIM group names can omit spaces, so enterprise IdPs that disallow spaces still work cleanly.
The Vanta MCP integration is now generally available to all workspaces.
Applying tags when creating datasets, prompts, and projects is now governed by dedicated tag-on-create permissions.

LLM Gateway

The LLM gateway now supports native Gemini routes for Vertex AI and the OpenAI embeddings endpoint.
Gateway guard policies now accept a granular PII configuration and a configurable timeout action.

Usage and billing

Granular billable usage now clarifies org scoping, so you can interpret usage totals more accurately.

June 8-12, 2026

Observability and evaluations

Engine

Engine now shows only project-level spend in project view, so org-wide spend stays in the org settings surface.
Engine now keeps the Slack issue-alert deck pinned above the scrolling issues list, so the callout stays visible as you browse.

Datasets and experiments

The experiments table now displays loading progress bars showing the number of runs completed and evaluated, and experiments that predate this feature show a placeholder progress bar.

Dashboards now support time series bar and line charts backed by the v2 chart API, so monitored metrics can use the newer chart type.
Categorical feedback now shows derived percentages in experiment tables, so pass/fail metrics are easier to scan.

Prompts and playground

Playground now mints OAuth bearers end to end for OAuth-enabled presets, so long-running batches and streams keep working.

Sandboxes

Sandbox auth proxy now supports GCP auth flows, so sandbox workloads can reach Google APIs through the proxy.

Fixes

The Engine trial modal no longer shows the rough-math LCU bullet, so the pricing copy is less misleading.

June 1-5, 2026

Observability and evaluations

Automations

Run rule webhook payloads now include a trace deep link for each run, so downstream systems can jump straight back to the trace.

Engine

Per-workspace Engine spend is now generally available: you can view LCU and USD spend directly on the Engine settings page, including session-level spend.
The Engine settings page now surfaces additional Engine details in one place.
You can rotate Engine issue-board webhook signing secrets from both the API and the webhook settings UI.
The Engine issues list adds a sort option by trace count.

Datasets and experiments

A new out-of-the-box Assertions evaluator scores outputs against an explicit list of criteria specified in the reference output, and an Assertions rule is auto-attached when you add assertion-style examples to a dataset.
Evaluator metrics are improved in the experiment detail, comparison, and global experiments tables.

Prompts and playground

The Playground supports Amazon Bedrock API key authentication, letting you authenticate with a bearer token instead of AWS credentials.

Tracing

The trace view now shows an unread indicator on a run’s actions menu when the run has reviewer notes you have not seen yet.
The waterfall view is now full-height with sticky turn headers, so you keep your place while scrolling through long traces.
Global search now includes context and sandboxes

Deployment

You can now trigger a LangSmith Deployment from the Studio page.
LangSmith Deployment now supports deploying Google Agent Development Kit (ADK) agents.

Sandboxes

Sandbox proxy rules now support configuring AWS authentication, so sandboxes can reach AWS services through the proxy with signed requests.
Sandboxes can create snapshots from a Dockerfile build source.

Admin and billing

Administration

Organization admins can now disable personal access token creation from the organization settings page.

Usage and billing

Granular billable usage now supports filtering and grouping by retention tier, separating long-lived from short-lived traces.
The Granular Billable Usage page now surfaces LangSmith Deployment usage, including nodes executed, agent runs, and agent uptime, alongside trace usage.

Fixes

Performance improvements for the loading of large traces.
Filter values for metadata are now preserved when you reopen a filter dropdown to edit it.
Dataset creation now uses a multi-select dropdown for choosing CSV fields.

February 16-20, 2026

Observability and evaluations

Insights

The Insights Agent now supports scheduled reports on daily, weekly, or custom cron intervals, so report generation runs without manual triggering. Time ranges compute dynamically, so a “last 24 hours” report always reflects the most recent window when it runs, not when you configured it.

Datasets and experiments

You can now pin any experiment as a baseline. The pinned experiment stays at the top of the Experiments view and serves as the automatic comparison point for later runs, surfacing performance deltas across every column so improvements and regressions are immediately clear.

February 2-6, 2026

Observability and evaluations

Cost tracking

Cost tracking now extends beyond LLM calls. Submit custom cost metadata for any run, such as an expensive tool call, a third-party API, or a retrieval step, to monitor, debug, and optimize spend across your entire agent stack from a single dashboard.

Tracing

You can now configure which parts of a trace’s inputs and outputs appear in the tracing table, so teams working with custom trace formats can surface the most relevant fields, reduce clutter, and identify traces that need a closer look faster.

December 15-19, 2025

Observability and evaluations

Annotation and human feedback

New pairwise annotation queues let reviewers compare two runs side by side and choose whether option A is better, option B is better, or the two are equal across rubric items. LangSmith automatically pairs runs between two experiments and manages queues, reviewer assignments, and trace access, so you can run A/B evaluations across agents, prompts, and models, including for subjective dimensions like tone, correctness, usefulness, or style.

December 8-12, 2025

Observability and evaluations

Tracing

LangSmith Fetch, a new command-line tool, brings LangSmith traces directly into your terminal, coding environment, or IDE. Install it with pip install langsmith-fetch, then retrieve traces with filters such as --limit, --after, and --last-n-minutes, or bulk-export traces and threads to files for analysis, scripting, or dataset creation.

December 1-5, 2025

Observability and evaluations

Cost tracking

Cost tracking now automatically records token usage and derived costs for major model providers, and you can submit custom cost data for tools, retrieval steps, and other operations. Costs appear across trace trees, project stats, and dashboards, with an editable price map for non-standard pricing.

November 17-21, 2025

Admin and billing

Administration

LangSmith is now on the Okta Integration Network, so enterprise teams can provision and deprovision users with SCIM and configure SSO through Okta’s guided setup. See the administration overview for access control options.

October 20-24, 2025

Observability and evaluations

Insights

The Insights Agent is now generally available for Plus and Enterprise plans. It analyzes production traces to surface usage patterns, agent behaviors, and failure modes, with usage-pattern clustering, poor-interaction analysis, and custom grouping and filtering.

Datasets and experiments

Multi-turn evals measure end-to-end agent conversations across multiple exchanges, scoring semantic intent, semantic outcomes, and agent trajectory, including tool calls and decisions.

October 13-17, 2025

Observability and evaluations

Datasets and experiments

Dataset creation now infers schema automatically from uploaded CSV and JSONL files, supports adding metadata fields during upload, supports column mapping and renaming, and supports bulk additions to existing datasets from new uploads.

Deployment

LangGraph Platform is now LangSmith Deployment and LangGraph Studio is now LangSmith Studio. LangSmith now spans three services: Observability, Evaluation, and Deployment. Existing deployments, APIs, workflows, pricing, and contracts are unchanged, and no action is required.

October 6-10, 2025

Observability and evaluations

Datasets and experiments

You can now write custom code evaluators in JavaScript in addition to Python, so TypeScript teams can stay in their ecosystem end to end.

September 22-26, 2025

Observability and evaluations

Datasets and experiments

Composite evaluators combine multiple evaluator scores into a single metric using a weighted average or weighted sum, with customizable weights.

September 8-12, 2025

Admin and billing

Administration

You can now create service keys at the organization level, scoped to multiple workspaces or the entire organization, and assign roles, including custom roles, for granular permissions.

August 25-29, 2025

Deployment

LangSmith Deployment now queues revisions automatically, processing each new revision only after the current one finishes to prevent overlapping deployments and conflicts.

August 11-15, 2025

Deployment

Studio now includes Trace Mode, which shows your LangSmith traces directly in Studio and supports annotating runs and adding them to datasets for evaluation.

July 28 - August 1, 2025

Observability and evaluations

Datasets and experiments

Align Evals provides a playground-like interface for iterating on evaluator prompts and comparing human-graded scores side by side with LLM-generated scores to surface misaligned cases.

Deployment

LangSmith now links traces to the server logs in LangSmith Deployment, so you can open user and system logs directly from a trace.

July 21-25, 2025

Observability and evaluations

Tracing

Data export now supports scheduled exports of traces, so external systems such as data warehouses, monitoring platforms, and dashboards stay in sync without custom infrastructure.

July 7-11, 2025

Deployment

A new Monitoring tab shows deployment metrics, including CPU and memory usage, API request latency, and active run counts, over a customizable time range.

June 30 - July 4, 2025

Observability and evaluations

Datasets and experiments

You can now create custom views of evaluation results by breaking fields from inputs, outputs, and reference outputs into their own columns, hiding or reordering columns, and adjusting decimal precision on feedback scores.

Admin and billing

Administration

LangSmith API keys now support expiration dates, so you can scope access for temporary tasks or team members.

June 16-20, 2025

Observability and evaluations

Prompts and playground

The Playground now supports calling built-in tools from OpenAI and Anthropic, such as web search and MCP, so you can verify tool selection and argument passing.

Deployment

Studio now lets you run agent evaluations in the UI without code, comparing against reference outputs and grading responses with custom criteria.

June 2-6, 2025

Observability and evaluations

Cost tracking

Cost tracking now accounts for cached tokens, multiple token modalities such as text and image, and reasoning tokens, and supports tracking costs for arbitrary token types.

May 26-30, 2025

Observability and evaluations

Prompts and playground

Prompts now support webhook triggers that sync a prompt to external systems such as GitHub, databases, or CI/CD pipelines when it is updated.

May 19-23, 2025

Deployment

Every agent deployed on LangSmith now exposes its own Model Context Protocol (MCP) endpoint, so the agent can be used as a tool in any client that supports streamable HTTP for MCP, with no custom code or infrastructure.

Admin and billing

Usage and billing

SaaS customers can now view monthly usage charts that track all billable metrics in one place.

May 12-16, 2025

Observability and evaluations

Monitoring and alerting

Agent observability surfaces tool calls and run stats, including the most-used tools and runs, their latency, and which generate the most errors.

Deployment

LangGraph Platform, now LangSmith Deployment, reached general availability for deploying and managing long-running, stateful agents at scale, with one-click GitHub-to-production deployment, integrated memory and persistence, scalable APIs, and an agent registry across cloud, hybrid, self-hosted, and developer deployment options.
Studio v2 runs locally without the desktop app, supports editing prompts and configuration in the UI, integrates with the Playground, and lets you download production traces to debug them locally.

May 5-9, 2025

Observability and evaluations

Tracing

LangSmith now supports multimodal content for images, PDFs, and audio across the playground, annotation queues, and datasets, including attaching files to dataset examples without base64 encoding and visualizing the content in the app.

April 21-25, 2025

Observability and evaluations

Monitoring and alerting

Alerts send real-time notifications on error rates, run latency, and feedback scores, so you can catch production failures proactively.

March 31 - April 4, 2025

Observability and evaluations

Prompts and playground

The Playground now lets you create datasets inline and add examples to existing datasets without leaving the Playground.

March 24-28, 2025

Observability and evaluations

Tracing

LangSmith now has end-to-end native OpenTelemetry support for LangChain and LangGraph applications, including distributed tracing across microservices.

Datasets and experiments

You can now define evaluators for datasets and tracing projects directly in the UI with no code, including LLM-as-a-judge evaluators with prebuilt templates, customizable prompts, variable mapping, scoring, and few-shot support.

March 17-21, 2025

Deployment

Studio now lets you view and edit node logic in the UI by tagging configuration fields with langgraph_nodes, edit prompts without code changes, and sync Playground experiments back to the graph.

March 10-14, 2025

Observability and evaluations

Tracing

LangSmith now supports tracing OpenAI Agents SDK applications with two lines of code, for step-by-step observability of agent execution and reasoning.

Datasets and experiments

You can now rename an experiment in the UI, either from the Playground table header after a run or with the pencil icon in the Experiments view.

February 24-28, 2025

Observability and evaluations

Datasets and experiments

You can now group experiment results by metadata to analyze evaluation performance across segments such as user groups or subject areas.

Fixes

A new ingest-backend service separates trace ingestion from frontend request handling, improving average request processing and high-traffic response times.

February 17-21, 2025

Observability and evaluations

Prompts and playground

The Playground can now use workspace secrets saved in LangSmith, for consistent credential management across environments.

February 3-7, 2025

Observability and evaluations

Datasets and experiments

A new experiment view gives each feedback key its own column and adds filtering, sorting, and a heat map to spot patterns and performance areas.

Deployment

You can now open LLM runs from Studio in the LangSmith Playground for debugging, visualization, and prompt experimentation within threads.

January 27-31, 2025

Observability and evaluations

Tracing

Traces now include a waterfall graph that highlights latency bottlenecks and shows which components run in parallel versus sequentially.

January 20-24, 2025

Observability and evaluations

Prompts and playground

The Playground adds a streamlined prompt settings UI, a default model configuration, an enhanced tool management modal, and improved side-by-side comparison.

Datasets and experiments

New Pytest and Vitest integrations let you run evaluations using familiar testing frameworks, with debugging, metrics tracking, and built-in evaluation functions.

June 15-19, 2026

New features

Fleet tools now include Salesforce OAuth provider setup for self-hosted users, so you can configure the provider end to end.
Agent sharing is redesigned around two choices, who can use and who can edit an agent, plus a Publish as template option that lets others fork their own editable copy.
Fleet agents now post a notification to the originating thread, such as Slack, when they pause at a human-in-the-loop interrupt, with a link back to the agent chat.
You can now complete Fleet integration OAuth through your own callback URL, so headless setups can finish authentication without the LangSmith UI.
Agent cards now show the agent owner.
New first-party templates, Brand Copywriter and Applicant Screening, are available in the gallery.

Fixes

Switching threads in the agent chat now clears the previous thread immediately and shows a loading state instead of stale messages.
The skills list now degrades gracefully when one skill fails to load, so the remaining skills still appear.

June 8-12, 2026

New features

Templates now show “by Fleet” with the Fleet logo, so curated templates match Fleet branding.

Fixes

The Fleet list-threads endpoint now returns items instead of threads, so the response shape matches the rest of the API.
Fleet thread requests now return a clearer error when a large response would have triggered a 5xx, so long lists fail gracefully.

June 1-5, 2026

New features

Skills load faster: the skills list fetches lightweight metadata first and loads file contents only when you open a skill.
The agent creation menu adds a Templates entry.
The remote MCP authorization screen now shows the connecting application’s name, logo, and homepage, terms, and privacy links instead of its raw client ID.
Slack integration available in AWS and APAC regions.

Fixes

Scheduled (cron) execution is restored for enterprise Fleet agents.
Long-running agent runs and agent-builder generations are no longer cut off after 60 seconds.
The Gmail read-emails tool now returns results when you search sent mail with an in:sent query.
Scrolling is improved for long toolbox, skill, and sub-agent lists in the agent editor, and webhook dialogs now scroll within the viewport.

March 16-20, 2026

New features

Agent Builder is now LangSmith Fleet. The new name reflects Fleet’s focus on building and managing agents for your whole team: creating them, sharing them, managing their tasks, and controlling agent access and identity. All existing agents, configurations, integrations, plans, and contracts continue to work unchanged, with no action required on your end.

February 16-20, 2026

New features

A central Chat agent connects to all of your workspace tools, including Slack, Gmail, Linear, and MCP servers, so you can ask questions and take actions without setting up a dedicated agent first.
Turn a useful conversation into a recurring agent with one click, with no prompt engineering or conditional logic required.
Upload files directly into chat, including CSVs, images, documents, and style guides, for the agent to act on immediately.
A central tool registry lets workspace admins connect tools, manage authentication, and control access across the organization.

October 27-31, 2025

New features

LangSmith Agent Builder launched in private preview as a no-code way for non-developers to build agents, with conversational setup, built-in memory, MCP integrations, automated triggers, and subagent support. Agent Builder later became LangSmith Fleet.

Connect these docs to Claude, VSCode, and more via MCP for real-time answers.

Edit this page on GitHub or file an issue.

​Observability and evaluations

​Automations

​Engine

​Datasets and experiments

​Prompts and playground

​Tracing

​Monitoring and alerting

​Deployment

​Sandboxes

​Admin and billing

​Administration

​LLM Gateway

​Usage and billing

​Observability and evaluations

​Engine

​Datasets and experiments

​Prompts and playground

​Sandboxes

​Fixes

​Observability and evaluations

​Automations

​Engine

​Datasets and experiments

​Prompts and playground

​Tracing

​Deployment

​Sandboxes

​Admin and billing

​Administration

​Usage and billing

​Fixes

​Observability and evaluations

​Insights

​Datasets and experiments

​Observability and evaluations

​Cost tracking

​Tracing

​Observability and evaluations

​Annotation and human feedback

​Observability and evaluations

​Tracing

​Observability and evaluations

​Cost tracking

​Admin and billing

​Administration

​Observability and evaluations

​Insights

​Datasets and experiments

​Observability and evaluations

​Datasets and experiments

​Deployment

​Observability and evaluations

​Datasets and experiments

​Observability and evaluations

​Datasets and experiments

​Admin and billing

​Administration

​Deployment

​Deployment

​Observability and evaluations

​Datasets and experiments

​Deployment

​Observability and evaluations

​Tracing

​Deployment

​Observability and evaluations

​Datasets and experiments

​Admin and billing

​Administration

​Observability and evaluations

​Prompts and playground

​Deployment

​Observability and evaluations

​Cost tracking

​Observability and evaluations

​Prompts and playground

​Deployment

​Admin and billing

​Usage and billing

​Observability and evaluations

Observability and evaluations

Automations

Engine

Datasets and experiments

Prompts and playground

Tracing

Monitoring and alerting

Deployment

Sandboxes

Admin and billing

Administration

LLM Gateway

Usage and billing

Observability and evaluations

Engine

Datasets and experiments

Prompts and playground

Sandboxes

Fixes

Observability and evaluations

Automations

Engine

Datasets and experiments

Prompts and playground

Tracing

Deployment

Sandboxes

Admin and billing

Administration

Usage and billing

Fixes

Observability and evaluations

Insights

Datasets and experiments

Observability and evaluations

Cost tracking

Tracing

Observability and evaluations

Annotation and human feedback

Observability and evaluations

Tracing

Observability and evaluations

Cost tracking

Admin and billing

Administration

Observability and evaluations

Insights

Datasets and experiments

Observability and evaluations

Datasets and experiments

Deployment

Observability and evaluations

Datasets and experiments

Observability and evaluations

Datasets and experiments

Admin and billing

Administration

Deployment

Deployment

Observability and evaluations

Datasets and experiments

Deployment

Observability and evaluations

Tracing

Deployment

Observability and evaluations

Datasets and experiments

Admin and billing

Administration

Observability and evaluations

Prompts and playground

Deployment

Observability and evaluations

Cost tracking

Observability and evaluations

Prompts and playground

Deployment

Admin and billing

Usage and billing

Observability and evaluations