# null
Source: https://docs.langchain.com/index



<div class="mx-auto max-w-8xl px-0 lg:px-5" style={{ paddingBottom: "8rem" }}>
  <div class="mdx-content prose prose-gray dark:prose-invert mx-4 pt-10">
    <h1 class="flex whitespace-pre-wrap group font-semibold text-2xl sm:text-3xl mt-8">Documentation</h1>

    LangChain is the platform for agent engineering. AI teams at Replit, Clay, Rippling, Cloudflare, Workday, and more trust LangChain's products to engineer reliable agents.

    <h2 class="flex whitespace-pre-wrap group font-semibold">Open source agent frameworks</h2>

    <Tabs>
      <Tab title="Python" icon="python">
        <CardGroup cols={3}>
          <Card title="LangChain (Python)" href="/oss/python/langchain/overview" icon="link" cta="Learn more">
            Quickly get started building agents, with any model provider of your choice.
          </Card>

          <Card title="LangGraph (Python)" href="/oss/python/langgraph/overview" icon="circle-nodes" cta="Learn more">
            Control every step of your custom agent with low-level orchestration, memory, and human-in-the-loop support.
          </Card>

          <Card title="Deep Agents (Python)" href="/oss/python/deepagents/overview" icon="robot" cta="Learn more">
            Build agents that can tackle complex, multi-step tasks.
          </Card>
        </CardGroup>
      </Tab>

      <Tab title="TypeScript" icon="js">
        <CardGroup cols={3}>
          <Card title="LangChain (TypeScript)" href="/oss/javascript/langchain/overview" icon="link" cta="Learn more">
            Quickly get started building agents, with any model provider of your choice.
          </Card>

          <Card title="LangGraph (TypeScript)" href="/oss/javascript/langgraph/overview" icon="circle-nodes" cta="Learn more">
            Control every step of your custom agent with low-level orchestration, memory, and human-in-the-loop support.
          </Card>

          <Card title="Deep Agents (TypeScript)" href="/oss/javascript/deepagents/overview" icon="robot" cta="Learn more">
            Build agents that can tackle complex, multi-step tasks.
          </Card>
        </CardGroup>
      </Tab>
    </Tabs>

    <h2 class="flex whitespace-pre-wrap group font-semibold">LangSmith</h2>

    [**LangSmith**](/langsmith/home) is a platform that helps AI teams use live production data for continuous testing and improvement. LangSmith provides:

    <CardGroup cols={4}>
      <Card title="Observability" href="/langsmith/observability" icon="magnifying-glass" cta="Learn more">
        See exactly how your agent thinks and acts with detailed tracing and aggregate trend metrics.
      </Card>

      <Card title="Evaluation" href="/langsmith/evaluation" icon="chart-simple" cta="Learn more">
        Test and score agent behavior on production data or offline datasets to continuously improve performance.
      </Card>

      <Card title="Prompt Engineering" href="/langsmith/prompt-engineering" icon="terminal" cta="Learn more">
        Iterate on prompts with version control, prompt optimization, and collaboration features.
      </Card>

      <Card title="Deployment" href="/langsmith/deployments" icon="rocket-launch" cta="Learn more">
        Ship your agent in one click, using scalable infrastructure built for long-running tasks.
      </Card>
    </CardGroup>

    <Callout icon="lock" color="#DFC5FE" iconType="regular">
      LangSmith meets the highest standards of data security and privacy with HIPAA, SOC 2 Type 2, and GDPR compliance. For more information, see the [Trust Center](https://trust.langchain.com/).
    </Callout>

    <h2 class="flex whitespace-pre-wrap group font-semibold">Get started</h2>

    <CardGroup cols={4}>
      <Card title="Build your first agent with LangChain" icon="gear" href="/oss/python/langchain/quickstart" cta="Get started" />

      <Card title="Sign up for LangSmith" icon="screwdriver-wrench" href="https://smith.langchain.com/" cta="Try LangSmith" />

      <Card title="Build an advanced agent with LangGraph" icon="robot" href="/oss/python/langgraph/quickstart" cta="Get started" />

      <Card title="Enroll in LangChain Academy" icon="graduation-cap" href="https://academy.langchain.com/" cta="Get started" />
    </CardGroup>
  </div>
</div>

<div className="relative">
  <div className="absolute top-0 left-0 w-screen h-48 bg-background-light dark:bg-background-dark z-10" />
</div>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/index.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Connect an authentication provider
Source: https://docs.langchain.com/langsmith/add-auth-server



In [the last tutorial](/langsmith/resource-auth), you added resource authorization to give users private conversations. However, you are still using hard-coded tokens for authentication, which is not secure. Now you'll replace those tokens with real user accounts using [OAuth2](/langsmith/deployment-quickstart).

You'll keep the same [`Auth`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth) object and [resource-level access control](/langsmith/auth#single-owner-resources), but upgrade authentication to use Supabase as your identity provider. While Supabase is used in this tutorial, the concepts apply to any OAuth2 provider. You'll learn how to:

1. Replace test tokens with real JWT tokens
2. Integrate with OAuth2 providers for secure user authentication
3. Handle user sessions and metadata while maintaining our existing authorization logic

## Background

OAuth2 involves three main roles:

1. **Authorization server**: The identity provider (e.g., Supabase, Auth0, Google) that handles user authentication and issues tokens
2. **Application backend**: Your LangGraph application. This validates tokens and serves protected resources (conversation data)
3. **Client application**: The web or mobile app where users interact with your service

A standard OAuth2 flow works something like this:

```mermaid  theme={null}
sequenceDiagram
    participant User
    participant Client
    participant AuthServer
    participant LangGraph Backend

    User->>Client: Initiate login
    User->>AuthServer: Enter credentials
    AuthServer->>Client: Send tokens
    Client->>LangGraph Backend: Request with token
    LangGraph Backend->>AuthServer: Validate token
    AuthServer->>LangGraph Backend: Token valid
    LangGraph Backend->>Client: Serve request (e.g., run agent or graph)
```

## Prerequisites

Before you start this tutorial, ensure you have:

* The [bot from the second tutorial](/langsmith/resource-auth) running without errors.
* A [Supabase project](https://supabase.com/dashboard) to use its authentication server.

## 1. Install dependencies

Install the required dependencies. Start in your `custom-auth` directory and ensure you have the `langgraph-cli` installed:

<CodeGroup>
  ```bash pip theme={null}
  cd custom-auth
  pip install -U "langgraph-cli[inmem]"
  ```

  ```bash uv theme={null}
  cd custom-auth
  uv add langgraph-cli[inmem]
  ```
</CodeGroup>

<a id="setup-auth-provider" />

## 2. Set up the authentication provider

Next, fetch the URL of your auth server and the private key for authentication.
Since you're using Supabase for this, you can do this in the Supabase dashboard:

1. In the left sidebar, click on t️⚙ Project Settings" and then click "API"
2. Copy your project URL and add it to your `.env` file

```shell  theme={null}
echo "SUPABASE_URL=your-project-url" >> .env
```

3. Copy your service role secret key and add it to your `.env` file:

```shell  theme={null}
echo "SUPABASE_SERVICE_KEY=your-service-role-key" >> .env
```

4. Copy your "anon public" key and note it down. This will be used later when you set up our client code.

```bash  theme={null}
SUPABASE_URL=your-project-url
SUPABASE_SERVICE_KEY=your-service-role-key
```

## 3. Implement token validation

In the previous tutorials, you used the [`Auth`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth) object to [validate hard-coded tokens](/langsmith/set-up-custom-auth) and [add resource ownership](/langsmith/resource-auth).

Now you'll upgrade your authentication to validate real JWT tokens from Supabase. The main changes will all be in the [`@auth.authenticate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth.authenticate) decorated function:

* Instead of checking against a hard-coded list of tokens, you'll make an HTTP request to Supabase to validate the token.
* You'll extract real user information (ID, email) from the validated token.
* The existing resource authorization logic remains unchanged.

Update `src/security/auth.py` to implement this:

```python {highlight={8-9,20-30}} title="src/security/auth.py" theme={null}
import os
import httpx
from langgraph_sdk import Auth

auth = Auth()

# This is loaded from the `.env` file you created above
SUPABASE_URL = os.environ["SUPABASE_URL"]
SUPABASE_SERVICE_KEY = os.environ["SUPABASE_SERVICE_KEY"]


@auth.authenticate
async def get_current_user(authorization: str | None):
    """Validate JWT tokens and extract user information."""
    assert authorization
    scheme, token = authorization.split()
    assert scheme.lower() == "bearer"

    try:
        # Verify token with auth provider
        async with httpx.AsyncClient() as client:
            response = await client.get(
                f"{SUPABASE_URL}/auth/v1/user",
                headers={
                    "Authorization": authorization,
                    "apiKey": SUPABASE_SERVICE_KEY,
                },
            )
            assert response.status_code == 200
            user = response.json()
            return {
                "identity": user["id"],  # Unique user identifier
                "email": user["email"],
                "is_authenticated": True,
            }
    except Exception as e:
        raise Auth.exceptions.HTTPException(status_code=401, detail=str(e))

# ... the rest is the same as before

# Keep our resource authorization from the previous tutorial
@auth.on
async def add_owner(ctx, value):
    """Make resources private to their creator using resource metadata."""
    filters = {"owner": ctx.user.identity}
    metadata = value.setdefault("metadata", {})
    metadata.update(filters)
    return filters
```

The most important change is that we're now validating tokens with a real authentication server. Our authentication handler has the private key for our Supabase project, which we can use to validate the user's token and extract their information.

## 4. Test authentication flow

Let's test out the new authentication flow. You can run the following code in a file or notebook. You will need to provide:

* A valid email address
* A Supabase project URL (from [above](#setup-auth-provider))
* A Supabase anon **public key** (also from [above](#setup-auth-provider))

```python  theme={null}
import os
import httpx
from getpass import getpass
from langgraph_sdk import get_client


# Get email from command line
email = getpass("Enter your email: ")
base_email = email.split("@")
password = "secure-password"  # CHANGEME
email1 = f"{base_email[0]}+1@{base_email[1]}"
email2 = f"{base_email[0]}+2@{base_email[1]}"

SUPABASE_URL = os.environ.get("SUPABASE_URL")
if not SUPABASE_URL:
    SUPABASE_URL = getpass("Enter your Supabase project URL: ")

# This is your PUBLIC anon key (which is safe to use client-side)
# Do NOT mistake this for the secret service role key
SUPABASE_ANON_KEY = os.environ.get("SUPABASE_ANON_KEY")
if not SUPABASE_ANON_KEY:
    SUPABASE_ANON_KEY = getpass("Enter your public Supabase anon  key: ")


async def sign_up(email: str, password: str):
    """Create a new user account."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{SUPABASE_URL}/auth/v1/signup",
            json={"email": email, "password": password},
            headers={"apiKey": SUPABASE_ANON_KEY},
        )
        assert response.status_code == 200
        return response.json()

# Create two test users
print(f"Creating test users: {email1} and {email2}")
await sign_up(email1, password)
await sign_up(email2, password)
```

⚠️ Before continuing: Check your email and click both confirmation links. Supabase will reject `/login` requests until after you have confirmed your users' email.

Now test that users can only see their own data. Make sure the server is running (run `langgraph dev`) before proceeding. The following snippet requires the "anon public" key that you copied from the Supabase dashboard while [setting up the auth provider](#setup-auth-provider) previously.

```python  theme={null}
async def login(email: str, password: str):
    """Get an access token for an existing user."""
    async with httpx.AsyncClient() as client:
        response = await client.post(
            f"{SUPABASE_URL}/auth/v1/token?grant_type=password",
            json={
                "email": email,
                "password": password
            },
            headers={
                "apikey": SUPABASE_ANON_KEY,
                "Content-Type": "application/json"
            },
        )
        assert response.status_code == 200
        return response.json()["access_token"]


# Log in as user 1
user1_token = await login(email1, password)
user1_client = get_client(
    url="http://localhost:2024", headers={"Authorization": f"Bearer {user1_token}"}
)

# Create a thread as user 1
thread = await user1_client.threads.create()
print(f"✅ User 1 created thread: {thread['thread_id']}")

# Try to access without a token
unauthenticated_client = get_client(url="http://localhost:2024")
try:
    await unauthenticated_client.threads.create()
    print("❌ Unauthenticated access should fail!")
except Exception as e:
    print("✅ Unauthenticated access blocked:", e)

# Try to access user 1's thread as user 2
user2_token = await login(email2, password)
user2_client = get_client(
    url="http://localhost:2024", headers={"Authorization": f"Bearer {user2_token}"}
)

try:
    await user2_client.threads.get(thread["thread_id"])
    print("❌ User 2 shouldn't see User 1's thread!")
except Exception as e:
    print("✅ User 2 blocked from User 1's thread:", e)
```

The output should look like this:

```shell  theme={null}
✅ User 1 created thread: d6af3754-95df-4176-aa10-dbd8dca40f1a
✅ Unauthenticated access blocked: Client error '403 Forbidden' for url 'http://localhost:2024/threads'
✅ User 2 blocked from User 1's thread: Client error '404 Not Found' for url 'http://localhost:2024/threads/d6af3754-95df-4176-aa10-dbd8dca40f1a'
```

Your authentication and authorization are working together:

1. Users must log in to access the bot
2. Each user can only see their own threads

All users are managed by the Supabase auth provider, so you don't need to implement any additional user management logic.

## Next steps

You've successfully built a production-ready authentication system for your LangGraph application! Let's review what you've accomplished:

1. Set up an authentication provider (Supabase in this case)
2. Added real user accounts with email/password authentication
3. Integrated JWT token validation into your Agent Server
4. Implemented proper authorization to ensure users can only access their own data
5. Created a foundation that's ready to handle your next authentication challenge 🚀

Now that you have production authentication, consider:

1. Building a web UI with your preferred framework (see the [Custom Auth](https://github.com/langchain-ai/custom-auth) template for an example)
2. Learn more about the other aspects of authentication and authorization in the [conceptual guide on authentication](/langsmith/auth).
3. Customize your handlers and setup further after reading the [reference docs](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/add-auth-server.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Human-in-the-loop using server API
Source: https://docs.langchain.com/langsmith/add-human-in-the-loop



To review, edit, and approve tool calls in an agent or workflow, use LangGraph's [human-in-the-loop](/oss/python/langgraph/interrupts) features.

## Dynamic interrupts

<Tabs>
  <Tab title="Python">
    ```python {highlight={2,34}} theme={null}
    from langgraph_sdk import get_client
    from langgraph_sdk.schema import Command
    client = get_client(url=<DEPLOYMENT_URL>)

    # Using the graph deployed with the name "agent"
    assistant_id = "agent"

    # create a thread
    thread = await client.threads.create()
    thread_id = thread["thread_id"]

    # Run the graph until the interrupt is hit.
    result = await client.runs.wait(
        thread_id,
        assistant_id,
        input={"some_text": "original text"}   # (1)!
    )

    print(result['__interrupt__']) # (2)!
    # > [
    # >     {
    # >         'value': {'text_to_revise': 'original text'},
    # >         'resumable': True,
    # >         'ns': ['human_node:fc722478-2f21-0578-c572-d9fc4dd07c3b'],
    # >         'when': 'during'
    # >     }
    # > ]


    # Resume the graph
    print(await client.runs.wait(
        thread_id,
        assistant_id,
        command=Command(resume="Edited text")   # (3)!
    ))
    # > {'some_text': 'Edited text'}
    ```

    1. The graph is invoked with some initial state.
    2. When the graph hits the interrupt, it returns an interrupt object with the payload and metadata.
       3\. The graph is resumed with a `Command(resume=...)`, injecting the human's input and continuing execution.
  </Tab>

  <Tab title="JavaScript">
    ```javascript {highlight={32}} theme={null}
    import { Client } from "@langchain/langgraph-sdk";
    const client = new Client({ apiUrl: <DEPLOYMENT_URL> });

    // Using the graph deployed with the name "agent"
    const assistantID = "agent";

    // create a thread
    const thread = await client.threads.create();
    const threadID = thread["thread_id"];

    // Run the graph until the interrupt is hit.
    const result = await client.runs.wait(
      threadID,
      assistantID,
      { input: { "some_text": "original text" } }   # (1)!
    );

    console.log(result['__interrupt__']); # (2)!
    // > [
    # >     {
    # >         'value': {'text_to_revise': 'original text'},
    # >         'resumable': True,
    # >         'ns': ['human_node:fc722478-2f21-0578-c572-d9fc4dd07c3b'],
    # >         'when': 'during'
    # >     }
    # > ]

    // Resume the graph
    console.log(await client.runs.wait(
        threadID,
        assistantID,
        { command: { resume: "Edited text" }}   # (3)!
    ));
    # > {'some_text': 'Edited text'}
    ```

    1. The graph is invoked with some initial state.
    2. When the graph hits the interrupt, it returns an interrupt object with the payload and metadata.
    3. The graph is resumed with a `{ resume: ... }` command object, injecting the human's input and continuing execution.
  </Tab>

  <Tab title="cURL">
    Create a thread:

    ```bash  theme={null}
    curl --request POST \
    --url <DEPLOYMENT_URL>/threads \
    --header 'Content-Type: application/json' \
    --data '{}'
    ```

    Run the graph until the interrupt is hit.:

    ```bash  theme={null}
    curl --request POST \
    --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/wait \
    --header 'Content-Type: application/json' \
    --data "{
      \"assistant_id\": \"agent\",
      \"input\": {\"some_text\": \"original text\"}
    }"
    ```

    Resume the graph:

    ```bash  theme={null}
    curl --request POST \
     --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/wait \
     --header 'Content-Type: application/json' \
     --data "{
       \"assistant_id\": \"agent\",
       \"command\": {
         \"resume\": \"Edited text\"
       }
     }"
    ```
  </Tab>
</Tabs>

<Accordion title="Extended example: using `interrupt`">
  This is an example graph you can run in the Agent Server.
  See [LangSmith quickstart](/langsmith/deployment-quickstart) for more details.

  ```python {highlight={7,13}} theme={null}
  from typing import TypedDict
  import uuid

  from langgraph.checkpoint.memory import InMemorySaver
  from langgraph.constants import START
  from langgraph.graph import StateGraph
  from langgraph.types import interrupt, Command

  class State(TypedDict):
      some_text: str

  def human_node(state: State):
      value = interrupt( # (1)!
          {
              "text_to_revise": state["some_text"] # (2)!
          }
      )
      return {
          "some_text": value # (3)!
      }


  # Build the graph
  graph_builder = StateGraph(State)
  graph_builder.add_node("human_node", human_node)
  graph_builder.add_edge(START, "human_node")

  graph = graph_builder.compile()
  ```

  1. `interrupt(...)` pauses execution at `human_node`, surfacing the given payload to a human.
  2. Any JSON serializable value can be passed to the [`interrupt`](https://reference.langchain.com/python/langgraph/types/#langgraph.types.interrupt) function. Here, a dict containing the text to revise.
  3. Once resumed, the return value of `interrupt(...)` is the human-provided input, which is used to update the state.

  Once you have a running Agent Server, you can interact with it using
  [LangGraph SDK](/langsmith/langgraph-python-sdk)

  <Tabs>
    <Tab title="Python">
      ```python {highlight={2,34}} theme={null}
      from langgraph_sdk import get_client
      from langgraph_sdk.schema import Command
      client = get_client(url=<DEPLOYMENT_URL>)

      # Using the graph deployed with the name "agent"
      assistant_id = "agent"

      # create a thread
      thread = await client.threads.create()
      thread_id = thread["thread_id"]

      # Run the graph until the interrupt is hit.
      result = await client.runs.wait(
          thread_id,
          assistant_id,
          input={"some_text": "original text"}   # (1)!
      )

      print(result['__interrupt__']) # (2)!
      # > [
      # >     {
      # >         'value': {'text_to_revise': 'original text'},
      # >         'resumable': True,
      # >         'ns': ['human_node:fc722478-2f21-0578-c572-d9fc4dd07c3b'],
      # >         'when': 'during'
      # >     }
      # > ]


      # Resume the graph
      print(await client.runs.wait(
          thread_id,
          assistant_id,
          command=Command(resume="Edited text")   # (3)!
      ))
      # > {'some_text': 'Edited text'}
      ```

      1. The graph is invoked with some initial state.
      2. When the graph hits the interrupt, it returns an interrupt object with the payload and metadata.
         3\. The graph is resumed with a `Command(resume=...)`, injecting the human's input and continuing execution.
    </Tab>

    <Tab title="JavaScript">
      ```javascript {highlight={32}} theme={null}
      import { Client } from "@langchain/langgraph-sdk";
      const client = new Client({ apiUrl: <DEPLOYMENT_URL> });

      // Using the graph deployed with the name "agent"
      const assistantID = "agent";

      // create a thread
      const thread = await client.threads.create();
      const threadID = thread["thread_id"];

      // Run the graph until the interrupt is hit.
      const result = await client.runs.wait(
        threadID,
        assistantID,
        { input: { "some_text": "original text" } }   # (1)!
      );

      console.log(result['__interrupt__']); # (2)!
      # > [
      # >     {
      # >         'value': {'text_to_revise': 'original text'},
      # >         'resumable': True,
      # >         'ns': ['human_node:fc722478-2f21-0578-c572-d9fc4dd07c3b'],
      # >         'when': 'during'
      # >     }
      # > ]

      // Resume the graph
      console.log(await client.runs.wait(
          threadID,
          assistantID,
          { command: { resume: "Edited text" }}   # (3)!
      ));
      # > {'some_text': 'Edited text'}
      ```

      1. The graph is invoked with some initial state.
      2. When the graph hits the interrupt, it returns an interrupt object with the payload and metadata.
      3. The graph is resumed with a `{ resume: ... }` command object, injecting the human's input and continuing execution.
    </Tab>

    <Tab title="cURL">
      Create a thread:

      ```bash  theme={null}
      curl --request POST \
      --url <DEPLOYMENT_URL>/threads \
      --header 'Content-Type: application/json' \
      --data '{}'
      ```

      Run the graph until the interrupt is hit:

      ```bash  theme={null}
      curl --request POST \
      --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/wait \
      --header 'Content-Type: application/json' \
      --data "{
        \"assistant_id\": \"agent\",
        \"input\": {\"some_text\": \"original text\"}
      }"
      ```

      Resume the graph:

      ```bash  theme={null}
      curl --request POST \
      --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/wait \
      --header 'Content-Type: application/json' \
      --data "{
        \"assistant_id\": \"agent\",
        \"command\": {
          \"resume\": \"Edited text\"
        }
      }"
      ```
    </Tab>
  </Tabs>
</Accordion>

## Static interrupts

Static interrupts (also known as static breakpoints) are triggered either before or after a node executes.

<Warning>
  Static interrupts are **not** recommended for human-in-the-loop workflows. They are best used for debugging and testing.
</Warning>

You can set static interrupts by specifying `interrupt_before` and `interrupt_after` at compile time:

```python {highlight={1,2,3}} theme={null}
graph = graph_builder.compile( # (1)!
    interrupt_before=["node_a"], # (2)!
    interrupt_after=["node_b", "node_c"], # (3)!
)
```

1. The breakpoints are set during `compile` time.
2. `interrupt_before` specifies the nodes where execution should pause before the node is executed.
3. `interrupt_after` specifies the nodes where execution should pause after the node is executed.

Alternatively, you can set static interrupts at run time:

<Tabs>
  <Tab title="Python">
    ```python {highlight={1,5,6}} theme={null}
    await client.runs.wait( # (1)!
        thread_id,
        assistant_id,
        inputs=inputs,
        interrupt_before=["node_a"], # (2)!
        interrupt_after=["node_b", "node_c"] # (3)!
    )
    ```

    1. `client.runs.wait` is called with the `interrupt_before` and `interrupt_after` parameters. This is a run-time configuration and can be changed for every invocation.
    2. `interrupt_before` specifies the nodes where execution should pause before the node is executed.
    3. `interrupt_after` specifies the nodes where execution should pause after the node is executed.
  </Tab>

  <Tab title="JavaScript">
    ```javascript {highlight={1,6,7}} theme={null}
    await client.runs.wait( // (1)!
        threadID,
        assistantID,
        {
        input: input,
        interruptBefore: ["node_a"], // (2)!
        interruptAfter: ["node_b", "node_c"] // (3)!
        }
    )
    ```

    1. `client.runs.wait` is called with the `interruptBefore` and `interruptAfter` parameters. This is a run-time configuration and can be changed for every invocation.
    2. `interruptBefore` specifies the nodes where execution should pause before the node is executed.
    3. `interruptAfter` specifies the nodes where execution should pause after the node is executed.
  </Tab>

  <Tab title="cURL">
    ```bash  theme={null}
    curl --request POST \
    --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/wait \
    --header 'Content-Type: application/json' \
    --data "{
        \"assistant_id\": \"agent\",
        \"interrupt_before\": [\"node_a\"],
        \"interrupt_after\": [\"node_b\", \"node_c\"],
        \"input\": <INPUT>
    }"
    ```
  </Tab>
</Tabs>

The following example shows how to add static interrupts:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    from langgraph_sdk import get_client
    client = get_client(url=<DEPLOYMENT_URL>)

    # Using the graph deployed with the name "agent"
    assistant_id = "agent"

    # create a thread
    thread = await client.threads.create()
    thread_id = thread["thread_id"]

    # Run the graph until the breakpoint
    result = await client.runs.wait(
        thread_id,
        assistant_id,
        input=inputs   # (1)!
    )

    # Resume the graph
    await client.runs.wait(
        thread_id,
        assistant_id,
        input=None   # (2)!
    )
    ```

    1. The graph is run until the first breakpoint is hit.
    2. The graph is resumed by passing in `None` for the input. This will run the graph until the next breakpoint is hit.
  </Tab>

  <Tab title="JavaScript">
    ```js  theme={null}
    import { Client } from "@langchain/langgraph-sdk";
    const client = new Client({ apiUrl: <DEPLOYMENT_URL> });

    // Using the graph deployed with the name "agent"
    const assistantID = "agent";

    // create a thread
    const thread = await client.threads.create();
    const threadID = thread["thread_id"];

    // Run the graph until the breakpoint
    const result = await client.runs.wait(
      threadID,
      assistantID,
      { input: input }   # (1)!
    );

    // Resume the graph
    await client.runs.wait(
      threadID,
      assistantID,
      { input: null }   # (2)!
    );
    ```

    1. The graph is run until the first breakpoint is hit.
    2. The graph is resumed by passing in `null` for the input. This will run the graph until the next breakpoint is hit.
  </Tab>

  <Tab title="cURL">
    Create a thread:

    ```bash  theme={null}
    curl --request POST \
    --url <DEPLOYMENT_URL>/threads \
    --header 'Content-Type: application/json' \
    --data '{}'
    ```

    Run the graph until the breakpoint:

    ```bash  theme={null}
    curl --request POST \
    --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/wait \
    --header 'Content-Type: application/json' \
    --data "{
      \"assistant_id\": \"agent\",
      \"input\": <INPUT>
    }"
    ```

    Resume the graph:

    ```bash  theme={null}
    curl --request POST \
    --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/wait \
    --header 'Content-Type: application/json' \
    --data "{
      \"assistant_id\": \"agent\"
    }"
    ```
  </Tab>
</Tabs>

## Learn more

* [Human-in-the-loop conceptual guide](/oss/python/langgraph/interrupts): learn more about LangGraph human-in-the-loop features.
* [Common patterns](/oss/python/langgraph/interrupts#common-patterns): learn how to implement patterns like approving/rejecting actions, requesting user input, tool call review, and validating human input.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/add-human-in-the-loop.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Add metadata and tags to traces
Source: https://docs.langchain.com/langsmith/add-metadata-tags



LangSmith supports sending arbitrary metadata and tags along with traces.

Tags are strings that can be used to categorize or label a trace. Metadata is a dictionary of key-value pairs that can be used to store additional information about a trace.

Both are useful for associating additional information with a trace, such as the environment in which it was executed, the user who initiated it, or an internal correlation ID. For more information on tags and metadata, see the [Concepts](/langsmith/observability-concepts#tags) page. For information on how to query traces and runs by metadata and tags, see the [Filter traces in the application](/langsmith/filter-traces-in-application) page.

<CodeGroup>
  ```python Python theme={null}
  import openai
  import langsmith as ls
  from langsmith.wrappers import wrap_openai

  client = openai.Client()
  messages = [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Hello!"}
  ]

      # You can set metadata & tags **statically** when decorating a function
      # Use the @traceable decorator with tags and metadata
      # Ensure that the LANGSMITH_TRACING environment variables are set for @traceable to work
      @ls.traceable(
          run_type="llm",
          name="OpenAI Call Decorator",
          tags=["my-tag"],
          metadata={"my-key": "my-value"}
      )
      def call_openai(
          messages: list[dict], model: str = "gpt-4o-mini"
      ) -> str:
          # You can also dynamically set metadata on the parent run:
          rt = ls.get_current_run_tree()
          rt.metadata["some-conditional-key"] = "some-val"
          rt.tags.extend(["another-tag"])
          return client.chat.completions.create(
              model=model,
              messages=messages,
          ).choices[0].message.content

      call_openai(
          messages,
          # To add at **invocation time**, when calling the function.
          # via the langsmith_extra parameter
          langsmith_extra={"tags": ["my-other-tag"], "metadata": {"my-other-key": "my-value"}}
      )

      # Alternatively, you can use the context manager
      with ls.trace(
          name="OpenAI Call Trace",
          run_type="llm",
          inputs={"messages": messages},
          tags=["my-tag"],
          metadata={"my-key": "my-value"},
      ) as rt:
          chat_completion = client.chat.completions.create(
              model="gpt-4o-mini",
              messages=messages,
          )
          rt.metadata["some-conditional-key"] = "some-val"
          rt.end(outputs={"output": chat_completion})

  # You can use the same techniques with the wrapped client
  patched_client = wrap_openai(
      client, tracing_extra={"metadata": {"my-key": "my-value"}, "tags": ["a-tag"]}
  )
  chat_completion = patched_client.chat.completions.create(
      model="gpt-4o-mini",
      messages=messages,
      langsmith_extra={
          "tags": ["my-other-tag"],
          "metadata": {"my-other-key": "my-value"},
      },
  )
  ```

  ```typescript TypeScript theme={null}
  import OpenAI from "openai";
  import { traceable, getCurrentRunTree } from "langsmith/traceable";
  import { wrapOpenAI } from "langsmith/wrappers";

      const client = wrapOpenAI(new OpenAI());
      const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
          { role: "system", content: "You are a helpful assistant." },
          { role: "user", content: "Hello!" },
      ];

      const traceableCallOpenAI = traceable(
          async (messages: OpenAI.Chat.ChatCompletionMessageParam[]) => {
              const completion = await client.chat.completions.create({
                  model: "gpt-4o-mini",
                  messages,
              });
              const runTree = getCurrentRunTree();
              runTree.extra.metadata = {
                  ...runTree.extra.metadata,
                  someKey: "someValue",
              };
              runTree.tags = [...(runTree.tags ?? []), "runtime-tag"];
              return completion.choices[0].message.content;
          },
          {
              run_type: "llm",
              name: "OpenAI Call Traceable",
              tags: ["my-tag"],
              metadata: { "my-key": "my-value" },
          }
      );

  // Call the traceable function
  await traceableCallOpenAI(messages);
  ```
</CodeGroup>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/add-metadata-tags.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Overview
Source: https://docs.langchain.com/langsmith/administration-overview



This overview covers topics related to managing users, organizations, and workspaces within LangSmith.

## Resource Hierarchy

### Organizations

An organization is a logical grouping of users within LangSmith with its own billing configuration. Typically, there is one organization per company. An organization can have multiple workspaces. For more details, see the [setup guide](/langsmith/set-up-a-workspace#set-up-an-organization).

When you log in for the first time, a personal organization will be created for you automatically. If you'd like to collaborate with others, you can create a separate organization and invite your team members to join. There are a few important differences between your personal organization and shared organizations:

| Feature             | Personal            | Shared                                                                                       |
| ------------------- | ------------------- | -------------------------------------------------------------------------------------------- |
| Maximum workspaces  | 1                   | Variable, depending on plan (see [pricing page](https://www.langchain.com/pricing-langsmith) |
| Collaboration       | Cannot invite users | Can invite users                                                                             |
| Billing: paid plans | Developer plan only | All other plans available                                                                    |

### Workspaces

<Info>
  Workspaces were formerly called Tenants. Some code and APIs may still reference the old name for a period of time during the transition.
</Info>

A workspace is a logical grouping of users and resources within an organization. A workspace separates trust boundaries for resources and access control. Users may have permissions in a workspace that grant them access to the resources in that workspace, including tracing projects, datasets, annotation queues, and prompts. For more details, see the [setup guide](/langsmith/set-up-a-workspace).

It is recommended to create a separate workspace for each team within your organization. To organize resources even further, you can use [Resource Tags](#resource-tags) to group resources within a workspace.

The following image shows a sample workspace settings page: <img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/sample-workspace.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=a2af70dec53a98f502132de92445aed2" alt="Sample Workspace" data-og-width="3008" width="3008" data-og-height="956" height="956" data-path="langsmith/images/sample-workspace.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/sample-workspace.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=439802e412d8f96c45a4af9829659f5a 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/sample-workspace.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=0ce5170a93ee736faf115533f46467f2 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/sample-workspace.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=e3d47902f66ad077ae9a893be183c348 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/sample-workspace.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=b5af7b353e10c44b844f60a677740eb7 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/sample-workspace.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=6acccca51bf1962f2abcd82ab3dca96e 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/sample-workspace.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=213d4b9372f31ca2babcbe02e69a0e3e 2500w" />

The following diagram explains the relationship between organizations, workspaces, and the different resources scoped to and within a workspace: <img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-hierarchy.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=bc51967d88ce39120547d174ab6a442e" alt="Resource Hierarchy" data-og-width="1403" width="1403" data-og-height="272" height="272" data-path="langsmith/images/resource-hierarchy.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-hierarchy.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=a679974443061378a17aef96f8118bc2 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-hierarchy.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=15fa6c782376f97d79a2c2f87a13d5a7 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-hierarchy.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=f6ff24a91b5c00e17c44c07bd5868a06 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-hierarchy.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=7eaab6991daaca255d41da41fad268a9 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-hierarchy.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=fd2c292577912b0cf04c6d14ba636fe7 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-hierarchy.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=da5879f0ae4b12f5e8b64d447ddee8fe 2500w" />

See the table below for details on which features are available in which scope (organization or workspace):

| Resource/Setting                                                            | Scope            |
| --------------------------------------------------------------------------- | ---------------- |
| Trace Projects                                                              | Workspace        |
| Annotation Queues                                                           | Workspace        |
| Deployments                                                                 | Workspace        |
| Datasets & Experiments                                                      | Workspace        |
| Prompts                                                                     | Workspace        |
| Resource Tags                                                               | Workspace        |
| API Keys                                                                    | Workspace        |
| Settings including Secrets, Feedback config, Models, Rules, and Shared URLs | Workspace        |
| User management: Invite User to Workspace                                   | Workspace        |
| RBAC: Assigning Workspace Roles                                             | Workspace        |
| Data Retention, Usage Limits                                                | Workspace\*      |
| Plans and Billing, Credits, Invoices                                        | Organization     |
| User management: Invite User to Organization                                | Organization\*\* |
| Adding Workspaces                                                           | Organization     |
| Assigning Organization Roles                                                | Organization     |
| RBAC: Creating/Editing/Deleting Custom Roles                                | Organization     |

\* Data retention settings and usage limits will be available soon for the organization level as well \*\* Self-hosted installations may enable workspace-level invites of users to the organization via a feature flag. See the [self-hosted user management docs](/langsmith/self-host-user-management) for details.

### Resource tags

Resource tags allow you to organize resources within a workspace. Each tag is a key-value pair that can be assigned to a resource. Tags can be used to filter workspace-scoped resources in the UI and API: Projects, Datasets, Annotation Queues, Deployments, and Experiments.

Each new workspace comes with two default tag keys: `Application` and `Environment`; as the names suggest, these tags can be used to categorize resources based on the application and environment they belong to. More tags can be added as needed.

LangSmith resource tags are very similar to tags in cloud services like [AWS](https://docs.aws.amazon.com/tag-editor/latest/userguide/tagging.html).

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-tags.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=99d640e56b981397219e3ad1a5d3f3b2" alt="Sample Resource Tags" data-og-width="1152" width="1152" data-og-height="233" height="233" data-path="langsmith/images/resource-tags.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-tags.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=d819b9e316cd0ac5cd065f51999cbfe5 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-tags.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=d2389e8878ee353d730f13ad63f59482 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-tags.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=bbdafff1f16e1036d0f243e8aa75481d 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-tags.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=5f02b3dcfd6c6c842ad8899cd19d43ee 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-tags.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=5d362a1f024b376c19c0dc03d06406ac 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/resource-tags.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=0f5fc4688726e872dbb53a1d3d4f0b00 2500w" />

## User Management and RBAC

### Users

A user is a person who has access to LangSmith. Users can be members of one or more organizations and workspaces within those organizations.

Organization members are managed in organization settings:

<img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-members-settings.png?fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=93249dc8c55c6d8001b350e497b2a0ac" alt="Sample Organization Members" data-og-width="3020" width="3020" data-og-height="1246" height="1246" data-path="langsmith/images/org-members-settings.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-members-settings.png?w=280&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=d2c0bbf572906894c972ea1047ff1c3a 280w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-members-settings.png?w=560&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=fe38d538de5f0436f05fa1fa0679f612 560w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-members-settings.png?w=840&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=cf5e551ff7c68793da8a3f5ecdef7e49 840w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-members-settings.png?w=1100&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=a401368eebe3fc19a5bec603ad9bb9c8 1100w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-members-settings.png?w=1650&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=be6d182043d052dece018f54bd546916 1650w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-members-settings.png?w=2500&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=450b385d226c53c1f327612e44abc5d1 2500w" />

And workspace members are managed in workspace settings:

<img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-settings-workspaces-tab.png?fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=678c90095209d5e1440e452297478dc2" alt="Sample Workspace Members" data-og-width="3016" width="3016" data-og-height="1248" height="1248" data-path="langsmith/images/org-settings-workspaces-tab.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-settings-workspaces-tab.png?w=280&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=a1a6827b0fc3fdc068b98a02bbb58a05 280w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-settings-workspaces-tab.png?w=560&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=79c438ec6b20a407d607a56b66dfa431 560w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-settings-workspaces-tab.png?w=840&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=1c622e1fcd3afdff3f001fe4f642a9cf 840w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-settings-workspaces-tab.png?w=1100&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=bda1583680e7f19c2cc65a72a5158fae 1100w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-settings-workspaces-tab.png?w=1650&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=c80bcd52cb503a6652978dad756123f0 1650w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/org-settings-workspaces-tab.png?w=2500&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=d8350437e643a858e138674d38ed8751 2500w" />

### API keys

<Warning>
  We ended support for legacy API keys prefixed with `ls__` on October 22, 2024 in favor of personal access tokens (PATs) and service keys. We require using PATs and service keys for all new integrations. API keys prefixed with `ls__` will no longer work as of October 22, 2024.
</Warning>

#### Expiration Dates

When you create an API key, you have the option to set an expiration date. Adding an expiration date to keys enhances security and minimizes the risk of unauthorized access. For example, you may set expiration dates on keys for temporary tasks that require elevated access.

By default, keys never expire. Once expired, an API key is no longer valid and cannot be reactivated or have its expiration modified.

#### Personal Access Tokens (PATs)

Personal Access Tokens (PATs) are used to authenticate requests to the LangSmith API. They are created by users and scoped to a user. The PAT will have the same permissions as the user that created it. We recommend not using these to authenticate requests from your application, but rather using them for personal scripts or tools that interact with the LangSmith API. If the user associated with the PAT is removed from the organization, the PAT will no longer work.

PATs are prefixed with `lsv2_pt_`

#### Service keys

Service keys are similar to PATs, but are used to authenticate requests to the LangSmith API on behalf of a service account. Only admins can create service keys. We recommend using these for applications / services that need to interact with the LangSmith API, such as LangGraph agents or other integrations. Service keys may be scoped to a single workspace, multiple workspaces, or the entire organization, and can be used to authenticate requests to the LangSmith API for whichever workspace(s) it has access to.

Service keys are prefixed with `lsv2_sk_`

<Warning>
  Use the `X-Tenant-Id` header to specify the target workspace.

  * **When using PATs**: If this header is omitted, requests will run against the default workspace associated with the key.
  * **When using organization-scoped service keys**: You must include the `X-Tenant-Id` header when accessing workspace-scoped resources. Without it, the request will fail with a `403 Forbidden` error.
</Warning>

<Note>
  To see how to create a service key or Personal Access Token, see the [setup guide](/langsmith/create-account-api-key)
</Note>

### Organization roles

Organization roles are distinct from the [Enterprise feature workspace RBAC](#workspace-roles-rbac) and are used in the context of multiple [workspaces](#workspaces). Your organization role determines your workspace membership characteristics and your [organization-level permissions](/langsmith/organization-workspace-operations).

The organization role selected also impacts workspace membership as described here:

* [Organization Admin](/langsmith/rbac#organization-admin) grants full access to manage all organization configuration, users, billing, and workspaces.
  * An Organization Admin has `Admin` access to all workspaces in an organization.
* [Organization User](/langsmith/rbac#organization-user) may read organization information but cannot execute any write actions at the organization level. An Organization User may create [Personal Access Tokens](#personal-access-tokens-pats).
  * An Organization User can be added to a subset of workspaces and assigned workspace roles as usual (if RBAC is enabled), which specify permissions at the workspace level.
* [Organization Viewer](/langsmith/rbac#organization-viewer) is equivalent to Organization User, but **cannot** create Personal Access Tokens. (for self-hosted, available in Helm chart version 0.11.25+).

<Info>
  The Organization User and Organization Viewer roles are only available in organizations on [plans](https://langchain.com/pricing) with multiple workspaces. In organizations limited to a single workspace, all users have the Organization Admin role.

  See [security settings](/langsmith/manage-organization-by-api#security-settings) for instructions on how to disable PAT creation for the entire organization.
</Info>

For more information on setting up organizations and workspaces, refer to the [organization setup guide](/langsmith/set-up-a-workspace#organization-roles) for more information.

The following table provdies an overview of organization level permissions:

|                                             | Organization Viewer | Organization User | Organization Admin |
| ------------------------------------------- | ------------------- | ----------------- | ------------------ |
| View organization configuration             | ✅                   | ✅                 | ✅                  |
| View organization roles                     | ✅                   | ✅                 | ✅                  |
| View organization members                   | ✅                   | ✅                 | ✅                  |
| View data retention settings                | ✅                   | ✅                 | ✅                  |
| View usage limits                           | ✅                   | ✅                 | ✅                  |
| Create personal access tokens (PATs)        | ❌                   | ✅                 | ✅                  |
| Admin access to all workspaces              | ❌                   | ❌                 | ✅                  |
| Manage billing settings                     | ❌                   | ❌                 | ✅                  |
| Create workspaces                           | ❌                   | ❌                 | ✅                  |
| Create, edit, and delete organization roles | ❌                   | ❌                 | ✅                  |
| Invite new users to organization            | ❌                   | ❌                 | ✅                  |
| Delete user invites                         | ❌                   | ❌                 | ✅                  |
| Remove users from an organization           | ❌                   | ❌                 | ✅                  |
| Update data retention settings              | ❌                   | ❌                 | ✅                  |
| Update usage limits                         | ❌                   | ❌                 | ✅                  |

For a comprehensive list of required permissions along with the operations and roles that can perform them, refer to the [Organization and workspace reference](/langsmith/organization-workspace-operations).

### Workspace roles (RBAC)

<Note>
  RBAC (Role-Based Access Control) is a feature that is only available to Enterprise customers. If you are interested in this feature, [contact our sales team](https://www.langchain.com/contact-sales). Other plans default to using the Admin role for all users.
</Note>

Roles are used to define the set of permissions that a user has within a workspace. There are three built-in system roles that cannot be edited:

* [Workspace Admin](/langsmith/rbac#workspace-admin) has full access to all resources within the workspace.
* [Workspace Editor](/langsmith/rbac#workspace-editor) has full permissions except for workspace management (adding/removing users, changing roles, configuring service keys).
* [Workspace Viewer](/langsmith/rbac#workspace-viewer) has read-only access to all resources within the workspace.

[Organization admins](/langsmith/rbac#organization-admin) can also create/edit custom roles with specific permissions for different resources.

Roles can be managed in **Organization Settings** under the **Roles** tab:

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/roles-tab-rbac.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=8c9f1e2b8e18b6f40b59ecc3c18d0e5f" alt="The Organization members and roles view showing a list of the roles." data-og-width="3018" width="3018" data-og-height="1546" height="1546" data-path="langsmith/images/roles-tab-rbac.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/roles-tab-rbac.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=94ab2c109a053547c6bbe9ac0364d870 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/roles-tab-rbac.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=fd6bc62163b03e32aca8fd40d831111f 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/roles-tab-rbac.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=f85073d5302051fd6febed8762615018 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/roles-tab-rbac.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=5c1f79814e3b54e747cc33d7096364e5 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/roles-tab-rbac.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=78bfc784d69847b1c08d40671e30d64e 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/roles-tab-rbac.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=a4651067818a7de28a672307d4ecdbee 2500w" />

* For comprehensive documentation on roles and permissions, refer to the [Role-based access control](/langsmith/rbac) guide.
* For more details on assigning and creating roles, refer to the [User Management](/langsmith/user-management) guide.
* For a comprehensive list of required permissions along with the operations and roles that can perform them, refer to the [Organization and workspace reference](/langsmith/organization-workspace-operations).

## Best Practices

### Environment Separation

Use [resource tags](#resource-tags) to organize resources by environment using the default tag key `Environment` and different values for the environment (e.g., `dev`, `staging`, `prod`). We do not recommend using separate workspaces for environment separation because resources cannot be shared across workspaces, which would prevent you from promoting resources (like prompts) between environments.

<Note>
  **Resource tags vs. commit tags for prompt management**

  While both types of tags can use environment terminology like `dev`, `staging`, and `prod`, they serve different purposes:

  * **Resource tags** (`Environment: prod`): Use these to *organize and filter* resources across your workspace. Apply resource tags to tracing projects, datasets, and other resources (including prompts) to group them by environment, which enables filtering in the UI.
  * [Commit tags](/langsmith/manage-prompts#commit-tags) (`prod` tag): Use these to manage which [prompt version](/langsmith/prompt-engineering) your code references. Commit tags are labels that point to specific commits in a prompt's history. When your code pulls a prompt by tag name (e.g., `client.pull_prompt("prompt-name:prod")`), it retrieves whichever commit that tag currently points to. To promote a prompt from `staging` to `prod`, move the commit tag to point to the desired version.

  Resource tags organize **which resources** belong to an environment. Commit tags let you control **which version** of a prompt your code references without changing the code itself.
</Note>

## Usage and Billing

### Data Retention

This section covers how data retention works and how it's priced in LangSmith.

#### Why retention matters

* **Privacy**: Many data privacy regulations, such as GDPR in Europe or CCPA in California, require organizations to delete personal data once it's no longer necessary for the purposes for which it was collected. Setting retention periods aids in compliance with such regulations.
* **Cost**: LangSmith charges less for traces that have low data retention. See our tutorial on how to [optimize spend](/langsmith/billing#optimize-your-tracing-spend) for details.

#### How it works

LangSmith has two tiers of traces based on Data Retention with the following characteristics:

|                      | Base              | Extended        |
| -------------------- | ----------------- | --------------- |
| **Price**            | \$.50 / 1k traces | \$5 / 1k traces |
| **Retention Period** | 14 days           | 400 days        |

**Data deletion after retention ends**

After the specified retention period, traces are no longer accessible in the tracing project UI or via the API. All user data associated with the trace (e.g. inputs and outputs) is deleted from our internal systems within a day thereafter. Some metadata associated with each trace may be retained indefinitely for analytics and billing purposes.

**Data retention auto-upgrades**

<Warning>
  Auto upgrades can have an impact on your bill. Please read this section carefully to fully understand your estimated LangSmith tracing costs.
</Warning>

When you use certain features with `base` tier traces, their data retention will be automatically upgraded to `extended` tier. This will increase both the retention period, and the cost of the trace.

The complete list of scenarios in which a trace will upgrade when:

* **Feedback** is added to any run on the trace (or any trace in the thread), whether through [manual annotation](/langsmith/annotate-traces-inline#annotate-traces-and-runs-inline), automatically with [an online evaluator](/langsmith/online-evaluations), or programmatically [via the SDK](/langsmith/attach-user-feedback#log-user-feedback-using-the-sdk).
* An **[annotation queue](/langsmith/annotation-queues#assign-runs-to-an-annotation-queue)** receives any run from the trace.
* An **[automation rule](/langsmith/rules#set-up-automation-rules)** matches any run within a trace.

**Why auto-upgrade traces?**

We have two reasons behind the auto-upgrade model for tracing:

1. We think that traces that match any of these conditions are fundamentally more interesting than other traces, and therefore it is good for users to be able to keep them around longer.
2. We philosophically want to charge customers an order of magnitude lower for traces that may not be interacted with meaningfully. We think auto-upgrades align our pricing model with the value that LangSmith brings, where only traces with meaningful interaction are charged at a higher rate.

If you have questions or concerns about our pricing model, please feel free to reach out to [support@langchain.dev](mailto:support@langchain.dev) and let us know your thoughts!

**How does data retention affect downstream features?**

* **Annotation Queues, Run Rules, and Feedback**: Traces that use these features will be [auto-upgraded](#data-retention-auto-upgrades).
* **Monitoring**: The monitoring tab will continue to work even after a base tier trace's data retention period ends. It is powered by trace metadata that exists for >30 days, meaning that your monitoring graphs will continue to stay accurate even on `base` tier traces.
* **Datasets**: Datasets have an indefinite data retention period. Restated differently, if you add a trace's inputs and outputs to a dataset, they will never be deleted. We suggest that if you are using LangSmith for data collection, you take advantage of the datasets feature.

#### Billing model

**Billable metrics**

On your LangSmith invoice, you will see two metrics that we charge for:

* LangSmith Traces (Base Charge)
* LangSmith Traces (Extended Data Retention Upgrades).

The first metric includes all traces, regardless of tier. The second metric just counts the number of extended retention traces.

**Why measure all traces + upgrades instead of base and extended traces?**

A natural question to ask when considering our pricing is why not just show the number of `base` tier and `extended` tier traces directly on the invoice?

While we understand this would be more straightforward, it doesn't fit trace upgrades properly. Consider a `base` tier trace that was recorded on June 30, and upgraded to `extended` tier on July 3. The `base` tier trace occurred in the June billing period, but the upgrade occurred in the July billing period. Therefore, we need to be able to measure these two events independently to properly bill our customers.

If your trace was recorded as an extended retention trace, then the `base` and `extended` metrics will both be recorded with the same timestamp.

**Cost breakdown**

The Base Charge for a trace is .05¢ per trace. We priced the upgrade such that an `extended` retention trace costs 10x the price of a base tier trace (.50¢ per trace) including both metrics. Thus, each upgrade costs .45¢.

### Rate Limits

LangSmith has rate limits which are designed to ensure the stability of the service for all users.

To ensure access and stability, LangSmith will respond with HTTP Status Code 429 indicating that rate or usage limits have been exceeded under the following circumstances:

#### Temporary throughput limit over a 1 minute period at our application load balancer

This 429 is the the result of exceeding a fixed number of API calls over a 1 minute window on a per API key/access token basis. The start of the window will vary slightly — it is not guaranteed to start at the start of a clock minute — and may change depending on application deployment events.

After the max events are received we will respond with a 429 until 60 seconds from the start of the evaluation window has been reached and then the process repeats.

This 429 is thrown by our application load balancer and is a mechanism in place for all LangSmith users independent of plan tier to ensure continuity of service for all users.

| Method            | Endpoints     | Limit | Window   |
| ----------------- | ------------- | ----- | -------- |
| `DELETE`          | `/sessions*`  | 30    | 1 minute |
| `POST` OR `PATCH` | `/runs*`      | 5000  | 1 minute |
| `GET`             | `/runs/:id`   | 30    | 1 minute |
| `POST`            | `/feedbacks*` | 5000  | 1 minute |
| `*`               | `*`           | 2000  | 1 minute |

<Note>
  The LangSmith SDK takes steps to minimize the likelihood of reaching these limits on run-related endpoints by batching up to 100 runs from a single session ID into a single API call.
</Note>

#### Plan-level hourly trace event limit

This 429 is the result of reaching your maximum hourly events ingested and is evaluated in a fixed window starting at the beginning of each clock hour in UTC and resets at the top of each new hour.

An event in this context is the creation or update of a run. So if run is created, then subsequently updated in the same hourly window, that will count as 2 events against this limit.

This is thrown by our application and varies by plan tier, with organizations on our Startup/Plus and Enterprise plan tiers having higher hourly limits than our Free and Developer Plan Tiers which are designed for personal use.

| Plan                             | Limit          | Window |
| -------------------------------- | -------------- | ------ |
| Developer (no payment on file)   | 50,000 events  | 1 hour |
| Developer (with payment on file) | 250,000 events | 1 hour |
| Startup/Plus                     | 500,000 events | 1 hour |
| Enterprise                       | Custom         | Custom |

#### Plan-level hourly trace data ingest limit

This 429 is the result of reaching the maximum amount of data ingested across your trace inputs, outputs, and metadata and is evaluated in a fixed window starting at the beginning of each clock hour in UTC and resets at the top of each new hour.

Typically, inputs, outputs, and metadata are send on both run creation and update events. So if a run is created and is 2.0MB in size at creation, and 3.0MB in size when updated in the same hourly window, that will count as 5.0MB of storage against this limit.

This is thrown by our application and varies by plan tier, with organizations on our Startup/Plus and Enterprise plan tiers having higher hourly limits than our Free and Developer Plan Tiers which are designed for personal use.

| Plan                             | Limit  | Window |
| -------------------------------- | ------ | ------ |
| Developer (no payment on file)   | 500MB  | 1 hour |
| Developer (with payment on file) | 2.5GB  | 1 hour |
| Startup/Plus                     | 5.0GB  | 1 hour |
| Enterprise                       | Custom | Custom |

#### Plan-level monthly unique traces limit

This 429 is the result of reaching your maximum monthly traces ingested and is evaluated in a fixed window starting at the beginning of each calendar month in UTC and resets at the beginning of each new month.

This is thrown by our application and applies only to the Developer Plan Tier when there is no payment method on file.

| Plan                           | Limit        | Window  |
| ------------------------------ | ------------ | ------- |
| Developer (no payment on file) | 5,000 traces | 1 month |

#### Self-configured monthly usage limits

This 429 is the result of reaching your usage limit as configured by your organization admin and is evaluated in a fixed window starting at the beginning of each calendar month in UTC and resets at the beginning of each new month.

This is thrown by our application and varies by organization based on their configured settings.

#### Handling 429s responses in your application

Since some 429 responses are temporary and may succeed on a successive call, if you are directly calling the LangSmith API in your application we recommend implementing retry logic with exponential backoff and jitter.

For convenience, LangChain applications built with the LangSmith SDK has this capability built-in.

<Note>
  It is important to note that if you are saturating the endpoints for extended periods of time, retries may not be effective as your application will eventually run large enough backlogs to exhaust all retries.

  If that is the case, we would like to discuss your needs more specifically. Please reach out to [LangSmith Support](mailto:support@langchain.dev) with details about your applications throughput needs and sample code and we can work with you to better understand whether the best approach is fixing a bug, changes to your application code, or a different LangSmith plan.
</Note>

### Usage Limits

LangSmith lets you configure usage limits on tracing. Note that these are *usage* limits, not *spend* limits, which mean they let you limit the quantity of occurrences of some event rather than the total amount you will spend.

LangSmith lets you set two different monthly limits, mirroring our Billable Metrics discussed in the aforementioned data retention guide:

* All traces limit
* Extended data retention traces limit

These let you limit the number of total traces, and extended data retention traces respectively.

#### Properties of usage limiting

Usage limiting is approximate, meaning that we do not guarantee the exactness of the limit. In rare cases, there may be a small period of time where additional traces are processed above the limit threshold before usage limiting begins to apply.

#### Side effects of extended data retention traces limit

The extended data retention traces limit has side effects. If the limit is already reached, any feature that could cause an auto-upgrade of tracing tiers becomes inaccessible. This is because an auto-upgrade of a trace would cause another extended retention trace to be created, which in turn should not be allowed by the limit. Therefore, you can no longer:

1. match run rules
2. add feedback to traces
3. add runs to annotation queues

Each of these features may cause an auto upgrade, so we shut them off when the limit is reached.

#### Updating usage limits

Usage limits can be updated from the `Settings` page under `Usage and Billing`. Limit values are cached, so it may take a minute or two before the new limits apply.

### Related content

* Tutorial on how to [optimize spend](/langsmith/billing#optimize-your-tracing-spend)

## Additional Resources

* **[Release Versions](/langsmith/release-versions)**: Learn about LangSmith's version support policy, including Active, Critical, End of Life, and Deprecated support levels.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/administration-overview.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Set up Agent Auth (Beta)
Source: https://docs.langchain.com/langsmith/agent-auth

Enable secure access from agents to any system using OAuth 2.0 credentials with Agent Auth.

<Note>Agent Auth is in **Beta** and under active development. To provide feedback or use this feature, reach out to the [LangChain team](https://forum.langchain.com/c/help/langsmith/).</Note>

## Installation

Install the Agent Auth client library from PyPI:

<CodeGroup>
  ```bash pip theme={null}
  pip install langchain-auth
  ```

  ```bash uv theme={null}
  uv add langchain-auth
  ```
</CodeGroup>

## Quickstart

### 1. Initialize the client

```python  theme={null}
from langchain_auth import Client

client = Client(api_key="your-langsmith-api-key")
```

### 2. Set up OAuth providers

Before agents can authenticate, you need to configure an OAuth provider using the following process:

1. Select a unique identifier for your OAuth provider to use in LangChain's platform (e.g., "github-local-dev", "google-workspace-prod").

2. Go to your OAuth provider's developer console and create a new OAuth application.

3. Set LangChain's API as an available callback URL using this structure:
   ```
   https://api.host.langchain.com/v2/auth/callback/{provider_id}
   ```
   For example, if your provider\_id is "github-local-dev", use:
   ```
   https://api.host.langchain.com/v2/auth/callback/github-local-dev
   ```

4. Use `client.create_oauth_provider()` with the credentials from your OAuth app:

```python  theme={null}
new_provider = await client.create_oauth_provider(
    provider_id="{provider_id}", # Provide any unique ID. Not formally tied to the provider.
    name="{provider_display_name}", # Provide any display name
    client_id="{your_client_id}",
    client_secret="{your_client_secret}",
    auth_url="{auth_url_of_your_provider}",
    token_url="{token_url_of_your_provider}",
)
```

### 3. Authenticate from an agent

The client `authenticate()` API is used to get OAuth tokens from pre-configured providers. On the first call, it takes the caller through an OAuth 2.0 auth flow.

#### In LangGraph context

By default, tokens are scoped to the calling agent using the Assistant ID parameter.

```python  theme={null}
auth_result = await client.authenticate(
    provider="{provider_id}",
    scopes=["scopeA"],
    user_id="your_user_id" # Any unique identifier to scope this token to the human caller
)

# Or if you'd like a token that can be used by any agent, set agent_scoped=False
auth_result = await client.authenticate(
    provider="{provider_id}",
    scopes=["scopeA"],
    user_id="your_user_id",
    agent_scoped=False
)
```

During execution, if authentication is required, the SDK will throw an [interrupt](https://langchain-ai.github.io/langgraph/how-tos/human_in_the_loop/add-human-in-the-loop/#pause-using-interrupt). The agent execution pauses and presents the OAuth URL to the user:

<img src="https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/langgraph-auth-interrupt.png?fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=94f84dd7ec822ca69f9a27b4458dca9f" alt="Studio interrupt showing OAuth URL" data-og-width="1197" width="1197" data-og-height="530" height="530" data-path="images/langgraph-auth-interrupt.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/langgraph-auth-interrupt.png?w=280&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=8e2f6ddeb7ae2b7e3f349a23ed69270a 280w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/langgraph-auth-interrupt.png?w=560&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=ed5f6697e44784a6a937f6bfd3248780 560w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/langgraph-auth-interrupt.png?w=840&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=bb34295ee4128adb77cdf6dd1a76d88a 840w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/langgraph-auth-interrupt.png?w=1100&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=09df9030e048467ca35ab70bf73b2272 1100w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/langgraph-auth-interrupt.png?w=1650&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=ebfe20351ac52045b30713007da5ba61 1650w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/langgraph-auth-interrupt.png?w=2500&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=ff3b2fcebfdb6fc76e7269d8aef34077 2500w" />

After the user completes OAuth authentication and we receive the callback from the provider, they will see the auth success page.

<img src="https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/github-auth-success.png?fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=72e6492f074507bc8888804066205fcb" alt="GitHub OAuth success page" data-og-width="447" width="447" data-og-height="279" height="279" data-path="images/github-auth-success.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/github-auth-success.png?w=280&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=031b2f9d30e4da4240059cb25fba6d15 280w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/github-auth-success.png?w=560&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=eb4d01516b4691158a47a8b2632d22e3 560w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/github-auth-success.png?w=840&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=e49b04f99e4c2f485769443da039bca1 840w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/github-auth-success.png?w=1100&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=930aee5e270d2fcb4d6bdfb001150d81 1100w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/github-auth-success.png?w=1650&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=3b5dc841251462c3ed140800564c0ad8 1650w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/images/github-auth-success.png?w=2500&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=0e53fbc4c56b16bf1db88c98ab2e631d 2500w" />

The agent then resumes execution from the point it left off at, and the token can be used for any API calls. We store and refresh OAuth tokens so that future uses of the service by either the user or agent do not require an OAuth flow.

```python  theme={null}
token = auth_result.token
```

#### Outside LangGraph context

Provide the `auth_url` to the user for out-of-band OAuth flows.

```python  theme={null}
# Default: user-scoped token (works for any agent under this user)
auth_result = await client.authenticate(
    provider="{provider_id}",
    scopes=["scopeA"],
    user_id="your_user_id"
)

if auth_result.needs_auth:
    print(f"Complete OAuth at: {auth_result.auth_url}")
    # Wait for completion
    completed_auth = await client.wait_for_completion(auth_result.auth_id)
    token = completed_auth.token
else:
    token = auth_result.token
```

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-auth.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Agent Builder
Source: https://docs.langchain.com/langsmith/agent-builder



<Callout icon="wand-magic-sparkles" color="#2563EB" iconType="regular">
  Agent Builder is in Beta.
</Callout>

Agent Builder lets you turn natural-language ideas into production agents. It's powered by [deep-agents](https://github.com/langchain-ai/deepagents), and is not <Tooltip tip="Predetermined code paths that are designed to operate in a certain order.">workflow based</Tooltip>.

## Memory and updates

Agent Builder includes persistent agent memory and supports self-updates. This lets agents adapt over time and refine how they work without manual edits.

* Persistent memory: Agents retain relevant information across runs to inform future decisions.
* What can be updated: Tools (add, remove, or reconfigure), and instructions/system prompts.
* Agents cannot modify their name, description, and/or triggers attached.

## Triggers

Triggers define when your agent should start running. You can connect your agent to external tools or time-based schedules, letting it respond automatically to messages, emails, or recurring events.

The following examples show some of the apps you can use to trigger your agent:

<CardGroup cols={3}>
  <Card title="Slack" icon="slack">
    Activate your agent when messages are received in specific Slack channels.
  </Card>

  <Card title="Gmail" icon="envelope">
    Trigger your agent when emails are received.
  </Card>

  <Card title="Cron schedules" icon="clock">
    Run your agent on a time-based schedule for recurring tasks.
  </Card>
</CardGroup>

## Sub-agents

Agent Builder lets you create sub-agents within a main agent. Sub-agents are smaller, specialized agents that handle specific parts of a larger task. They can operate with their own tools, permissions, or goals while coordinating with the main agent.

Using sub-agents makes it easier to build complex systems by dividing work into focused, reusable components. This modular approach helps keep your agents organized, scalable, and easier to maintain.

Below are a few ways sub-agents can be used in your projects:

* Handle distinct parts of a broader workflow (for example, data retrieval, summarization, or formatting).
* Use different tools or context windows for specialized tasks.
* Run independently but report results back to the main agent.

## Human in the loop

Human-in-the-loop functionality allows you to review and approve agent actions before they execute, giving you control over critical decisions.

### Enabling interrupts

<Steps>
  <Step title="Select a tool">
    When configuring your agent in Agent Builder, select the tool you want to add human oversight to.
  </Step>

  <Step title="Enable interrupts">
    Look for the interrupt option when selecting the tool and toggle it on.
  </Step>

  <Step title="Agent pauses for approval">
    The agent will pause and wait for human approval before executing that tool.
  </Step>
</Steps>

### Actions on interrupts

When your agent reaches an interrupt point, you can take one of three actions:

<CardGroup cols={3}>
  <Card title="Accept" icon="check">
    Approve the agent's proposed action and allow it to proceed as planned.
  </Card>

  <Card title="Edit" icon="pen-to-square">
    Modify the agent's message or parameters before allowing it to continue.
  </Card>

  <Card title="Send feedback" icon="comment">
    Provide feedback to the agent.
  </Card>
</CardGroup>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-builder.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# LangSmith Tool Server
Source: https://docs.langchain.com/langsmith/agent-builder-mcp-framework



The LangSmith Tool Server is our MCP Framework that powers the tools available in the LangSmith Agent Builder. This framework enables you to build and deploy custom tools that can be integrated with your agents. It provides a standardized way to create, deploy, and manage tools with built-in authentication and authorization.

The PyPi package that defines the framework is available [here](https://pypi.org/project/langsmith-tool-server/).

## Quick start

Install the LangSmith Tool Server and LangChain CLI:

```bash  theme={null}
pip install langsmith-tool-server
pip install langchain-cli-v2
```

Create a new toolkit:

```bash  theme={null}
langchain tools new my-toolkit
cd my-toolkit
```

This creates a toolkit with the following structure:

```
my-toolkit/
├── pyproject.toml
├── toolkit.toml
└── my_toolkit/
    ├── __init__.py
    ├── auth.py
    └── tools/
        ├── __init__.py
        └── ...
```

Define your tools using the `@tool` decorator:

```python  theme={null}
from langsmith_tool_server import tool

@tool
def hello(name: str) -> str:
    """Greet someone by name."""
    return f"Hello, {name}!"

@tool
def add(x: int, y: int) -> int:
    """Add two numbers."""
    return x + y

TOOLS = [hello, add]
```

Run the server:

```bash  theme={null}
langchain tools serve
```

Your tool server will start on `http://localhost:8000`.

## Simple client example

Here's a simple example that lists available tools and calls the `add` tool:

```python  theme={null}
import asyncio
import aiohttp

async def mcp_request(url: str, method: str, params: dict = None):
    async with aiohttp.ClientSession() as session:
        payload = {"jsonrpc": "2.0", "method": method, "params": params or {}, "id": 1}
        async with session.post(f"{url}/mcp", json=payload) as response:
            return await response.json()

async def main():
    url = "http://localhost:8000"

    tools = await mcp_request(url, "tools/list")
    print(f"Tools: {tools}")

    result = await mcp_request(url, "tools/call", {"name": "add", "arguments": {"a": 5, "b": 3}})
    print(f"Result: {result}")

asyncio.run(main())
```

## Adding OAuth authentication

For tools that need to access third-party APIs (like Google, GitHub, Slack, etc.), you can use OAuth authentication with [Agent Auth](/langsmith/agent-auth).

Before using OAuth in your tools, you'll need to configure an OAuth provider in your LangSmith workspace settings. See the [Agent Auth documentation](/langsmith/agent-auth) for setup instructions.

Once configured, specify the `auth_provider` in your tool decorator:

```python  theme={null}
from langsmith_tool_server import tool, Context
from google.oauth2.credentials import Credentials
from googleapiclient.discovery import build

@tool(
    auth_provider="google",
    scopes=["https://www.googleapis.com/auth/gmail.readonly"],
    integration="gmail"
)
async def read_emails(context: Context, max_results: int = 10) -> str:
    """Read recent emails from Gmail."""
    credentials = Credentials(token=context.token)
    service = build('gmail', 'v1', credentials=credentials)
    # ... Gmail API calls
    return f"Retrieved {max_results} emails"
```

Tools with `auth_provider` must:

* Have `context: Context` as the first parameter
* Specify at least one scope
* Use `context.token` to make authenticated API calls

## Using as an MCP gateway

The LangSmith Tool Server can act as an MCP gateway, aggregating tools from multiple MCP servers into a single endpoint. Configure MCP servers in your `toolkit.toml`:

```toml  theme={null}
[toolkit]
name = "my-toolkit"
tools = "./my_toolkit/__init__.py:TOOLS"

[[mcp_servers]]
name = "weather"
transport = "streamable_http"
url = "http://localhost:8001/mcp/"

[[mcp_servers]]
name = "math"
transport = "stdio"
command = "python"
args = ["-m", "mcp_server_math"]
```

All tools from connected MCP servers are exposed through your server's `/mcp` endpoint. MCP tools are prefixed with their server name to avoid conflicts (e.g., `weather.get_forecast`, `math.add`).

## Custom authentication

Custom authentication allows you to validate requests and integrate with your identity provider. Define an authentication handler in your `auth.py` file:

```python  theme={null}
from langsmith_tool_server import Auth

auth = Auth()

@auth.authenticate
async def authenticate(authorization: str = None) -> dict:
    """Validate requests and return user identity."""
    if not authorization or not authorization.startswith("Bearer "):
        raise auth.exceptions.HTTPException(
            status_code=401,
            detail="Unauthorized"
        )

    token = authorization.replace("Bearer ", "")
    # Validate token with your identity provider
    user = await verify_token_with_idp(token)

    return {"identity": user.id}
```

The handler runs on every request and must return a dict with `identity` (and optionally `permissions`).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-builder-mcp-framework.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Agent Builder setup
Source: https://docs.langchain.com/langsmith/agent-builder-setup

Add required workspace secrets for models and tools used by Agent Builder.

This page lists the workspace secrets you need to add before using Agent Builder. Add these in your LangSmith workspace settings under Secrets. Keep values scoped to your workspace and avoid placing credentials in prompts or code.

## How to add workspace secrets

In the [LangSmith UI](https://smith.langchain.com), ensure that your Anthropic API key is set as a [workspace secret](/langsmith/administration-overview#workspace-secrets).

1. Navigate to <Icon icon="gear" /> **Settings** and then move to the **Secrets** tab.
2. Select **Add secret** and enter the `ANTHROPIC_API_KEY` and your API key as the **Value**.
3. Select **Save secret**.

<Note> When adding workspace secrets in the LangSmith UI, make sure the secret keys match the environment variable names expected by your model provider.</Note>

## Required model key

* `ANTHROPIC_API_KEY`: Required for Agent Builder models. The agent graphs load this key from workspace secrets for inference.

## Optional tool keys

Add keys for any tools you enable. These are read from workspace secrets at runtime.

* `EXA_API_KEY`: Required for Exa search tools (general web and LinkedIn profile search).
* `TAVILY_API_KEY`: Required for Tavily web search.
* `TWITTER_API_KEY` and `TWITTER_API_KEY_SECRET`: Required for Twitter/X read operations (app‑only bearer). Posting/media upload is not enabled.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-builder-setup.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# LangSmith Agent Builder Slack App
Source: https://docs.langchain.com/langsmith/agent-builder-slack-app

Connect the LangSmith Agent Builder to your Slack workspace to power AI agents.

The LangSmith Agent Builder Slack app integrates your agents with Slack for secure, context-aware communication inside your Slack workspace.

After installation, your agents will be able to:

* Send direct messages.
* Post to channels.
* Read thread messages.
* Reply in threads.
* Read conversation history.

## How to install

To install the LangSmith Agent Builder for Slack:

1. Navigate to Agent Builder in your [LangSmith workspace](https://smith.langchain.com).
2. Create or edit an agent.
3. Add Slack as a trigger or enable Slack tools.
4. When prompted, authorize the Slack connection.
5. Follow the OAuth flow to grant permissions to your Slack workspace.

The app will be installed automatically when you complete the authorization.

## Permissions

The LangSmith Agent Builder requires the following permissions to your Slack workspace:

* **Send messages** - Send direct messages and post to channels
* **Read messages** - Read channel history and thread messages
* **View channels** - Access basic channel information
* **View users** - Look up user information for messaging

These permissions enable agents to communicate effectively within your Slack workspace.

## Privacy policy

The LangSmith Agent Builder Slack app collects, manages, and stores third-party data in accordance with our privacy policy. For full details on how your data is handled, please see [our privacy policy](https://www.langchain.com/privacy-policy).

## AI components and disclaimers

The LangSmith Agent Builder uses large language models (LLMs) to power AI agents that interact with users in Slack. While these models are powerful, they have the potential to generate inaccurate responses, summaries, or other outputs.

### What you should know

* **AI-generated content**: All responses from agents are generated by AI and may contain errors or inaccuracies. Always verify important information.
* **Data usage**: Slack data is not used to train LLMs. Your workspace data remains private and is only used to provide agent functionality.
* **Transparency**: The Agent Builder is transparent about the actions it will take once added to your workspace, as outlined in the permissions section above.

### Technical details

The Agent Builder uses the following approach to AI:

* **Model**: Uses LLMs provided through the LangSmith platform
* **Data retention**: User data is retained according to LangSmith's data retention policies
* **Data tenancy**: Data is handled according to your LangSmith organization settings
* **Data residency**: Data residency follows your LangSmith configuration

For more information about AI safety and best practices, see the [Agent Builder documentation](/langsmith/agent-builder).

## Pricing

The LangSmith Agent Builder Slack app itself does not have any direct pricing. However, agent runs and traces are billed through the [LangSmith platform](https://smith.langchain.com) according to your organization's plan.

For current pricing information, see the [LangSmith pricing page](https://www.langchain.com/pricing).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-builder-slack-app.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Supported tools
Source: https://docs.langchain.com/langsmith/agent-builder-tools



Use these built-in tools to give your agents access to email, calendars, chat, project management, search, social, and general web utilities.

<Note> Google, Slack, Linear, and LinkedIn use OAuth. Exa, Tavily, and Twitter/X use workspace secrets.</Note>

<CardGroup cols={2}>
  <Card title="Gmail" icon="google">
    Read and send email

    <ul>
      <li>Read emails (optionally include body, filter with search)</li>
      <li>Send email or reply to an existing message</li>
      <li>Create draft emails</li>
      <li>Mark messages as read</li>
      <li>Get a conversation thread</li>
      <li>Apply or create labels</li>
      <li>List mailbox labels</li>
    </ul>
  </Card>

  <Card title="Slack" icon="slack">
    Send and read messages

    <ul>
      <li>Send a direct message to a user</li>
      <li>Post a message to a channel</li>
      <li>Reply in a thread</li>
      <li>Read channel history</li>
      <li>Read thread messages</li>
    </ul>
  </Card>

  <div style={{ position: 'relative' }}>
    <Card title="Search" icon="magnifying-glass">
      <ul>
        <li>Exa web search (optionally fetch page contents)</li>
        <li>Exa LinkedIn profile search</li>
        <li>Tavily web search</li>
      </ul>
    </Card>

    <div style={{ position: 'absolute', top: 16, right: 16 }}>
      <Tooltip tip="Exa: EXA_API_KEY; Tavily: TAVILY_API_KEY">
        <Icon icon="key" size={16} />
      </Tooltip>
    </div>
  </div>

  <Card title="LinkedIn" icon="linkedin">
    Post to profile

    <ul>
      <li>Publish a post with optional image or link</li>
    </ul>
  </Card>

  <Card title="Google Calendar" icon="google">
    Manage events

    <ul>
      <li>List events for a date</li>
      <li>Get event details</li>
      <li>Create new events</li>
    </ul>
  </Card>

  <Card title="Linear" icon="list-check">
    Manage issues and teams

    <ul>
      <li>List teams and team members</li>
      <li>List issues with filters</li>
      <li>Get issue details</li>
      <li>Create, update, or delete issues</li>
    </ul>
  </Card>

  <div style={{ position: 'relative' }}>
    <Card title="Twitter/X" icon="twitter">
      <ul>
        <li>Read a tweet by ID</li>
        <li>Read recent posts from a list</li>
      </ul>
    </Card>

    <div style={{ position: 'absolute', top: 16, right: 16 }}>
      <Tooltip tip="Required keys: TWITTER_API_KEY, TWITTER_API_KEY_SECRET">
        <Icon icon="key" size={16} />
      </Tooltip>
    </div>
  </div>

  <Card title="Web utilities" icon="globe">
    <ul>
      <li>Read webpage text content</li>
      <li>Extract image URLs and metadata</li>
      <li>Notify user (for confirmations/updates)</li>
    </ul>
  </Card>
</CardGroup>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-builder-tools.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Agent Server
Source: https://docs.langchain.com/langsmith/agent-server



LangSmith Deployment's **Agent Server** offers an API for creating and managing agent-based applications. It is built on the concept of [assistants](/langsmith/assistants), which are agents configured for specific tasks, and includes built-in [persistence](/oss/python/langgraph/persistence#memory-store) and a **task queue**. This versatile API supports a wide range of agentic application use cases, from background processing to real-time interactions.

Use Agent Server to create and manage [assistants](/langsmith/assistants), [threads](/oss/python/langgraph/persistence#threads), [runs](/langsmith/assistants#execution), [cron jobs](/langsmith/cron-jobs), [webhooks](/langsmith/use-webhooks), and more.

<Tip>
  **API reference**<br />
  For detailed information on the API endpoints and data models, refer to the [API reference docs](https://langchain-ai.github.io/langgraph/cloud/reference/api/api_ref.html).
</Tip>

To use the Enterprise version of the Agent Server, you must acquire a license key that you will need to specify when running the Docker image. To acquire a license key, [contact our sales team](https://www.langchain.com/contact-sales).

You can run the Enterprise version of the Agent Server on the following LangSmith [platform](/langsmith/platform-setup) options:

* [Cloud](/langsmith/cloud)
* [Hybrid](/langsmith/hybrid)
* [Self-hosted](/langsmith/self-hosted)

## Application structure

To deploy an Agent Server application, you need to specify the graph(s) you want to deploy, as well as any relevant configuration settings, such as dependencies and environment variables.

Read the [application structure](/langsmith/application-structure) guide to learn how to structure your LangGraph application for deployment.

## Parts of a deployment

When you deploy Agent Server, you are deploying one or more [graphs](#graphs), a database for [persistence](/oss/python/langgraph/persistence), and a task queue.

### Graphs

When you deploy a graph with Agent Server, you are deploying a "blueprint" for an [Assistant](/langsmith/assistants).

An [Assistant](/langsmith/assistants) is a graph paired with specific configuration settings. You can create multiple assistants per graph, each with unique settings to accommodate different use cases
that can be served by the same graph.

Upon deployment, Agent Server will automatically create a default assistant for each graph using the graph's default configuration settings.

<Note>
  We often think of a graph as implementing an [agent](/oss/python/langgraph/workflows-agents), but a graph does not necessarily need to implement an agent. For example, a graph could implement a simple
  chatbot that only supports back-and-forth conversation, without the ability to influence any application control flow. In reality, as applications get more complex, a graph will often implement a more complex flow that may use [multiple agents](/oss/python/langchain/multi-agent) working in tandem.
</Note>

### Persistence and task queue

Agent Server leverages a database for [persistence](/oss/python/langgraph/persistence) and a task queue.

[PostgreSQL](https://www.postgresql.org/) is supported as a database for Agent Server and [Redis](https://redis.io/) as the task queue.

If you're deploying using [LangSmith cloud](/langsmith/cloud), these components are managed for you. If you're deploying Agent Server on your [own infrastructure](/langsmith/self-hosted), you'll need to set up and manage these components yourself.

For more information on how these components are set up and managed, review the [hosting options](/langsmith/platform-setup) guide.

## Learn more

* [Application Structure](/langsmith/application-structure) guide explains how to structure your application for deployment.
* The [API Reference](https://langchain-ai.github.io/langgraph/cloud/reference/api/api_ref.html) provides detailed information on the API endpoints and data models.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-server.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Agent Server
Source: https://docs.langchain.com/langsmith/agent-server-api-ref



***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-server-api-ref.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Agent Server changelog
Source: https://docs.langchain.com/langsmith/agent-server-changelog



[Agent Server](/langsmith/agent-server) is an API platform for creating and managing agent-based applications. It provides built-in persistence, a task queue, and supports deploying, configuring, and running assistants (agentic workflows) at scale. This changelog documents all notable updates, features, and fixes to Agent Server releases.

<a id="2025-11-21" />

## v0.5.24

* Added executor metrics for Datadog and enhanced core stream API metrics for better performance tracking.
* Disabled Redis Go maintenance notifications to prevent startup errors with unsupported commands in Redis versions below 8.

<a id="2025-11-20" />

## v0.5.20

* Resolved an error in the executor service that occurred when handling large messages.

<a id="2025-11-19" />

## v0.5.19

* Upgraded built-in langchain-core to version 1.0.7 to address a prompt formatting vulnerability.

<a id="2025-11-19" />

## v0.5.18

* Introduced persistent cron threads with `on_run_completed: {keep,delete}` for enhanced cron management and retrieval options.

<a id="2025-11-19" />

## v0.5.17

* Enhanced task handling to support multiple interrupts, aligning with open-source functionality.

<a id="2025-11-18" />

## v0.5.15

* Added custom JSON unmarshalling for `Resume` and `Goto` commands to fix map-style null resume interpretation issues.

<a id="2025-11-14" />

## v0.5.14

* Ensured `pg make start` command functions correctly with core-api enabled.

<a id="2025-11-13" />

## v0.5.13

* Support `include` and `exclude` (plural form key for `includes` and `excludes`) since a doc incorrectly claimed support for that. Now the server accepts either.

<a id="2025-11-10" />

## v0.5.11

* Ensured auth handlers are applied consistently when streaming threads, aligning with recent security practices.
* Bumped `undici` dependency from version 6.21.3 to 7.16.0, introducing various performance improvements and bug fixes.
* Updated `p-queue` from version 8.0.1 to 9.0.0, introducing new features and breaking changes, including the removal of the `throwOnTimeout` option.

<a id="2025-11-10" />

## v0.5.10

* Implemented healthcheck calls in the queue /ok handler to improve Kubernetes liveness and readiness probe compatibility.

<a id="2025-11-09" />

## v0.5.9

* Resolved an issue causing an "unbound local error" for the `elapsed` variable during a SIGINT interruption.
* Mapped the "interrupted" status to A2A's "input-required" status for better task status alignment.

<a id="2025-11-07" />

## v0.5.8

* Ensured environment variables are passed as a dictionary when starting langgraph-ui for compatibility with `uvloop`.
* Implemented CRUD operations for runs in Go, simplifying JSON merges and improving transaction readability, with PostgreSQL as a reference.

<a id="2025-11-07" />

## v0.5.7

* Replaced no-retry Redis client with a retry client to handle connection errors more effectively and reduced corresponding logging severity.

<a id="2025-11-06" />

## v0.5.6

* Added pending time metrics to provide better insights into task waiting times.
* Replaced `pb.Value` with `ChannelValue` to streamline code structure.

<a id="2025-11-05" />

## v0.5.5

* Made the Redis `health_check_interval` more frequent and configurable for better handling of idle connections.

<a id="2025-11-05" />

## v0.5.4

* Implemented `ormsgpack` with `OPT_REPLACE_SURROGATES` and updated for compatibility with the latest FastAPI release affecting custom authentication dependencies.

<a id="2025-11-03" />

## v0.5.2

* Added retry logic for PostgreSQL connections during startup to enhance deployment reliability and improved error logging for easier debugging.

<a id="2025-11-03" />

## v0.5.1

* Resolved an issue where persistence was not functioning correctly with LangChain.js's createAgent feature.
* Optimized assistants CRUD performance by improving database connection pooling and gRPC client reuse, reducing latency for large payloads.

<a id="2025-10-31" />

## v0.5.0

* Updated dependency requirements to support the latest security patch, removed JSON fallback for serialization, and adjusted deserialization behavior for enhanced security.

<a id="2025-10-29" />

## v0.4.47

* Validated and auto-corrected environment configuration types using TypeAdapter.
* Added support for LangChain.js and LangGraph.js version 1.x, ensuring compatibility.
* Updated hono library from version 4.9.7 to 4.10.3, addressing a CORS middleware security issue and enhancing JWT audience validation.
* Introduced a modular benchmark framework, adding support for assistants and streams, with improvements to the existing ramp benchmark methodology.
* Introduced a gRPC API for core threads CRUD operations, with updated Python and TypeScript clients.
* Updated `hono` package from version 4.9.7 to 4.10.2, including security improvements for JWT audience validation.
* Updated `hono` dependency from version 4.9.7 to 4.10.3 to fix a security issue and improve CORS middleware handling.
* Introduced basic CRUD operations for threads, including create, get, patch, delete, search, count, and copy, with support for Go, gRPC server, and Python and TypeScript clients.

<a id="2025-10-21" />

## v0.4.46

* Added an option to enable message streaming from subgraph events, giving users more control over event notifications.

<a id="2025-10-21" />

## v0.4.45

* Implemented support for authorization on custom routes, controlled by the `enable_custom_route_auth` flag.
* Set default tracing to off for improved performance and simplified debugging.

<a id="2025-10-18" />

## v0.4.44

* Used Redis key prefix for license-related keys to prevent conflicts with existing setups.

<a id="2025-10-16" />

## v0.4.43

* Implemented a health check for Redis connections to prevent them from idling out.

<a id="2025-10-15" />

## v0.4.40

* Prevented duplicate messages in resumable run and thread streams by addressing a race condition and adding tests to ensure consistent behavior.
* Ensured that runs don't start until the pubsub subscription is confirmed to prevent message drops on startup.
* Renamed platform from langgraph to improve clarity and branding.
* Reset PostgreSQL connections after use to prevent lock holding and improved error reporting for transaction issues.

<a id="2025-10-10" />

## v0.4.39

* Upgraded `hono` from version 4.7.6 to 4.9.7, addressing a security issue related to the `bodyLimit` middleware.
* Allowed customization of the base authentication URL to enhance flexibility.
* Pinned the 'ty' dependency to a stable version using 'uv' to prevent unexpected linting failures.

<a id="2025-10-08" />

## v0.4.38

* Replaced `LANGSMITH_API_KEY` with `LANGSMITH_CONTROL_PLANE_API_KEY` to support hybrid deployments requiring license verification.
* Introduced self-hosted log ingestion support, configurable via `SELF_HOSTED_LOGS_ENABLED` and `SELF_HOSTED_LOGS_ENDPOINT` environment variables.

<a id="2025-10-06" />

## v0.4.37

* Required create permissions for copying threads to ensure proper authorization.

<a id="2025-10-03" />

## v0.4.36

* Improved error handling and added a delay to the sweep loop for smoother operation during Redis downtime or cancellation errors.
* Updated the queue entrypoint to start the core-api gRPC server when `FF_USE_CORE_API` is enabled.
* Introduced checks for invalid configurations in assistant endpoints to ensure consistency with other endpoints.

<a id="2025-10-02" />

## v0.4.35

* Resolved a timezone issue in the core API, ensuring accurate time data retrieval.
* Introduced a new `middleware_order` setting to apply authentication middleware before custom middleware, allowing finer control over protected route configurations.
* Logged the Redis URL when errors occur during Redis client creation.
* Improved Go engine/runtime context propagation to ensure consistent execution flow.
* Removed the unnecessary `assistants.put` call from the executor entrypoint to streamline the process.

<a id="2025-10-01" />

## v0.4.34

* Blocked unauthorized users from updating thread TTL settings to enhance security.

<a id="2025-10-01" />

## v0.4.33

* Improved error handling for Redis locks by logging `LockNotOwnedError` and extending initial pool migration lock timeout to 60 seconds.
* Updated the BaseMessage schema to align with the latest langchain-core version and synchronized build dependencies for consistent local development.

<a id="2025-09-30" />

## v0.4.32

* Added a GO persistence layer to the API image, enabling GRPC server operation with PostgreSQL support and enhancing configurability.
* Set the status to error when a timeout occurs to improve error handling.

<a id="2025-09-30" />

## v0.4.30

* Added support for context when using `stream_mode="events"` and included new tests for this functionality.
* Added support for overriding the server port using `$LANGGRAPH_SERVER_PORT` and removed an unnecessary Dockerfile `ARG` for cleaner configuration.
* Applied authorization filters to all table references in thread delete CTE to enhance security.
* Introduced self-hosted metrics ingestion capability, allowing metrics to be sent to an OTLP collector every minute when the corresponding environment variables are set.
* Ensured that the `set_latest` function properly updates the name and description of the version.

<a id="2025-09-26" />

## v0.4.29

* Ensured proper cleanup of redis pubsub connections in all scenarios.

<a id="2025-09-25" />

## v0.4.28

* Added a format parameter to the queue metrics server for enhanced customization.
* Corrected `MOUNT_PREFIX` environment variable usage in CLI for consistency with documentation and to prevent confusion.
* Added a feature to log warnings when messages are dropped due to no subscribers, controllable via a feature flag.
* Added support for Bookworm and Bullseye distributions in Node images.
* Consolidated executor definitions by moving them from the `langgraph-go` repository, improving manageability and updating the checkpointer setup method for server migrations.
* Ensured correct response headers are sent for a2a, improving compatibility and communication.
* Consolidated PostgreSQL checkpoint implementation, added CI testing for the `/core` directory, fixed RemoteStore test errors, and enhanced the Store implementation with transactions.
* Added PostgreSQL migrations to the queue server to prevent errors from graphs being added before migrations are performed.

<a id="2025-09-23" />

## v0.4.27

* Replaced `coredis` with `redis-py` to improve connection handling and reliability under high traffic loads.

<a id="2025-09-22-v0.4.24" />

## v0.4.24

* Added functionality to return full message history for A2A calls in accordance with the A2A spec.
* Added a `LANGGRAPH_SERVER_HOST` environment variable to Dockerfiles to support custom host settings for dual stack mode.

<a id="2025-09-22" />

## v0.4.23

* Use a faster message codec for redis streaming.

<a id="2025-09-19" />

## v0.4.22

* Ported long-stream handling to the run stream, join, and cancel endpoints for improved stream management.

<a id="2025-09-18" />

## v0.4.21

* Added A2A streaming functionality and enhanced testing with the A2A SDK.
* Added Prometheus metrics to track language usage in graphs, middleware, and authentication for improved insights.
* Fixed bugs in Open Source Software related to message conversion for chunks.
* Removed await from pubsub subscribes to reduce flakiness in cluster tests and added retries in the shutdown suite to enhance API stability.

<a id="2025-09-11" />

## v0.4.20

* Optimized Pubsub initialization to prevent overhead and address subscription timing issues, ensuring smoother run execution.

<a id="2025-09-11" />

## v0.4.19

* Removed warnings from psycopg by addressing function checks introduced in version 3.2.10.

<a id="2025-09-11" />

## v0.4.17

* Filtered out logs with mount prefix to reduce noise in logging output.

<a id="2025-09-10" />

## v0.4.16

* Added support for implicit thread creation in a2a to streamline operations.
* Improved error serialization and emission in distributed runtime streams, enabling more comprehensive testing.

<a id="2025-09-09" />

## v0.4.13

* Monitored queue status in the health endpoint to ensure correct behavior when PostgreSQL fails to initialize.
* Addressed an issue with unequal swept ID lengths to improve log clarity.
* Enhanced streaming outputs by avoiding re-serialization of DR payloads, using msgpack byte inspection for json-like parsing.

<a id="2025-09-04" />

## v0.4.12

* Ensured metrics are returned even when experiencing database connection issues.
* Optimized update streams to prevent unnecessary data transmission.
* Upgraded `hono` from version 4.9.2 to 4.9.6 in the `storage_postgres/langgraph-api-server` for improved URL path parsing security.
* Added retries and an in-memory cache for LangSmith access calls to improve resilience against single failures.

<a id="2025-09-04" />

## v0.4.11

* Added support for TTL (time-to-live) in thread updates.

<a id="2025-09-04" />

## v0.4.10

* In distributed runtime, update serde logic for final checkpoint -> thread setting.

<a id="2025-09-02" />

## v0.4.9

* Added support for filtering search results by IDs in the search endpoint for more precise queries.
* Included configurable headers for assistant endpoints to enhance request customization.
* Implemented a simple A2A endpoint with support for agent card retrieval, task creation, and task management.

<a id="2025-08-30" />

## v0.4.7

* Stopped the inclusion of x-api-key to enhance security.

<a id="2025-08-29" />

## v0.4.6

* Fixed a race condition when joining streams, preventing duplicate start events.

<a id="2025-08-29" />

## v0.4.5

* Ensured the checkpointer starts and stops correctly before and after the queue to improve shutdown and startup efficiency.
* Resolved an issue where workers were being prematurely cancelled when the queue was cancelled.
* Prevented queue termination by adding a fallback for cases when Redis fails to wake a worker.

<a id="2025-08-28" />

## v0.4.4

* Set the custom auth thread\_id to None for stateless runs to prevent conflicts.
* Improved Redis signaling in the Go runtime by adding a wakeup worker and Redis lock implementation, and updated sweep logic.

<a id="2025-08-27" />

## v0.4.3

* Added stream mode to thread stream for improved data processing.
* Added a durability parameter to runs for improved data persistence.

<a id="2025-08-27" />

## v0.4.2

* Ensured pubsub is initialized before creating a run to prevent errors from missing messages.

<a id="2025-08-25" />

## v0.4.0

* Emitted attempt messages correctly within the thread stream.
* Reduced cluster conflicts by using only the thread ID for hashing in cluster mapping, prioritizing efficiency with stream\_thread\_cache.
* Introduced a stream endpoint for threads to track all outputs across sequentially executed runs.
* Made the filter query builder in PostgreSQL more robust against malformed expressions and improved validation to prevent potential security risks.

<a id="2025-08-25" />

## v0.3.4

* Added custom Prometheus metrics for Redis/PG connection pools and switched the queue server to Uvicorn/Starlette for improved monitoring.
* Restored Wolfi image build by correcting shell command formatting and added a Makefile target for testing with nginx.

<a id="2025-08-22" />

## v0.3.3

* Added timeouts to specific Redis calls to prevent workers from being left active.
* Updated the Golang runtime and added pytest skips for unsupported functionalities, including initial support for passing store to node and message streaming.
* Introduced a reverse proxy setup for serving combined Python and Node.js graphs, with nginx handling server routing, to facilitate a Postgres/Redis backend for the Node.js API server.

<a id="2025-08-21" />

## v0.3.1

* Added a statement timeout to the pool to prevent long-running queries.

<a id="2025-08-21" />

## v0.3.0

* Set a default 15-minute statement timeout and implemented monitoring for long-running queries to ensure system efficiency.
* Stop propagating run configurable values to the thread configuration, because this can cause issues on subsequent runs if you are specifying a checkpoint\_id. This is a **slight breaking change** in behavior, since the thread value will no longer automatically reflect the unioned configuration of the most recent run. We believe this behavior is more intuitive, however.
* Enhanced compatibility with older worker versions by handling event data in channel names within ops.py.

<a id="2025-08-20" />

## v0.2.137

* Fixed an unbound local error and improved logging for thread interruptions or errors, along with type updates.

<a id="2025-08-20" />

## v0.2.136

* Added enhanced logging to aid in debugging metaview issues.
* Upgraded executor and runtime to the latest version for improved performance and stability.

<a id="2025-08-19" />

## v0.2.135

* Ensured async coroutines are properly awaited to prevent potential runtime errors.

<a id="2025-08-18" />

## v0.2.134

* Enhanced search functionality to improve performance by allowing users to select specific columns for query results.

<a id="2025-08-18" />

## v0.2.133

* Added count endpoints for crons, threads, and assistants to enhance data tracking (#1132).
* Improved SSH functionality for better reliability and stability.
* Updated @langchain/langgraph-api to version 0.0.59 to fix an invalid state schema issue.

<a id="2025-08-15" />

## v0.2.132

* Added Go language images to enhance project compatibility and functionality.
* Printed internal PIDs for JS workers to facilitate process inspection via SIGUSR1 signal.
* Resolved a `run_pkey` error that occurred when attempting to insert duplicate runs.
* Added `ty run` command and switched to using uuid7 for generating run IDs.
* Implemented the initial Golang runtime to expand language support.

<a id="2025-08-14" />

## v0.2.131

* Added support for `object agent spec` with descriptions in JS.

<a id="2025-08-13" />

## v0.2.130

* Added a feature flag (FF\_RICH\_THREADS=false) to disable thread updates on run creation, reducing lock contention and simplifying thread status handling.
* Utilized existing connections for `aput` and `apwrite` operations to improve performance.
* Improved error handling for decoding issues to enhance data processing reliability.
* Excluded headers from logs to improve security while maintaining runtime functionality.
* Fixed an error that prevented mapping slots to a single node.
* Added debug logs to track node execution in JS deployments for improved issue diagnosis.
* Changed the default multitask strategy to enqueue, improving throughput by eliminating the need to fetch inflight runs during new run insertions.
* Optimized database operations for `Runs.next` and `Runs.sweep` to reduce redundant queries and improve efficiency.
* Improved run creation speed by skipping unnecessary inflight runs queries.

<a id="2025-08-11" />

## v0.2.129

* Stopped passing internal LGP fields to context to prevent breaking type checks.
* Exposed content-location headers to ensure correct resumability behavior in the API.

<a id="2025-08-08" />

## v0.2.128

* Ensured synchronized updates between `configurable` and `context` in assistants, preventing setup errors and supporting smoother version transitions.

<a id="2025-08-08" />

## v0.2.127

* Excluded unrequested stream modes from the resumable stream to optimize functionality.

<a id="2025-08-08" />

## v0.2.126

* Made access logger headers configurable to enhance logging flexibility.
* Debounced the Runs.stats function to reduce the frequency of expensive calls and improve performance.
* Introduced debouncing for sweepers to enhance performance and efficiency (#1147).
* Acquired a lock for TTL sweeping to prevent database spamming during scale-out operations.

<a id="2025-08-06" />

## v0.2.125

* Updated tracing context replicas to use the new format, ensuring compatibility.

<a id="2025-08-06" />

## v0.2.123

* Added an entrypoint to the queue replica for improved deployment management.

<a id="2025-08-06" />

## v0.2.122

* Utilized persisted interrupt status in `join` to ensure correct handling of user's interrupt state after completion.

<a id="2025-08-06" />

## v0.2.121

* Consolidated events to a single channel to prevent race conditions and optimize startup performance.
* Ensured custom lifespans are invoked on queue workers for proper setup, and added tests.

<a id="2025-08-04" />

## v0.2.120

* Restored the original streaming behavior of runs, ensuring consistent inclusion of interrupt events based on `stream_mode` settings.
* Optimized `Runs.next` query to reduce average execution time from \~14.43ms to \~2.42ms, improving performance.
* Added support for stream mode "tasks" and "checkpoints", normalized the UI namespace, and upgraded `@langchain/langgraph-api` for enhanced functionality.

<a id="2025-07-31" />

## v0.2.117

* Added a composite index on threads for faster searches with owner-based authentication and updated the default sort order to `updated_at` for improved query performance.

<a id="2025-07-31" />

## v0.2.116

* Reduced the default number of history checkpoints from 10 to 1 to optimize performance.

<a id="2025-07-31" />

## v0.2.115

* Optimized cache re-use to enhance application performance and efficiency.

<a id="2025-07-30" />

## v0.2.113

* Improved thread search pagination by updating response headers with `X-Pagination-Total` and `X-Pagination-Next` for better navigation.

<a id="2025-07-30" />

## v0.2.112

* Ensured sync logging methods are awaited and added a linter to prevent future occurrences.
* Fixed an issue where JavaScript tasks were not being populated correctly for JS graphs.

<a id="2025-07-29" />

## v0.2.111

* Fixed JS graph streaming failure by starting the heartbeat as soon as the connection opens.

<a id="2025-07-29" />

## v0.2.110

* Added interrupts as default values for join operations while preserving stream behavior.

<a id="2025-07-28" />

## v0.2.109

* Fixed an issue where config schema was missing when `config_type` was not set, ensuring more reliable configurations.

<a id="2025-07-28" />

## v0.2.108

* Prepared for LangGraph v0.6 compatibility with new context API support and bug fixes.

<a id="2025-07-27" />

## v0.2.107

* Implemented caching for authentication processes to enhance performance and efficiency.
* Optimized database performance by merging count and select queries.

<a id="2025-07-27" />

## v0.2.106

* Made log streams resumable, enhancing reliability and improving user experience when reconnecting.

<a id="2025-07-27" />

## v0.2.105

* Added a heapdump endpoint to save memory heap information to a file.

<a id="2025-07-25" />

## v0.2.103

* Used the correct metadata endpoint to resolve issues with data retrieval.

<a id="2025-07-24" />

## v0.2.102

* Captured interrupt events in the wait method to preserve previous behavior from langgraph 0.5.0.
* Added support for SDK structlog in the JavaScript environment for enhanced logging capabilities.

<a id="2025-07-24" />

## v0.2.101

* Corrected the metadata endpoint for self-hosted deployments.

<a id="2025-07-22" />

## v0.2.99

* Improved license check by adding an in-memory cache and handling Redis connection errors more effectively.
* Reloaded assistants to preserve manually created ones while discarding those removed from the configuration file.
* Reverted changes to ensure the UI namespace for gen UI is a valid JavaScript property name.
* Ensured that the UI namespace for generated UI is a valid JavaScript property name, improving API compliance.
* Enhanced error handling to return a 422 status code for unprocessable entity requests.

<a id="2025-07-19" />

## v0.2.98

* Added context to langgraph nodes to improve log filtering and trace visibility.

<a id="2025-07-19" />

## v0.2.97

* Improved interoperability with the ckpt ingestion worker on the main loop to prevent task scheduling issues.
* Delayed queue worker startup until after migrations are completed to prevent premature execution.
* Enhanced thread state error handling by adding specific metadata and improved response codes for better clarity when state updates fail during creation.
* Exposed the interrupt ID when retrieving the thread state to improve API transparency.

<a id="2025-07-17" />

## v0.2.96

* Added a fallback mechanism for configurable header patterns to handle exclude/include settings more effectively.

<a id="2025-07-17" />

## v0.2.95

* Avoided setting the future if it is already done to prevent redundant operations.
* Resolved compatibility errors in CI by switching from `typing.TypedDict` to `typing_extensions.TypedDict` for Python versions below 3.12.

<a id="2025-07-16" />

## v0.2.94

* Improved performance by omitting pending sends for langgraph versions 0.5 and above.
* Improved server startup logs to provide clearer warnings when the DD\_API\_KEY environment variable is set.

<a id="2025-07-16" />

## v0.2.93

* Removed the GIN index for run metadata to improve performance.

<a id="2025-07-16" />

## v0.2.92

* Enabled copying functionality for blobs and checkpoints, improving data management flexibility.

<a id="2025-07-16" />

## v0.2.91

* Reduced writes to the `checkpoint_blobs` table by inlining small values (null, numeric, str, etc.). This means we don't need to store extra values for channels that haven't been updated.

<a id="2025-07-16" />

## v0.2.90

* Improve checkpoint writes via node-local background queueing.

<a id="2025-07-15" />

## v0.2.89

* Decoupled checkpoint writing from thread/run state by removing foreign keys and updated logger to prevent timeout-related failures.

<a id="2025-07-14" />

## v0.2.88

* Removed the foreign key constraint for `thread` in the `run` table to simplify database schema.

<a id="2025-07-14" />

## v0.2.87

* Added more detailed logs for Redis worker signaling to improve debugging.

<a id="2025-07-11" />

## v0.2.86

* Honored tool descriptions in the `/mcp` endpoint to align with expected functionality.

<a id="2025-07-10" />

## v0.2.85

* Added support for the `on_disconnect` field to `runs/wait` and included disconnect logs for better debugging.

<a id="2025-07-09" />

## v0.2.84

* Removed unnecessary status updates to streamline thread handling and updated version to 0.2.84.

<a id="2025-07-09" />

## v0.2.83

* Reduced the default time-to-live for resumable streams to 2 minutes.
* Enhanced data submission logic to send data to both Beacon and LangSmith instance based on license configuration.
* Enabled submission of self-hosted data to a LangSmith instance when the endpoint is configured.

<a id="2025-07-03" />

## v0.2.82

* Addressed a race condition in background runs by implementing a lock using join, ensuring reliable execution across CTEs.

<a id="2025-07-03" />

## v0.2.81

* Optimized run streams by reducing initial wait time to improve responsiveness for older or non-existent runs.

<a id="2025-07-03" />

## v0.2.80

* Corrected parameter passing in the `logger.ainfo()` API call to resolve a TypeError.

<a id="2025-07-02" />

## v0.2.79

* Fixed a JsonDecodeError in checkpointing with remote graph by correcting JSON serialization to handle trailing slashes properly.
* Introduced a configuration flag to disable webhooks globally across all routes.

<a id="2025-07-02" />

## v0.2.78

* Added timeout retries to webhook calls to improve reliability.
* Added HTTP request metrics, including a request count and latency histogram, for enhanced monitoring capabilities.

<a id="2025-07-02" />

## v0.2.77

* Added HTTP metrics to improve performance monitoring.
* Changed the Redis cache delimiter to reduce conflicts with subgraph message names and updated caching behavior.

<a id="2025-07-01" />

## v0.2.76

* Updated Redis cache delimiter to prevent conflicts with subgraph messages.

<a id="2025-06-30" />

## v0.2.74

* Scheduled webhooks in an isolated loop to ensure thread-safe operations and prevent errors with PYTHONASYNCIODEBUG=1.

<a id="2025-06-27" />

## v0.2.73

* Fixed an infinite frame loop issue and removed the dict\_parser due to structlog's unexpected behavior.
* Throw a 409 error on deadlock occurrence during run cancellations to handle lock conflicts gracefully.

<a id="2025-06-27" />

## v0.2.72

* Ensured compatibility with future langgraph versions.
* Implemented a 409 response status to handle deadlock issues during cancellation.

<a id="2025-06-26" />

## v0.2.71

* Improved logging for better clarity and detail regarding log types.

<a id="2025-06-26" />

## v0.2.70

* Improved error handling to better distinguish and log TimeoutErrors caused by users from internal run timeouts.

<a id="2025-06-26" />

## v0.2.69

* Added sorting and pagination to the crons API and updated schema definitions for improved accuracy.

<a id="2025-06-26" />

## v0.2.66

* Fixed a 404 error when creating multiple runs with the same thread\_id using `on_not_exist="create"`.

<a id="2025-06-25" />

## v0.2.65

* Ensured that only fields from `assistant_versions` are returned when necessary.
* Ensured consistent data types for in-memory and PostgreSQL users, improving internal authentication handling.

<a id="2025-06-24" />

## v0.2.64

* Added descriptions to version entries for better clarity.

<a id="2025-06-23" />

## v0.2.62

* Improved user handling for custom authentication in the JS Studio.
* Added Prometheus-format run statistics to the metrics endpoint for better monitoring.
* Added run statistics in Prometheus format to the metrics endpoint.

<a id="2025-06-20" />

## v0.2.61

* Set a maximum idle time for Redis connections to prevent unnecessary open connections.

<a id="2025-06-20" />

## v0.2.60

* Enhanced error logging to include traceback details for dictionary operations.
* Added a `/metrics` endpoint to expose queue worker metrics for monitoring.

<a id="2025-06-18" />

## v0.2.57

* Removed CancelledError from retriable exceptions to allow local interrupts while maintaining retriability for workers.
* Introduced middleware to gracefully shut down the server after completing in-flight requests upon receiving a SIGINT.
* Reduced metadata stored in checkpoint to only include necessary information.
* Improved error handling in join runs to return error details when present.

<a id="2025-06-17" />

## v0.2.56

* Improved application stability by adding a handler for SIGTERM signals.

<a id="2025-06-17" />

## v0.2.55

* Improved the handling of cancellations in the queue entrypoint.
* Improved cancellation handling in the queue entry point.

<a id="2025-06-16" />

## v0.2.54

* Enhanced error message for LuaLock timeout during license validation.
* Fixed the \$contains filter in custom auth by requiring an explicit ::text cast and updated tests accordingly.
* Ensured project and tenant IDs are formatted as UUIDs for consistency.

<a id="2025-06-13" />

## v0.2.53

* Resolved a timing issue to ensure the queue starts only after the graph is registered.
* Improved performance by setting thread and run status in a single query and enhanced error handling during checkpoint writes.
* Reduced the default background grace period to 3 minutes.

<a id="2025-06-12" />

## v0.2.52

* Now logging expected graphs when one is omitted to improve traceability.
* Implemented a time-to-live (TTL) feature for resumable streams.
* Improved query efficiency and consistency by adding a unique index and optimizing row locking.

<a id="2025-06-12" />

## v0.2.51

* Handled `CancelledError` by marking tasks as ready to retry, improving error management in worker processes.
* Added LG API version and request ID to metadata and logs for better tracking.
* Added LG API version and request ID to metadata and logs to improve traceability.
* Improved database performance by creating indexes concurrently.
* Ensured postgres write is committed only after the Redis running marker is set to prevent race conditions.
* Enhanced query efficiency and reliability by adding a unique index on thread\_id/running, optimizing row locks, and ensuring deterministic run selection.
* Resolved a race condition by ensuring Postgres updates only occur after the Redis running marker is set.

<a id="2025-06-07" />

## v0.2.46

* Introduced a new connection for each operation while preserving transaction characteristics in Threads state `update()` and `bulk()` commands.

<a id="2025-06-05" />

## v0.2.45

* Enhanced streaming feature by incorporating tracing contexts.
* Removed an unnecessary query from the Crons.search function.
* Resolved connection reuse issue when scheduling next run for multiple cron jobs.
* Removed an unnecessary query in the Crons.search function to improve efficiency.
* Resolved an issue with scheduling the next cron run by improving connection reuse.

<a id="2025-06-04" />

## v0.2.44

* Enhanced the worker logic to exit the pipeline before continuing when the Redis message limit is reached.
* Introduced a ceiling for Redis message size with an option to skip messages larger than 128 MB for improved performance.
* Ensured the pipeline always closes properly to prevent resource leaks.

<a id="2025-06-04" />

## v0.2.43

* Improved performance by omitting logs in metadata calls and ensuring output schema compliance in value streaming.
* Ensured the connection is properly closed after use.
* Aligned output format to strictly adhere to the specified schema.
* Stopped sending internal logs in metadata requests to improve privacy.

<a id="2025-06-04" />

## v0.2.42

* Added timestamps to track the start and end of a request's run.
* Added tracer information to the configuration settings.
* Added support for streaming with tracing contexts.

<a id="2025-06-03" />

## v0.2.41

* Added locking mechanism to prevent errors in pipelined executions.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-server-changelog.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Configure LangSmith Agent Server for scale
Source: https://docs.langchain.com/langsmith/agent-server-scale



The default configuration for LangSmith Agent Server is designed to handle substantial read and write load across a variety of different workloads. By following the best practices outlined below, you can tune your Agent Server to perform optimally for your specific workload. This page describes scaling considerations for the Agent Server and provides examples to help configure your deployment.

For some example self-hosted configurations, refer to the [Example Agent Server configurations for scale](#example-agent-server-configurations-for-scale) section.

## Scaling for write load

Write load is primarily driven by the following factors:

* Creation of new [runs](/langsmith/background-run)
* Creation of new checkpoints during run execution
* Writing to long term memory
* Creation of new [threads](/langsmith/use-threads)
* Creation of new [assistants](/langsmith/assistants)
* Deletion of runs, checkpoints, threads, assistants and cron jobs

The following components are primarily responsible for handling write load:

* API server: Handles initial request and persistence of data to the database.
* Queue worker: Handles the execution of runs.
* Redis: Handles the storage of ephemeral data about on-going runs.
* Postgres: Handles the storage of all data, including run, thread, assistant, cron job, checkpointing and long term memory.

### Best practices for scaling the write path

#### Change `N_JOBS_PER_WORKER` based on assistant characteristics

The default value of [`N_JOBS_PER_WORKER`](/langsmith/env-var#n-jobs-per-worker) is 10. You can change this value to scale the maximum number of runs that can be executed at a time by a single queue worker based on the characteristics of your assistant.

Some general guidelines for changing `N_JOBS_PER_WORKER`:

* If your assistant is CPU bounded, the default value of 10 is likely sufficient. You might lower `N_JOBS_PER_WORKER` if you notice excessive CPU usage on queue workers or delays in run execution.
* If your assistant is IO bounded, increase `N_JOBS_PER_WORKER` to handle more concurrent runs per worker.

There is no upper limit to `N_JOBS_PER_WORKER`. However, queue workers are greedy when fetching new runs, which means they will try to pick up as many runs as they have available jobs and begin executing them immediately. Setting `N_JOBS_PER_WORKER` too high in environments with bursty traffic can lead to uneven worker utilization and increased run execution times.

#### Avoid synchronous blocking operations

Avoid synchronous blocking operations in your code and prefer asynchronous operations. Long synchronous operations can block the main event loop, causing longer request and run execution times and potential timeouts.

For example, consider an application that needs to sleep for 1 second. Instead of using synchronous code like this:

```python  theme={null}
import time

def my_function():
    time.sleep(1)
```

Prefer asynchronous code like this:

```python  theme={null}
import asyncio

async def my_function():
    await asyncio.sleep(1)
```

If an assistant requires synchronous blocking operations, set [`BG_JOB_ISOLATED_LOOPS`](/langsmith/env-var#bg-job-isolated-loops) to `True` to execute each run in a separate event loop.

#### Minimize redundant checkpointing

Minimize redundant checkpointing by setting [`durability`](/oss/python/langgraph/durable-execution#durability-modes) to the minimum value necessary to ensure your data is durable.

The default durability mode is `"async", meaning checkpoints are written after each step asynchronously. If an assistant needs to persist only the final state of the run, `durability`can be set to`"exit"\`, storing only the final state of the run. This can be set when creating the run:

```python  theme={null}
from langgraph_sdk import get_client

client = get_client(url=<DEPLOYMENT_URL>)
thread = await client.threads.create()
run = await client.runs.create(
    thread_id=thread["thread_id"],
    assistant_id="agent",
    durability="exit"
)
```

#### Self-hosted

<Note>
  These settings are only required for [self-hosted](/langsmith/self-hosted) deployments. By default, [cloud](/langsmith/cloud) deployments already have these best practices enabled.
</Note>

##### Enable the use of queue workers

By default, the API server manages the queue and does not use queue workers. You can enable the use of queue workers by setting the `queue.enabled` configuration to `true`.

```yaml  theme={null}
queue:
  enabled: true
```

This will allow the API server to offload the queue management to the queue workers, significantly reducing the load on the API server and allowing it to focus on handling requests.

##### Support a number of jobs equal to expected throughput

The more runs you execute in parallel, the more jobs you will need to handle the load. There are two main parameters to scale the available jobs:

* `number_of_queue_workers`: The number of queue workers provisioned.
* `N_JOBS_PER_WORKER`: The number of runs that a single queue work can execute at a time. Defaults to 10.

You can calculate the available jobs with the following equation:

```
available_jobs = number_of_queue_workers * `N_JOBS_PER_WORKER`
```

Throughput is then the number of runs that can be executed per second by the available jobs:

```
throughput_per_second = available_jobs / average_run_execution_time_seconds
```

Therefore, the minimum number of queue workers you should provision to support your expected steady state throughput is:

```
number_of_queue_workers = throughput_per_second * average_run_execution_time_seconds / `N_JOBS_PER_WORKER`
```

##### Configure autoscaling for bursty workloads

Autoscaling is disabled by default, but should be configured for bursty workloads. Using the same calculations as the [previous section](#support-a-number-of-jobs-equal-to-expected-throughput), you can determine the maximum number of queue workers you should allow the autoscaler to scale to based on maximum expected throughput.

## Scaling for read load

Read load is primarily driven by the following factors:

* Getting the results of a [run](/langsmith/background-run)
* Getting the state of a [thread](/langsmith/use-threads)
* Searching for [runs](/langsmith/background-run), [threads](/langsmith/use-threads), [cron jobs](/langsmith/cron-jobs) and [assistants](/langsmith/assistants)
* Retrieving checkpoints and long term memory

The following components are primarily responsible for handling read load:

* API server: Handles the request and direct retrieval of data from the database.
* Postgres: Handles the storage of all data, including run, thread, assistant, cron job, checkpointing and long term memory.
* Redis: Handles the storage of ephemeral data about on-going runs, including streaming messages from queue workers to api servers.

### Best practices for scaling the read path

#### Use filtering to reduce the number of resources returned per request

[Agent Server](/langsmith/agent-server) provides a search API for each resource type. These APIs implement pagination by default and offer many filtering options. Use filtering to reduce the number of resources returned per request and improve performance.

#### Set a TTLs to automatically delete old data

Set a [TTL on threads](/langsmith/configure-ttl) to automatically clean up old data. Runs and checkpoints are automatically deleted when the associated thread is deleted.

#### Avoid polling and use /join to monitor the state of a run

Avoid polling the state of a run by using the `/join` API endpoint. This method returns the final state of the run once the run is complete.

If you need to monitor the output of a run in real-time, use the `/stream` API endpoint. This method streams the run output including the final state of the run.

#### Self-hosted

<Note>
  These settings are only required for [self-hosted](/langsmith/self-hosted) deployments. By default, [cloud](/langsmith/cloud) deployments already have these best practices enabled.
</Note>

##### Configure autoscaling for bursty workloads

Autoscaling is disabled by default, but should be configured for bursty workloads. You can determine the maximum number of api servers you should allow the autoscaler to scale to based on maximum expected throughput. The default for [cloud](/langsmith/cloud) deployments is a maximum of 10 API servers.

## Example self-hosted Agent Server configurations

<Note>
  The exact optimal configuration depends on your application complexity, request patterns, and data requirements. Use the following examples in combination with the information in the previous sections and your specific usage to update your deployment configuration as needed. If you have any questions, reach out to the LangChain team at [support@langchain.dev](mailto:support@langchain.dev).
</Note>

The following table provides an overview comparing different LangSmith Agent Server configurations for various load patterns (read requests per second / write requests per second) and standard assistant characteristics (average run execution time of 1 second, moderate CPU and memory usage):

|                                                                                                                          | **[Low / low](#low-reads-low-writes)** | **[Low / high](#low-reads-high-writes)** | **[High / low](#high-reads-low-writes)** | [Medium / medium](#medium-reads-medium-writes) | [High / high](#high-reads-high-writes) |
| :----------------------------------------------------------------------------------------------------------------------- | :------------------------------------- | :--------------------------------------- | :--------------------------------------- | :--------------------------------------------- | :------------------------------------- |
| <Tooltip tip="Number of write requests being processed by the deployment per second">Write requests per second</Tooltip> | 5                                      | 5                                        | 500                                      | 50                                             | 500                                    |
| <Tooltip tip="Number of read requests being processed by the deployment per second">Read requests per second</Tooltip>   | 5                                      | 500                                      | 5                                        | 50                                             | 500                                    |
| **API servers**<br />(1 CPU, 2Gi per server)                                                                             | 1 (default)                            | 6                                        | 10                                       | 3                                              | 15                                     |
| **Queue workers**<br />(1 CPU, 2Gi per worker)                                                                           | 1 (default)                            | 10                                       | 1 (default)                              | 5                                              | 10                                     |
| **`N_JOBS_PER_WORKER`**                                                                                                  | 10 (default)                           | 50                                       | 10                                       | 10                                             | 50                                     |
| **Redis resources**                                                                                                      | 2 Gi (default)                         | 2 Gi (default)                           | 2 Gi (default)                           | 2 Gi (default)                                 | 2 Gi (default)                         |
| **Postgres resources**                                                                                                   | 2 CPU<br />8 Gi (default)              | 4 CPU<br />16 Gi memory                  | 4 CPU<br />16 Gi                         | 4 CPU<br />16 Gi memory                        | 8 CPU<br />32 Gi memory                |

The following sample configurations enable each of these setups. Load levels are defined as:

* Low means approximately 5 requests per second
* Medium means approximately 50 requests per second
* High means approximately 500 requests per second

### Low reads, low writes <a name="low-reads-low-writes" />

The default [LangSmith Deployment](/langsmith/deployments) configuration will handle this load. No custom resource configuration is needed here.

### Low reads, high writes <a name="low-reads-high-writes" />

You have a high volume of write requests (500 per second) being processed by your deployment, but relatively few read requests (5 per second).

For this, we recommend a configuration like this:

```yaml  theme={null}
# Example configuration for low reads, high writes (5 read/500 write requests per second)
api:
  replicas: 6
  resources:
    requests:
      cpu: "1"
      memory: "2Gi"
    limits:
      cpu: "2"
      memory: "4Gi"

queue:
  replicas: 10
  resources:
    requests:
      cpu: "1"
      memory: "2Gi"
    limits:
      cpu: "2"
      memory: "4Gi"

config:
  numberOfJobsPerWorker: 50

redis:
  resources:
    requests:
      memory: "2Gi"
    limits:
      memory: "2Gi"

postgres:
  resources:
    requests:
      cpu: "4"
      memory: "16Gi"
    limits:
      cpu: "8"
      memory: "32Gi"
```

### High reads, low writes <a name="high-reads-low-writes" />

You have a high volume of read requests (500 per second) but relatively few write requests (5 per second).

For this, we recommend a configuration like this:

```yaml  theme={null}
# Example configuration for high reads, low writes (500 read/5 write requests per second)
api:
  replicas: 10
  resources:
    requests:
      cpu: "1"
      memory: "2Gi"
    limits:
      cpu: "2"
      memory: "4Gi"

queue:
  replicas: 1  # Default, minimal write load
  resources:
    requests:
      cpu: "1"
      memory: "2Gi"
    limits:
      cpu: "2"
      memory: "4Gi"

redis:
  resources:
    requests:
      memory: "2Gi"
    limits:
      memory: "2Gi"

postgres:
  resources:
    requests:
      cpu: "4"
      memory: "16Gi"
    limits:
      cpu: "8"
      memory: "32Gi"
  # Consider read replicas for high read scenarios
  readReplicas: 2
```

### Medium reads, medium writes <a name="medium-reads-medium-writes" />

This is a balanced configuration that should handle moderate read and write loads (50 read/50 write requests per second).

For this, we recommend a configuration like this:

```yaml  theme={null}
# Example configuration for medium reads, medium writes (50 read/50 write requests per second)
api:
  replicas: 3
  resources:
    requests:
      cpu: "1"
      memory: "2Gi"
    limits:
      cpu: "2"
      memory: "4Gi"

queue:
  replicas: 5
  resources:
    requests:
      cpu: "1"
      memory: "2Gi"
    limits:
      cpu: "2"
      memory: "4Gi"

redis:
  resources:
    requests:
      memory: "2Gi"
    limits:
      memory: "2Gi"

postgres:
  resources:
    requests:
      cpu: "4"
      memory: "16Gi"
    limits:
      cpu: "8"
      memory: "32Gi"
```

### High reads, high writes <a name="high-reads-high-writes" />

You have high volumes of both read and write requests (500 read/500 write requests per second).

For this, we recommend a configuration like this:

```yaml  theme={null}
# Example configuration for high reads, high writes (500 read/500 write requests per second)
api:
  replicas: 15
  resources:
    requests:
      cpu: "1"
      memory: "2Gi"
    limits:
      cpu: "2"
      memory: "4Gi"

queue:
  replicas: 10
  resources:
    requests:
      cpu: "1"
      memory: "2Gi"
    limits:
      cpu: "2"
      memory: "4Gi"

config:
  numberOfJobsPerWorker: 50

redis:
  resources:
    requests:
      memory: "2Gi"
    limits:
      memory: "2Gi"

postgres:
  resources:
    requests:
      cpu: "8"
      memory: "32Gi"
    limits:
      cpu: "16"
      memory: "64Gi"
```

### Autoscaling

If your deployment experiences bursty traffic, you can enable autoscaling to scale the number of API servers and queue workers to handle the load.

Here is a sample configuration for autoscaling for high reads and high writes:

```yaml  theme={null}
api:
  autoscaling:
    enabled: true
    minReplicas: 15
    maxReplicas: 25

queue:
  autoscaling:
    enabled: true
    minReplicas: 10
    maxReplicas: 20
```

<Note>
  Ensure that your deployment environment has sufficient resources to scale to the recommended size. Monitor your applications and infrastructure to ensure optimal performance. Consider implementing monitoring and alerting to track resource usage and application performance.
</Note>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/agent-server-scale.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Alerts in LangSmith
Source: https://docs.langchain.com/langsmith/alerts



<Note>
  **Self-hosted Version Requirement**

  Access to alerts requires Helm chart version **0.10.3** or later.
</Note>

## Overview

Effective observability in LLM applications requires proactive detection of failures, performance degradations, and regressions. LangSmith's alerts feature helps identify critical issues such as:

* API rate limit violations from model providers
* Latency increases for your application
* Application changes that affect feedback scores reflecting end-user experience

Alerts in LangSmith are project-scoped, requiring separate configuration for each monitored project.

## Configuring an alert

### Step 1: Navigate To Create Alert

First navigate to the Tracing project that you would like to configure alerts for. Click the Alerts icon on the top right hand corner of the page to view existing alerts for that project and set up a new alert.

### Step 2: Select Metric Type

<br />

<div style={{ textAlign: 'center' }}>
    <img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-metric.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=932f55b512d866906160e3ebe9a78ad7" alt="Alert Metrics" data-og-width="597" width="597" data-og-height="134" height="134" data-path="langsmith/images/alert-metric.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-metric.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=9a0140bfcf9df907ccaeffc0abc6d324 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-metric.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=774b40c4cf122330c3b7e7e39bffecde 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-metric.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=599617a29917cffe79547c1a85d110c3 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-metric.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=4e963933afa346141fc2623286f55b48 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-metric.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=fcb38466705fd5d8b94443ec9916a6ee 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-metric.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=d738df80eee5db727e6627c4a0e85ce9 2500w" />
</div>

LangSmith offers threshold-based alerting on three core metrics:

| Metric Type        | Description                         | Use Case                                                                                                                                                |
| ------------------ | ----------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Errored Runs**   | Track runs with an error status     | Monitors for failures in an application.                                                                                                                |
| **Feedback Score** | Measures the average feedback score | Track [feedback from end users](/langsmith/attach-user-feedback) or [online evaluation results](/langsmith/online-evaluations) to alert on regressions. |
| **Latency**        | Measures average run execution time | Tracks the latency of your application to alert on spikes and performance bottlenecks.                                                                  |

Additionally, for **Errored Runs** and **Run Latency**, you can define filters to narrow down the runs that trigger alerts. For example, you might create an error alert filter for all `llm` runs tagged with `support_agent` that encounter a `RateLimitExceeded` error.

<div style={{ textAlign: 'center' }}>
    <img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alerts-filter.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=b2dd48ba21e857c8a99a26a0d896f950" alt="Alert Metrics" data-og-width="407" width="407" data-og-height="273" height="273" data-path="langsmith/images/alerts-filter.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alerts-filter.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=d776aa4bb261605c45f4691b95822ad1 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alerts-filter.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=1cace263d141b044c73a8615c4c9cd15 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alerts-filter.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=a77dfdb2a2e5a119d11675fc01a857ce 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alerts-filter.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=d582ea675732440f5b4bae57ae35b766 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alerts-filter.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=8780c7b52bc0a61c938a7c75357cd068 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alerts-filter.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=6d2d7349e8856d8575bed75ccde61871 2500w" />
</div>

### Step 2: Define Alert Conditions

Alert conditions consist of several components:

* **Aggregation Method**: Average, Percentage, or Count
* **Comparison Operator**: `>=`, `<=`, or exceeds threshold
* **Threshold Value**: Numerical value triggering the alert
* **Aggregation Window**: Time period for metric calculation (currently choose between 5 or 15 minutes)
* **Feedback Key** (Feedback Score alerts only): Specific feedback metric to monitor

<br />

<div style={{ textAlign: 'center' }}>
    <img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/define-conditions.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=d92406d84dec4f1b827b82a989df30b9" alt="Alert Condition Configuration" data-og-width="597" width="597" data-og-height="112" height="112" data-path="langsmith/images/define-conditions.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/define-conditions.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=3311a45f1a32527a54c71d4966fdac3b 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/define-conditions.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=6ed12bea3c447c20bfff16e4e58d27e6 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/define-conditions.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=78955506ecd68ba0bac2ea7053837d6e 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/define-conditions.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=4a0bf3da7b34bdd56777a350315b3f6a 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/define-conditions.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=56a4a9e40b9c2a870b999c52dd13dd68 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/define-conditions.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=5207fb3afe3b40873280d9f23e3e0e24 2500w" />
</div>

**Example:** The configuration shown above would generate an alert when more than 5% of runs within the past 5 minutes result in errors.

You can preview alert behavior over a historical time window to understand how many datapoints—and which ones—would have triggered an alert at a chosen threshold (indicated in red). For example, setting an average latency threshold of 60 seconds for a project lets you visualize potential alerts, as shown in the image below.

<div style={{ textAlign: 'center' }}>
    <img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=d7f26bce1113c50bec8f5853c6448415" alt="Alert Metrics" data-og-width="863" width="863" data-og-height="545" height="545" data-path="langsmith/images/alert-preview.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=a508e02a73579624ae120276664e0e6a 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=4f7c5616752dfea80a346be50532f442 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=dd7d2d27fdb2335640d5ac43b6747baf 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=abbaea739f003fcbe97ee00e55e68927 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=55672ba9518816caf74921bc26694ffa 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=f67b99c4b5d709b1756d5b674a20dba1 2500w" />
</div>

### Step 3: Configure Notification Channel

LangSmith supports the following notification channels:

1. [PagerDuty Integration](/langsmith/alerts-pagerduty)
2. [Webhook Notifications](/langsmith/alerts-webhook)

Select the appropriate channel to ensure notifications reach the responsible team members.

## Best Practices

* Adjust sensitivity based on application criticality
* Start with broader thresholds and refine based on observed patterns
* Ensure alert routing reaches appropriate on-call personnel

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/alerts.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Configure webhook notifications for LangSmith alerts
Source: https://docs.langchain.com/langsmith/alerts-webhook



## Overview

This guide details the process for setting up webhook notifications for [LangSmith alerts](/langsmith/alerts). Before proceeding, make sure you have followed the steps leading up to the notification step of creating the alert by following [this guide](./alerts). Webhooks enable integration with custom services and third-party platforms by sending HTTP POST requests when alert conditions are triggered. Use webhooks to forward alert data to ticketing systems, chat applications, or custom monitoring solutions.

## Prerequisites

* An endpoint that can receive HTTP POST requests
* Appropriate authentication credentials for your receiving service (if required)

## Integration Configuration

### Step 1: Prepare Your Receiving Endpoint

Before configuring the webhook in LangSmith, ensure your receiving endpoint:

* Accepts HTTP POST requests
* Can process JSON payloads
* Is accessible from external services
* Has appropriate authentication mechanisms (if required)

Additionally, if on a custom deployment of LangSmith, make sure there are no firewall settings blocking egress traffic from LangSmith services.

### Step 2: Configure Webhook Parameters

<img src="https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/webhook-setup.png?fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=fecb6275ad3d576a864d1c6a2771c847" alt="Webhook Setup" data-og-width="754" width="754" data-og-height="523" height="523" data-path="langsmith/images/webhook-setup.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/webhook-setup.png?w=280&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=ef03d3ab887113e73dbdc1097076d103 280w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/webhook-setup.png?w=560&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=a25fcaedcbed92c9c3f2e2bddd8d88bd 560w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/webhook-setup.png?w=840&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=4785471ce1e58f3c48ce19b7be3889c5 840w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/webhook-setup.png?w=1100&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=8c7dd40aeb5635cdf4ddf207d0dfe7c7 1100w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/webhook-setup.png?w=1650&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=aff125529b9db8fbf861999e70bcdb26 1650w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/webhook-setup.png?w=2500&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=e9de7c4f0dcc440d734f4f3d09d2abf4 2500w" />

In the notification section of your alert complete the webhook configuration with the following parameters:

**Required Fields**

* **URL**: The complete URL of your receiving endpoint
  * Example: `https://api.example.com/incident-webhook`

**Optional Fields**

* **Headers**: JSON Key-value pairs sent with the webhook request

  * Common headers include:

    * `Authorization`: For authentication tokens
    * `Content-Type`: Usually set to `application/json` (default)
    * `X-Source`: To identify the source as LangSmith

  * If no headers, then simply use `{}`

* **Request Body Template**: Customize the JSON payload sent to your endpoint

  * Default: LangSmith sends the payload defined and the following additonal key-value pairs appended to the payload:

    * `project_name`: Name of the triggered alert
    * `alert_rule_id`: A UUID to identify the LangSmith alert. This can be used as a de-duplication key in the webhook service.
    * `alert_rule_name`: The name of the alert rule.
    * `alert_rule_type`: The type of alert (as of 04/01/2025 all alerts are of type `threshold`).
    * `alert_rule_attribute`: The attribute associated with the alert rule - `error_count`, `feedback_score` or `latency`.
    * `triggered_metric_value`: The value of the metric at the time the threshold was triggered.
    * `triggered_threshold`: The threshold that triggered the alert.
    * `timestamp`: The timestamp that triggered the alert.

### Step 3: Test the Webhook

Click **Send Test Alert** to send the webhook notification to ensure the notification works as intended.

## Troubleshooting

If webhook notifications aren't being delivered:

* Verify the webhook URL is correct and accessible
* Ensure any authentication headers are properly formatted
* Check that your receiving endpoint accepts POST requests
* Examine your endpoint's logs for received but rejected requests
* Verify your custom payload template is valid JSON format

## Security Considerations

* Use HTTPS for your webhook endpoints
* Implement authentication for your webhook endpoint
* Consider adding a shared secret in your headers to verify webhook sources
* Validate incoming webhook requests before processing them

## Sending alerts to Slack using a webhook

Here is an example for configuring LangSmith alerts to send notifications to Slack channels using the [`chat.postMessage`](https://api.slack.com/methods/chat.postMessage) API.

### Prerequisites

* Access to a Slack workspace
* A LangSmith project to set up alerts
* Permissions to create Slack applications

### Step 1: Create a Slack App

1. Visit the [Slack API Applications page](https://api.slack.com/apps)
2. Click **Create New App**
3. Select **From scratch**
4. Provide an **App Name** (e.g., "LangSmith Alerts")
5. Select the workspace where you want to install the app
6. Click **Create App**

### Step 2: Configure Bot Permissions

1. In the left sidebar of your Slack app configuration, click **OAuth & Permissions**

2. Scroll down to **Bot Token Scopes** under **Scopes** and click **Add an OAuth Scope**

3. Add the following scopes:

   * `chat:write` (Send messages as the app)
   * `chat:write.public` (Send messages to channels the app isn't in)
   * `channels:read` (View basic channel information)

### Step 3: Install the App to Your Workspace

1. Scroll up to the top of the **OAuth & Permissions** page
2. Click **Install to Workspace**
3. Review the permissions and click **Allow**
4. Copy the **Bot User OAuth Token** that appears (begins with `xoxb-`)

### Step 4: Configure the Webhook Alert in LangSmith

1. In LangSmith, navigate to your project
2. Select **Alerts → Create Alert**
3. Define your alert metrics and conditions
4. In the notification section, select **Webhook**
5. Configure the webhook with the following settings:

**Webhook URL**

```json  theme={null}
https://slack.com/api/chat.postMessage
```

**Headers**

```json  theme={null}
{
  "Content-Type": "application/json",
  "Authorization": "Bearer xoxb-your-token-here"
}
```

> **Note:** Replace `xoxb-your-token-here` with your actual Bot User OAuth Token

**Request Body Template**

```json  theme={null}
{
  "channel": "{channel_id}",
  "text": "{alert_name} triggered for {project_name}",
  "blocks": [
    {
      "type": "section",
      "text": {
        "type": "mrkdwn",
        "text": "🚨{alert_name} has been triggered"
      }
    },
    {
      "type": "section",
      "text": {
        "type": "mrkdwn",
        "text": "Please check the following link for more information:"
      }
    },
    {
      "type": "section",
      "text": {
        "type": "mrkdwn",
        "text": "<{project-url}|View in LangSmith>"
      }
    }
  ]
}
```

**NOTE:** Fill in the `channel_id`, `alert_name`, `project_name` and `project_url` when creating the alert. You can find your `project_url` in the browser's URL bar. Copy the portion up to but not including any query parameters.

6. Click **Save** to activate the webhook configuration

### Step 5: Test the Integration

1. In the LangSmith alert configuration, click **Test Alert**
2. Check your specified Slack channel for the test notification
3. Verify that the message contains the expected alert information

### (Optional) Step 6: Link to the Alert Preview in the Request Body

After creating an alert, you can optionally link to its preview in the webhook's request body.

<img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview-pane.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=286ebb8f90bafbdcacf9a0602aaf749c" alt="Alert Preview Pane" data-og-width="832" width="832" data-og-height="773" height="773" data-path="langsmith/images/alert-preview-pane.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview-pane.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=20a409a30bff44a1a8bb1b79a6a2216b 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview-pane.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=414bb4719617bd23452273c73327d601 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview-pane.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=6bc7bc7aaee65f7f4afac42102047ad2 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview-pane.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=491244ac56f6f4bcbb64419b267df0fe 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview-pane.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=d47f5ba127c3f61e3cb7498f8b7568fe 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/alert-preview-pane.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=6a70706db839b2d211024116ba19acef 2500w" />

To configure this:

1. Save your alert
2. Find your saved alert in the alerts table and click it
3. Copy the dsiplayed URL
4. Click "Edit Alert"
5. Replace the existing project URL with the copied alert preview URL

## Additional Resources

* [LangSmith Alerts Documentation](/langsmith/alerts)
* [Slack chat.postMessage API Documentation](https://api.slack.com/methods/chat.postMessage)
* [Slack Block Kit Builder](https://app.slack.com/block-kit-builder/)

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/alerts-webhook.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Analyze an experiment
Source: https://docs.langchain.com/langsmith/analyze-an-experiment



This page describes some of the essential tasks for working with [*experiments*](/langsmith/evaluation-concepts#experiment) in LangSmith:

* **[Analyze a single experiment](#analyze-a-single-experiment)**: View and interpret experiment results, customize columns, filter data, and compare runs.
* **[Download experiment results as a CSV](#how-to-download-experiment-results-as-a-csv)**: Export your experiment data for external analysis and sharing.
* **[Rename an experiment](#how-to-rename-an-experiment)**: Update experiment names in both the Playground and Experiments view.

## Analyze a single experiment

After running an experiment, you can use LangSmith's experiment view to analyze the results and draw insights about your experiment's performance.

### Open the experiment view

To open the experiment view, select the relevant [*dataset*](/langsmith/evaluation-concepts#datasets) from the **Dataset & Experiments** page and then select the experiment you want to view.

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-experiment.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=74207f0a2422f89fdc75b23f0a88c58f" alt="Open experiment view" data-og-width="1640" width="1640" data-og-height="899" height="899" data-path="langsmith/images/select-experiment.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-experiment.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=fa173f885a87adac0c9ced9b3d553876 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-experiment.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=011a588eeca2032ab40c3612345a0b4f 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-experiment.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=7baa325ce358d73c61dbab0cce54222b 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-experiment.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=2810aba5a4ce3f7f0098f167bff7a78f 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-experiment.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=78d38c66b183cee29fa66959f339954c 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-experiment.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=0bf7c03ef6f6ca6e269053a984f29c3a 2500w" />

### View experiment results

#### Customize columns

By default, the experiment view shows the input, output, and reference output for each [example](/langsmith/evaluation-concepts#examples) in the dataset, feedback scores from evaluations and experiment metrics like cost, token counts, latency and status.

You can customize the columns using the **Display** button to make it easier to interpret experiment results:

* **Break out fields from inputs, outputs, and reference outputs** into their own columns. This is especially helpful if you have long inputs/outputs/reference outputs and want to surface important fields.
* **Hide and reorder columns** to create focused views for analysis.
* **Control decimal precision on feedback scores**. By default, LangSmith surfaces numerical feedback scores with a decimal precision of 2, but you can customize this setting to be up to 6 decimals.
* **Set the Heat Map threshold** to high, middle, and low for numeric feedback scores in your experiment, which affects the threshold at which score chips render as red or green:

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/column-heat-map.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=b0203a449f0f7df70900735ba540d712" alt="Column heatmap configuration" data-og-width="1780" width="1780" data-og-height="1688" height="1688" data-path="langsmith/images/column-heat-map.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/column-heat-map.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=1ac06a00a4d11c8455d3996e3b3cc7ea 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/column-heat-map.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=4c2207a21fea1078e5002d0d96c8c989 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/column-heat-map.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=ea673195e58c5bae676a782cb03bbbaa 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/column-heat-map.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=ac4708a7612a3f7e7f82db28ac3a7b91 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/column-heat-map.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=da649bbe343e2fd5e87d1845d1b19944 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/column-heat-map.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=f06ba0a04ef257e2d69193021938e761 2500w" />

<Tip>
  You can set default configurations for an entire dataset or temporarily save settings just for yourself.
</Tip>

#### Sort and filter

To sort or filter feedback scores, you can use the actions in the column headers.

<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/sort-filter.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=067490743d1229ae233f15e46236ed67" alt="Sort and filter" data-og-width="1633" width="1633" data-og-height="788" height="788" data-path="langsmith/images/sort-filter.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/sort-filter.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=a116c731329b3fc088b57eae1d2f41a4 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/sort-filter.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=d4ab881725f50ca23dc2650aa5376efd 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/sort-filter.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=62027a48b9fd97a335caf3bd7e99d0dc 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/sort-filter.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=137a7384faf8c1958dd309d5cfeba998 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/sort-filter.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=3ea99639ad38e02f997f66f504663b8a 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/sort-filter.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=2372af19c9bd6d45b420e870f7b77402 2500w" />

#### Table views

Depending on the view most useful for your analysis, you can change the formatting of the table by toggling between a compact view, a full, view, and a diff view.

* The **Compact** view shows each run as a one-line row, for ease of comparing scores at a glance.
* The **Full** view shows the full output for each run for digging into the details of individual runs.
* The **Diff** view shows the text difference between the reference output and the output for each run.

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/diff-mode.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=fb916d33cea2f344f3483b42d3670696" alt="Diff view" data-og-width="1638" width="1638" data-og-height="969" height="969" data-path="langsmith/images/diff-mode.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/diff-mode.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=d30d2281df2f18a7c45fcae5eb839ded 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/diff-mode.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=edcf8cc4a3aca2bc3498a7c8f97b31b1 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/diff-mode.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=3d0b55249029b6d72a98fea716d4dae7 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/diff-mode.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=ef89e5bf58911f5362635cfefe8ead27 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/diff-mode.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=337796d8a22712dea4c848ba8bc0e94a 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/diff-mode.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=4b1019a7ec0249a9b58da199421689bc 2500w" />

#### View the traces

Hover over any of the output cells, and click on the trace icon to view the trace for that run. This will open up a trace in the side panel.

To view the entire tracing project, click on the **View Project** button in the top right of the header.

<img src="https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-trace.png?fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=c94c0d2ecedf248c639c971bf29196e6" alt="View trace" data-og-width="1634" width="1634" data-og-height="835" height="835" data-path="langsmith/images/view-trace.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-trace.png?w=280&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=45822be0900e0aaf9a06b9ddf7a8d91c 280w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-trace.png?w=560&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=c0bc6ecc7cb67144c7899b8345d6ccef 560w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-trace.png?w=840&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=85387bf5a2dd30a8411759069d4b3bbc 840w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-trace.png?w=1100&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=f5edba01b477e0c8a4f7a5e81123ffb7 1100w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-trace.png?w=1650&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=0241b8036e5e409a7d365e39d8d72bf1 1650w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-trace.png?w=2500&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=9a7f0042a8319a0398427665d50347d3 2500w" />

#### View evaluator runs

For evaluator scores, you can view the source run by hovering over the evaluator score cell and clicking on the arrow icon. This will open up a trace in the side panel. If you're running a [LLM-as-a-judge evaluator](/langsmith/llm-as-judge), you can view the prompt used for the evaluator in this run. If your experiment has [repetitions](/langsmith/evaluation-concepts#repetitions), you can click on the aggregate average score to find links to all of the individual runs.

<img src="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluator-run.png?fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=fc8df7233285b0f5a4ca9b44c06fcb47" alt="View evaluator runs" data-og-width="1634" width="1634" data-og-height="831" height="831" data-path="langsmith/images/evaluator-run.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluator-run.png?w=280&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=368b84b566407689f0e1b69f7e2d1ec8 280w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluator-run.png?w=560&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=9f7888c2163589263d5ed0827bf9b55a 560w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluator-run.png?w=840&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=a9cf41bd478fa7bb86b5594f4a0f163a 840w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluator-run.png?w=1100&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=644229f6557ca6b1b68df6be37b79cf9 1100w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluator-run.png?w=1650&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=da59385aa685d9e2c163f18385ee201c 1650w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluator-run.png?w=2500&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=039cad7ccdd3c0ed5d5e0e30151ed916 2500w" />

### Group results by metadata

You can add metadata to examples to categorize and organize them. For example, if you're evaluating factual accuracy on a question answering dataset, the metadata might include which subject area each question belongs to. Metadata can be added either [via the UI](/langsmith/manage-datasets-in-application#edit-example-metadata) or [via the SDK](/langsmith/manage-datasets-programmatically#update-single-example).

To analyze results by metadata, use the **Group by** dropdown in the top right corner of the experiment view and select your desired metadata key. This displays average feedback scores, latency, total tokens, and cost for each metadata group.

<Info>
  You will only be able to group by example metadata on experiments created after February 20th, 2025. Any experiments before that date can still be grouped by metadata, but only if the metadata is on the experiment traces themselves.
</Info>

### Repetitions

If you've run your experiment with [*repetitions*](/langsmith/evaluation-concepts#repetitions), there will be arrows in the output results column so you can view outputs in the table. To view each run from the repetition, hover over the output cell and click the expanded view.

When you run an experiment with repetitions, LangSmith displays the average for each feedback score in the table. Click on the feedback score to view the feedback scores from individual runs, or to view the standard deviation across repetitions.

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/repetitions.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=60962de04e5533d7718ca60fa9c7dcce" alt="Repetitions" data-og-width="1636" width="1636" data-og-height="959" height="959" data-path="langsmith/images/repetitions.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/repetitions.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=8be83801a53f2544883faf173bc16ef1 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/repetitions.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=7a924559be193efcc2c77dba3fea1231 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/repetitions.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=25cbd580d06bda48419b83401c268c2d 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/repetitions.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=9da3908c81d1c8fd44dde6d3ec7dfe1d 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/repetitions.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=775af0be371e662bea7ba7e29c2f21fd 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/repetitions.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=4d593460688be852a64638f092cba9f3 2500w" />

### Compare to another experiment

In the top right of the experiment view, you can select another experiment to compare to. This will open up a comparison view, where you can see how the two experiments compare. To learn more about the comparison view, see [how to compare experiment results](/langsmith/compare-experiment-results).

## Download experiment results as a CSV

LangSmith lets you download experiment results as a CSV file, which allows you to analyze and share your results.

To download as a CSV, click the download icon at the top of the experiment view. The icon is directly to the left of the [Compact toggle](/langsmith/compare-experiment-results#adjust-the-table-display).

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/download-experiment-results-as-csv.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=f237eb4b252a1018097be113434c22fa" alt="Download CSV" data-og-width="1705" width="1705" data-og-height="1345" height="1345" data-path="langsmith/images/download-experiment-results-as-csv.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/download-experiment-results-as-csv.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=679144e623fdd6a5ff643d66378f5f21 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/download-experiment-results-as-csv.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=ca7f980d6a11cf6ac6f54a2e8799ac08 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/download-experiment-results-as-csv.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=ae945f1172fad11a20e16c15ae859409 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/download-experiment-results-as-csv.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=e64dbaffad47fb30e1d44f032533bdcb 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/download-experiment-results-as-csv.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=72f1815228051375821c98a26edd0452 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/download-experiment-results-as-csv.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=51944ba9dddf5dddaa77b12a8a1c8d8a 2500w" />

## Rename an experiment

<Note>
  Experiment names must be unique per workspace.
</Note>

You can rename an experiment in the LangSmith UI in:

* The [Playground](#renaming-an-experiment-in-the-playground). When running experiments in the Playground, a default name with the format `pg::prompt-name::model::uuid` (eg. `pg::gpt-4o-mini::897ee630`) is automatically assigned.

  You can rename an experiment immediately after running it by editing its name in the Playground table header.

  <img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-playground.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=5b647ff1894376bbb727dabc4d73f039" alt="Edit name in playground" data-og-width="1372" width="1372" data-og-height="200" height="200" data-path="langsmith/images/rename-in-playground.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-playground.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=9d505597b1d2e180ebfa05d4361d3225 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-playground.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=d09703c543257203a19434a8a30458e1 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-playground.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=b3a8f6f2bd9fbfb9ba0c3f6c7477e04c 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-playground.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=591176ffbeb1f0a16f084f31f6717c6f 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-playground.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=381a59f8752307204b3e9f82a2fdbd16 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-playground.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=cba501d5c0d826f6f61b57d01e1d94c9 2500w" />

* The [Experiments view](#renaming-an-experiment-in-the-experiments-view). When viewing results in the experiments view, you can rename an experiment by using the pencil icon beside the experiment name.

  <img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-experiments-view.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=16afa853361ec265a0c7917d815f3132" alt="Edit name in experiments view" data-og-width="1628" width="1628" data-og-height="224" height="224" data-path="langsmith/images/rename-in-experiments-view.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-experiments-view.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=d87e21875d807e1553289f096137af3f 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-experiments-view.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=761aab57b06c1e3644ac633f10565e26 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-experiments-view.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=0dd7d731b4aae11a4b08644e06ac0eb9 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-experiments-view.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=333a93f0e1d2ff38f9bdcfcfed33c825 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-experiments-view.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=48d1b7cacab8e71bb4d0fe79573d6f11 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rename-in-experiments-view.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=8185ba3772c13f40ea57501dd4982bc3 2500w" />

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/analyze-an-experiment.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Custom instrumentation
Source: https://docs.langchain.com/langsmith/annotate-code



<Note>
  If you've decided you no longer want to trace your runs, you can remove the `LANGSMITH_TRACING` environment variable. Note that this does not affect the `RunTree` objects or API users, as these are meant to be low-level and not affected by the tracing toggle.
</Note>

There are several ways to log traces to LangSmith.

<Check>
  If you are using LangChain (either Python or JS/TS), you can skip this section and go directly to the [LangChain-specific instructions](/langsmith/trace-with-langchain).
</Check>

## Use `@traceable` / `traceable`

LangSmith makes it easy to log traces with minimal changes to your existing code with the `@traceable` decorator in Python and `traceable` function in TypeScript.

<Note>
  The `LANGSMITH_TRACING` environment variable must be set to `'true'` in order for traces to be logged to LangSmith, even when using `@traceable` or `traceable`. This allows you to toggle tracing on and off without changing your code.

  Additionally, you will need to set the `LANGSMITH_API_KEY` environment variable to your API key (see [Setup](/) for more information).

  By default, the traces will be logged to a project named `default`. To log traces to a different project, see [this section](/langsmith/log-traces-to-project).
</Note>

The `@traceable` decorator is a simple way to log traces from the LangSmith Python SDK. Simply decorate any function with `@traceable`.

Note that when wrapping a sync function with `traceable`, (e.g. `formatPrompt` in the example below), you should use the `await` keyword when calling it to
ensure the trace is logged correctly.

<CodeGroup>
  ```python Python theme={null}
  from langsmith import traceable
  from openai import Client

  openai = Client()

  @traceable
  def format_prompt(subject):
    return [
        {
            "role": "system",
            "content": "You are a helpful assistant.",
        },
        {
            "role": "user",
            "content": f"What's a good name for a store that sells {subject}?"
        }
    ]

  @traceable(run_type="llm")
  def invoke_llm(messages):
    return openai.chat.completions.create(
        messages=messages, model="gpt-4o-mini", temperature=0
    )

  @traceable
  def parse_output(response):
    return response.choices[0].message.content

  @traceable
  def run_pipeline():
    messages = format_prompt("colorful socks")
    response = invoke_llm(messages)
    return parse_output(response)

  run_pipeline()
  ```

  ```typescript TypeScript theme={null}
  import { traceable } from "langsmith/traceable";
  import OpenAI from "openai";

  const openai = new OpenAI();

  const formatPrompt = traceable((subject: string) => {
    return [
      {
        role: "system" as const,
        content: "You are a helpful assistant.",
      },
      {
        role: "user" as const,
        content: `What's a good name for a store that sells ${subject}?`,
      },
    ];
  },{ name: "formatPrompt" });

  const invokeLLM = traceable(
    async ({ messages }: { messages: { role: string; content: string }[] }) => {
        return openai.chat.completions.create({
            model: "gpt-4o-mini",
            messages: messages,
            temperature: 0,
        });
    },
    { run_type: "llm", name: "invokeLLM" }
  );

  const parseOutput = traceable(
    (response: any) => {
        return response.choices[0].message.content;
    },
    { name: "parseOutput" }
  );

  const runPipeline = traceable(
    async () => {
        const messages = await formatPrompt("colorful socks");
        const response = await invokeLLM({ messages });
        return parseOutput(response);
    },
    { name: "runPipeline" }
  );

  await runPipeline();
  ```
</CodeGroup>

<img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-code-trace.gif?s=bb81d0cb45382f2d793d43624db6e9ba" alt="" data-og-width="822" width="822" data-og-height="480" height="480" data-path="langsmith/images/annotate-code-trace.gif" data-optimize="true" data-opv="3" />

## Use the `trace` context manager (Python only)

In Python, you can use the `trace` context manager to log traces to LangSmith. This is useful in situations where:

1. You want to log traces for a specific block of code.
2. You want control over the inputs, outputs, and other attributes of the trace.
3. It is not feasible to use a decorator or wrapper.
4. Any or all of the above.

The context manager integrates seamlessly with the `traceable` decorator and `wrap_openai` wrapper, so you can use them together in the same application.

```python  theme={null}
import openai
import langsmith as ls
from langsmith.wrappers import wrap_openai

client = wrap_openai(openai.Client())

@ls.traceable(run_type="tool", name="Retrieve Context")
def my_tool(question: str) -> str:
    return "During this morning's meeting, we solved all world conflict."

def chat_pipeline(question: str):
    context = my_tool(question)
    messages = [
        { "role": "system", "content": "You are a helpful assistant. Please respond to the user's request only based on the given context." },
        { "role": "user", "content": f"Question: {question}\nContext: {context}"}
    ]
    chat_completion = client.chat.completions.create(
        model="gpt-4o-mini", messages=messages
    )
    return chat_completion.choices[0].message.content

app_inputs = {"input": "Can you summarize this morning's meetings?"}

with ls.trace("Chat Pipeline", "chain", project_name="my_test", inputs=app_inputs) as rt:
    output = chat_pipeline("Can you summarize this morning's meetings?")
    rt.end(outputs={"output": output})
```

## Use the `RunTree` API

Another, more explicit way to log traces to LangSmith is via the `RunTree` API. This API allows you more control over your tracing - you can manually create runs and children runs to assemble your trace. You still need to set your `LANGSMITH_API_KEY`, but `LANGSMITH_TRACING` is not necessary for this method.

This method is not recommended, as it's easier to make mistakes in propagating trace context.

<CodeGroup>
  ```python Python theme={null}
  import openai
  from langsmith.run_trees import RunTree

  # This can be a user input to your app
  question = "Can you summarize this morning's meetings?"

  # Create a top-level run
  pipeline = RunTree(
    name="Chat Pipeline",
    run_type="chain",
    inputs={"question": question}
  )
  pipeline.post()

  # This can be retrieved in a retrieval step
  context = "During this morning's meeting, we solved all world conflict."
  messages = [
    { "role": "system", "content": "You are a helpful assistant. Please respond to the user's request only based on the given context." },
    { "role": "user", "content": f"Question: {question}\nContext: {context}"}
  ]

  # Create a child run
  child_llm_run = pipeline.create_child(
    name="OpenAI Call",
    run_type="llm",
    inputs={"messages": messages},
  )
  child_llm_run.post()

  # Generate a completion
  client = openai.Client()
  chat_completion = client.chat.completions.create(
    model="gpt-4o-mini", messages=messages
  )

  # End the runs and log them
  child_llm_run.end(outputs=chat_completion)
  child_llm_run.patch()
  pipeline.end(outputs={"answer": chat_completion.choices[0].message.content})
  pipeline.patch()
  ```

  ```typescript TypeScript theme={null}
  import OpenAI from "openai";
  import { RunTree } from "langsmith";

  // This can be a user input to your app
  const question = "Can you summarize this morning's meetings?";

  const pipeline = new RunTree({
    name: "Chat Pipeline",
    run_type: "chain",
    inputs: { question }
  });
  await pipeline.postRun();

  // This can be retrieved in a retrieval step
  const context = "During this morning's meeting, we solved all world conflict.";
  const messages = [
    { role: "system", content: "You are a helpful assistant. Please respond to the user's request only based on the given context." },
    { role: "user", content: `Question: ${question}Context: ${context}` }
  ];

  // Create a child run
  const childRun = await pipeline.createChild({
    name: "OpenAI Call",
    run_type: "llm",
    inputs: { messages },
  });
  await childRun.postRun();

  // Generate a completion
  const client = new OpenAI();
  const chatCompletion = await client.chat.completions.create({
    model: "gpt-4o-mini",
    messages: messages,
  });

  // End the runs and log them
  childRun.end(chatCompletion);
  await childRun.patchRun();
  pipeline.end({ outputs: { answer: chatCompletion.choices[0].message.content } });
  await pipeline.patchRun();
  ```
</CodeGroup>

## Example usage

You can extend the utilities above to conveniently trace any code. Below are some example extensions:

Trace any public method in a class:

```python  theme={null}
from typing import Any, Callable, Type, TypeVar

T = TypeVar("T")

def traceable_cls(cls: Type[T]) -> Type[T]:
    """Instrument all public methods in a class."""
    def wrap_method(name: str, method: Any) -> Any:
        if callable(method) and not name.startswith("__"):
            return traceable(name=f"{cls.__name__}.{name}")(method)
        return method

    # Handle __dict__ case
    for name in dir(cls):
        if not name.startswith("_"):
            try:
                method = getattr(cls, name)
                setattr(cls, name, wrap_method(name, method))
            except AttributeError:
                # Skip attributes that can't be set (e.g., some descriptors)
                pass

    # Handle __slots__ case
    if hasattr(cls, "__slots__"):
        for slot in cls.__slots__:  # type: ignore[attr-defined]
            if not slot.startswith("__"):
                try:
                    method = getattr(cls, slot)
                    setattr(cls, slot, wrap_method(slot, method))
                except AttributeError:
                    # Skip slots that don't have a value yet
                    pass

    return cls

@traceable_cls
class MyClass:
    def __init__(self, some_val: int):
        self.some_val = some_val

    def combine(self, other_val: int):
        return self.some_val + other_val

# See trace: https://smith.langchain.com/public/882f9ecf-5057-426a-ae98-0edf84fdcaf9/r
MyClass(13).combine(29)
```

## Ensure all traces are submitted before exiting

LangSmith's tracing is done in a background thread to avoid obstructing your production application. This means that your process may end before all traces are successfully posted to LangSmith. Here are some options for ensuring all traces are submitted before exiting your application.

### Using the LangSmith SDK

If you are using the LangSmith SDK standalone, you can use the `flush` method before exit:

<CodeGroup>
  ```python Python theme={null}
  from langsmith import Client

  client = Client()

  @traceable(client=client)
  async def my_traced_func():
    # Your code here...
    pass

  try:
    await my_traced_func()
  finally:
    await client.flush()
  ```

  ```typescript TypeScript theme={null}
  import { Client } from "langsmith";

  const langsmithClient = new Client({});

  const myTracedFunc = traceable(async () => {
    // Your code here...
  },{ client: langsmithClient });

  try {
    await myTracedFunc();
  } finally {
    await langsmithClient.flush();
  }
  ```
</CodeGroup>

### Using LangChain

If you are using LangChain, please refer to our [LangChain tracing guide](/langsmith/trace-with-langchain#ensure-all-traces-are-submitted-before-exiting).

If you prefer a video tutorial, check out the [Tracing Basics video](https://academy.langchain.com/pages/intro-to-langsmith-preview) from the Introduction to LangSmith Course.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/annotate-code.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Annotate traces and runs inline
Source: https://docs.langchain.com/langsmith/annotate-traces-inline



LangSmith allows you to manually annotate traces with feedback within the application. This can be useful for adding context to a trace, such as a user's comment or a note about a specific issue.
You can annotate a trace either inline or by sending the trace to an annotation queue, which allows you to closely inspect and log feedbacks to runs one at a time.
Feedback tags are associated with your [workspace](/langsmith/administration-overview#workspaces).

<Note>
  **You can attach user feedback to ANY intermediate run (span) of the trace, not just the root span.**

  This is useful for critiquing specific parts of the LLM application, such as the retrieval step or generation step of the RAG pipeline.
</Note>

To annotate a trace inline, click on the `Annotate` in the upper right corner of trace view for any particular run that is part of the trace.

<img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-trace-inline.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=193363baef8b46e592fa63b299b407af" alt="" data-og-width="1722" width="1722" data-og-height="1035" height="1035" data-path="langsmith/images/annotate-trace-inline.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-trace-inline.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=3751be3488a8a5a488eb4277b4bc574e 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-trace-inline.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=e4c94e427fd4a4c14e5fca6b1a4fce14 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-trace-inline.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=3cc298e64a52af87bb4d46760352c958 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-trace-inline.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=6f13e3f345430b443c6f45642d6031eb 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-trace-inline.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=4259e830834b95f09e4b457b1e9a2807 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-trace-inline.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=f65fa998f3962f7b971c8433c8be36aa 2500w" />

This will open up a pane that allows you to choose from feedback tags associated with your workspace and add a score for particular tags. You can also add a standalone comment. Follow [this guide](./set-up-feedback-criteria) to set up feedback tags for your workspace.
You can also set up new feedback criteria from within the pane itself.

<img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotation-sidebar.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=6a16e79d91b435f6c5de94d0d58daa59" alt="" data-og-width="1376" width="1376" data-og-height="758" height="758" data-path="langsmith/images/annotation-sidebar.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotation-sidebar.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=6b0ee76da4d19ca7d7b5640865f738a7 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotation-sidebar.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=2e1acb5f129b2f865eef0399b0b06217 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotation-sidebar.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=ac7a1f2abf6faeb918eee11b702bb450 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotation-sidebar.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=1454e680d2c557541ea68c794018ceae 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotation-sidebar.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=5f45921db9ec69f138be197f71ab6c22 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotation-sidebar.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=c0c75ec7d51de574a3b7b1283ed06907 2500w" />

You can use the labeled keyboard shortcuts to streamline the annotation process.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/annotate-traces-inline.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Use annotation queues
Source: https://docs.langchain.com/langsmith/annotation-queues



*Annotation queues* provide a streamlined, directed view for human annotators to attach feedback to specific [runs](/langsmith/observability-concepts#runs). While you can always annotate [traces](/langsmith/observability-concepts#traces) inline, annotation queues provide another option to group runs together, then have annotators review and provide [feedback](/langsmith/observability-concepts#feedback) on them.

## Create an annotation queue

To create an annotation queue:

1. Navigate to the **Annotation queues** section on the left-hand navigation panel of the [LangSmith UI](https://smith.langchain.com).
2. Click **+ New annotation queue** in the top right corner.

   <img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-queue-new.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=c5c28c10a5522af0a37f40236ed57510" alt="Create Annotation Queue form with Basic Details, Annotation Rubric, and Feedback sections." data-og-width="3456" width="3456" data-og-height="1912" height="1912" data-path="langsmith/images/create-annotation-queue-new.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-queue-new.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=daa5c44976804eae5ca8bbfef1d0a9d0 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-queue-new.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=167955e0202671425e6cd1476c31a756 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-queue-new.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=71627eeab271c6d4581f00506731cc09 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-queue-new.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=cd30341efa9d5eea82d85b63518b53a0 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-queue-new.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=3e71d31b42b3411946f73d79e8735599 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-queue-new.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=832e7c8b99d332176bc9d9de702a6bac 2500w" />

### Basic Details

1. Fill in the form with the **Name** and **Description** of the queue. You can also assign a **default dataset** to queue, which will streamline the process of sending the inputs and outputs of certain runs to datasets in your LangSmith [workspace](/langsmith/administration-overview#workspaces).

### Annotation Rubric

1. Draft some high-level instructions for your annotators, which will be shown in the sidebar on every run.
2. Click **+ Desired Feedback** to add feedback keys to your annotation queue. Annotators will be presented with these feedback keys on each run.
3. Add a description for each, as well as a short description of each category, if the feedback is categorical.

   <img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-rubric.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=8adfdba2649847f82543674978b0d1b1" alt="Annotation queue rubric form with instructions and desired feedback entered." data-og-width="3456" width="3456" data-og-height="1914" height="1914" data-path="langsmith/images/create-annotation-rubric.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-rubric.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=5d73a7688b61b3b9489aacac1223f7c6 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-rubric.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=ede9b22be4e3ce82e4feabf86575e8a5 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-rubric.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=768747aa9e314c66631f27e794d9174b 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-rubric.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=f3061b6ba68c4d9cab997bbed2efe76e 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-rubric.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=27545c5b64b3b82ae9ebed853ff02168 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-annotation-rubric.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=801397056da06004808b6c38df30c139 2500w" />

   For example, with the descriptions in the previous screenshot, reviewers will see the **Annotation Rubric** details in the right-hand pane of the UI.

   <img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rubric-for-annotators.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=44452f7da89329acc06672beba4e4c0e" alt="The rendered rubric for reviewers from the example instructions." data-og-width="3456" width="3456" data-og-height="1912" height="1912" data-path="langsmith/images/rubric-for-annotators.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rubric-for-annotators.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=fa18a86229854c27a85c341da2638501 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rubric-for-annotators.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=67f5880b9d79e05e2dcd3703db92f79a 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rubric-for-annotators.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=01436264374f87f9baab4ac4f3f8c161 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rubric-for-annotators.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=80b8569a8ef3acc5b7d94a9276dc0d26 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rubric-for-annotators.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=b4c6f43fe2e91f57de4a2790757a2083 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/rubric-for-annotators.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=aae4a68208b5b6838c3fdf77f2c57efe 2500w" />

### Collaborator Settings

When there are multiple annotators for a run:

* **Number of reviewers per run**: This determines the number of reviewers that must mark a run as **Done** for it to be removed from the queue. If you check **All workspace members review each run**, then a run will remain in the queue until all [workspace](/langsmith/administration-overview#workspaces) members have marked their review as **Done**.

  * Reviewers cannot view the feedback left by other reviewers.
  * Comments on runs are visible to all reviewers.

* **Enable reservations on runs**: When a reviewer views a run, the run is reserved for that reviewer for the specified **Reservation length**. If there are multiple reviewers per run as specified above, the run can be reserved by multiple reviewers (up to the number of reviewers per run) at the same time.

  <Tip>
    We recommend enabling reservations. This will prevent multiple annotators from reviewing the same run at the same time.
  </Tip>

  If a reviewer has viewed a run and then leaves the run without marking it **Done**, the reservation will expire after the specified **Reservation length**. The run is then released back into the queue and can be reserved by another reviewer.

  <Note>
    Clicking **Requeue** for a run's annotation will only move the current run to the end of the current user's queue; it won't affect the queue order of any other user. It will also release the reservation that the current user has on that run.
  </Note>

As a result of the **Collaborator settings**, it's possible (and likely) that the number of runs visible to an individual in an annotation queue differs from the total number of runs in the queue compared to another user's queue size.

You can update these settings at any time by clicking on the pencil icon <Icon icon="pencil" /> in the **Annotation Queues** section.

## Assign runs to an annotation queue

To assign runs to an annotation queue, do one of the following:

* Click on **Add to Annotation Queue** in top right corner of any [trace](/langsmith/observability-concepts#traces) view. You can add any intermediate [run](/langsmith/observability-concepts#runs) (span) of the trace to an annotation queue, but not the root span.

  <img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/add-to-annotation-queue.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=fc604c7f91bc8795dc688c4f9db73ce9" alt="Trace view with the Add to Annotation Queue button highglighted at the top of the screen." data-og-width="1373" width="1373" data-og-height="1028" height="1028" data-path="langsmith/images/add-to-annotation-queue.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/add-to-annotation-queue.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=0ff1545d09984dfb766067ad65ecbfb9 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/add-to-annotation-queue.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=055554579c01cc8c48471e0d74fe27d6 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/add-to-annotation-queue.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=10412c26d4042358e098631386cddbf2 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/add-to-annotation-queue.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=be2c6162599ab18ef34975a439f16e93 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/add-to-annotation-queue.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=4e0b51a16246b1e28d415f23d86643c7 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/add-to-annotation-queue.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=c7744f75389cd270f60bbbee9571582a 2500w" />

* Select multiple runs in the runs table then click **Add to Annotation Queue** at the bottom of the page.

  <img src="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multi-select-annotation-queue.png?fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=c6781e6a7345ef7e16ea7a0bb306a474" alt="View of the runs table with runs selected. Add to Annotation Queue button at the botton of the page." data-og-width="1323" width="1323" data-og-height="1317" height="1317" data-path="langsmith/images/multi-select-annotation-queue.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multi-select-annotation-queue.png?w=280&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=03fff2a1f8cc40bf86f4b9251dacc0e1 280w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multi-select-annotation-queue.png?w=560&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=2cc50c1d4f1b9ec24f9e3bc4d5fabbff 560w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multi-select-annotation-queue.png?w=840&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=b37ce7b181457582652a22405f240750 840w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multi-select-annotation-queue.png?w=1100&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=badc9ef126c220b9aa8f5e8212421b93 1100w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multi-select-annotation-queue.png?w=1650&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=ae02333d194a955c1a6c86a2bf87c75f 1650w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multi-select-annotation-queue.png?w=2500&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=9dce63a0eedfc4a02a5ad822cee67bce 2500w" />

* [Set up an automation rule](/langsmith/rules) that automatically assigns runs that pass a certain filter and sampling condition to an annotation queue.

* Navigate to the **Datasets & Experiments** page and select a dataset. On the dataset's page select one or multiple [experiments](/langsmith/evaluation-concepts#experiment). At the bottom of the page, click **<Icon icon="pencil" /> Annotate**. From the resulting popup, you can either create a new queue or add the runs to an existing one.

  <img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-experiment.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=7622e6db855711542de24270ddc129dc" alt="Selected experiments with the Annotate button at the bottom of the page." data-og-width="3456" width="3456" data-og-height="1914" height="1914" data-path="langsmith/images/annotate-experiment.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-experiment.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=6bc0abf70504c439b413dfb6a3ff59f7 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-experiment.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=c511826c68b2ecf9fb75addc50df48c3 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-experiment.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=450345b069f8de91a982d66bfe6ce8a9 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-experiment.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=fb67c24cd3f6fee0042ad3ef94fc9a59 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-experiment.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=83aab2b2cd3cc7a84440caa0d57a68be 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/annotate-experiment.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=08a045816ca42285ad46570280ee7061 2500w" />

<Check>
  It is often a good idea to assign runs that have a particular type of user feedback score (e.g., thumbs up, thumbs down) from the application to an annotation queue. This way, you can identify and address issues that are causing user dissatisfaction. To learn more about how to capture user feedback from your LLM application, follow the guide on [attaching user feedback](/langsmith/attach-user-feedback).
</Check>

## Review runs in an annotation queue

To review runs in an annotation queue:

1. Navigate to the **Annotation Queues** section through the left-hand navigation bar.
2. Click on the queue you want to review. This will take you to a focused, cyclical view of the runs in the queue that require review.
3. You can attach a comment, attach a score for a particular [feedback](/langsmith/observability-concepts#feedback) criteria, add the run to a dataset or mark the run as reviewed. You can also remove the run from the queue for all users, despite any current reservations or settings for the queue, by clicking the **Trash** icon <Icon icon="trash" /> next to **View run**.

   <Tip>
     The keyboard shortcuts that are next to each option can help streamline the review process.
   </Tip>

   <img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/review-runs.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=9065d4b85e6165084b65d3908d61778a" alt="View or a run with the Annotate side panel. Keyboard shortcuts visible for options." data-og-width="1532" width="1532" data-og-height="1080" height="1080" data-path="langsmith/images/review-runs.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/review-runs.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=f69f94916e247ff498e1d9e5ed2755a2 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/review-runs.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=64f16b8b83ffc56d1ad078ae1bfacd65 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/review-runs.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=085b9e1ea9b3797bc10117c326a80018 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/review-runs.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=4321324e5d276d12864e446cd2a97c82 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/review-runs.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=7eeb685e4d4064518613283e502c9395 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/review-runs.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=059f57fb238aa0cd1ebcf6a2fb0ede95 2500w" />

## Video guide

<iframe className="w-full aspect-video rounded-xl" src="https://www.youtube.com/embed/rxKYHA-2KS0?si=V4EnrUmzJaUVJh0m" title="YouTube video player" frameBorder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowFullScreen />

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/annotation-queues.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Control plane API reference for LangSmith Deployment
Source: https://docs.langchain.com/langsmith/api-ref-control-plane



The control plane API is part of [LangSmith Deployment](/langsmith/deployments). With the control plane API, you can programmatically create, manage, and automate your [Agent Server](/langsmith/agent-server) deployments—for example, as part of a custom CI/CD workflow.

<Card title="API Reference" href="https://api.host.langchain.com/docs" icon="book">
  View the full Control Plane API reference documentation
</Card>

## Host

The control plane hosts for Cloud data regions:

| US                               | EU                                  |
| -------------------------------- | ----------------------------------- |
| `https://api.host.langchain.com` | `https://eu.api.host.langchain.com` |

**Note**: Self-hosted deployments of LangSmith will have a custom host for the control plane. The control plane APIs can be accessed at the path `/api-host`. For example, `http(s)://<host>/api-host/v2/deployments`. See [here](../langsmith/self-host-usage#configuring-the-application-you-want-to-use-with-langsmith) for more details.

## Authentication

To authenticate with the control plane API, set the `X-Api-Key` header to a valid LangSmith API key and set the `X-Tenant-Id` header to a valid workspace ID to target.

Example `curl` command:

```shell  theme={null}
curl --request GET \
  --url http://localhost:8124/v2/deployments \
  --header 'X-Api-Key: LANGSMITH_API_KEY'
  --header 'X-Tenant-Id': WORKSPACE_ID'
```

## Versioning

Each endpoint path is prefixed with a version (e.g. `v1`, `v2`).

## Quick Start

1. Call `POST /v2/deployments` to create a new Deployment. The response body contains the Deployment ID (`id`) and the ID of the latest (and first) revision (`latest_revision_id`).
2. Call `GET /v2/deployments/{deployment_id}` to retrieve the Deployment. Set `deployment_id` in the URL to the value of Deployment ID (`id`).
3. Poll for revision `status` until `status` is `DEPLOYED` by calling `GET /v2/deployments/{deployment_id}/revisions/{latest_revision_id}`.
4. Call `PATCH /v2/deployments/{deployment_id}` to update the deployment.

## Example Code

Below is example Python code that demonstrates how to orchestrate the control plane APIs to create a deployment, update the deployment, and delete the deployment.

```python  theme={null}
import os
import time

import requests
from dotenv import load_dotenv


load_dotenv()

# required environment variables
CONTROL_PLANE_HOST = os.getenv("CONTROL_PLANE_HOST")
LANGSMITH_API_KEY = os.getenv("LANGSMITH_API_KEY")
WORKSPACE_ID = os.getenv("WORKSPACE_ID")
INTEGRATION_ID = os.getenv("INTEGRATION_ID")
MAX_WAIT_TIME = 1800  # 30 mins


def get_headers() -> dict:
    """Return common headers for requests to the control plane API."""
    return {
        "X-Api-Key": LANGSMITH_API_KEY,
        "X-Tenant-Id": WORKSPACE_ID,
    }


def create_deployment() -> str:
    """Create deployment. Return deployment ID."""
    headers = get_headers()
    headers["Content-Type"] = "application/json"

    deployment_name = "my_deployment"

    request_body = {
        "name": deployment_name,
        "source": "github",
        "source_config": {
            "integration_id": INTEGRATION_ID,
            "repo_url": "https://github.com/langchain-ai/langgraph-example",
            "deployment_type": "dev",
            "build_on_push": False,
            "custom_url": None,
            "resource_spec": None,
        },
        "source_revision_config": {
            "repo_ref": "main",
            "langgraph_config_path": "langgraph.json",
            "image_uri": None,
        },
        "secrets": [
            {
                "name": "OPENAI_API_KEY",
                "value": "test_openai_api_key",
            },
            {
                "name": "ANTHROPIC_API_KEY",
                "value": "test_anthropic_api_key",
            },
            {
                "name": "TAVILY_API_KEY",
                "value": "test_tavily_api_key",
            },
        ],
    }

    response = requests.post(
        url=f"{CONTROL_PLANE_HOST}/v2/deployments",
        headers=headers,
        json=request_body,
    )

    if response.status_code != 201:
        raise Exception(f"Failed to create deployment: {response.text}")

    deployment_id = response.json()["id"]
    print(f"Created deployment {deployment_name} ({deployment_id})")
    return deployment_id


def get_deployment(deployment_id: str) -> dict:
    """Get deployment."""
    response = requests.get(
        url=f"{CONTROL_PLANE_HOST}/v2/deployments/{deployment_id}",
        headers=get_headers(),
    )

    if response.status_code != 200:
        raise Exception(f"Failed to get deployment ID {deployment_id}: {response.text}")

    return response.json()


def list_revisions(deployment_id: str) -> list[dict]:
    """List revisions.

    Return list is sorted by created_at in descending order (latest first).
    """
    response = requests.get(
        url=f"{CONTROL_PLANE_HOST}/v2/deployments/{deployment_id}/revisions",
        headers=get_headers(),
    )

    if response.status_code != 200:
        raise Exception(
            f"Failed to list revisions for deployment ID {deployment_id}: {response.text}"
        )

    return response.json()


def get_revision(
    deployment_id: str,
    revision_id: str,
) -> dict:
    """Get revision."""
    response = requests.get(
        url=f"{CONTROL_PLANE_HOST}/v2/deployments/{deployment_id}/revisions/{revision_id}",
        headers=get_headers(),
    )

    if response.status_code != 200:
        raise Exception(f"Failed to get revision ID {revision_id}: {response.text}")

    return response.json()


def patch_deployment(deployment_id: str) -> None:
    """Patch deployment."""
    headers = get_headers()
    headers["Content-Type"] = "application/json"

    # This creates a new revision because source_revision_config is included
    response = requests.patch(
        url=f"{CONTROL_PLANE_HOST}/v2/deployments/{deployment_id}",
        headers=headers,
        json={
            "source_config": {
                "build_on_push": True,
            },
            "source_revision_config": {
                "repo_ref": "main",
                "langgraph_config_path": "langgraph.json",
            },
        },
    )

    if response.status_code != 200:
        raise Exception(f"Failed to patch deployment: {response.text}")

    print(f"Patched deployment ID {deployment_id}")


def wait_for_deployment(deployment_id: str, revision_id: str) -> None:
    """Wait for revision status to be DEPLOYED."""
    start_time = time.time()
    revision, status = None, None
    while time.time() - start_time < MAX_WAIT_TIME:
        revision = get_revision(deployment_id, revision_id)
        status = revision["status"]
        if status == "DEPLOYED":
            break
        elif "FAILED" in status:
            raise Exception(f"Revision ID {revision_id} failed: {revision}")

        print(f"Waiting for revision ID {revision_id} to be DEPLOYED...")
        time.sleep(60)

    if status != "DEPLOYED":
        raise Exception(
            f"Timeout waiting for revision ID {revision_id} to be DEPLOYED: {revision}"
        )


def delete_deployment(deployment_id: str) -> None:
    """Delete deployment."""
    response = requests.delete(
        url=f"{CONTROL_PLANE_HOST}/v2/deployments/{deployment_id}",
        headers=get_headers(),
    )

    if response.status_code != 204:
        raise Exception(
            f"Failed to delete deployment ID {deployment_id}: {response.text}"
        )

    print(f"Deployment ID {deployment_id} deleted")


if __name__ == "__main__":
    # create deployment and get the latest revision
    deployment_id = create_deployment()
    revisions = list_revisions(deployment_id)
    latest_revision = revisions["resources"][0]
    latest_revision_id = latest_revision["id"]

    # wait for latest revision to be DEPLOYED
    wait_for_deployment(deployment_id, latest_revision_id)

    # patch the deployment and get the latest revision
    patch_deployment(deployment_id)
    revisions = list_revisions(deployment_id)
    latest_revision = revisions["resources"][0]
    latest_revision_id = latest_revision["id"]

    # wait for latest revision to be DEPLOYED
    wait_for_deployment(deployment_id, latest_revision_id)

    # delete the deployment
    delete_deployment(deployment_id)
```

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/api-ref-control-plane.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# App development in LangSmith Deployment
Source: https://docs.langchain.com/langsmith/app-development



**LangSmith Deployment** builds on the open-source [LangGraph](/oss/python/langgraph/overview) framework for developing stateful, multi-agent applications.
LangGraph provides the core abstractions and execution model, while LangSmith adds managed infrastructure, observability, deployment options, assistants, and concurrency controls—supporting the full lifecycle from development to production.

<Callout icon="cubes" color="#4F46E5" iconType="regular">
  LangSmith Deployment is framework-agnostic: you can deploy agents built with LangGraph or [other frameworks](/langsmith/autogen-integration). To get started with LangGraph itself, refer to the [LangGraph quickstart](/oss/python/langgraph/quickstart).
</Callout>

<CardGroup>
  <Card title="Assistants" cta="Explore assistants" href="/langsmith/assistants" icon="user-gear">
    Manage agent configurations, connect to threads, and build interactive assistants.
  </Card>

  <Card title="Runs" cta="Learn about runs" href="/langsmith/background-run" icon="play">
    Execute background jobs, stateless runs, cron jobs, and manage configurable headers.
  </Card>

  <Card title="Core capabilities" cta="See core features" href="/langsmith/streaming" icon="gear">
    Streaming, human-in-the-loop, webhooks, and concurrency controls like double-texting.
  </Card>

  <Card title="Tutorials" cta="View tutorials" href="/langsmith/autogen-integration" icon="graduation-cap">
    Step-by-step examples: AutoGen integration, streaming UI, and generative UI in React.
  </Card>
</CardGroup>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/app-development.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Application structure
Source: https://docs.langchain.com/langsmith/application-structure



To deploy on LangSmith, an application must consist of one or more graphs, a configuration file (`langgraph.json`), a file that specifies dependencies, and an optional `.env` file that specifies environment variables.

This page explains how a LangSmith application is organized and how to provide the configuration details required for deployment.

## Key Concepts

To deploy using LangSmith, provide the following information:

1. A [configuration file](#configuration-file-concepts) (`langgraph.json`) that specifies the dependencies, graphs, and environment variables to use for the application.
2. The [graphs](#graphs) that implement the logic of the application.
3. A file that specifies [dependencies](#dependencies) required to run the application.
4. [Environment variables](#environment-variables) that are required for the application to run.

<Tip>
  **Framework agnostic**

  LangSmith Deployment supports deploying a [LangGraph](/oss/python/langgraph/overview) *graph*. However, the implementation of a *node* of a graph can contain arbitrary Python code. This means any framework can be implemented within a node and deployed on LangSmith Deployment. This lets you keep your core application logic outside LangGraph while still using LangSmith for [deployment](/langsmith/deployments), scaling, and [observability](/langsmith/observability).
</Tip>

## File structure

The following are examples of directory structures for Python and JavaScript applications:

<Tabs>
  <Tab title="Python (requirements.txt)">
    ```plaintext  theme={null}
    my-app/
    ├── my_agent # all project code lies within here
    │   ├── utils # utilities for your graph
    │   │   ├── __init__.py
    │   │   ├── tools.py # tools for your graph
    │   │   ├── nodes.py # node functions for your graph
    │   │   └── state.py # state definition of your graph
    │   ├── __init__.py
    │   └── agent.py # code for constructing your graph
    ├── .env # environment variables
    ├── requirements.txt # package dependencies
    └── langgraph.json # configuration file for LangGraph
    ```
  </Tab>

  <Tab title="Python (pyproject.toml)">
    ```plaintext  theme={null}
    my-app/
    ├── my_agent # all project code lies within here
    │   ├── utils # utilities for your graph
    │   │   ├── __init__.py
    │   │   ├── tools.py # tools for your graph
    │   │   ├── nodes.py # node functions for your graph
    │   │   └── state.py # state definition of your graph
    │   ├── __init__.py
    │   └── agent.py # code for constructing your graph
    ├── .env # environment variables
    ├── langgraph.json  # configuration file for LangGraph
    └── pyproject.toml # dependencies for your project
    ```
  </Tab>

  <Tab title="JS (package.json)">
    ```plaintext  theme={null}
    my-app/
    ├── src # all project code lies within here
    │   ├── utils # optional utilities for your graph
    │   │   ├── tools.ts # tools for your graph
    │   │   ├── nodes.ts # node functions for your graph
    │   │   └── state.ts # state definition of your graph
    │   └── agent.ts # code for constructing your graph
    ├── package.json # package dependencies
    ├── .env # environment variables
    └── langgraph.json # configuration file for LangGraph
    ```
  </Tab>
</Tabs>

<Note>
  The directory structure of an application can vary depending on the programming language and the package manager used.
</Note>

<a id="configuration-file-concepts" />

## Configuration file

The `langgraph.json` file is a JSON file that specifies the dependencies, graphs, environment variables, and other settings required to deploy an application.

For details on all supported keys in the JSON file, refer to the [LangGraph configuration file reference](/langsmith/cli#configuration-file).

<Tip>
  The [LangGraph CLI](/langsmith/cli) defaults to using the configuration file `langgraph.json` in the current directory.
</Tip>

### Examples

<Tabs>
  <Tab title="Python">
    * The dependencies involve a custom local package and the `langchain_openai` package.
    * A single graph will be loaded from the file `./your_package/your_file.py` with the variable `variable`.
    * The environment variables are loaded from the `.env` file.

    ```json  theme={null}
    {
        "dependencies": [
            "langchain_openai",
            "./your_package"
        ],
        "graphs": {
            "my_agent": "./your_package/your_file.py:agent"
        },
        "env": "./.env"
    }
    ```
  </Tab>

  <Tab title="JavaScript">
    * The dependencies will be loaded from a dependency file in the local directory (e.g., `package.json`).
    * A single graph will be loaded from the file `./your_package/your_file.js` with the function `agent`.
    * The environment variable `OPENAI_API_KEY` is set inline.

    ```json  theme={null}
    {
        "dependencies": [
            "."
        ],
        "graphs": {
            "my_agent": "./your_package/your_file.js:agent"
        },
        "env": {
            "OPENAI_API_KEY": "secret-key"
        }
    }
    ```
  </Tab>
</Tabs>

## Dependencies

An application may depend on other Python packages or JavaScript libraries (depending on the programming language in which the application is written).

You will generally need to specify the following information for dependencies to be set up correctly:

1. A file in the directory that specifies the dependencies (e.g., `requirements.txt`, `pyproject.toml`, or `package.json`).
2. A `dependencies` key in the [configuration file](#configuration-file-concepts) that specifies the dependencies required to run the application.
3. Any additional binaries or system libraries can be specified using `dockerfile_lines` key in the [LangGraph configuration file](#configuration-file-concepts).

## Graphs

Use the `graphs` key in the [configuration file](#configuration-file-concepts) to specify which graphs will be available in the deployed application.

You can specify one or more graphs in the configuration file. Each graph is identified by a unique name and a path to either (1) a compiled graph or (2) a function that defines a graph.

## Environment variables

If you're working with a deployed LangGraph application [locally](/langsmith/local-server), you can configure environment variables in the `env` key of the [configuration file](#configuration-file-concepts).

For a production deployment, you will typically want to configure the environment variables in the deployment environment.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/application-structure.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Assistants
Source: https://docs.langchain.com/langsmith/assistants



**Assistants** allow you to manage configurations (like prompts, LLM selection, tools) separately from your graph's core logic, enabling rapid changes that don't alter the graph architecture. It is a way to create multiple specialized versions of the same graph architecture, each optimized for different use cases through configuration variations rather than structural changes.

For example, imagine a general-purpose writing agent built on a common graph architecture. While the structure remains the same, different writing styles—such as blog posts and tweets—require tailored configurations to optimize performance. To support these variations, you can create multiple assistants (e.g., one for blogs and another for tweets) that share the underlying graph but differ in model selection and system prompt.

<img src="https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/assistants.png?fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=05402316c8fe86fead077ec774e873f0" alt="assistant versions" data-og-width="1824" width="1824" data-og-height="692" height="692" data-path="langsmith/images/assistants.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/assistants.png?w=280&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=3ac250197ee8463950b74dc5f6bcd37f 280w, https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/assistants.png?w=560&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=d6b01c6ae96bd96b580bf43228610224 560w, https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/assistants.png?w=840&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=6125bf9aed49385ec8422e27cb377dad 840w, https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/assistants.png?w=1100&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=c54cde5d8a052ceac26d67131407aa73 1100w, https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/assistants.png?w=1650&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=780c08f1695bc2e5ba0b6261febb1954 1650w, https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/assistants.png?w=2500&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=ed8fba40ce7c1b3455027df735f9bdba 2500w" />

The LangGraph API provides several endpoints for creating and managing assistants and their versions. See the [API reference](https://langchain-ai.github.io/langgraph/cloud/reference/api/api_ref/#tag/assistants) for more details.

<Info>
  Assistants are a [LangSmith](/langsmith/home) concept. They are not available in the open source LangGraph library.
</Info>

## Configuration

Assistants build on the LangGraph open source concept of [configuration](/oss/python/langgraph/graph-api#runtime-context).

While configuration is available in the open source LangGraph library, assistants are only present in [LangSmith](/langsmith/home). This is due to the fact that assistants are tightly coupled to your deployed graph. Upon deployment, Agent Server will automatically create a default assistant for each graph using the graph's default configuration settings.

In practice, an assistant is just an *instance* of a graph with a specific configuration. Therefore, multiple assistants can reference the same graph but can contain different configurations (e.g. prompts, models, tools). The LangSmith Deployment API provides several endpoints for creating and managing assistants. See the [API reference](https://langchain-ai.github.io/langgraph/cloud/reference/api/api_ref/) and [this how-to](/langsmith/configuration-cloud) for more details on how to create assistants.

## Versioning

Assistants support versioning to track changes over time.
Once you've created an assistant, subsequent edits to that assistant will create new versions. See [this how-to](/langsmith/configuration-cloud#create-a-new-version-for-your-assistant) for more details on how to manage assistant versions.

## Execution

A **run** is an invocation of an assistant. Each run may have its own input, configuration, and metadata, which may affect execution and output of the underlying graph. A run can optionally be executed on a [thread](/oss/python/langgraph/persistence#threads).

LangSmith API provides several endpoints for creating and managing runs. See the [API reference](https://langchain-ai.github.io/langgraph/cloud/reference/api/api_ref/) for more details.

## Video guide

<iframe className="w-full aspect-video rounded-xl" src="https://www.youtube.com/embed/fMsQX6pwXkE?si=6Q28l0taGOynO7sU" title="YouTube video player" frameBorder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowFullScreen />

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/assistants.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Log user feedback using the SDK
Source: https://docs.langchain.com/langsmith/attach-user-feedback



<Tip>
  **Key concepts**

  * [Conceptual guide on tracing and feedback](/langsmith/observability-concepts)
  * [Reference guide on feedback data format](/langsmith/feedback-data-format)
</Tip>

LangSmith makes it easy to attach feedback to traces.
This feedback can come from users, annotators, automated evaluators, etc., and is crucial for monitoring and evaluating applications.

## Use [create\_feedback()](https://docs.smith.langchain.com/reference/python/client/langsmith.client.Client#langsmith.client.Client.create_feedback) / [createFeedback()](https://docs.smith.langchain.com/reference/js/classes/client.Client#createfeedback)

Here we'll walk through how to log feedback using the SDK.

<Info>
  **Child runs**
  You can attach user feedback to ANY child run of a trace, not just the trace (root run) itself.
  This is useful for critiquing specific steps of the LLM application, such as the retrieval step or generation step of a RAG pipeline.
</Info>

<Tip>
  **Non-blocking creation (Python only)**
  The Python client will automatically background feedback creation if you pass `trace_id=` to [create\_feedback()](https://docs.smith.langchain.com/reference/python/client/langsmith.client.Client#langsmith.client.Client.create_feedback).
  This is essential for low-latency environments, where you want to make sure your application isn't blocked on feedback creation.
</Tip>

<CodeGroup>
  ```python Python theme={null}
  from langsmith import trace, traceable, Client

      @traceable
      def foo(x):
          return {"y": x * 2}

      @traceable
      def bar(y):
          return {"z": y - 1}

      client = Client()

      inputs = {"x": 1}
      with trace(name="foobar", inputs=inputs) as root_run:
          result = foo(**inputs)
          result = bar(**result)
          root_run.outputs = result
          trace_id = root_run.id
          child_runs = root_run.child_runs

      # Provide feedback for a trace (a.k.a. a root run)
      client.create_feedback(
          key="user_feedback",
          score=1,
          trace_id=trace_id,
          comment="the user said that ..."
      )

  # Provide feedback for a child run
  foo_run_id = [run for run in child_runs if run.name == "foo"][0].id
  client.create_feedback(
      key="correctness",
      score=0,
      run_id=foo_run_id,
      # trace_id= is optional but recommended to enable batched and backgrounded
      # feedback ingestion.
      trace_id=trace_id,
  )
  ```

  ```typescript TypeScript theme={null}
  import { Client } from "langsmith";
  const client = new Client();

      // ... Run your application and get the run_id...
      // This information can be the result of a user-facing feedback form

  await client.createFeedback(
      runId,
      "feedback-key",
      {
          score: 1.0,
          comment: "comment",
      }
  );
  ```
</CodeGroup>

You can even log feedback for in-progress runs using `create_feedback() / createFeedback()`. See [this guide](/langsmith/access-current-span) for how to get the run ID of an in-progress run.

To learn more about how to filter traces based on various attributes, including user feedback, see [this guide](/langsmith/filter-traces-in-application).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/attach-user-feedback.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to audit evaluator scores
Source: https://docs.langchain.com/langsmith/audit-evaluator-scores



LLM-as-a-judge evaluators don't always get it right. Because of this, it is often useful for a human to manually audit the scores left by an evaluator and correct them where necessary. LangSmith allows you to make corrections on evaluator scores in the UI or SDK.

## In the comparison view

In the comparison view, you may click on any feedback tag to bring up the feedback details. From there, click the "edit" icon on the right to bring up the corrections view. You may then type in your desired score in the text box under "Make correction". If you would like, you may also attach an explanation to your correction. This is useful if you are using a [few-shot evaluator](/langsmith/create-few-shot-evaluators) and will be automatically inserted into your few-shot examples in place of the `few_shot_explanation` prompt variable.

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-comparison-view.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=5b815b771c18f291a9ef1b7defb9feb3" alt="Audit Evaluator Comparison View" data-og-width="3426" width="3426" data-og-height="1878" height="1878" data-path="langsmith/images/corrections-comparison-view.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-comparison-view.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=4840ceb8c340713fef6a7999c5d9c6cb 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-comparison-view.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=08b128d085701f17e20fdc6d314253a8 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-comparison-view.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=7d6300071894c9ff3f1fc80c6954c13d 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-comparison-view.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=ffe472be6a4b33d741782a1bc3269c60 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-comparison-view.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=454422187b095a4ad0ec3ad9074d4301 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-comparison-view.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=93b0585577eb98e6d76db3cba6868473 2500w" />

## In the runs table

In the runs table, find the "Feedback" column and click on the feedback tag to bring up the feedback details. Again, click the "edit" icon on the right to bring up the corrections view.

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-runs-table.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=5e64530681ac9125751af2383b67ba35" alt="Audit Evaluator Runs Table" data-og-width="1734" width="1734" data-og-height="1002" height="1002" data-path="langsmith/images/corrections-runs-table.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-runs-table.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=46a1a8328ad238d876d3b003a7ab836a 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-runs-table.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=60183b8a46938ccfe97a694cb941e7e3 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-runs-table.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=7a2b5008a78d18d283e81eae9a8e23c0 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-runs-table.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=e58f3c26472e5e78209927d662ab72c1 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-runs-table.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=eeb374392b18fa613e43564269cd8ff8 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/corrections-runs-table.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=117673d7c311c45f0b954313a35efb32 2500w" />

## In the SDK

Corrections can be made via the SDK's `update_feedback` function, with the `correction` dict. You must specify a `score` key which corresponds to a number for it to be rendered in the UI.

<CodeGroup>
  ```python Python theme={null}
  import langsmith

  client = langsmith.Client()

  client.update_feedback(
      my_feedback_id,
      correction={
          "score": 1,
      },
  )
  ```

  ```typescript TypeScript theme={null}
  import { Client } from 'langsmith';

  const client = new Client();

  await client.updateFeedback(
      myFeedbackId,
      {
          correction: {
              score: 1,
          }
      }
  )
  ```
</CodeGroup>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/audit-evaluator-scores.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Authentication & access control
Source: https://docs.langchain.com/langsmith/auth



LangSmith provides a flexible authentication and authorization system that can integrate with most authentication schemes.

## Core Concepts

### Authentication vs authorization

While often used interchangeably, these terms represent distinct security concepts:

* [**Authentication**](#authentication) ("AuthN") verifies *who* you are. This runs as middleware for every request.
* [**Authorization**](#authorization) ("AuthZ") determines *what you can do*. This validates the user's privileges and roles on a per-resource basis.

In LangSmith, authentication is handled by your [`@auth.authenticate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth.authenticate) handler, and authorization is handled by your [`@auth.on`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth.on) handlers.

## Default security models

LangSmith provides different security defaults:

### LangSmith

* Uses LangSmith API keys by default
* Requires valid API key in `x-api-key` header
* Can be customized with your auth handler

<Note>
  **Custom auth**
  Custom auth **is supported** for all plans in LangSmith.
</Note>

### Self-hosted

* No default authentication
* Complete flexibility to implement your security model
* You control all aspects of authentication and authorization

## System architecture

A typical authentication setup involves three main components:

1. **Authentication Provider** (Identity Provider/IdP)

* A dedicated service that manages user identities and credentials
* Handles user registration, login, password resets, etc.
* Issues tokens (JWT, session tokens, etc.) after successful authentication
* Examples: Auth0, Supabase Auth, Okta, or your own auth server

2. **LangGraph Backend** (Resource Server)

* Your LangGraph application that contains business logic and protected resources
* Validates tokens with the auth provider
* Enforces access control based on user identity and permissions
* Doesn't store user credentials directly

3. **Client Application** (Frontend)

* Web app, mobile app, or API client
* Collects time-sensitive user credentials and sends to auth provider
* Receives tokens from auth provider
* Includes these tokens in requests to LangGraph backend

Here's how these components typically interact:

```mermaid  theme={null}
sequenceDiagram
    participant Client as Client App
    participant Auth as Auth Provider
    participant LG as LangGraph Backend

    Client->>Auth: 1. Login (username/password)
    Auth-->>Client: 2. Return token
    Client->>LG: 3. Request with token
    Note over LG: 4. Validate token (@auth.authenticate)
    LG-->>Auth:  5. Fetch user info
    Auth-->>LG: 6. Confirm validity
    Note over LG: 7. Apply access control (@auth.on.*)
    LG-->>Client: 8. Return resources
```

Your [`@auth.authenticate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth.authenticate) handler in LangGraph handles steps 4-6, while your [`@auth.on`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth.on) handlers implement step 7.

## Authentication

Authentication in LangGraph runs as middleware on every request. Your [`@auth.authenticate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth.authenticate) handler receives request information and should:

1. Validate the credentials
2. Return [user info](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.MinimalUserDict) containing the user's identity and user information if valid
3. Raise an [HTTP exception](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.exceptions.HTTPException) or AssertionError if invalid

```python  theme={null}
from langgraph_sdk import Auth

auth = Auth()

@auth.authenticate
async def authenticate(headers: dict) -> Auth.types.MinimalUserDict:
    # Validate credentials (e.g., API key, JWT token)
    api_key = headers.get(b"x-api-key")
    if not api_key or not is_valid_key(api_key):
        raise Auth.exceptions.HTTPException(
            status_code=401,
            detail="Invalid API key"
        )

    # Return user info - only identity and is_authenticated are required
    # Add any additional fields you need for authorization
    return {
        "identity": "user-123",        # Required: unique user identifier
        "is_authenticated": True,      # Optional: assumed True by default
        "permissions": ["read", "write"] # Optional: for permission-based auth
        # You can add more custom fields if you want to implement other auth patterns
        "role": "admin",
        "org_id": "org-456"

    }
```

The returned user information is available:

* To your authorization handlers via [`ctx.user`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.AuthContext)
* In your application via `config["configuration"]["langgraph_auth_user"]`

<Accordion title="Supported Parameters">
  The [`@auth.authenticate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth.authenticate) handler can accept any of the following parameters by name:

  * request (Request): The raw ASGI request object
  * path (str): The request path, e.g., `"/threads/abcd-1234-abcd-1234/runs/abcd-1234-abcd-1234/stream"`
  * method (str): The HTTP method, e.g., `"GET"`
  * path\_params (dict\[str, str]): URL path parameters, e.g., `{"thread_id": "abcd-1234-abcd-1234", "run_id": "abcd-1234-abcd-1234"}`
  * query\_params (dict\[str, str]): URL query parameters, e.g., `{"stream": "true"}`
  * headers (dict\[bytes, bytes]): Request headers
  * authorization (str | None): The Authorization header value (e.g., `"Bearer <token>"`)

  In many of our tutorials, we will just show the "authorization" parameter to be concise, but you can opt to accept more information as needed
  to implement your custom authentication scheme.
</Accordion>

### Agent authentication

Custom authentication permits delegated access. The values you return in  `@auth.authenticate` are added to the run context, giving agents user-scoped credentials lets them access resources on the user’s behalf.

```mermaid  theme={null}
sequenceDiagram
  %% Actors
  participant ClientApp as Client
  participant AuthProv  as Auth Provider
  participant LangGraph as LangGraph Backend
  participant SecretStore as Secret Store
  participant ExternalService as External Service

  %% Platform login / AuthN
  ClientApp  ->> AuthProv: 1. Login (username / password)
  AuthProv   -->> ClientApp: 2. Return token
  ClientApp  ->> LangGraph: 3. Request with token

  Note over LangGraph: 4. Validate token (@auth.authenticate)
  LangGraph  -->> AuthProv: 5. Fetch user info
  AuthProv   -->> LangGraph: 6. Confirm validity

  %% Fetch user tokens from secret store
  LangGraph  ->> SecretStore: 6a. Fetch user tokens
  SecretStore -->> LangGraph: 6b. Return tokens

  Note over LangGraph: 7. Apply access control (@auth.on.*)

  %% External Service round-trip
  LangGraph  ->> ExternalService: 8. Call external service (with header)
  Note over ExternalService: 9. External service validates header and executes action
  ExternalService  -->> LangGraph: 10. Service response

  %% Return to caller
  LangGraph  -->> ClientApp: 11. Return resources
```

After authentication, the platform creates a special configuration object that is passed to your graph and all nodes via the configurable context.
This object contains information about the current user, including any custom fields you return from your [`@auth.authenticate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth.authenticate) handler.

To enable an agent to act on behalf of the user, use [custom authentication middleware](/langsmith/custom-auth). This will allow the agent to interact with external systems like MCP servers, external databases, and even other agents on behalf of the user.

For more information, see the [Use custom auth](/langsmith/custom-auth#enable-agent-authentication) guide.

### Agent authentication with MCP

For information on how to authenticate an agent to an MCP server, see the [MCP conceptual guide](/oss/python/langchain/mcp).

## Authorization

After authentication, LangGraph calls your [`@auth.on`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth) handlers to control access to specific resources (e.g., threads, assistants, crons). These handlers can:

1. Add metadata to be saved during resource creation by mutating the `value["metadata"]` dictionary directly. See the [supported actions table](#supported-actions) for the list of types the value can take for each action.
2. Filter resources by metadata during search/list or read operations by returning a [filter dictionary](#filter-operations).
3. Raise an HTTP exception if access is denied.

If you want to just implement simple user-scoped access control, you can use a single [`@auth.on`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth) handler for all resources and actions. If you want to have different control depending on the resource and action, you can use [resource-specific handlers](#resource-specific-handlers). See the [Supported Resources](#supported-resources) section for a full list of the resources that support access control.

```python  theme={null}
@auth.on
async def add_owner(
    ctx: Auth.types.AuthContext,
    value: dict  # The payload being sent to this access method
) -> dict:  # Returns a filter dict that restricts access to resources
    """Authorize all access to threads, runs, crons, and assistants.

    This handler does two things:
        - Adds a value to resource metadata (to persist with the resource so it can be filtered later)
        - Returns a filter (to restrict access to existing resources)

    Args:
        ctx: Authentication context containing user info, permissions, the path, and
        value: The request payload sent to the endpoint. For creation
              operations, this contains the resource parameters. For read
              operations, this contains the resource being accessed.

    Returns:
        A filter dictionary that LangGraph uses to restrict access to resources.
        See [Filter Operations](#filter-operations) for supported operators.
    """
    # Create filter to restrict access to just this user's resources
    filters = {"owner": ctx.user.identity}

    # Get or create the metadata dictionary in the payload
    # This is where we store persistent info about the resource
    metadata = value.setdefault("metadata", {})

    # Add owner to metadata - if this is a create or update operation,
    # this information will be saved with the resource
    # So we can filter by it later in read operations
    metadata.update(filters)

    # Return filters to restrict access
    # These filters are applied to ALL operations (create, read, update, search, etc.)
    # to ensure users can only access their own resources
    return filters
```

<a id="resource-specific-handlers" />

### Resource-specific handlers

You can register handlers for specific resources and actions by chaining the resource and action names together with the [`@auth.on`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth) decorator.
When a request is made, the most specific handler that matches that resource and action is called. Below is an example of how to register handlers for specific resources and actions. For the following setup:

1. Authenticated users are able to create threads, read threads, and create runs on threads
2. Only users with the "assistants:create" permission are allowed to create new assistants
3. All other endpoints (e.g., e.g., delete assistant, crons, store) are disabled for all users.

<Tip>
  **Supported Handlers**
  For a full list of supported resources and actions, see the [Supported Resources](#supported-resources) section below.
</Tip>

```python  theme={null}
# Generic / global handler catches calls that aren't handled by more specific handlers
@auth.on
async def reject_unhandled_requests(ctx: Auth.types.AuthContext, value: Any) -> False:
    print(f"Request to {ctx.path} by {ctx.user.identity}")
    raise Auth.exceptions.HTTPException(
        status_code=403,
        detail="Forbidden"
    )

# Matches the "thread" resource and all actions - create, read, update, delete, search
# Since this is **more specific** than the generic @auth.on handler, it will take precedence
# over the generic handler for all actions on the "threads" resource
@auth.on.threads
async def on_thread(
    ctx: Auth.types.AuthContext,
    value: Auth.types.threads.create.value
):
    # Setting metadata on the thread being created
    # will ensure that the resource contains an "owner" field
    # Then any time a user tries to access this thread or runs within the thread,
    # we can filter by owner
    metadata = value.setdefault("metadata", {})
    metadata["owner"] = ctx.user.identity
    return {"owner": ctx.user.identity}


# Thread creation. This will match only on thread create actions
# Since this is **more specific** than both the generic @auth.on handler and the @auth.on.threads handler,
# it will take precedence for any "create" actions on the "threads" resources
@auth.on.threads.create
async def on_thread_create(
    ctx: Auth.types.AuthContext,
    value: Auth.types.threads.create.value
):
    # Reject if the user does not have write access
    if "write" not in ctx.permissions:
        raise Auth.exceptions.HTTPException(
            status_code=403,
            detail="User lacks the required permissions."
        )
    # Setting metadata on the thread being created
    # will ensure that the resource contains an "owner" field
    # Then any time a user tries to access this thread or runs within the thread,
    # we can filter by owner
    metadata = value.setdefault("metadata", {})
    metadata["owner"] = ctx.user.identity
    return {"owner": ctx.user.identity}

# Reading a thread. Since this is also more specific than the generic @auth.on handler, and the @auth.on.threads handler,
# it will take precedence for any "read" actions on the "threads" resource
@auth.on.threads.read
async def on_thread_read(
    ctx: Auth.types.AuthContext,
    value: Auth.types.threads.read.value
):
    # Since we are reading (and not creating) a thread,
    # we don't need to set metadata. We just need to
    # return a filter to ensure users can only see their own threads
    return {"owner": ctx.user.identity}

# Run creation, streaming, updates, etc.
# This takes precedenceover the generic @auth.on handler and the @auth.on.threads handler
@auth.on.threads.create_run
async def on_run_create(
    ctx: Auth.types.AuthContext,
    value: Auth.types.threads.create_run.value
):
    metadata = value.setdefault("metadata", {})
    metadata["owner"] = ctx.user.identity
    # Inherit thread's access control
    return {"owner": ctx.user.identity}

# Assistant creation
@auth.on.assistants.create
async def on_assistant_create(
    ctx: Auth.types.AuthContext,
    value: Auth.types.assistants.create.value
):
    if "assistants:create" not in ctx.permissions:
        raise Auth.exceptions.HTTPException(
            status_code=403,
            detail="User lacks the required permissions."
        )
```

Notice that we are mixing global and resource-specific handlers in the above example. Since each request is handled by the most specific handler, a request to create a `thread` would match the `on_thread_create` handler but NOT the `reject_unhandled_requests` handler. A request to `update` a thread, however would be handled by the global handler, since we don't have a more specific handler for that resource and action.

<a id="filter-operations" />

### Filter operations

Authorization handlers can return `None`, a boolean, or a filter dictionary.

* `None` and `True` mean "authorize access to all underling resources"
* `False` means "deny access to all underling resources (raises a 403 exception)"
* A metadata filter dictionary will restrict access to resources

A filter dictionary is a dictionary with keys that match the resource metadata. It supports three operators:

* The default value is a shorthand for exact match, or "\$eq", below. For example, `{"owner": user_id}` will include only resources with metadata containing `{"owner": user_id}`
* `$eq`: Exact match (e.g., `{"owner": {"$eq": user_id}}`) - this is equivalent to the shorthand above, `{"owner": user_id}`
* `$contains`: List membership (e.g., `{"allowed_users": {"$contains": user_id}}`) or list containment (e.g., `{"allowed_users": {"$contains": [user_id_1, user_id_2]}}`). The value here must be an element of the list or a subset of the elements of the list, respectively. The metadata in the stored resource must be a list/container type.

A dictionary with multiple keys is treated using a logical `AND` filter. For example, `{"owner": org_id, "allowed_users": {"$contains": user_id}}` will only match resources with metadata whose "owner" is `org_id` and whose "allowed\_users" list contains `user_id`.
See the reference [`Auth`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.Auth)(Auth) for more information.

## Common access patterns

Here are some typical authorization patterns:

### Single-owner resources

This common pattern lets you scope all threads, assistants, crons, and runs to a single user. It's useful for common single-user use cases like regular chatbot-style apps.

```python  theme={null}
@auth.on
async def owner_only(ctx: Auth.types.AuthContext, value: dict):
    metadata = value.setdefault("metadata", {})
    metadata["owner"] = ctx.user.identity
    return {"owner": ctx.user.identity}
```

### Permission-based access

This pattern lets you control access based on **permissions**. It's useful if you want certain roles to have broader or more restricted access to resources.

```python  theme={null}
# In your auth handler:
@auth.authenticate
async def authenticate(headers: dict) -> Auth.types.MinimalUserDict:
    ...
    return {
        "identity": "user-123",
        "is_authenticated": True,
        "permissions": ["threads:write", "threads:read"]  # Define permissions in auth
    }

def _default(ctx: Auth.types.AuthContext, value: dict):
    metadata = value.setdefault("metadata", {})
    metadata["owner"] = ctx.user.identity
    return {"owner": ctx.user.identity}

@auth.on.threads.create
async def create_thread(ctx: Auth.types.AuthContext, value: dict):
    if "threads:write" not in ctx.permissions:
        raise Auth.exceptions.HTTPException(
            status_code=403,
            detail="Unauthorized"
        )
    return _default(ctx, value)


@auth.on.threads.read
async def rbac_create(ctx: Auth.types.AuthContext, value: dict):
    if "threads:read" not in ctx.permissions and "threads:write" not in ctx.permissions:
        raise Auth.exceptions.HTTPException(
            status_code=403,
            detail="Unauthorized"
        )
    return _default(ctx, value)
```

## Supported resources

LangGraph provides three levels of authorization handlers, from most general to most specific:

1. **Global Handler** (`@auth.on`): Matches all resources and actions
2. **Resource Handler** (e.g., `@auth.on.threads`, `@auth.on.assistants`, `@auth.on.crons`): Matches all actions for a specific resource
3. **Action Handler** (e.g., `@auth.on.threads.create`, `@auth.on.threads.read`): Matches a specific action on a specific resource

The most specific matching handler will be used. For example, `@auth.on.threads.create` takes precedence over `@auth.on.threads` for thread creation.
If a more specific handler is registered, the more general handler will not be called for that resource and action.

<Tip>
  "Type Safety"
  Each handler has type hints available for its `value` parameter at `Auth.types.on.<resource>.<action>.value`. For example:

  ```python  theme={null}
  @auth.on.threads.create
  async def on_thread_create(
  ctx: Auth.types.AuthContext,
  value: Auth.types.on.threads.create.value  # Specific type for thread creation
  ):
  ...

  @auth.on.threads
  async def on_threads(
  ctx: Auth.types.AuthContext,
  value: Auth.types.on.threads.value  # Union type of all thread actions
  ):
  ...

  @auth.on
  async def on_all(
  ctx: Auth.types.AuthContext,
  value: dict  # Union type of all possible actions
  ):
  ...
  ```

  More specific handlers provide better type hints since they handle fewer action types.
</Tip>

<a id="supported-actions" />

#### Supported actions and types

Here are all the supported action handlers:

| Resource       | Handler                       | Description                | Value Type                                                                                                                       |
| -------------- | ----------------------------- | -------------------------- | -------------------------------------------------------------------------------------------------------------------------------- |
| **Threads**    | `@auth.on.threads.create`     | Thread creation            | [`ThreadsCreate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.ThreadsCreate)       |
|                | `@auth.on.threads.read`       | Thread retrieval           | [`ThreadsRead`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.ThreadsRead)           |
|                | `@auth.on.threads.update`     | Thread updates             | [`ThreadsUpdate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.ThreadsUpdate)       |
|                | `@auth.on.threads.delete`     | Thread deletion            | [`ThreadsDelete`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.ThreadsDelete)       |
|                | `@auth.on.threads.search`     | Listing threads            | [`ThreadsSearch`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.ThreadsSearch)       |
|                | `@auth.on.threads.create_run` | Creating or updating a run | [`RunsCreate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.RunsCreate)             |
| **Assistants** | `@auth.on.assistants.create`  | Assistant creation         | [`AssistantsCreate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.AssistantsCreate) |
|                | `@auth.on.assistants.read`    | Assistant retrieval        | [`AssistantsRead`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.AssistantsRead)     |
|                | `@auth.on.assistants.update`  | Assistant updates          | [`AssistantsUpdate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.AssistantsUpdate) |
|                | `@auth.on.assistants.delete`  | Assistant deletion         | [`AssistantsDelete`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.AssistantsDelete) |
|                | `@auth.on.assistants.search`  | Listing assistants         | [`AssistantsSearch`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.AssistantsSearch) |
| **Crons**      | `@auth.on.crons.create`       | Cron job creation          | [`CronsCreate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.CronsCreate)           |
|                | `@auth.on.crons.read`         | Cron job retrieval         | [`CronsRead`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.CronsRead)               |
|                | `@auth.on.crons.update`       | Cron job updates           | [`CronsUpdate`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.CronsUpdate)           |
|                | `@auth.on.crons.delete`       | Cron job deletion          | [`CronsDelete`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.CronsDelete)           |
|                | `@auth.on.crons.search`       | Listing cron jobs          | [`CronsSearch`](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.auth.types.CronsSearch)           |

<Note>
  "About Runs"

  Runs are scoped to their parent thread for access control. This means permissions are typically inherited from the thread, reflecting the conversational nature of the data model. All run operations (reading, listing) except creation are controlled by the thread's handlers.
  There is a specific `create_run` handler for creating new runs because it had more arguments that you can view in the handler.
</Note>

## Next steps

For implementation details:

* Check out the introductory tutorial on [setting up authentication](/langsmith/set-up-custom-auth)
* See the how-to guide on implementing a [custom auth handlers](/langsmith/custom-auth)

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/auth.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Authentication methods
Source: https://docs.langchain.com/langsmith/authentication-methods



LangSmith supports multiple authentication methods for easy sign-up and login.

## Cloud

### Email/Password

Users can use an email address and password to sign up and login to LangSmith.

### Social Providers

Users can alternatively use their credentials from GitHub or Google.

### SAML SSO

Enterprise customers can configure [SAML SSO](/langsmith/user-management) and [SCIM](/langsmith/user-management)

## Self-Hosted

Self-hosted customers have more control over how their users can login to LangSmith. For more in-depth coverage of configuration options, see [the self-hosting docs](/langsmith/self-hosted) and [Helm chart](https://github.com/langchain-ai/helm/tree/main/charts/langsmith).

### SSO with OAuth 2.0 and OIDC

Production installations should configure SSO in order to use an external identity provider. This enables users to login through an identity platform like Auth0/Okta. LangSmith supports almost any OIDC-compliant provider. Learn more about configuring SSO in the [SSO configuration guide](/langsmith/self-host-sso)

### Email/Password a.k.a. basic auth

This auth method requires very little configuration as it does not require an external identity provider. It is most appropriate to use for self-hosted trials. Learn more in the [basic auth configuration guide](/langsmith/self-host-basic-auth)

### None

<Warning>
  This authentication mode will be removed after the launch of Basic Auth.
</Warning>

If zero authentication methods are enabled, a self-hosted installation does not require any login/sign-up. This configuration should only be used for verifying installation at the infrastructure level, as the feature set supported in this mode is restricted with only a single organization and workspace.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/authentication-methods.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to integrate LangGraph with AutoGen, CrewAI, and other frameworks
Source: https://docs.langchain.com/langsmith/autogen-integration



This guide shows how to integrate AutoGen agents with LangGraph to leverage features like persistence, streaming, and memory, and then deploy the integrated solution to LangSmith for scalable production use. In this guide we show how to build a LangGraph chatbot that integrates with AutoGen, but you can follow the same approach with other frameworks.

Integrating AutoGen with LangGraph provides several benefits:

* Enhanced features: Add [persistence](/oss/python/langgraph/persistence), [streaming](/langsmith/streaming), [short and long-term memory](/oss/python/concepts/memory) and more to your AutoGen agents.
* Multi-agent systems: Build [multi-agent systems](/oss/python/langchain/multi-agent) where individual agents are built with different frameworks.
* Production deployment: Deploy your integrated solution to [LangSmith](/langsmith/home) for scalable production use.

## Prerequisites

* Python 3.9+
* Autogen: `pip install autogen`
* LangGraph: `pip install langgraph`
* OpenAI API key

## Setup

Set your your environment:

```python  theme={null}
import getpass
import os


def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")


_set_env("OPENAI_API_KEY")
```

## 1. Define AutoGen agent

Create an AutoGen agent that can execute code. This example is adapted from AutoGen's [official tutorials](https://github.com/microsoft/autogen/blob/0.2/notebook/agentchat_web_info.ipynb):

```python  theme={null}
import autogen
import os

config_list = [{"model": "gpt-4o", "api_key": os.environ["OPENAI_API_KEY"]}]

llm_config = {
    "timeout": 600,
    "cache_seed": 42,
    "config_list": config_list,
    "temperature": 0,
}

autogen_agent = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    max_consecutive_auto_reply=10,
    is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
    code_execution_config={
        "work_dir": "web",
        "use_docker": False,
    },  # Please set use_docker=True if docker is available to run the generated code. Using docker is safer than running the generated code directly.
    llm_config=llm_config,
    system_message="Reply TERMINATE if the task has been solved at full satisfaction. Otherwise, reply CONTINUE, or the reason why the task is not solved yet.",
)
```

## 2. Create the graph

We will now create a LangGraph chatbot graph that calls AutoGen agent.

```python  theme={null}
from langchain_core.messages import convert_to_openai_messages
from langgraph.graph import StateGraph, MessagesState, START
from langgraph.checkpoint.memory import MemorySaver

def call_autogen_agent(state: MessagesState):
    # Convert LangGraph messages to OpenAI format for AutoGen
    messages = convert_to_openai_messages(state["messages"])

    # Get the last user message
    last_message = messages[-1]

    # Pass previous message history as context (excluding the last message)
    carryover = messages[:-1] if len(messages) > 1 else []

    # Initiate chat with AutoGen
    response = user_proxy.initiate_chat(
        autogen_agent,
        message=last_message,
        carryover=carryover
    )

    # Extract the final response from the agent
    final_content = response.chat_history[-1]["content"]

    # Return the response in LangGraph format
    return {"messages": {"role": "assistant", "content": final_content}}

# Create the graph with memory for persistence
checkpointer = MemorySaver()

# Build the graph
builder = StateGraph(MessagesState)
builder.add_node("autogen", call_autogen_agent)
builder.add_edge(START, "autogen")

# Compile with checkpointer for persistence
graph = builder.compile(checkpointer=checkpointer)
```

```python  theme={null}
from IPython.display import display, Image

display(Image(graph.get_graph().draw_mermaid_png()))
```

<img src="https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/autogen-output.png?fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=1165c5d1a5c154b2491d6a5fca30853f" alt="LangGraph chatbot with one step: START routes to autogen, where call_autogen_agent sends the latest user message (with prior context) to the AutoGen agent." data-og-width="180" width="180" data-og-height="134" height="134" data-path="langsmith/images/autogen-output.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/autogen-output.png?w=280&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=6a6671038776cd1784c968ee2ecf973e 280w, https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/autogen-output.png?w=560&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=94c98b5b118ae49006d2f56179e1dc0d 560w, https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/autogen-output.png?w=840&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=703b4822dfc1c3395c16dd9e7d0f1462 840w, https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/autogen-output.png?w=1100&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=0f6b4e65d2f036d5b28dde44afbb5fd8 1100w, https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/autogen-output.png?w=1650&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=17f044c125e4a480d4bd8814c19a0949 1650w, https://mintcdn.com/langchain-5e9cc07a/IMK8wJkjSpMCGODD/langsmith/images/autogen-output.png?w=2500&fit=max&auto=format&n=IMK8wJkjSpMCGODD&q=85&s=841b246f7eabe796de9c4ac0af4816dd 2500w" />

## 3. Test the graph locally

Before deploying to LangSmith, you can test the graph locally:

```python {highlight={2,13}} theme={null}
# pass the thread ID to persist agent outputs for future interactions
config = {"configurable": {"thread_id": "1"}}

for chunk in graph.stream(
    {
        "messages": [
            {
                "role": "user",
                "content": "Find numbers between 10 and 30 in fibonacci sequence",
            }
        ]
    },
    config,
):
    print(chunk)
```

**Output:**

```
user_proxy (to assistant):

Find numbers between 10 and 30 in fibonacci sequence

--------------------------------------------------------------------------------
assistant (to user_proxy):

To find numbers between 10 and 30 in the Fibonacci sequence, we can generate the Fibonacci sequence and check which numbers fall within this range. Here's a plan:

1. Generate Fibonacci numbers starting from 0.
2. Continue generating until the numbers exceed 30.
3. Collect and print the numbers that are between 10 and 30.

...
```

Since we're leveraging LangGraph's [persistence](/oss/python/langgraph/persistence) features we can now continue the conversation using the same thread ID -- LangGraph will automatically pass previous history to the AutoGen agent:

```python {highlight={10}} theme={null}
for chunk in graph.stream(
    {
        "messages": [
            {
                "role": "user",
                "content": "Multiply the last number by 3",
            }
        ]
    },
    config,
):
    print(chunk)
```

**Output:**

```
user_proxy (to assistant):

Multiply the last number by 3
Context:
Find numbers between 10 and 30 in fibonacci sequence
The Fibonacci numbers between 10 and 30 are 13 and 21.

These numbers are part of the Fibonacci sequence, which is generated by adding the two preceding numbers to get the next number, starting from 0 and 1.

The sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...

As you can see, 13 and 21 are the only numbers in this sequence that fall between 10 and 30.

TERMINATE

--------------------------------------------------------------------------------
assistant (to user_proxy):

The last number in the Fibonacci sequence between 10 and 30 is 21. Multiplying 21 by 3 gives:

21 * 3 = 63

TERMINATE

--------------------------------------------------------------------------------
{'call_autogen_agent': {'messages': {'role': 'assistant', 'content': 'The last number in the Fibonacci sequence between 10 and 30 is 21. Multiplying 21 by 3 gives:\n\n21 * 3 = 63\n\nTERMINATE'}}}
```

## 4. Prepare for deployment

To deploy to LangSmith, create a file structure like the following:

```
my-autogen-agent/
├── agent.py          # Your main agent code
├── requirements.txt  # Python dependencies
└── langgraph.json   # LangGraph configuration
```

<Tabs>
  <Tab title="agent.py">
    ```python  theme={null}
    import os
    import autogen
    from langchain_core.messages import convert_to_openai_messages
    from langgraph.graph import StateGraph, MessagesState, START
    from langgraph.checkpoint.memory import MemorySaver

    # AutoGen configuration
    config_list = [{"model": "gpt-4o", "api_key": os.environ["OPENAI_API_KEY"]}]

    llm_config = {
        "timeout": 600,
        "cache_seed": 42,
        "config_list": config_list,
        "temperature": 0,
    }

    # Create AutoGen agents
    autogen_agent = autogen.AssistantAgent(
        name="assistant",
        llm_config=llm_config,
    )

    user_proxy = autogen.UserProxyAgent(
        name="user_proxy",
        human_input_mode="NEVER",
        max_consecutive_auto_reply=10,
        is_termination_msg=lambda x: x.get("content", "").rstrip().endswith("TERMINATE"),
        code_execution_config={
            "work_dir": "/tmp/autogen_work",
            "use_docker": False,
        },
        llm_config=llm_config,
        system_message="Reply TERMINATE if the task has been solved at full satisfaction.",
    )

    def call_autogen_agent(state: MessagesState):
        """Node function that calls the AutoGen agent"""
        messages = convert_to_openai_messages(state["messages"])
        last_message = messages[-1]
        carryover = messages[:-1] if len(messages) > 1 else []

        response = user_proxy.initiate_chat(
            autogen_agent,
            message=last_message,
            carryover=carryover
        )

        final_content = response.chat_history[-1]["content"]
        return {"messages": {"role": "assistant", "content": final_content}}

    # Create and compile the graph
    def create_graph():
        checkpointer = MemorySaver()
        builder = StateGraph(MessagesState)
        builder.add_node("autogen", call_autogen_agent)
        builder.add_edge(START, "autogen")
        return builder.compile(checkpointer=checkpointer)

    # Export the graph for LangSmith
    graph = create_graph()
    ```
  </Tab>

  <Tab title="requirements.txt">
    ```
    langgraph>=0.1.0
    pyautogen>=0.2.0
    langchain-core>=0.1.0
    langchain-openai>=0.0.5
    ```
  </Tab>

  <Tab title="langgraph.json">
    ```json  theme={null}
    {
    "dependencies": ["."],
    "graphs": {
        "autogen_agent": "./agent.py:graph"
    },
    "env": ".env"
    }
    ```
  </Tab>
</Tabs>

## 5. Deploy to LangSmith

Deploy the graph with the LangSmith CLI:

<CodeGroup>
  ```bash pip theme={null}
  pip install -U langgraph-cli
  ```

  ```bash uv theme={null}
  uv add langgraph-cli
  ```
</CodeGroup>

```
langgraph deploy --config langgraph.json
```

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/autogen-integration.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to kick off background runs
Source: https://docs.langchain.com/langsmith/background-run



This guide covers how to kick off background runs for your agent.
This can be useful for long running jobs.

## Setup

First let's set up our client and thread:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    from langgraph_sdk import get_client

    client = get_client(url=<DEPLOYMENT_URL>)
    # Using the graph deployed with the name "agent"
    assistant_id = "agent"
    # create thread
    thread = await client.threads.create()
    print(thread)
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    import { Client } from "@langchain/langgraph-sdk";

    const client = new Client({ apiUrl: <DEPLOYMENT_URL> });
    // Using the graph deployed with the name "agent"
    const assistantID = "agent";
    // create thread
    const thread = await client.threads.create();
    console.log(thread);
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request POST \
      --url <DEPLOYMENT_URL>/threads \
      --header 'Content-Type: application/json' \
      --data '{}'
    ```
  </Tab>
</Tabs>

Output:

```
{
'thread_id': '5cb1e8a1-34b3-4a61-a34e-71a9799bd00d',
'created_at': '2024-08-30T20:35:52.062934+00:00',
'updated_at': '2024-08-30T20:35:52.062934+00:00',
'metadata': {},
'status': 'idle',
'config': {},
'values': None
}
```

## Check runs on thread

If we list the current runs on this thread, we will see that it's empty:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    runs = await client.runs.list(thread["thread_id"])
    print(runs)
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    let runs = await client.runs.list(thread['thread_id']);
    console.log(runs);
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request GET \
        --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs
    ```
  </Tab>
</Tabs>

Output:

```
[]
```

## Start runs on thread

Now let's kick off a run:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    input = {"messages": [{"role": "user", "content": "what's the weather in sf"}]}
    run = await client.runs.create(thread["thread_id"], assistant_id, input=input)
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    let input = {"messages": [{"role": "user", "content": "what's the weather in sf"}]};
    let run = await client.runs.create(thread["thread_id"], assistantID, { input });
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request POST \
        --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs \
        --header 'Content-Type: application/json' \
        --data '{
            "assistant_id": <ASSISTANT_ID>
        }'
    ```
  </Tab>
</Tabs>

The first time we poll it, we can see `status=pending`:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    print(await client.runs.get(thread["thread_id"], run["run_id"]))
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    console.log(await client.runs.get(thread["thread_id"], run["run_id"]));
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request GET \
        --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/<RUN_ID>
    ```
  </Tab>
</Tabs>

Output:

```
{
"run_id": "1ef6a5f8-bd86-6763-bbd6-bff042db7b1b",
"thread_id": "7885f0cf-94ad-4040-91d7-73f7ba007c8a",
"assistant_id": "fe096781-5601-53d2-b2f6-0d3403f7e9ca",
"created_at": "2024-09-04T01:46:47.244887+00:00",
"updated_at": "2024-09-04T01:46:47.244887+00:00",
"metadata": {},
"status": "pending",
"kwargs": {
"input": {
"messages": [
{
"role": "user",
"content": "what's the weather in sf"
}
]
},
"config": {
"metadata": {
"created_by": "system"
},
"configurable": {
"run_id": "1ef6a5f8-bd86-6763-bbd6-bff042db7b1b",
"user_id": "",
"graph_id": "agent",
"thread_id": "7885f0cf-94ad-4040-91d7-73f7ba007c8a",
"assistant_id": "fe096781-5601-53d2-b2f6-0d3403f7e9ca",
"checkpoint_id": null
}
},
"webhook": null,
"temporary": false,
"stream_mode": [
"values"
],
"feedback_keys": null,
"interrupt_after": null,
"interrupt_before": null
},
"multitask_strategy": "reject"
}
```

Now we can join the run, wait for it to finish and check that status again:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    await client.runs.join(thread["thread_id"], run["run_id"])
    print(await client.runs.get(thread["thread_id"], run["run_id"]))
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    await client.runs.join(thread["thread_id"], run["run_id"]);
    console.log(await client.runs.get(thread["thread_id"], run["run_id"]));
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request GET \
        --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/<RUN_ID>/join &&
    curl --request GET \
        --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/<RUN_ID>
    ```
  </Tab>
</Tabs>

Output:

```
{
"run_id": "1ef6a5f8-bd86-6763-bbd6-bff042db7b1b",
"thread_id": "7885f0cf-94ad-4040-91d7-73f7ba007c8a",
"assistant_id": "fe096781-5601-53d2-b2f6-0d3403f7e9ca",
"created_at": "2024-09-04T01:46:47.244887+00:00",
"updated_at": "2024-09-04T01:46:47.244887+00:00",
"metadata": {},
"status": "success",
"kwargs": {
"input": {
"messages": [
{
"role": "user",
"content": "what's the weather in sf"
}
]
},
"config": {
"metadata": {
"created_by": "system"
},
"configurable": {
"run_id": "1ef6a5f8-bd86-6763-bbd6-bff042db7b1b",
"user_id": "",
"graph_id": "agent",
"thread_id": "7885f0cf-94ad-4040-91d7-73f7ba007c8a",
"assistant_id": "fe096781-5601-53d2-b2f6-0d3403f7e9ca",
"checkpoint_id": null
}
},
"webhook": null,
"temporary": false,
"stream_mode": [
"values"
],
"feedback_keys": null,
"interrupt_after": null,
"interrupt_before": null
},
"multitask_strategy": "reject"
}
```

Perfect! The run succeeded as we would expect. We can double check that the run worked as expected by printing out the final state:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    final_result = await client.threads.get_state(thread["thread_id"])
    print(final_result)
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    let finalResult = await client.threads.getState(thread["thread_id"]);
    console.log(finalResult);
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request GET \
        --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/state
    ```
  </Tab>
</Tabs>

Output:

```
{
"values": {
"messages": [
{
"content": "what's the weather in sf",
"additional_kwargs": {},
"response_metadata": {},
"type": "human",
"name": null,
"id": "beba31bf-320d-4125-9c37-cadf526ac47a",
"example": false
},
{
"content": [
{
"id": "toolu_01AaNPSPzqia21v7aAKwbKYm",
"input": {},
"name": "tavily_search_results_json",
"type": "tool_use",
"index": 0,
"partial_json": "{\"query\": \"weather in san francisco\"}"
}
],
"additional_kwargs": {},
"response_metadata": {
"stop_reason": "tool_use",
"stop_sequence": null
},
"type": "ai",
"name": null,
"id": "run-f220faf8-1d27-4f73-ad91-6bb3f47e8639",
"example": false,
"tool_calls": [
{
"name": "tavily_search_results_json",
"args": {
"query": "weather in san francisco"
},
"id": "toolu_01AaNPSPzqia21v7aAKwbKYm",
"type": "tool_call"
}
],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 273,
"output_tokens": 61,
"total_tokens": 334
}
},
{
"content": "[{\"url\": \"https://www.weatherapi.com/\", \"content\": \"{'location': {'name': 'San Francisco', 'region': 'California', 'country': 'United States of America', 'lat': 37.78, 'lon': -122.42, 'tz_id': 'America/Los_Angeles', 'localtime_epoch': 1725052131, 'localtime': '2024-08-30 14:08'}, 'current': {'last_updated_epoch': 1725051600, 'last_updated': '2024-08-30 14:00', 'temp_c': 21.1, 'temp_f': 70.0, 'is_day': 1, 'condition': {'text': 'Partly cloudy', 'icon': '//cdn.weatherapi.com/weather/64x64/day/116.png', 'code': 1003}, 'wind_mph': 11.9, 'wind_kph': 19.1, 'wind_degree': 290, 'wind_dir': 'WNW', 'pressure_mb': 1018.0, 'pressure_in': 30.07, 'precip_mm': 0.0, 'precip_in': 0.0, 'humidity': 59, 'cloud': 25, 'feelslike_c': 21.1, 'feelslike_f': 70.0, 'windchill_c': 18.6, 'windchill_f': 65.5, 'heatindex_c': 18.6, 'heatindex_f': 65.5, 'dewpoint_c': 12.2, 'dewpoint_f': 54.0, 'vis_km': 16.0, 'vis_miles': 9.0, 'uv': 5.0, 'gust_mph': 15.0, 'gust_kph': 24.2}}\"}]",
"additional_kwargs": {},
"response_metadata": {},
"type": "tool",
"name": "tavily_search_results_json",
"id": "686b2487-f332-4e58-9508-89b3a814cd81",
"tool_call_id": "toolu_01AaNPSPzqia21v7aAKwbKYm",
"artifact": {
"query": "weather in san francisco",
"follow_up_questions": null,
"answer": null,
"images": [],
"results": [
{
"title": "Weather in San Francisco",
"url": "https://www.weatherapi.com/",
"content": "{'location': {'name': 'San Francisco', 'region': 'California', 'country': 'United States of America', 'lat': 37.78, 'lon': -122.42, 'tz_id': 'America/Los_Angeles', 'localtime_epoch': 1725052131, 'localtime': '2024-08-30 14:08'}, 'current': {'last_updated_epoch': 1725051600, 'last_updated': '2024-08-30 14:00', 'temp_c': 21.1, 'temp_f': 70.0, 'is_day': 1, 'condition': {'text': 'Partly cloudy', 'icon': '//cdn.weatherapi.com/weather/64x64/day/116.png', 'code': 1003}, 'wind_mph': 11.9, 'wind_kph': 19.1, 'wind_degree': 290, 'wind_dir': 'WNW', 'pressure_mb': 1018.0, 'pressure_in': 30.07, 'precip_mm': 0.0, 'precip_in': 0.0, 'humidity': 59, 'cloud': 25, 'feelslike_c': 21.1, 'feelslike_f': 70.0, 'windchill_c': 18.6, 'windchill_f': 65.5, 'heatindex_c': 18.6, 'heatindex_f': 65.5, 'dewpoint_c': 12.2, 'dewpoint_f': 54.0, 'vis_km': 16.0, 'vis_miles': 9.0, 'uv': 5.0, 'gust_mph': 15.0, 'gust_kph': 24.2}}",
"score": 0.976148,
"raw_content": null
}
],
"response_time": 3.07
},
"status": "success"
},
{
"content": [
{
"text": "\n\nThe search results provide the current weather conditions in San Francisco. According to the data, as of 2:00 PM on August 30, 2024, the temperature in San Francisco is 70\u00b0F (21.1\u00b0C) with partly cloudy skies. The wind is blowing from the west-northwest at around 12 mph (19 km/h). The humidity is 59% and visibility is 9 miles (16 km). Overall, it looks like a nice late summer day in San Francisco with comfortable temperatures and partly sunny conditions.",
"type": "text",
"index": 0
}
],
"additional_kwargs": {},
"response_metadata": {
"stop_reason": "end_turn",
"stop_sequence": null
},
"type": "ai",
"name": null,
"id": "run-8fecc61d-3d9f-4e16-8e8a-92f702be498a",
"example": false,
"tool_calls": [],
"invalid_tool_calls": [],
"usage_metadata": {
"input_tokens": 837,
"output_tokens": 124,
"total_tokens": 961
}
}
]
},
"next": [],
"tasks": [],
"metadata": {
"step": 3,
"run_id": "1ef67140-eb23-684b-8253-91d4c90bb05e",
"source": "loop",
"writes": {
"agent": {
"messages": [
{
"id": "run-8fecc61d-3d9f-4e16-8e8a-92f702be498a",
"name": null,
"type": "ai",
"content": [
{
"text": "\n\nThe search results provide the current weather conditions in San Francisco. According to the data, as of 2:00 PM on August 30, 2024, the temperature in San Francisco is 70\u00b0F (21.1\u00b0C) with partly cloudy skies. The wind is blowing from the west-northwest at around 12 mph (19 km/h). The humidity is 59% and visibility is 9 miles (16 km). Overall, it looks like a nice late summer day in San Francisco with comfortable temperatures and partly sunny conditions.",
"type": "text",
"index": 0
}
],
"example": false,
"tool_calls": [],
"usage_metadata": {
"input_tokens": 837,
"total_tokens": 961,
"output_tokens": 124
},
"additional_kwargs": {},
"response_metadata": {
"stop_reason": "end_turn",
"stop_sequence": null
},
"invalid_tool_calls": []
}
]
}
},
"user_id": "",
"graph_id": "agent",
"thread_id": "5cb1e8a1-34b3-4a61-a34e-71a9799bd00d",
"created_by": "system",
"assistant_id": "fe096781-5601-53d2-b2f6-0d3403f7e9ca"
},
"created_at": "2024-08-30T21:09:00.079909+00:00",
"checkpoint_id": "1ef67141-3ca2-6fae-8003-fe96832e57d6",
"parent_checkpoint_id": "1ef67141-2129-6b37-8002-61fc3bf69cb5"
}
```

We can also just print the content of the last AIMessage:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    print(final_result['values']['messages'][-1]['content'][0]['text'])
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    console.log(finalResult['values']['messages'][finalResult['values']['messages'].length-1]['content'][0]['text']);
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request GET \
        --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/state | jq -r '.values.messages[-1].content.[0].text'
    ```
  </Tab>
</Tabs>

Output:

```
The search results provide the current weather conditions in San Francisco. According to the data, as of 2:00 PM on August 30, 2024, the temperature in San Francisco is 70°F (21.1°C) with partly cloudy skies. The wind is blowing from the west-northwest at around 12 mph (19 km/h). The humidity is 59% and visibility is 9 miles (16 km). Overall, it looks like a nice late summer day in San Francisco with comfortable temperatures and partly sunny conditions.
```

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/background-run.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Manage billing in your account
Source: https://docs.langchain.com/langsmith/billing



This page describes how to manage billing for your LangSmith organization:

* [Set up billing for your account](#set-up-billing-for-your-account): Complete the billing setup process for Developer and Plus plans, including special instructions for legacy accounts.
* [Update your information](#update-your-information): Modify invoice email addresses, business information, and tax IDs for your organization.
* [Enforce spend limits](#enforce-spend-limits): Learn how to manage your spend through usage limits and data retention.

## Set up billing for your account

<Note>
  Before using this guide, note the following:

  * If you are interested in the [Enterprise](https://www.langchain.com/pricing) plan, please [contact sales](https://www.langchain.com/contact-sales). This guide is only for our self-serve billing plans.
</Note>

To set up billing for your LangSmith organization, navigate to the [Billing and Usage](https://smith.langchain.com/settings/payments) page under **Settings**. Depending on your organization's settings, there are different setup guides:

* [Developer plan](#developer-plan%3A-set-up-billing-on-your-personal-organization)
* [Plus plan](#plus-plan%3A-set-up-billing-on-a-shared-organization)

### Developer Plan: set up billing on your personal organization

Personal organizations are limited to 5,000 traces per month until a credit card is added. To add a card:

1. Click **Add card to remove trace limit**.
2. Add your credit card information.
3. Once complete, you will no longer be rate limited to 5,000 traces, and you will be charged for any excess traces at rates specified on the [pricing](https://www.langchain.com/pricing-langsmith) page.

### Plus Plan: set up billing on a shared organization

Team organizations are given an initial 10,000 traces per month. Any excess traces will be charged at rates specified on the [pricing](https://www.langchain.com/pricing-langsmith) page.

<Note>
  New organizations that you manually create are required to be on the Plus Plan. If you see a message about needing to upgrade to Plus to use this organization, follow these steps.
</Note>

1. Click **Upgrade to Plus**.
2. Invite members to your organization, as desired.
3. Enter your credit card information. Then, enter business information, invoice email, and tax ID. If this organization belongs to a business, check the **This is a business** checkbox and enter the information accordingly. For more information, refer to the [Update your information section](#update-your-information).

## Update your information (Paid plans only)

To update business information for your LangSmith organization, head to the [Billing and Usage](https://smith.langchain.com/settings/payments) page under **Settings**.

### Invoice email

To update the email address for invoices, follow these steps:

1. Navigate to the **Plans and Billing** tab.
2. Locate the section beneath the payment method, where the current invoice email is displayed.
3. Enter the new email address for invoices in the provided field.
4. The new email address will be automatically saved.

You will receive all future invoices to the updated email address.

### Business information and tax ID

<Note>
  In certain jurisdictions, LangSmith is required to collect sales tax. If you are a business, providing your tax ID may qualify you for a sales tax exemption.
</Note>

To update your organization's business information, follow these steps:

1. Navigate to the **Plans and Billing** tab.
2. Below the invoice email section, you will find a checkbox labeled **Business**.
3. Check the **Business** checkbox if your organization belongs to a business.
4. A business information section will appear, allowing you to enter or update the following details:
   * Business Name
   * Address
   * Tax ID for applicable jurisdictions
5. A Tax ID field will appear for applicable jurisdictions after you select a country.
6. After entering the necessary information, click the **Save** button to save your changes.

This ensures that your business information is up-to-date and accurate for billing and tax purposes.

## Enforce spend limits

<Check>
  You may find it helpful to read the following pages, before continuing with this section on optimizing your tracing spend:

  * [Data Retention Conceptual Docs](/langsmith/administration-overview#data-retention)
  * [Usage Limiting Conceptual Docs](/langsmith/administration-overview#usage-limits)
</Check>

<Note>
  Some of the features mentioned in this guide are not currently available on Enterprise plan due to its custom nature of billing. If you are on the Enterprise plan and have questions about cost optimization, reach out to your sales rep or [support@langchain.dev](mailto:support@langchain.dev).
</Note>

### Understand your current usage

The first step of any optimization process is to understand current usage. LangSmith provides two ways to do this: [Usage graph](#usage-graph) and [Invoices](#invoices).

LangSmith Usage is measured per workspace, because workspaces often represent development environments (as in the example), or teams within an organization.

#### Usage graph

The usage graph lets you examine how much of each usage-based pricing metric you have consumed. It does not directly show spend (which you will review later in the draft invoice).

Navigate to the usage graph under **Settings** -> **Billing and Usage** -> **Usage Graph**.

There are two usage metrics that LangSmith charges for:

* LangSmith Traces (Base Charge): tracks all traces that you send to LangSmith.
* LangSmith Traces (Extended Data Retention Upgrades): tracks all traces that also have our Extended 400 Day Data Retention.

For more details, refer to the [data retention conceptual docs](/langsmith/administration-overview#data-retention).

#### Invoices

To understand how your usage translates to spend, navigate to the **Invoices** tab. The first invoice that will appear on screen is a draft of your current month's invoice, which shows your running spend thus far this month.

<Note>
  LangSmith's Usage Graph and Invoice use the term `tenant_id` to refer to a workspace ID. They are interchangeable.
</Note>

### Set limits on usage

<img src="https://mintcdn.com/langchain-5e9cc07a/-XAfdD9knKGGfZBx/langsmith/images/p2usagelimitsempty-v2.png?fit=max&auto=format&n=-XAfdD9knKGGfZBx&q=85&s=27addecc92b87dd4131683fb8500f96c" alt="" data-og-width="2598" width="2598" data-og-height="1582" height="1582" data-path="langsmith/images/p2usagelimitsempty-v2.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/-XAfdD9knKGGfZBx/langsmith/images/p2usagelimitsempty-v2.png?w=280&fit=max&auto=format&n=-XAfdD9knKGGfZBx&q=85&s=1b544ddf00fd88a470ec19db0283a9ec 280w, https://mintcdn.com/langchain-5e9cc07a/-XAfdD9knKGGfZBx/langsmith/images/p2usagelimitsempty-v2.png?w=560&fit=max&auto=format&n=-XAfdD9knKGGfZBx&q=85&s=91a0e4559da14d35d8644e20ff2210ec 560w, https://mintcdn.com/langchain-5e9cc07a/-XAfdD9knKGGfZBx/langsmith/images/p2usagelimitsempty-v2.png?w=840&fit=max&auto=format&n=-XAfdD9knKGGfZBx&q=85&s=0f77081378606e6bb4061e4c7b694381 840w, https://mintcdn.com/langchain-5e9cc07a/-XAfdD9knKGGfZBx/langsmith/images/p2usagelimitsempty-v2.png?w=1100&fit=max&auto=format&n=-XAfdD9knKGGfZBx&q=85&s=e5ea7731ac5e71276ed58f8f8df03d18 1100w, https://mintcdn.com/langchain-5e9cc07a/-XAfdD9knKGGfZBx/langsmith/images/p2usagelimitsempty-v2.png?w=1650&fit=max&auto=format&n=-XAfdD9knKGGfZBx&q=85&s=f36ec6fc852a08f1d83bb8e0a0cc0fa8 1650w, https://mintcdn.com/langchain-5e9cc07a/-XAfdD9knKGGfZBx/langsmith/images/p2usagelimitsempty-v2.png?w=2500&fit=max&auto=format&n=-XAfdD9knKGGfZBx&q=85&s=88f9537239e3a5250b056070d3b57a93 2500w" />

#### Set spend limit for workspace

1. To set limits, navigate to **Settings** -> **Billing and Usage** -> **Usage limits**.
2. Input a spend limit for your selected workspace. LangSmith will determine an appropriate number of base and extended trace limits to match that spend. The trace limits include the free trace allocation that comes with your plan (see details on [pricing page](https://smith.langchain.com/settings/payments)).

<Note>
  For organizations with **multiple workspaces only**: For simplicity, LangSmith incorporates the free traces into the cost calculation of the **first workspace only**. In actuality, the free traces can be "consumed" by any workspace. Therefore, although workspace-level spend limits are approximate for multi-workspace organizations, the organization-level spend limit is absolute.
</Note>

#### Configure trace tier distrubution

LangSmith has two trace tiers: base traces and extended traces. Base traces have the base retention and are short-lived (14 days), while extended traces have extended retention and are long-lived (400 days). For more information, refer to the [data retention conceptual docs](/langsmith/administration-overview#data-retention).

Set the desired default trace tier by selecting an option below the **Default data retention** label. All traces will have this tier by default when they are registered. Note that because extended traces cost more than base traces, selecting **Extended** as your default data retention option will result in less overall traces allowed in the billing period. By default, updating this setting will only apply to future incoming traces. To apply to all existing traces in the workspace, select the checkbox.

If the default data retention is set to **Base** you can optionally use the slider to distribute trace limits across base and extended tracess. LangSmith automatically provides a suggestion for this distribution but you can tailor this to your needs. For example, if you are running lots of automations or other features that may upgrade a trace to extended, you may want to increase your extended trace limits. To see the complete list of features that may upgrade a trace, [see here](https://docs.langchain.com/langsmith/administration-overview#how-it-works:~:text=Data%20retention%20auto%2Dupgrades).

<Note>
  The extended data retention limit can cause features other than tracing to stop working once reached. If you plan to use this feature, read more about its [functionality and side effects](/langsmith/administration-overview#side-effects-of-extended-data-retention-traces-limit).
</Note>

### Other methods of managing traces

#### Change project-level default retention

Data retention settings are adjustable per tracing project.

Navigate to **Projects** > ***Your project name*** > Select **Retention** and select the desired default retention. This will only affect retention (and pricing) for **traces going forward**.

<img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/p1projectretention.png?fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=394b513df5ef31d0309f5f3c78bd315a" alt="" data-og-width="1358" width="1358" data-og-height="452" height="452" data-path="langsmith/images/p1projectretention.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/p1projectretention.png?w=280&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=0ebc83ac05d14858da153e707bd02f6b 280w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/p1projectretention.png?w=560&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=0baaf55ed3a4719c2b3c3545778e1ded 560w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/p1projectretention.png?w=840&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=30989a6636683bca7e456ea1897c2986 840w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/p1projectretention.png?w=1100&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=ee7c87a83e1a15bc4d843bae3ad32811 1100w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/p1projectretention.png?w=1650&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=7947cbde975873a53d197019453485e5 1650w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/p1projectretention.png?w=2500&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=53e1f6edb92789ee5034d5f7f1153af6 2500w" />

#### Apply extended data retention to a percentage of traces

You may not want all traces to expire after 14 days. You can automatically extend the retention of traces that match some criteria by creating an [automation rule](/langsmith/rules). You might want to apply extended data retention to specific types of traces, such as:

* 10% of all traces: For general analysis or analyzing trends long term.
* Errored traces: To investigate and debug issues thoroughly.
* Traces with specific metadata: For long-term examination of particular features or user flows.

To configure this:

1. Navigate to **Projects** > ***Your project name*** > Select **+ New** > Select **New Automation**.
2. Name your rule and optionally apply filters or a sample rate. For more information on configuring filters, refer to [filtering techniques](/langsmith/filter-traces-in-application#filter-operators).

<Note>
  When an automation rule matches any [run](/langsmith/observability-concepts#runs) within a [trace](/langsmith/observability-concepts#traces), then all runs within the trace are upgraded to be retained for 400 days.
</Note>

For example, this is the expected configuration to keep 10% of all traces for extended data retention:

<img src="https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/langsmith/images/P2SampleTraces.png?fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=09bbdf5ef7cf3a5a99d6bf0a704e2143" alt="" data-og-width="640" width="640" data-og-height="610" height="610" data-path="langsmith/images/P2SampleTraces.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/langsmith/images/P2SampleTraces.png?w=280&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=b40ff90a0e3ef5f52ecda25965ec1daa 280w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/langsmith/images/P2SampleTraces.png?w=560&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=4381f3239c4d5c71c8ccd9dcdb4fe859 560w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/langsmith/images/P2SampleTraces.png?w=840&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=81da02c26dcc067fbbc4a4d7de318dee 840w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/langsmith/images/P2SampleTraces.png?w=1100&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=5f703039c242ba31d192fdb8f4742362 1100w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/langsmith/images/P2SampleTraces.png?w=1650&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=e293a6cca1f7da6c8ca62b64e249f6b2 1650w, https://mintcdn.com/langchain-5e9cc07a/Xbr8HuVd9jPi6qTU/langsmith/images/P2SampleTraces.png?w=2500&fit=max&auto=format&n=Xbr8HuVd9jPi6qTU&q=85&s=558b6d18d598241a2f81e8327b13ce3d 2500w" />

If you want to keep a subset of traces for **longer than 400 days** for data collection purposes, you can create another run rule that sends some runs to a dataset of your choosing. A dataset allows you to store the trace inputs and outputs (e.g., as a key-value dataset), and will persist indefinitely, even after the trace gets deleted.

### Summary

If you have questions about further managing your spend, please reach out to [support@langchain.dev](mailto:support@langchain.dev).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/billing.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Automatically run evaluators on experiments
Source: https://docs.langchain.com/langsmith/bind-evaluator-to-dataset



LangSmith supports two ways to grade experiments created via the SDK:

* **Programmatically**, by specifying evaluators in your code (see [this guide](/langsmith/evaluate-llm-application) for details)
* By **binding evaluators to a dataset** in the UI. This will automatically run the evaluators on any new experiments created, in addition to any evaluators you've set up via the SDK. This is useful when you're iterating on your application (target function), and have a standard set of evaluators you want to run for all experiments.

## Configuring an evaluator on a dataset

1. Click on the **Datasets and Experiments** tab in the sidebar.
2. Select the dataset you want to configure the evaluator for.
3. Click on the **+ Evaluator** button to add an evaluator to the dataset. This will open a pane you can use to configure the evaluator.

<Note>
  When you configure an evaluator for a dataset, it will only affect the experiment runs that are created after the evaluator is configured. It will not affect the evaluation of experiment runs that were created before the evaluator was configured.
</Note>

## LLM-as-a-judge evaluators

The process for binding evaluators to a dataset is very similar to the process for configuring a LLM-as-a-judge evaluator in the Playground. View instructions for [configuring an LLM-as-a-judge evaluator in the Playground.](/langsmith/llm-as-judge?mode=ui)

## Custom code evaluators

The process for binding a code evaluators to a dataset is very similar to the process for configuring a code evaluator in online evaluation. View instruction for [configuring code evaluators](/langsmith/online-evaluations#configure-a-custom-code-evaluator).

The only difference between configuring a code evaluator in online evaluation and binding a code evaluator to a dataset is that the custom code evaluator can reference outputs that are part of the dataset's `Example`.

For custom code evaluators bound to a dataset, the evaluator function takes in two arguments:

* A `Run` ([reference](/langsmith/run-data-format)). This represents the new run in your experiment. For example, if you ran an experiment via SDK, this would contain the input/output from your chain or model you are testing.
* An `Example` ([reference](/langsmith/example-data-format)). This represents the reference example in your dataset that the chain or model you are testing uses. The `inputs` to the Run and Example should be the same. If your Example has a reference `outputs`, then you can use this to compare to the run's output for scoring.

The code below shows an example of a simple evaluator function that checks that the outputs exactly equal the reference outputs.

<CodeGroup>
  ```python Python theme={null}
  import numpy as np

  def perform_eval(run, example):
      # run is a Run object
      # example is an Example object
      output = run['outputs']['output']
      ref_output = example['outputs']['outputs']
      output_match = np.array_equal(output, ref_output)

      return { "exact_match": output_match }
  ```

  ```javascript JavaScript theme={null}
  function perform_eval(run, example) {
      // run is a Run object
      // example is an Example object
      const output = run.outputs.output;
      const refOutput = example.outputs.outputs;

      // Deep equality check for arrays/objects
      const outputMatch = JSON.stringify(output) === JSON.stringify(refOutput);

      return { "exact_match": outputMatch };
  }
  ```
</CodeGroup>

## Next steps

* Analyze your experiment results in the [experiments tab](/langsmith/analyze-an-experiment)
* Compare your experiment results in the [comparison view](/langsmith/compare-experiment-results)

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/bind-evaluator-to-dataset.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Implement a CI/CD pipeline using LangSmith Deployments and Evaluation
Source: https://docs.langchain.com/langsmith/cicd-pipeline-example



This guide demonstrates how to implement a comprehensive CI/CD pipeline for AI agent applications deployed in LangSmith Deployments. In this example, you'll use the [LangGraph](/oss/python/langgraph/overview) open source framework for orchestrating and building the agent, [LangSmith](/langsmith/home) for observability and evaluations. This pipeline is based on the [cicd-pipeline-example repository](https://github.com/langchain-ai/cicd-pipeline-example).

## Overview

The CI/CD pipeline provides:

* <Icon icon="check-circle" /> **Automated testing**: Unit, integration, and end-to-end tests.
* <Icon icon="chart-line" /> **Offline evaluations**: Performance assessment using [AgentEvals](https://github.com/langchain-ai/agentevals), [OpenEvals](https://github.com/langchain-ai/openevals) and [LangSmith](https://docs.langchain.com/langsmith/home).
* <Icon icon="rocket" /> **Preview and production deployments**: Automated staging and quality-gated production releases using the Control Plane API.
* <Icon icon="eye" /> **Monitoring**: Continuous evaluation and alerting.

## Pipeline architecture

The CI/CD pipeline consists of several key components that work together to ensure code quality and reliable deployments:

```mermaid  theme={null}
graph TD
    A1[Code or Graph Change] --> B1[Trigger CI Pipeline]
    A2[Prompt Commit in PromptHub] --> B1
    A3[Online Evaluation Alert] --> B1
    A4[PR Opened] --> B1

    subgraph "Testing"
        B1 --> C1[Run Unit Tests]
        B1 --> C2[Run Integration Tests]
        B1 --> C3[Run End to End Tests]
        B1 --> C4[Run Offline Evaluations]

        C4 --> D1[Evaluate with OpenEvals or AgentEvals]
        C4 --> D2[Assertions: Hard and Soft]

        C1 --> E1[Run LangGraph Dev Server Test]
        C2 --> E1
        C3 --> E1
        D1 --> E1
        D2 --> E1
    end

    E1 --> F1[Push to Staging Deployment - Deploy to LangSmith as Development Type]

    F1 --> G1[Run Online Evaluations on Live Data]
    G1 --> H1[Attach Scores to Traces]

    H1 --> I1[If Quality Below Threshold]
    I1 --> J1[Send to Annotation Queue]
    I1 --> J2[Trigger Alert via Webhook]
    I1 --> J3[Push Trace to Golden Dataset]

    F1 --> K1[Promote to Production if All Pass - Deploy to LangSmith Production]

    J2 --> L1[Slack or PagerDuty Notification]

    subgraph Manual Review
        J1 --> M1[Human Labeling]
        M1 --> J3
    end
```

### Trigger sources

There are multiple ways you can trigger this pipeline, either during development or if your application is already live. The pipeline can be triggered by:

* <Icon icon="code-branch" /> **Code changes**: Pushes to main/development branches where you can modify the LangGraph architecture, try different models, update agent logic, or make any code improvements.
* <Icon icon="edit" /> **PromptHub updates**: Changes to prompt templates stored in LangSmith PromptHub—whenever there's a new prompt commit, the system triggers a webhook to run the pipeline.
* <Icon icon="exclamation-triangle" /> **Online evaluation alerts**: Performance degradation notifications from live deployments
* <Icon icon="webhook" /> **LangSmith traces webhooks**: Automated triggers based on trace analysis and performance metrics.
* <Icon icon="play" /> **Manual trigger**: Manual initiation of the pipeline for testing or emergency deployments.

### Testing layers

Compared to traditional software, testing AI agent applications also requires assessing response quality, so it is important to test each part of the workflow. The pipeline implements multiple testing layers:

1. <Icon icon="puzzle-piece" /> **Unit tests**: Individual node and utility function testing.
2. <Icon icon="link" /> **Integration tests**: Component interaction testing.
3. <Icon icon="route" /> **End-to-end tests**: Full graph execution testing.
4. <Icon icon="brain" /> **Offline evaluations**: Performance assessment with real-world scenarios including end-to-end evaluations, single-step evaluations, agent trajectory analysis, and multi-turn simulations.
5. <Icon icon="server" /> **LangGraph dev server tests**: Use the [langgraph-cli](/langsmith/cli) tool for spinning up (inside the GitHub Action) a local server to run the LangGraph agent. This polls the `/ok` server API endpoint until it is available and for 30 seconds, after that it throws an error.

## GitHub Actions workflow

The CI/CD pipeline uses GitHub Actions with the [Control Plane API](/langsmith/api-ref-control-plane) and [LangSmith API](https://api.smith.langchain.com/redoc) to automate deployment. A helper script manages API interactions and deployments: [https://github.com/langchain-ai/cicd-pipeline-example/blob/main/.github/scripts/langgraph\_api.py](https://github.com/langchain-ai/cicd-pipeline-example/blob/main/.github/scripts/langgraph_api.py).

The workflow includes:

* **New agent deployment**: When a new PR is opened and tests pass, a new preview deployment is created in LangSmith Deployments using the [Control Plane API](/langsmith/api-ref-control-plane). This allows you to test the agent in a staging environment before promoting to production.

* **Agent deployment revision**: A revision happens when an existing deployment with the same ID is found, or when the PR is merged into main. In the case of merging to main, the preview deployment is deleted and a production deployment is created. This ensures that any updates to the agent are properly deployed and integrated into the production infrastructure.

  <img src="https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-new-lgp-revision.png?fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=3ef7d51a322b8b5e2f9c2c70579fcc97" alt="Agent Deployment Revision Workflow" data-og-width="1022" width="1022" data-og-height="196" height="196" data-path="langsmith/images/cicd-new-lgp-revision.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-new-lgp-revision.png?w=280&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=a3d06c339e84a1af99450d23e8bd617f 280w, https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-new-lgp-revision.png?w=560&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=30589c8727af3ecb1d97881fd6692554 560w, https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-new-lgp-revision.png?w=840&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=c05ab515ea0901fb2d076dee256ad108 840w, https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-new-lgp-revision.png?w=1100&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=b939ad6842110227f70cc0526468d21d 1100w, https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-new-lgp-revision.png?w=1650&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=0559d5b2a85414e954a72377b2eed9ec 1650w, https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-new-lgp-revision.png?w=2500&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=b8b96047a8b37f31b78d793cd7d18f45 2500w" />

* **Testing and evaluation workflow**: In addition to the more traditional testing phases (unit tests, integration tests, end-to-end tests, etc.), the pipeline includes [offline evaluations](/langsmith/evaluation-concepts#offline-evaluation) and [Agent dev server testing](/langsmith/local-server) because you want to test the quality of your agent. These evaluations provide comprehensive assessment of the agent's performance using real-world scenarios and data.

  <img src="https://mintcdn.com/langchain-5e9cc07a/MrTet_AXQVddxOlO/langsmith/images/cicd-test-with-results.png?fit=max&auto=format&n=MrTet_AXQVddxOlO&q=85&s=477c3f5ec3d9bb9dfc354b9a57860636" alt="Test with Results Workflow" data-og-width="2050" width="2050" data-og-height="996" height="996" data-path="langsmith/images/cicd-test-with-results.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/MrTet_AXQVddxOlO/langsmith/images/cicd-test-with-results.png?w=280&fit=max&auto=format&n=MrTet_AXQVddxOlO&q=85&s=7c5885b5f85c1c408fda449c5a0c706a 280w, https://mintcdn.com/langchain-5e9cc07a/MrTet_AXQVddxOlO/langsmith/images/cicd-test-with-results.png?w=560&fit=max&auto=format&n=MrTet_AXQVddxOlO&q=85&s=3b9a25332a9f6b56edfc9fbbfec248c1 560w, https://mintcdn.com/langchain-5e9cc07a/MrTet_AXQVddxOlO/langsmith/images/cicd-test-with-results.png?w=840&fit=max&auto=format&n=MrTet_AXQVddxOlO&q=85&s=380cb346fffbaf13365b37c6fa955c05 840w, https://mintcdn.com/langchain-5e9cc07a/MrTet_AXQVddxOlO/langsmith/images/cicd-test-with-results.png?w=1100&fit=max&auto=format&n=MrTet_AXQVddxOlO&q=85&s=8994d1e816e725865f90a2ac6601f7a4 1100w, https://mintcdn.com/langchain-5e9cc07a/MrTet_AXQVddxOlO/langsmith/images/cicd-test-with-results.png?w=1650&fit=max&auto=format&n=MrTet_AXQVddxOlO&q=85&s=42b752f1e5f0043dd6998ae372e83874 1650w, https://mintcdn.com/langchain-5e9cc07a/MrTet_AXQVddxOlO/langsmith/images/cicd-test-with-results.png?w=2500&fit=max&auto=format&n=MrTet_AXQVddxOlO&q=85&s=043be8ed1ef59cea171f30146790a877 2500w" />

  <AccordionGroup>
    <Accordion title="Final Response Evaluation" icon="check-circle">
      Evaluates the final output of your agent against expected results. This is the most common type of evaluation that checks if the agent's final response meets quality standards and answers the user's question correctly.
    </Accordion>

    <Accordion title="Single Step Evaluation" icon="step-forward">
      Tests individual steps or nodes within your LangGraph workflow. This allows you to validate specific components of your agent's logic in isolation, ensuring each step functions correctly before testing the full pipeline.
    </Accordion>

    <Accordion title="Agent Trajectory Evaluation" icon="route">
      Analyzes the complete path your agent takes through the graph, including all intermediate steps and decision points. This helps identify bottlenecks, unnecessary steps, or suboptimal routing in your agent's workflow. It also evaluates whether your agent invoked the right tools in the right order or at the right time.
    </Accordion>

    <Accordion title="Multi-Turn Evaluation" icon="comments">
      Tests conversational flows where the agent maintains context across multiple interactions. This is crucial for agents that handle follow-up questions, clarifications, or extended dialogues with users.
    </Accordion>
  </AccordionGroup>

  See the [LangGraph testing documentation](/oss/python/langgraph/test) for specific testing approaches and the [evaluation approaches guide](/langsmith/evaluation-approaches) for a comprehensive overview of offline evaluations.

### Prerequisites

Before setting up the CI/CD pipeline, ensure you have:

* <Icon icon="robot" /> An AI agent application (in this case built using [LangGraph](/oss/python/langgraph/overview))
* <Icon icon="user" /> A [LangSmith account](https://smith.langchain.com/)
* <Icon icon="key" /> A [LangSmith API key](/langsmith/create-account-api-key) needed to deploy agents and retrieve experiment results
* <Icon icon="cog" /> Project-specific environment variables configured in your repository secrets (e.g., LLM model API keys, vector store credentials, database connections)

<Note>
  While this example uses GitHub, the CI/CD pipeline works with other Git hosting platforms including GitLab, Bitbucket, and others.
</Note>

## Deployment options

LangSmith supports multiple deployment methods, depending on how your [LangSmith instance is hosted](/langsmith/platform-setup):

* <Icon icon="cloud" /> **Cloud LangSmith**: Direct GitHub integration or Docker image deployment.
* <Icon icon="server" /> **Self-Hosted/Hybrid**: Container registry-based deployments.

The deployment flow starts by modifying your agent implementation. At minimum, you must have a [`langgraph.json`](/langsmith/application-structure) and dependency file in your project (`requirements.txt` or `pyproject.toml`). Use the `langgraph dev` CLI tool to check for errors—fix any errors; otherwise, the deployment will succeed when deployed to LangSmith Deployments.

```mermaid  theme={null}
graph TD
    A[Agent Implementation] --> B[langgraph.json + dependencies]
    B --> C[Test Locally with langgraph dev]
    C --> D{Errors?}
    D -->|Yes| E[Fix Issues]
    E --> C
    D -->|No| F[Choose LangSmith Instance]

    F --> G[Cloud LangSmith]
    F --> H[Self-Hosted/Hybrid LangSmith]

    subgraph "Cloud LangSmith"
        G --> I[Method 1: Connect GitHub Repo in UI]
        G --> J[Method 2: Control Plane API with GitHub Repo]
        I --> K[Deploy via LangSmith UI]
        J --> L[Deploy via Control Plane API]
    end

    subgraph "Self-Hosted/Hybrid LangSmith"
        H --> S[Build Docker Image langgraph build]
        S --> T[Push to Container Registry]
        T --> U{Deploy via?}
        U -->|UI| V[Specify Image URI in UI]
        U -->|API| W[Use Control Plane API]
        V --> X[Deploy via LangSmith UI]
        W --> Y[Deploy via Control Plane API]
    end

    K --> AA[Agent Ready for Use]
    L --> AA
    X --> AA
    Y --> AA

    AA --> BB{Connect via?}
    BB -->|LangGraph SDK| CC[Use LangGraph SDK]
    BB -->|RemoteGraph| DD[Use RemoteGraph]
    BB -->|REST API| EE[Use REST API]
    BB -->|LangGraph Studio UI| FF[Use LangGraph Studio UI]
```

### Prerequisites for manual deployment

Before deploying your agent, ensure you have:

1. <Icon icon="project-diagram" /> **LangGraph graph**: Your agent implementation (e.g., `./agents/simple_text2sql.py:agent`).
2. <Icon icon="box" /> **Dependencies**: Either `requirements.txt` or `pyproject.toml` with all required packages.
3. <Icon icon="cog" /> **Configuration**: `langgraph.json` file specifying:
   * Path to your agent graph
   * Dependencies location
   * Environment variables
   * Python version

Example `langgraph.json`:

```json  theme={null}
{
    "graphs": {
        "simple_text2sql": "./agents/simple_text2sql.py:agent"
    },
    "env": ".env",
    "python_version": "3.11",
    "dependencies": ["."],
    "image_distro": "wolfi"
}
```

### Local development and testing

<img src="https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-studio-cli.png?fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=425460d3401221ab441e21fc706c9cf1" alt="Studio CLI Interface" data-og-width="2972" width="2972" data-og-height="1354" height="1354" data-path="langsmith/images/cicd-studio-cli.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-studio-cli.png?w=280&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=35e64359dba47f4db4962148073cfadb 280w, https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-studio-cli.png?w=560&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=c12eb479d5c46921633c56bdead978bc 560w, https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-studio-cli.png?w=840&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=b36efc12f81027b7364cea82a4600fc3 840w, https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-studio-cli.png?w=1100&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=131c3fa2e989fbb8ebc4748a5790dc36 1100w, https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-studio-cli.png?w=1650&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=afa56b4e5ca02495ef5e7cb69d8e1329 1650w, https://mintcdn.com/langchain-5e9cc07a/-UAx6PdOIJpPyTy2/langsmith/images/cicd-studio-cli.png?w=2500&fit=max&auto=format&n=-UAx6PdOIJpPyTy2&q=85&s=774ec3dcf76a4b0e61989cd12e41e0c3 2500w" />

First, test your agent locally using [Studio](/langsmith/studio):

```bash  theme={null}
# Start local development server with Studio
langgraph dev
```

This will:

* Spin up a local server with Studio.
* Allow you to visualize and interact with your graph.
* Validate that your agent works correctly before deployment.

<Note>
  If your agent runs locally without any errors, it means that deployment to LangSmith will likely succeed. This local testing helps catch configuration issues, dependency problems, and agent logic errors before attempting deployment.
</Note>

See the [LangGraph CLI documentation](/langsmith/cli#dev) for more details.

### Method 1: LangSmith Deployment UI

Deploy your agent using the LangSmith deployment interface:

1. Go to your [LangSmith dashboard](https://smith.langchain.com).
2. Navigate to the **Deployments** section.
3. Click the **+ New Deployment** button in the top right.
4. Select your GitHub repository containing your LangGraph agent from the dropdown menu.

**Supported deployments:**

* <Icon icon="cloud" /> **Cloud LangSmith**: Direct GitHub integration with dropdown menu
* <Icon icon="server" /> **Self-Hosted/Hybrid LangSmith**: Specify your image URI in the Image Path field (e.g., `docker.io/username/my-agent:latest`)

<Info>
  **Benefits:**

  * Simple UI-based deployment
  * Direct integration with your GitHub repository (cloud)
  * No manual Docker image management required (cloud)
</Info>

### Method 2: Control Plane API

Deploy using the Control Plane API with different approaches for each deployment type:

**For Cloud LangSmith:**

* Use the Control Plane API to create deployments by pointing to your GitHub repository
* No Docker image building required for cloud deployments

**For Self-Hosted/Hybrid LangSmith:**

```bash  theme={null}
# Build Docker image
langgraph build -t my-agent:latest

# Push to your container registry
docker push my-agent:latest
```

You can push to any container registry (Docker Hub, AWS ECR, Azure ACR, Google GCR, etc.) that your deployment environment has access to.

**Supported deployments:**

* <Icon icon="cloud" /> **Cloud LangSmith**: Use the Control Plane API to create deployments from your GitHub repository
* <Icon icon="server" /> **Self-Hosted/Hybrid LangSmith**: Use the Control Plane API to create deployments from your container registry

See the [LangGraph CLI build documentation](/langsmith/cli#build) for more details.

### Connect to Your Deployed Agent

* <Icon icon="code" /> **[LangGraph SDK](https://langchain-ai.github.io/langgraph/cloud/reference/sdk/python_sdk_ref/#langgraph-sdk-python)**: Use the LangGraph SDK for programmatic integration.
* <Icon icon="project-diagram" /> **[RemoteGraph](/langsmith/use-remote-graph)**: Connect using RemoteGraph for remote graph connections (to use your graph in other graphs).
* <Icon icon="globe" /> **[REST API](/langsmith/server-api-ref)**: Use HTTP-based interactions with your deployed agent.
* <Icon icon="desktop" /> **[Studio](/langsmith/studio)**: Access the visual interface for testing and debugging.

### Environment configuration

#### Database & cache configuration

By default, LangSmith Deployments create PostgreSQL and Redis instances for you. To use external services, set the following environment variables in your new deployment or revision:

```bash  theme={null}
# Set environment variables for external services
export POSTGRES_URI_CUSTOM="postgresql://user:pass@host:5432/db"
export REDIS_URI_CUSTOM="redis://host:6379/0"
```

See the [environment variables documentation](/langsmith/env-var#postgres-uri-custom) for more details.

## Troubleshooting

### Wrong API endpoints

If you're experiencing connection issues, verify you're using the correct endpoint format for your LangSmith instance. There are two different APIs with different endpoints:

#### LangSmith API (Traces, Ingestion, etc.)

For LangSmith API operations (traces, evaluations, datasets):

| Region | Endpoint                             |
| ------ | ------------------------------------ |
| US     | `https://api.smith.langchain.com`    |
| EU     | `https://eu.api.smith.langchain.com` |

For self-hosted LangSmith instances, use `http(s)://<langsmith-url>/api` where `<langsmith-url>` is your self-hosted instance URL.

<Note>
  If you're setting the endpoint in the `LANGSMITH_ENDPOINT` environment variable, you need to add `/v1` at the end (e.g., `https://api.smith.langchain.com/v1` or `http(s)://<langsmith-url>/api/v1` if self-hosted).
</Note>

#### LangSmith Deployments API (Deployments)

For LangSmith Deployments operations (deployments, revisions):

| Region | Endpoint                            |
| ------ | ----------------------------------- |
| US     | `https://api.host.langchain.com`    |
| EU     | `https://eu.api.host.langchain.com` |

For self-hosted LangSmith instances, use `http(s)://<langsmith-url>/api-host` where `<langsmith-url>` is your self-hosted instance URL.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/cicd-pipeline-example.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# LangGraph CLI
Source: https://docs.langchain.com/langsmith/cli



**LangGraph CLI** is a command-line tool for building and running the [Agent Server](/langsmith/agent-server) locally. The resulting server exposes all API endpoints for runs, threads, assistants, etc., and includes supporting services such as a managed database for checkpointing and storage.

## Installation

1. Ensure Docker is installed (e.g., `docker --version`).

2. Install the CLI:

   <CodeGroup>
     ```bash [Python (pip)] theme={null}
     pip install langgraph-cli
     ```

     ```bash JavaScript theme={null}
     # Use latest on demand
     npx @langchain/langgraph-cli

     # Or install globally (available as `langgraphjs`)
     npm install -g @langchain/langgraph-cli
     ```
   </CodeGroup>

3. Verify the install

   <CodeGroup>
     ```bash [Python (pip)] theme={null}
     langgraph --help
     ```

     ```bash JavaScript theme={null}
     npx @langchain/langgraph-cli --help
     ```
   </CodeGroup>

### Quick commands

| Command                               | What it does                                                                                                                         |
| ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| [`langgraph dev`](#dev)               | Starts a lightweight local dev server (no Docker required), ideal for rapid testing.                                                 |
| [`langgraph build`](#build)           | Builds a Docker image of your LangGraph API server for deployment.                                                                   |
| [`langgraph dockerfile`](#dockerfile) | Emits a Dockerfile derived from your config for custom builds.                                                                       |
| [`langgraph up`](#up)                 | Starts the LangGraph API server locally in Docker. Requires Docker running; LangSmith API key for local dev; license for production. |

For JS, use `npx @langchain/langgraph-cli <command>` (or `langgraphjs` if installed globally).

## Configuration file

To build and run a valid application, the LangGraph CLI requires a JSON configuration file that follows this [schema](https://raw.githubusercontent.com/langchain-ai/langgraph/refs/heads/main/libs/cli/schemas/schema.json). It contains the following properties:

<Note>The LangGraph CLI defaults to using the configuration file named <strong>langgraph.json</strong> in the current directory.</Note>

<Tabs>
  <Tab title="Python">
    | Key                                                              | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
    | ---------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
    | <span style={{ whiteSpace: "nowrap" }}>`dependencies`</span>     | **Required**. Array of dependencies for LangSmith API server. Dependencies can be one of the following: <ul><li>A single period (`"."`), which will look for local Python packages.</li><li>The directory path where `pyproject.toml`, `setup.py` or `requirements.txt` is located.<br />For example, if `requirements.txt` is located in the root of the project directory, specify `"./"`. If it's located in a subdirectory called `local_package`, specify `"./local_package"`. Do not specify the string `"requirements.txt"` itself.</li><li>A Python package name.</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
    | <span style={{ whiteSpace: "nowrap" }}>`graphs`</span>           | **Required**. Mapping from graph ID to path where the compiled graph or a function that makes a graph is defined. Example: <ul><li>`./your_package/your_file.py:variable`, where `variable` is an instance of `langgraph.graph.state.CompiledStateGraph`</li><li>`./your_package/your_file.py:make_graph`, where `make_graph` is a function that takes a config dictionary (`langchain_core.runnables.RunnableConfig`) and returns an instance of `langgraph.graph.state.StateGraph` or `langgraph.graph.state.CompiledStateGraph`. See [how to rebuild a graph at runtime](/langsmith/graph-rebuild) for more details.</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
    | <span style={{ whiteSpace: "nowrap" }}>`auth`</span>             | *(Added in v0.0.11)* Auth configuration containing the path to your authentication handler. Example: `./your_package/auth.py:auth`, where `auth` is an instance of `langgraph_sdk.Auth`. See [authentication guide](/langsmith/auth) for details.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
    | <span style={{ whiteSpace: "nowrap" }}>`base_image`</span>       | Optional. Base image to use for the LangGraph API server. Defaults to `langchain/langgraph-api` or `langchain/langgraphjs-api`. Use this to pin your builds to a particular version of the langgraph API, such as `"langchain/langgraph-server:0.2"`. See [https://hub.docker.com/r/langchain/langgraph-server/tags](https://hub.docker.com/r/langchain/langgraph-server/tags) for more details. (added in `langgraph-cli==0.2.8`)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
    | <span style={{ whiteSpace: "nowrap" }}>`image_distro`</span>     | Optional. Linux distribution for the base image. Must be one of `"debian"`, `"wolfi"`, `"bookworm"`, or `"bullseye"`. If omitted, defaults to `"debian"`. Available in `langgraph-cli>=0.2.11`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
    | <span style={{ whiteSpace: "nowrap" }}>`env`</span>              | Path to `.env` file or a mapping from environment variable to its value.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
    | <span style={{ whiteSpace: "nowrap" }}>`store`</span>            | Configuration for adding semantic search and/or time-to-live (TTL) to the BaseStore. Contains the following fields: <ul><li>`index` (optional): Configuration for semantic search indexing with fields `embed`, `dims`, and optional `fields`.</li><li>`ttl` (optional): Configuration for item expiration. An object with optional fields: `refresh_on_read` (boolean, defaults to `true`), `default_ttl` (float, lifespan in **minutes**; applied to newly created items only; existing items are unchanged; defaults to no expiration), and `sweep_interval_minutes` (integer, how often to check for expired items, defaults to no sweeping).</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
    | <span style={{ whiteSpace: "nowrap" }}>`ui`</span>               | Optional. Named definitions of UI components emitted by the agent, each pointing to a JS/TS file. (added in `langgraph-cli==0.1.84`)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
    | <span style={{ whiteSpace: "nowrap" }}>`python_version`</span>   | `3.11`, `3.12`, or `3.13`. Defaults to `3.11`.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
    | <span style={{ whiteSpace: "nowrap" }}>`node_version`</span>     | Specify `node_version: 20` to use LangGraph.js.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
    | <span style={{ whiteSpace: "nowrap" }}>`pip_config_file`</span>  | Path to `pip` config file.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
    | <span style={{ whiteSpace: "nowrap" }}>`pip_installer`</span>    | *(Added in v0.3)* Optional. Python package installer selector. It can be set to `"auto"`, `"pip"`, or `"uv"`. From version 0.3 onward the default strategy is to run `uv pip`, which typically delivers faster builds while remaining a drop-in replacement. In the uncommon situation where `uv` cannot handle your dependency graph or the structure of your `pyproject.toml`, specify `"pip"` here to revert to the earlier behaviour.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
    | <span style={{ whiteSpace: "nowrap" }}>`keep_pkg_tools`</span>   | *(Added in v0.3.4)* Optional. Control whether to retain Python packaging tools (`pip`, `setuptools`, `wheel`) in the final image. Accepted values: <ul><li><code>true</code> : Keep all three tools (skip uninstall).</li><li><code>false</code> / omitted : Uninstall all three tools (default behaviour).</li><li><code>list\[str]</code> : Names of tools <strong>to retain</strong>. Each value must be one of "pip", "setuptools", "wheel".</li></ul>. By default, all three tools are uninstalled.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
    | <span style={{ whiteSpace: "nowrap" }}>`dockerfile_lines`</span> | Array of additional lines to add to Dockerfile following the import from parent image.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
    | <span style={{ whiteSpace: "nowrap" }}>`checkpointer`</span>     | Configuration for the checkpointer. Supports: <ul><li>`ttl` (optional): Object with `strategy`, `sweep_interval_minutes`, `default_ttl` controlling checkpoint expiry.</li><li>`serde` (optional, 0.5+): Object with `allowed_json_modules` and `pickle_fallback` to tune deserialization behavior.</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
    | <span style={{ whiteSpace: "nowrap" }}>`http`</span>             | HTTP server configuration with the following fields: <ul><li>`app`: Path to custom Starlette/FastAPI app (e.g., `"./src/agent/webapp.py:app"`). See [custom routes guide](/langsmith/custom-routes).</li><li>`cors`: CORS configuration with fields such as `allow_origins`, `allow_methods`, `allow_headers`, `allow_credentials`, `allow_origin_regex`, `expose_headers`, and `max_age`.</li><li>`configurable_headers`: Define which request headers to expose as configurable values via `includes` / `excludes` patterns.</li><li>`logging_headers`: Mirror of `configurable_headers` for excluding sensitive headers from logs.</li><li>`middleware_order`: Choose how custom middleware and auth interact. `auth_first` runs authentication hooks before custom middleware, while `middleware_first` (default) runs your middleware first.</li><li>`enable_custom_route_auth`: Apply auth checks to routes added through `app`.</li><li>`disable_assistants`, `disable_mcp`, `disable_meta`, `disable_runs`, `disable_store`, `disable_threads`, `disable_ui`, `disable_webhooks`: Disable built-in routes or hooks.</li><li>`mount_prefix`: Prefix for mounted routes (e.g., "/my-deployment/api").</li></ul> |
    | <span style={{ whiteSpace: "nowrap" }}>`api_version`</span>      | *(Added in v0.3.7)* Which semantic version of the LangGraph API server to use (e.g., `"0.3"`). Defaults to latest. Check the server [changelog](/langsmith/agent-server-changelog) for details on each release.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
  </Tab>

  <Tab title="JS">
    | Key                                                              | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
    | ---------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
    | <span style={{ whiteSpace: "nowrap" }}>`graphs`</span>           | **Required**. Mapping from graph ID to path where the compiled graph or a function that makes a graph is defined. Example: <ul><li>`./src/graph.ts:variable`, where `variable` is an instance of [`CompiledStateGraph`](https://reference.langchain.com/python/langgraph/graphs/#langgraph.graph.state.CompiledStateGraph)</li><li>`./src/graph.ts:makeGraph`, where `makeGraph` is a function that takes a config dictionary (`LangGraphRunnableConfig`) and returns an instance of [`StateGraph`](https://reference.langchain.com/python/langgraph/graphs/#langgraph.graph.state.StateGraph) or [`CompiledStateGraph`](https://reference.langchain.com/python/langgraph/graphs/#langgraph.graph.state.CompiledStateGraph). See [how to rebuild a graph at runtime](/langsmith/graph-rebuild) for more details.</li></ul> |
    | <span style={{ whiteSpace: "nowrap" }}>`env`</span>              | Path to `.env` file or a mapping from environment variable to its value.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
    | <span style={{ whiteSpace: "nowrap" }}>`store`</span>            | Configuration for adding semantic search and/or time-to-live (TTL) to the BaseStore. Contains the following fields: <ul><li>`index` (optional): Configuration for semantic search indexing with fields `embed`, `dims`, and optional `fields`.</li><li>`ttl` (optional): Configuration for item expiration. An object with optional fields: `refresh_on_read` (boolean, defaults to `true`), `default_ttl` (float, lifespan in **minutes**; applied to newly created items only; existing items are unchanged; defaults to no expiration), and `sweep_interval_minutes` (integer, how often to check for expired items, defaults to no sweeping).</li></ul>                                                                                                                                                                |
    | <span style={{ whiteSpace: "nowrap" }}>`node_version`</span>     | Specify `node_version: 20` to use LangGraph.js.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
    | <span style={{ whiteSpace: "nowrap" }}>`dockerfile_lines`</span> | Array of additional lines to add to Dockerfile following the import from parent image.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
    | <span style={{ whiteSpace: "nowrap" }}>`checkpointer`</span>     | Configuration for the checkpointer. Supports: <ul><li>`ttl` (optional): Object with `strategy`, `sweep_interval_minutes`, `default_ttl` controlling checkpoint expiry.</li><li>`serde` (optional, 0.5+): Object with `allowed_json_modules` and `pickle_fallback` to tune deserialization behavior.</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
    | <span style={{ whiteSpace: "nowrap" }}>`http`</span>             | HTTP server configuration mirroring the Python options: <ul><li>`cors` with `allow_origins`, `allow_methods`, `allow_headers`, `allow_credentials`, `allow_origin_regex`, `expose_headers`, `max_age`.</li><li>`configurable_headers` and `logging_headers` pattern lists.</li><li>`middleware_order` (`auth_first` or `middleware_first`).</li><li>`enable_custom_route_auth` plus the same boolean route toggles as above.</li></ul>                                                                                                                                                                                                                                                                                                                                                                                     |
    | <span style={{ whiteSpace: "nowrap" }}>`api_version`</span>      | *(Added in v0.3.7)* Which semantic version of the LangGraph API server to use (e.g., `"0.3"`). Defaults to latest. Check the server [changelog](/langsmith/agent-server-changelog) for details on each release.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
  </Tab>
</Tabs>

### Examples

<Tabs>
  <Tab title="Python">
    #### Basic configuration

    ```json  theme={null}
    {
      "$schema": "https://langgra.ph/schema.json",
      "dependencies": ["."],
      "graphs": {
        "chat": "chat.graph:graph"
      }
    }
    ```

    #### Using Wolfi base images

    You can specify the Linux distribution for your base image using the `image_distro` field. Valid options are `debian`, `wolfi`, `bookworm`, or `bullseye`. Wolfi is the recommended option as it provides smaller and more secure images. This is available in `langgraph-cli>=0.2.11`.

    ```json  theme={null}
    {
      "$schema": "https://langgra.ph/schema.json",
      "dependencies": ["."],
      "graphs": {
        "chat": "chat.graph:graph"
      },
      "image_distro": "wolfi"
    }
    ```

    #### Adding semantic search to the store

    All deployments come with a DB-backed BaseStore. Adding an "index" configuration to your `langgraph.json` will enable [semantic search](/langsmith/semantic-search) within the BaseStore of your deployment.

    The `index.fields` configuration determines which parts of your documents to embed:

    * If omitted or set to `["$"]`, the entire document will be embedded
    * To embed specific fields, use JSON path notation: `["metadata.title", "content.text"]`
    * Documents missing specified fields will still be stored but won't have embeddings for those fields
    * You can still override which fields to embed on a specific item at `put` time using the `index` parameter

    ```json  theme={null}
    {
      "dependencies": ["."],
      "graphs": {
        "memory_agent": "./agent/graph.py:graph"
      },
      "store": {
        "index": {
          "embed": "openai:text-embedding-3-small",
          "dims": 1536,
          "fields": ["$"]
        }
      }
    }
    ```

    <Note>
      **Common model dimensions**

      * `openai:text-embedding-3-large`: 3072
      * `openai:text-embedding-3-small`: 1536
      * `openai:text-embedding-ada-002`: 1536
      * `cohere:embed-english-v3.0`: 1024
      * `cohere:embed-english-light-v3.0`: 384
      * `cohere:embed-multilingual-v3.0`: 1024
      * `cohere:embed-multilingual-light-v3.0`: 384
    </Note>

    #### Semantic search with a custom embedding function

    If you want to use semantic search with a custom embedding function, you can pass a path to a custom embedding function:

    ```json  theme={null}
    {
      "dependencies": ["."],
      "graphs": {
        "memory_agent": "./agent/graph.py:graph"
      },
      "store": {
        "index": {
          "embed": "./embeddings.py:embed_texts",
          "dims": 768,
          "fields": ["text", "summary"]
        }
      }
    }
    ```

    The `embed` field in store configuration can reference a custom function that takes a list of strings and returns a list of embeddings. Example implementation:

    ```python  theme={null}
    # embeddings.py
    def embed_texts(texts: list[str]) -> list[list[float]]:
        """Custom embedding function for semantic search."""
        # Implementation using your preferred embedding model
        return [[0.1, 0.2, ...] for _ in texts]  # dims-dimensional vectors
    ```

    #### Adding custom authentication

    ```json  theme={null}
    {
      "$schema": "https://langgra.ph/schema.json",
      "dependencies": ["."],
      "graphs": {
        "chat": "chat.graph:graph"
      },
      "auth": {
        "path": "./auth.py:auth",
        "openapi": {
          "securitySchemes": {
            "apiKeyAuth": {
              "type": "apiKey",
              "in": "header",
              "name": "X-API-Key"
            }
          },
          "security": [{ "apiKeyAuth": [] }]
        },
        "disable_studio_auth": false
      }
    }
    ```

    See the [authentication conceptual guide](/langsmith/auth) for details, and the [setting up custom authentication](/langsmith/set-up-custom-auth) guide for a practical walk through of the process.

    <a id="ttl" />

    #### Configuring store item Time-to-Live

    You can configure default data expiration for items/memories in the BaseStore using the `store.ttl` key. This determines how long items are retained after they are last accessed (with reads potentially refreshing the timer based on `refresh_on_read`). Note that these defaults can be overwritten on a per-call basis by modifying the corresponding arguments in `get`, `search`, etc.

    The `ttl` configuration is an object containing optional fields:

    * `refresh_on_read`: If `true` (the default), accessing an item via `get` or `search` resets its expiration timer. Set to `false` to only refresh TTL on writes (`put`).
    * `default_ttl`: The default lifespan of an item in **minutes**. Applies only to newly created items; existing items are not modified. If not set, items do not expire by default.
    * `sweep_interval_minutes`: How frequently (in minutes) the system should run a background process to delete expired items. If not set, sweeping does not occur automatically.

    Here is an example enabling a 7-day TTL (10080 minutes), refreshing on reads, and sweeping every hour:

    ```json  theme={null}
    {
      "$schema": "https://langgra.ph/schema.json",
      "dependencies": ["."],
      "graphs": {
        "memory_agent": "./agent/graph.py:graph"
      },
      "store": {
        "ttl": {
          "refresh_on_read": true,
          "sweep_interval_minutes": 60,
          "default_ttl": 10080
        }
      }
    }
    ```

    <a id="ttl" />

    #### Configuring checkpoint Time-to-Live

    You can configure the time-to-live (TTL) for checkpoints using the `checkpointer` key. This determines how long checkpoint data is retained before being automatically handled according to the specified strategy (e.g., deletion). Two optional sub-objects are supported:

    * `ttl`: Includes `strategy`, `sweep_interval_minutes`, and `default_ttl`, which collectively set how checkpoints expire.
    * `serde` *(Agent server 0.5+)* : Lets you control deserialization behavior for checkpoint payloads.

    Here's an example setting a default TTL of 30 days (43200 minutes):

    ```json  theme={null}
    {
      "$schema": "https://langgra.ph/schema.json",
      "dependencies": ["."],
      "graphs": {
        "chat": "chat.graph:graph"
      },
      "checkpointer": {
        "ttl": {
          "strategy": "delete",
          "sweep_interval_minutes": 10,
          "default_ttl": 43200
        }
      }
    }
    ```

    In this example, checkpoints older than 30 days will be deleted, and the check runs every 10 minutes.

    #### Configuring checkpointer serde

    The `checkpointer.serde` object shapes deserialization:

    * `allowed_json_modules` defines an allow list for custom Python objects you want the server to be able to deserialize from payloads saved in "json" mode. This is a list of `[path, to, module, file, symbol]` sequences. If omitted, only LangChain-safe defaults are allowed. You can unsafely set to `true` to allow any module to be deserialized.
    * `pickle_fallback`: Whether to fall back to pickle deserialization when JSON decoding fails.

    ```json  theme={null}
    {
      "checkpointer": {
        "serde": {
          "allowed_json_modules": [
            ["my_agent", "auth", "SessionState"]
          ]
        }
      }
    }
    ```

    #### Customizing HTTP middleware and headers

    The `http` block lets you fine-tune request handling:

    * `middleware_order`: Choose `"auth_first"` to run authentication before your middleware, or `"middleware_first"` (default) to invert that order.
    * `enable_custom_route_auth`: Extend authentication to routes you mount through `http.app`.
    * `configurable_headers` / `logging_headers`: Each accepts an object with optional `includes` and `excludes` arrays; wildcards are supported and exclusions run before inclusions.
    * `cors`: In addition to `allow_origins`, `allow_methods`, and `allow_headers`, you can set `allow_credentials`, `allow_origin_regex`, `expose_headers`, and `max_age` for detailed browser control.

      <a id="api-version" />

    #### Pinning API version

    *(Added in v0.3.7)*

    You can pin the API version of the Agent Server by using the `api_version` key. This is useful if you want to ensure that your server uses a specific version of the API.
    By default, builds in Cloud deployments use the latest stable version of the server. This can be pinned by setting the `api_version` key to a specific version.

    ```json  theme={null}
    {
      "$schema": "https://langgra.ph/schema.json",
      "dependencies": ["."],
      "graphs": {
        "chat": "chat.graph:graph"
      },
      "api_version": "0.2"
    }
    ```
  </Tab>

  <Tab title="JS">
    #### Basic configuration

    ```json  theme={null}
    {
      "$schema": "https://langgra.ph/schema.json",
      "graphs": {
        "chat": "./src/graph.ts:graph"
      }
    }
    ```

    <a id="api-version" />

    #### Pinning API version

    *(Added in v0.3.7)*

    You can pin the API version of the Agent Server by using the `api_version` key. This is useful if you want to ensure that your server uses a specific version of the API.
    By default, builds in Cloud deployments use the latest stable version of the server. This can be pinned by setting the `api_version` key to a specific version.

    ```json  theme={null}
    {
      "$schema": "https://langgra.ph/schema.json",
      "dependencies": ["."],
      "graphs": {
        "chat": "./src/chat/graph.ts:graph"
      },
      "api_version": "0.2"
    }
    ```
  </Tab>
</Tabs>

## Commands

**Usage**

<Tabs>
  <Tab title="Python">
    The base command for the LangGraph CLI is `langgraph`.

    ```
    langgraph [OPTIONS] COMMAND [ARGS]
    ```
  </Tab>

  <Tab title="JS">
    The base command for the LangGraph.js CLI is `langgraphjs`.

    ```
    npx @langchain/langgraph-cli [OPTIONS] COMMAND [ARGS]
    ```

    We recommend using `npx` to always use the latest version of the CLI.
  </Tab>
</Tabs>

### `dev`

<Tabs>
  <Tab title="Python">
    Run LangGraph API server in development mode with hot reloading and debugging capabilities. This lightweight server requires no Docker installation and is suitable for development and testing. State is persisted to a local directory.

    <Note>Currently, the CLI only supports Python >= 3.11.</Note>

    **Installation**

    This command requires the "inmem" extra to be installed:

    ```bash  theme={null}
    pip install -U "langgraph-cli[inmem]"
    ```

    **Usage**

    ```
    langgraph dev [OPTIONS]
    ```

    **Options**

    | Option                        | Default          | Description                                                                                                                                                                  |
    | ----------------------------- | ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
    | `-c, --config FILE`           | `langgraph.json` | Path to configuration file declaring dependencies, graphs and environment variables                                                                                          |
    | `--host TEXT`                 | `127.0.0.1`      | Host to bind the server to                                                                                                                                                   |
    | `--port INTEGER`              | `2024`           | Port to bind the server to                                                                                                                                                   |
    | `--no-reload`                 |                  | Disable auto-reload                                                                                                                                                          |
    | `--n-jobs-per-worker INTEGER` |                  | Number of jobs per worker. Default is 10                                                                                                                                     |
    | `--debug-port INTEGER`        |                  | Port for debugger to listen on                                                                                                                                               |
    | `--wait-for-client`           | `False`          | Wait for a debugger client to connect to the debug port before starting the server                                                                                           |
    | `--no-browser`                |                  | Skip automatically opening the browser when the server starts                                                                                                                |
    | `--studio-url TEXT`           |                  | URL of the Studio instance to connect to. Defaults to [https://smith.langchain.com](https://smith.langchain.com)                                                             |
    | `--allow-blocking`            | `False`          | Do not raise errors for synchronous I/O blocking operations in your code (added in `0.2.6`)                                                                                  |
    | `--tunnel`                    | `False`          | Expose the local server via a public tunnel (Cloudflare) for remote frontend access. This avoids issues with browsers like Safari or networks blocking localhost connections |
    | `--help`                      |                  | Display command documentation                                                                                                                                                |
  </Tab>

  <Tab title="JS">
    Run LangGraph API server in development mode with hot reloading capabilities. This lightweight server requires no Docker installation and is suitable for development and testing. State is persisted to a local directory.

    **Usage**

    ```
    npx @langchain/langgraph-cli dev [OPTIONS]
    ```

    **Options**

    | Option                        | Default          | Description                                                                                                                                                      |
    | ----------------------------- | ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
    | `-c, --config FILE`           | `langgraph.json` | Path to configuration file declaring dependencies, graphs and environment variables                                                                              |
    | `--host TEXT`                 | `127.0.0.1`      | Host to bind the server to                                                                                                                                       |
    | `--port INTEGER`              | `2024`           | Port to bind the server to                                                                                                                                       |
    | `--no-reload`                 |                  | Disable auto-reload                                                                                                                                              |
    | `--n-jobs-per-worker INTEGER` |                  | Number of jobs per worker. Default is 10                                                                                                                         |
    | `--debug-port INTEGER`        |                  | Port for debugger to listen on                                                                                                                                   |
    | `--wait-for-client`           | `False`          | Wait for a debugger client to connect to the debug port before starting the server                                                                               |
    | `--no-browser`                |                  | Skip automatically opening the browser when the server starts                                                                                                    |
    | `--studio-url TEXT`           |                  | URL of the Studio instance to connect to. Defaults to [https://smith.langchain.com](https://smith.langchain.com)                                                 |
    | `--allow-blocking`            | `False`          | Do not raise errors for synchronous I/O blocking operations in your code                                                                                         |
    | `--tunnel`                    | `False`          | Expose the local server via a public tunnel (Cloudflare) for remote frontend access. This avoids issues with browsers or networks blocking localhost connections |
    | `--help`                      |                  | Display command documentation                                                                                                                                    |
  </Tab>
</Tabs>

### `build`

<Tabs>
  <Tab title="Python">
    Build LangSmith API server Docker image.

    **Usage**

    ```
    langgraph build [OPTIONS]
    ```

    **Options**

    | Option                                | Default          | Description                                                                                                                                             |
    | ------------------------------------- | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
    | `--platform TEXT`                     |                  | Target platform(s) to build the Docker image for. Example: `langgraph build --platform linux/amd64,linux/arm64`                                         |
    | `-t, --tag TEXT`                      |                  | **Required**. Tag for the Docker image. Example: `langgraph build -t my-image`                                                                          |
    | `--pull / --no-pull`                  | `--pull`         | Build with latest remote Docker image. Use `--no-pull` for running the LangSmith API server with locally built images.                                  |
    | `-c, --config FILE`                   | `langgraph.json` | Path to configuration file declaring dependencies, graphs and environment variables.                                                                    |
    | `--build-command TEXT`<sup>\*</sup>   |                  | Build command to run. Runs from the directory where your `langgraph.json` file lives. Example: `langgraph build --build-command "yarn run turbo build"` |
    | `--install-command TEXT`<sup>\*</sup> |                  | Install command to run. Runs from the directory where you call `langgraph build` from. Example: `langgraph build --install-command "yarn install"`      |
    | `--help`                              |                  | Display command documentation.                                                                                                                          |

    <sup>\*</sup>Only supported for JS deployments, will have no impact on Python deployments.
  </Tab>

  <Tab title="JS">
    Build LangSmith API server Docker image.

    **Usage**

    ```
    npx @langchain/langgraph-cli build [OPTIONS]
    ```

    **Options**

    | Option              | Default          | Description                                                                                                     |
    | ------------------- | ---------------- | --------------------------------------------------------------------------------------------------------------- |
    | `--platform TEXT`   |                  | Target platform(s) to build the Docker image for. Example: `langgraph build --platform linux/amd64,linux/arm64` |
    | `-t, --tag TEXT`    |                  | **Required**. Tag for the Docker image. Example: `langgraph build -t my-image`                                  |
    | `--no-pull`         |                  | Use locally built images. Defaults to `false` to build with latest remote Docker image.                         |
    | `-c, --config FILE` | `langgraph.json` | Path to configuration file declaring dependencies, graphs and environment variables.                            |
    | `--help`            |                  | Display command documentation.                                                                                  |
  </Tab>
</Tabs>

### `up`

<Tabs>
  <Tab title="Python">
    Start LangGraph API server. For local testing, requires a LangSmith API key with access to LangSmith. Requires a license key for production use.

    **Usage**

    ```
    langgraph up [OPTIONS]
    ```

    **Options**

    | Option                       | Default                   | Description                                                                                                             |
    | ---------------------------- | ------------------------- | ----------------------------------------------------------------------------------------------------------------------- |
    | `--wait`                     |                           | Wait for services to start before returning. Implies --detach                                                           |
    | `--base-image TEXT`          | `langchain/langgraph-api` | Base image to use for the LangGraph API server. Pin to specific versions using version tags.                            |
    | `--image TEXT`               |                           | Docker image to use for the langgraph-api service. If specified, skips building and uses this image directly.           |
    | `--postgres-uri TEXT`        | Local database            | Postgres URI to use for the database.                                                                                   |
    | `--watch`                    |                           | Restart on file changes                                                                                                 |
    | `--debugger-base-url TEXT`   | `http://127.0.0.1:[PORT]` | URL used by the debugger to access LangGraph API.                                                                       |
    | `--debugger-port INTEGER`    |                           | Pull the debugger image locally and serve the UI on specified port                                                      |
    | `--verbose`                  |                           | Show more output from the server logs.                                                                                  |
    | `-c, --config FILE`          | `langgraph.json`          | Path to configuration file declaring dependencies, graphs and environment variables.                                    |
    | `-d, --docker-compose FILE`  |                           | Path to docker-compose.yml file with additional services to launch.                                                     |
    | `-p, --port INTEGER`         | `8123`                    | Port to expose. Example: `langgraph up --port 8000`                                                                     |
    | `--pull / --no-pull`         | `pull`                    | Pull latest images. Use `--no-pull` for running the server with locally-built images. Example: `langgraph up --no-pull` |
    | `--recreate / --no-recreate` | `no-recreate`             | Recreate containers even if their configuration and image haven't changed                                               |
    | `--help`                     |                           | Display command documentation.                                                                                          |
  </Tab>

  <Tab title="JS">
    Start LangGraph API server. For local testing, requires a LangSmith API key with access to LangSmith. Requires a license key for production use.

    **Usage**

    ```
    npx @langchain/langgraph-cli up [OPTIONS]
    ```

    **Options**

    | Option                                                                    | Default                                                                 | Description                                                                                                   |
    | ------------------------------------------------------------------------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------- |
    | <span style={{ whiteSpace: "nowrap" }}>`--wait`</span>                    |                                                                         | Wait for services to start before returning. Implies --detach                                                 |
    | <span style={{ whiteSpace: "nowrap" }}>`--base-image TEXT`</span>         | <span style={{ whiteSpace: "nowrap" }}>`langchain/langgraph-api`</span> | Base image to use for the LangGraph API server. Pin to specific versions using version tags.                  |
    | <span style={{ whiteSpace: "nowrap" }}>`--image TEXT`</span>              |                                                                         | Docker image to use for the langgraph-api service. If specified, skips building and uses this image directly. |
    | <span style={{ whiteSpace: "nowrap" }}>`--postgres-uri TEXT`</span>       | Local database                                                          | Postgres URI to use for the database.                                                                         |
    | <span style={{ whiteSpace: "nowrap" }}>`--watch`</span>                   |                                                                         | Restart on file changes                                                                                       |
    | <span style={{ whiteSpace: "nowrap" }}>`-c, --config FILE`</span>         | `langgraph.json`                                                        | Path to configuration file declaring dependencies, graphs and environment variables.                          |
    | <span style={{ whiteSpace: "nowrap" }}>`-d, --docker-compose FILE`</span> |                                                                         | Path to docker-compose.yml file with additional services to launch.                                           |
    | <span style={{ whiteSpace: "nowrap" }}>`-p, --port INTEGER`</span>        | `8123`                                                                  | Port to expose. Example: `langgraph up --port 8000`                                                           |
    | <span style={{ whiteSpace: "nowrap" }}>`--no-pull`</span>                 |                                                                         | Use locally built images. Defaults to `false` to build with latest remote Docker image.                       |
    | <span style={{ whiteSpace: "nowrap" }}>`--recreate`</span>                |                                                                         | Recreate containers even if their configuration and image haven't changed                                     |
    | <span style={{ whiteSpace: "nowrap" }}>`--help`</span>                    |                                                                         | Display command documentation.                                                                                |
  </Tab>
</Tabs>

### `dockerfile`

<Tabs>
  <Tab title="Python">
    Generate a Dockerfile for building a LangSmith API server Docker image.

    **Usage**

    ```
    langgraph dockerfile [OPTIONS] SAVE_PATH
    ```

    **Options**

    | Option              | Default          | Description                                                                                                     |
    | ------------------- | ---------------- | --------------------------------------------------------------------------------------------------------------- |
    | `-c, --config FILE` | `langgraph.json` | Path to the [configuration file](#configuration-file) declaring dependencies, graphs and environment variables. |
    | `--help`            |                  | Show this message and exit.                                                                                     |

    Example:

    ```bash  theme={null}
    langgraph dockerfile -c langgraph.json Dockerfile
    ```

    This generates a Dockerfile that looks similar to:

    ```dockerfile  theme={null}
    FROM langchain/langgraph-api:3.11

    ADD ./pipconf.txt /pipconfig.txt

    RUN PIP_CONFIG_FILE=/pipconfig.txt PYTHONDONTWRITEBYTECODE=1 pip install --no-cache-dir -c /api/constraints.txt langchain_community langchain_anthropic langchain_openai wikipedia scikit-learn

    ADD ./graphs /deps/__outer_graphs/src
    RUN set -ex && \
        for line in '[project]' \
                    'name = "graphs"' \
                    'version = "0.1"' \
                    '[tool.setuptools.package-data]' \
                    '"*" = ["**/*"]'; do \
            echo "$line" >> /deps/__outer_graphs/pyproject.toml; \
        done

    RUN PIP_CONFIG_FILE=/pipconfig.txt PYTHONDONTWRITEBYTECODE=1 pip install --no-cache-dir -c /api/constraints.txt -e /deps/*

    ENV LANGSERVE_GRAPHS='{"agent": "/deps/__outer_graphs/src/agent.py:graph", "storm": "/deps/__outer_graphs/src/storm.py:graph"}'
    ```

    <Note>The `langgraph dockerfile` command translates all the configuration in your `langgraph.json` file into Dockerfile commands. When using this command, you will have to re-run it whenever you update your `langgraph.json` file. Otherwise, your changes will not be reflected when you build or run the dockerfile.</Note>
  </Tab>

  <Tab title="JS">
    Generate a Dockerfile for building a LangSmith API server Docker image.

    **Usage**

    ```
    npx @langchain/langgraph-cli dockerfile [OPTIONS] SAVE_PATH
    ```

    **Options**

    | Option              | Default          | Description                                                                                                     |
    | ------------------- | ---------------- | --------------------------------------------------------------------------------------------------------------- |
    | `-c, --config FILE` | `langgraph.json` | Path to the [configuration file](#configuration-file) declaring dependencies, graphs and environment variables. |
    | `--help`            |                  | Show this message and exit.                                                                                     |

    Example:

    ```bash  theme={null}
    npx @langchain/langgraph-cli dockerfile -c langgraph.json Dockerfile
    ```

    This generates a Dockerfile that looks similar to:

    ```dockerfile  theme={null}
    FROM langchain/langgraphjs-api:20

    ADD . /deps/agent

    RUN cd /deps/agent && yarn install

    ENV LANGSERVE_GRAPHS='{"agent":"./src/react_agent/graph.ts:graph"}'

    WORKDIR /deps/agent

    RUN (test ! -f /api/langgraph_api/js/build.mts && echo "Prebuild script not found, skipping") || tsx /api/langgraph_api/js/build.mts
    ```

    <Note>The `npx @langchain/langgraph-cli dockerfile` command translates all the configuration in your `langgraph.json` file into Dockerfile commands. When using this command, you will have to re-run it whenever you update your `langgraph.json` file. Otherwise, your changes will not be reflected when you build or run the dockerfile.</Note>
  </Tab>
</Tabs>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/cli.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Cloud
Source: https://docs.langchain.com/langsmith/cloud



<Callout icon="rocket" color="#4F46E5" iconType="regular">
  If you're ready to deploy your app to Cloud, follow the [Cloud deployment quickstart](/langsmith/deployment-quickstart) or the [full setup guide](/langsmith/deploy-to-cloud). This page explains the Cloud managed architecture for reference.
</Callout>

The **Cloud** option is a fully managed model where LangChain hosts and operates all LangSmith infrastructure and services:

* **Fully managed infrastructure**: LangChain handles all infrastructure, updates, scaling, and maintenance.
* **Deploy from GitHub**: Connect your repositories and deploy with a few clicks.
* **Automated CI/CD**: Build process is handled automatically by the platform.
* **LangSmith UI**: Full access to [observability](/langsmith/observability), [evaluation](/langsmith/evaluation), [deployment management](/langsmith/deployments), and [Studio](/langsmith/studio).

|                                               | **Who manages it** | **Where it runs** |
| --------------------------------------------- | ------------------ | ----------------- |
| **LangSmith platform (UI, APIs, datastores)** | LangChain          | LangChain's cloud |
| **Your Agent Servers**                        | LangChain          | LangChain's cloud |
| **CI/CD for your apps**                       | LangChain          | LangChain's cloud |

<img src="https://mintcdn.com/langchain-5e9cc07a/JOyLr_spVEW0t2KF/langsmith/images/langgraph-cloud-architecture.png?fit=max&auto=format&n=JOyLr_spVEW0t2KF&q=85&s=3f0316122425895270d0ecd47b12e139" alt="Cloud deployment: LangChain hosts and manages all components including the UI, APIs, and your Agent Servers." data-og-width="1425" width="1425" data-og-height="1063" height="1063" data-path="langsmith/images/langgraph-cloud-architecture.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/JOyLr_spVEW0t2KF/langsmith/images/langgraph-cloud-architecture.png?w=280&fit=max&auto=format&n=JOyLr_spVEW0t2KF&q=85&s=b34a10fb40ca5188dddfb6be69696c75 280w, https://mintcdn.com/langchain-5e9cc07a/JOyLr_spVEW0t2KF/langsmith/images/langgraph-cloud-architecture.png?w=560&fit=max&auto=format&n=JOyLr_spVEW0t2KF&q=85&s=d310ce86e8421878575cf4d4fa72bf78 560w, https://mintcdn.com/langchain-5e9cc07a/JOyLr_spVEW0t2KF/langsmith/images/langgraph-cloud-architecture.png?w=840&fit=max&auto=format&n=JOyLr_spVEW0t2KF&q=85&s=7cc3fcee9c36a8d1e8da4cc4abbde3b9 840w, https://mintcdn.com/langchain-5e9cc07a/JOyLr_spVEW0t2KF/langsmith/images/langgraph-cloud-architecture.png?w=1100&fit=max&auto=format&n=JOyLr_spVEW0t2KF&q=85&s=a184c950c8585e2fdb5e75ac7f0f5642 1100w, https://mintcdn.com/langchain-5e9cc07a/JOyLr_spVEW0t2KF/langsmith/images/langgraph-cloud-architecture.png?w=1650&fit=max&auto=format&n=JOyLr_spVEW0t2KF&q=85&s=cb778e4a6707b00eaf703886d32569bd 1650w, https://mintcdn.com/langchain-5e9cc07a/JOyLr_spVEW0t2KF/langsmith/images/langgraph-cloud-architecture.png?w=2500&fit=max&auto=format&n=JOyLr_spVEW0t2KF&q=85&s=00f9d9df44512dded307613502d03299 2500w" />

## Get started

To deploy your first application to Cloud, follow the [Cloud deployment quickstart](/langsmith/deployment-quickstart) or refer to the [comprehensive setup guide](/langsmith/deploy-to-cloud).

## Cloud architecture and scalability

<Note>
  This section is only relevant for the cloud-managed LangSmith services available at [https://smith.langchain.com](https://smith.langchain.com) and [https://eu.smith.langchain.com](https://eu.smith.langchain.com).

  For information on the self-hosted LangSmith solution, please refer to the [self-hosted documentation](/langsmith/self-hosted).
</Note>

LangSmith is deployed on Google Cloud Platform (GCP) and is designed to be highly scalable. Many customers run production workloads on LangSmith for LLM application observability, evaluation, and agent deployment

### Architecture

The US-based LangSmith service is deployed in the `us-central1` (Iowa) region of GCP.

<Note>
  The [EU-based LangSmith service](https://eu.smith.langchain.com) is now available (as of mid-July 2024) and is deployed in the `europe-west4` (Netherlands) region of GCP. If you are interested in an enterprise plan in this region, [contact our sales team](https://www.langchain.com/contact-sales).
</Note>

#### Regional storage

The resources and services in this table are stored in the location corresponding to the URL where sign-up occurred (either the US or EU). Cloud-managed LangSmith uses [Supabase](https://supabase.com) for authentication/authorization and [ClickHouse Cloud](https://clickhouse.com/cloud) for data warehouse.

|                                                | US                                                                 | EU                                                                       |
| ---------------------------------------------- | ------------------------------------------------------------------ | ------------------------------------------------------------------------ |
| URL                                            | [https://smith.langchain.com](https://smith.langchain.com)         | [https://eu.smith.langchain.com](https://eu.smith.langchain.com)         |
| API URL                                        | [https://api.smith.langchain.com](https://api.smith.langchain.com) | [https://eu.api.smith.langchain.com](https://eu.api.smith.langchain.com) |
| GCP                                            | us-central1 (Iowa)                                                 | europe-west4 (Netherlands)                                               |
| Supabase                                       | AWS us-east-1 (N. Virginia)                                        | AWS eu-central-1 (Germany)                                               |
| ClickHouse Cloud                               | us-central1 (Iowa)                                                 | europe-west4 (Netherlands)                                               |
| [LangSmith deployment](/langsmith/deployments) | us-central1 (Iowa)                                                 | europe-west4 (Netherlands)                                               |

See the [Regions FAQ](/langsmith/regions-faq) for more information.

#### Region-independent storage

Data listed here is stored exclusively in the US:

* Payment and billing information with Stripe and Metronome

#### GCP services

LangSmith is composed of the following services, all deployed on Google Kubernetes Engine (GKE):

* LangSmith Frontend: serves the LangSmith UI.
* LangSmith Backend: serves the LangSmith API.
* LangSmith Platform Backend: handles authentication and other high-volume tasks. (Internal service)
* LangSmith Playground: handles forwarding requests to various LLM providers for the Playground feature.
* LangSmith Queue: handles processing of asynchronous tasks. (Internal service)

LangSmith uses the following GCP storage services:

* Google Cloud Storage (GCS) for runs inputs and outputs.
* Google Cloud SQL PostgreSQL for transactional workloads.
* Google Cloud Memorystore for Redis for queuing and caching.
* Clickhouse Cloud on GCP for trace ingestion and analytics. Our services connect to Clickhouse Cloud, which is hosted in the same GCP region, via a private endpoint.

Some additional GCP services we use include:

* Google Cloud Load Balancer for routing traffic to the LangSmith services.
* Google Cloud CDN for caching static assets.
* Google Cloud Armor for security and rate limits. For more information on rate limits we enforce, please refer to [this guide](/langsmith/administration-overview#rate-limits).

<div style={{ textAlign: 'center' }}>
  <img className="block dark:hidden" src="https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-light.png?fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=0790cbdf4fe131c74d1e60bb120834e3" alt="Light mode overview" data-og-width="2210" width="2210" data-og-height="1463" height="1463" data-path="langsmith/images/cloud-arch-light.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-light.png?w=280&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=c04d8a044d221559fe2f7b9121275638 280w, https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-light.png?w=560&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=a15351b254f11cc149ce237ba8853e91 560w, https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-light.png?w=840&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=d4a409e73830e588519cb1d0b2a17f3b 840w, https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-light.png?w=1100&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=6dbeda77b57083efb988e15af38f0a6e 1100w, https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-light.png?w=1650&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=24aadbe2e79db02d76fd5deaea6564e1 1650w, https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-light.png?w=2500&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=a126aa1f02d36de0a8e391f0e1059b8e 2500w" />

  <img className="hidden dark:block" src="https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-dark.png?fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=767f3bc3dc73ffe1a806f54e0aaa428b" alt="Dark mode overview" data-og-width="2210" width="2210" data-og-height="1463" height="1463" data-path="langsmith/images/cloud-arch-dark.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-dark.png?w=280&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=f7367df5b782c821882605418c50563f 280w, https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-dark.png?w=560&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=60759ef9e927ba0985e21e38acacae6d 560w, https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-dark.png?w=840&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=383ac38ba52733548d8d97ffabfe384e 840w, https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-dark.png?w=1100&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=b045b8e19a9926d4d10ec8ad2d2767c1 1100w, https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-dark.png?w=1650&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=23778aa891c1b42336b274ab1b2f8bec 1650w, https://mintcdn.com/langchain-5e9cc07a/rqYqeBEA_2oeiw17/langsmith/images/cloud-arch-dark.png?w=2500&fit=max&auto=format&n=rqYqeBEA_2oeiw17&q=85&s=5a64734b4e9fb5dd4af690edf3fa6248 2500w" />
</div>

### Allowlisting IP addresses

#### Egress from LangChain SaaS

All traffic leaving LangSmith services will be routed through a NAT gateway. All traffic will appear to originate from the following IP addresses:

| US             | EU             |
| -------------- | -------------- |
| 34.59.65.97    | 34.13.192.67   |
| 34.67.51.221   | 34.147.105.64  |
| 34.46.212.37   | 34.90.22.166   |
| 34.132.150.88  | 34.147.36.213  |
| 35.188.222.201 | 34.32.137.113  |
| 34.58.194.127  | 34.91.238.184  |
| 34.59.97.173   | 35.204.101.241 |
| 104.198.162.55 | 35.204.48.32   |

It may be helpful to allowlist these IP addresses if connecting to your own AzureOpenAI service or other endpoints that may be required by the Playground or Online Evaluation.

#### Ingress into LangChain SaaS

The langchain endpoints map to the following static IP addresses:

| US             | EU           |
| -------------- | ------------ |
| 34.8.121.39    | 34.95.92.214 |
| 34.107.251.234 | 34.13.73.122 |

You may need to allowlist these to enable traffic from your private network to LangSmith SaaS endpoints (`api.smith.langchain.com`, `smith.langchain.com`, `beacon.langchain.com`, `eu.api.smith.langchain.com`, `eu.smith.langchain.com`, `eu.beacon.langchain.com`).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/cloud.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to define a code evaluator
Source: https://docs.langchain.com/langsmith/code-evaluator



<Info>
  * [Evaluators](/langsmith/evaluation-concepts#evaluators)
</Info>

Code evaluators are just functions that take a dataset example and the resulting application output, and return one or more metrics. These functions can be passed directly into [evaluate()](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._runner.evaluate) / [aevaluate()](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._arunner.aevaluate).

## Basic example

<CodeGroup>
  ```python Python theme={null}
  from langsmith import evaluate

  def correct(outputs: dict, reference_outputs: dict) -> bool:
      """Check if the answer exactly matches the expected answer."""
      return outputs["answer"] == reference_outputs["answer"]

  def dummy_app(inputs: dict) -> dict:
      return {"answer": "hmm i'm not sure", "reasoning": "i didn't understand the question"}

  results = evaluate(
      dummy_app,
      data="dataset_name",
      evaluators=[correct]
  )
  ```

  ```typescript TypeScript theme={null}
  import type { EvaluationResult } from "langsmith/evaluation";

  const correct = async ({ outputs, referenceOutputs }: {
    outputs: Record<string, any>;
    referenceOutputs?: Record<string, any>;
  }): Promise<EvaluationResult> => {
    const score = outputs?.answer === referenceOutputs?.answer;
    return { key: "correct", score };
  }
  ```
</CodeGroup>

## Evaluator args

code evaluator functions must have specific argument names. They can take any subset of the following arguments:

* `run: Run`: The full [Run](/langsmith/run-data-format) object generated by the application on the given example.
* `example: Example`: The full dataset [Example](/langsmith/example-data-format), including the example inputs, outputs (if available), and metdata (if available).
* `inputs: dict`: A dictionary of the inputs corresponding to a single example in a dataset.
* `outputs: dict`: A dictionary of the outputs generated by the application on the given `inputs`.
* `reference_outputs/referenceOutputs: dict`: A dictionary of the reference outputs associated with the example, if available.

For most use cases you'll only need `inputs`, `outputs`, and `reference_outputs`. `run` and `example` are useful only if you need some extra trace or example metadata outside of the actual inputs and outputs of the application.

When using JS/TS these should all be passed in as part of a single object argument.

## Evaluator output

Code evaluators are expected to return one of the following types:

Python and JS/TS

* `dict`: dicts of the form `{"score" | "value": ..., "key": ...}` allow you to customize the metric type ("score" for numerical and "value" for categorical) and metric name. This if useful if, for example, you want to log an integer as a categorical metric.

Python only

* `int | float | bool`: this is interepreted as an continuous metric that can be averaged, sorted, etc. The function name is used as the name of the metric.
* `str`: this is intepreted as a categorical metric. The function name is used as the name of the metric.
* `list[dict]`: return multiple metrics using a single function.

## Additional examples

Requires `langsmith>=0.2.0`

<CodeGroup>
  ```python Python theme={null}
  from langsmith import evaluate, wrappers
  from langsmith.schemas import Run, Example
  from openai import AsyncOpenAI
  # Assumes you've installed pydantic.
  from pydantic import BaseModel

  # We can still pass in Run and Example objects if we'd like
  def correct_old_signature(run: Run, example: Example) -> dict:
      """Check if the answer exactly matches the expected answer."""
      return {"key": "correct", "score": run.outputs["answer"] == example.outputs["answer"]}

  # Just evaluate actual outputs
  def concision(outputs: dict) -> int:
      """Score how concise the answer is. 1 is the most concise, 5 is the least concise."""
      return min(len(outputs["answer"]) // 1000, 4) + 1

  # Use an LLM-as-a-judge
  oai_client = wrappers.wrap_openai(AsyncOpenAI())

  async def valid_reasoning(inputs: dict, outputs: dict) -> bool:
      """Use an LLM to judge if the reasoning and the answer are consistent."""
      instructions = """
  Given the following question, answer, and reasoning, determine if the reasoning for the
  answer is logically valid and consistent with question and the answer."""

      class Response(BaseModel):
          reasoning_is_valid: bool

      msg = f"Question: {inputs['question']}\nAnswer: {outputs['answer']}\nReasoning: {outputs['reasoning']}"
      response = await oai_client.beta.chat.completions.parse(
          model="gpt-4o-mini",
          messages=[{"role": "system", "content": instructions,}, {"role": "user", "content": msg}],
          response_format=Response
      )
      return response.choices[0].message.parsed.reasoning_is_valid

  def dummy_app(inputs: dict) -> dict:
      return {"answer": "hmm i'm not sure", "reasoning": "i didn't understand the question"}

  results = evaluate(
      dummy_app,
      data="dataset_name",
      evaluators=[correct_old_signature, concision, valid_reasoning]
  )
  ```

  ```typescript TypeScript theme={null}
  import { Client } from "langsmith";
  import { evaluate } from "langsmith/evaluation";
  import { Run, Example } from "langsmith/schemas";
  import OpenAI from "openai";

  // Type definitions
  interface AppInputs {
      question: string;
  }

  interface AppOutputs {
      answer: string;
      reasoning: string;
  }

  interface Response {
      reasoning_is_valid: boolean;
  }

  // Old signature evaluator
  function correctOldSignature(run: Run, example: Example) {
      return {
          key: "correct",
          score: run.outputs?.["answer"] === example.outputs?.["answer"],
      };
  }

  // Output-only evaluator
  function concision({ outputs }: { outputs: AppOutputs }) {
      return {
          key: "concision",
          score: Math.min(Math.floor(outputs.answer.length / 1000), 4) + 1,
      };
  }

  // LLM-as-judge evaluator
  const openai = new OpenAI();

  async function validReasoning({
      inputs,
      outputs
  }: {
      inputs: AppInputs;
      outputs: AppOutputs;
  }) {
      const instructions = `\
    Given the following question, answer, and reasoning, determine if the reasoning for the \
    answer is logically valid and consistent with question and the answer.`;

      const msg = `Question: ${inputs.question}
  Answer: ${outputs.answer}
  Reasoning: ${outputs.reasoning}`;

      const response = await openai.chat.completions.create({
          model: "gpt-4",
          messages: [
              { role: "system", content: instructions },
              { role: "user", content: msg }
          ],
          response_format: { type: "json_object" },
          functions: [{
              name: "parse_response",
              parameters: {
                  type: "object",
                  properties: {
                      reasoning_is_valid: {
                          type: "boolean",
                          description: "Whether the reasoning is valid"
                      }
                  },
                  required: ["reasoning_is_valid"]
              }
          }]
      });

      const parsed = JSON.parse(response.choices[0].message.content ?? "{}") as Response;
      return {
          key: "valid_reasoning",
          score: parsed.reasoning_is_valid ? 1 : 0
      };
  }

  // Example application
  function dummyApp(inputs: AppInputs): AppOutputs {
      return {
          answer: "hmm i'm not sure",
          reasoning: "i didn't understand the question"
      };
  }

  const results = await evaluate(dummyApp, {
      data: "dataset_name",
      evaluators: [correctOldSignature, concision, validReasoning],
      client: new Client()
  });
  ```
</CodeGroup>

## Related

* [Evaluate aggregate experiment results](/langsmith/summary): Define summary evaluators, which compute metrics for an entire experiment.
* [Run an evaluation comparing two experiments](/langsmith/evaluate-pairwise): Define pairwise evaluators, which compute metrics by comparing two (or more) experiments against each other.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/code-evaluator.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Beta LangSmith Collector-Proxy
Source: https://docs.langchain.com/langsmith/collector-proxy



<Note>
  This is a beta feature. The API may change in future releases.
</Note>

The LangSmith Collector-Proxy is a lightweight, high-performance proxy server that sits between your application and the LangSmith backend. It batches and compresses trace data before sending it to LangSmith, reducing network overhead and improving performance.

## When to Use the Collector-Proxy

The Collector-Proxy is particularly valuable when:

* You're running multiple instances of your application in parallel and need to efficiently aggregate traces
* You want more efficient tracing than direct OTEL API calls to LangSmith (the collector optimizes batching and compression)
* You're using a language that doesn't have a native LangSmith SDK

## Key Features

* **Efficient Data Transfer** Batches multiple spans into fewer, larger uploads.
* **Compression** Uses zstd to minimize payload size.
* **OTLP Support** Accepts OTLP JSON and Protobuf over HTTP POST.
* **Semantic Translation** Maps GenAI/OpenInference conventions to the LangSmith Run model.
* **Flexible Batching** Flush by span count or time interval.

## Configuration

Configure via environment variables:

| Variable             | Description                       | Default                           |
| -------------------- | --------------------------------- | --------------------------------- |
| `HTTP_PORT`          | Port to run the proxy server      | `4318`                            |
| `LANGSMITH_ENDPOINT` | LangSmith backend URL             | `https://api.smith.langchain.com` |
| `LANGSMITH_API_KEY`  | API key for LangSmith             | **Required** (env var or header)  |
| `LANGSMITH_PROJECT`  | Default tracing project           | Default project if not specified  |
| `BATCH_SIZE`         | Spans per upload batch            | `100`                             |
| `FLUSH_INTERVAL_MS`  | Flush interval in milliseconds    | `1000`                            |
| `MAX_BUFFER_BYTES`   | Max uncompressed buffer size      | `10485760` (10 MB)                |
| `MAX_BODY_BYTES`     | Max incoming request body size    | `209715200` (200 MB)              |
| `MAX_RETRIES`        | Retry attempts for failed uploads | `3`                               |
| `RETRY_BACKOFF_MS`   | Initial backoff in milliseconds   | `100`                             |

### Project Configuration

The Collector-Proxy supports LangSmith project configuration with the following priority:

1. If a project is specified in the request headers (`Langsmith-Project`), that project will be used
2. If no project is specified in headers, it will use the project set in the `LANGSMITH_PROJECT` environment variable
3. If neither is set, it will trace to the `default` project.

### Authentication

The API key can be provided either:

* As an environment variable (`LANGSMITH_API_KEY`)
* In the request headers (`X-API-Key`)

## Deployment (Docker)

You can deploy the Collector-Proxy with Docker:

1. **Build the image**

   ```bash  theme={null}
   docker build \
     -t langsmith-collector-proxy:beta .
   ```

2. **Run the container**

   ```bash  theme={null}
   docker run -d \
     -p 4318:4318 \
     -e LANGSMITH_API_KEY=<your_api_key> \
     -e LANGSMITH_PROJECT=<your_project> \
     langsmith-collector-proxy:beta
   ```

## Usage

Point any OTLP-compatible client or the OpenTelemetry Collector exporter at:

```bash  theme={null}
export OTEL_EXPORTER_OTLP_ENDPOINT=http://<host>:4318/v1/traces
export OTEL_EXPORTER_OTLP_HEADERS="X-API-Key=<your_api_key>,Langsmith-Project=<your_project>"
```

Send a test trace:

```bash  theme={null}
curl -X POST http://localhost:4318/v1/traces \
  -H "Content-Type: application/json" \
  --data '{
    "resourceSpans": [
      {
        "resource": {
          "attributes": [
            {
              "key": "service.name",
              "value": { "stringValue": "test-service" }
            }
          ]
        },
        "scopeSpans": [
          {
            "scope": {
              "name": "example/instrumentation",
              "version": "1.0.0"
            },
            "spans": [
              {
                "traceId": "T6nh/mMkIONaoHewS9UWIw==",
                "spanId": "0tEqJwCpvU0=",
                "name": "parent-span",
                "kind": "SPAN_KIND_INTERNAL",
                "startTimeUnixNano": 1747675155185223936,
                "endTimeUnixNano":   1747675156185223936,
                "attributes": [
                  {
                    "key": "gen_ai.prompt",
                    "value": {
                      "stringValue": "{\"text\":\"Hello, world!\"}"
                    }
                  },
                  {
                    "key": "gen_ai.usage.input_tokens",
                    "value": {
                      "intValue": "5"
                    }
                  },
                  {
                    "key": "gen_ai.completion",
                    "value": {
                      "stringValue": "{\"text\":\"Hi there!\"}"
                    }
                  },
                  {
                    "key": "gen_ai.usage.output_tokens",
                    "value": {
                      "intValue": "3"
                    }
                  }
                ],
                "droppedAttributesCount": 0,
                "events": [],
                "links": [],
                "status": {}
              }
            ]
          }
        ]
      }
    ]
  }'
```

## Health & Scaling

* **Liveness**: `GET /live` → 200
* **Readiness**: `GET /ready` → 200

## Horizontal Scaling

To ensure full traces are batched correctly, route spans with the same trace ID to the same instance (e.g., via consistent hashing).

## Fork & Extend

Fork the [Collector-Proxy repo on GitHub](https://github.com/langchain-ai/langsmith-collector-proxy) and implement your own converter:

* Create a custom `GenAiConverter` or modify the existing one in `internal/translator/otel_converter.go`
* Register the custom converter in `internal/translator/translator.go`

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/collector-proxy.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to compare experiment results
Source: https://docs.langchain.com/langsmith/compare-experiment-results



When you are iterating on your LLM application (such as changing the model or the prompt), you will want to compare the results of different [*experiments*](/langsmith/evaluation-concepts#experiment).

LangSmith supports a comparison view that lets you hone in on key differences, regressions, and improvements between different experiments.

## Open the comparison view

1. To access the experiment comaprison view, navigate to the **Datasets & Experiments** page.
2. Select a dataset, which will open the **Experiments** tab.
3. Select two or more experiments abd then click **Compare**.

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-select.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=67d4d6068012e92b101f900595734977" alt="The Experiments view in the UI with 3 experiments selected and the Compare button highlighted." data-og-width="1626" width="1626" data-og-height="966" height="966" data-path="langsmith/images/compare-select.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-select.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=8e520c4ec316531d45a9f538c3f36f78 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-select.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=0aed7d0b1cb5d70321ea536ce1decea9 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-select.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=71ed0d1faed09fe9bd845e8049327948 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-select.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=b8ccc3f4aa28630b633021b18eb64dd0 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-select.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=6e8b473e44e88e18aa743d1f75b9f6b5 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-select.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=c0784778d3b730d9e8b34400c852462b 2500w" />

## Adjust the table display

You can toggle between different views by clicking **Full** or **Compact** at the top of the **Comparing Experiments** page.

Toggling **Full** will show the full text of the input, output, and reference output for each run. If the reference output is too long to display in the table, you can click on **Expand detailed view** to view the full content.

You can also select and hide individual feedback keys or individual metrics in the **Display** settings dropdown to isolate the information you need in the comparison view.

## View regressions and improvements

In the comparison view, runs that *regressed* on your specified feedback key against your baseline experiment will be highlighted in red, while runs that *improved* will be highlighted in green. At the top of each column, you can find how many runs in that experiment did better and how many did worse than your baseline experiment.

Click on the regressions or improvements buttons on the top of each column to filter to the runs that regressed or improved in that specific experiment.

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/regression-view.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=14f3a9b65dec55c9e4a0f5688d9e8f43" alt="The comparison view comparing 2 experiments with the regressions and improvements highlighted in red and green respectively." data-og-width="1632" width="1632" data-og-height="739" height="739" data-path="langsmith/images/regression-view.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/regression-view.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=c7256046a4a6c5d28f350dd8d26bb7e3 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/regression-view.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=78c6a2cb4baa784abeb3336d5bea94c4 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/regression-view.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=9772371f026939748db446fff60fe19b 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/regression-view.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=d5639c88a1a0214a0955ffbcda23faf0 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/regression-view.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=2e4f232fbef17ff53192d8ac6db3913e 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/regression-view.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=d55c8e280732ba5d9e3866564af29de9 2500w" />

## Update baseline experiment and metric

In order to track regressions, you need to:

1. In the **Baseline** dropdown at the top of the comparison view, select a **Baseline experiment** against which to compare. By default, the newest experiment is selected as the baseline.
2. Select a  **Feedback key** (evaluation metric) you want to focus compare against. One will be assigned by default, but you can adjust as needed.
3. Configure whether a higher score is better for the selected feedback key. This preference will be stored.

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-baseline.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=3df57789664fd92aba18ea2f438934bd" alt="The Baseline dropdown highlighted with a selected experiment and feedback key of &#x22;hallucination&#x22;." data-og-width="1627" width="1627" data-og-height="898" height="898" data-path="langsmith/images/select-baseline.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-baseline.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=173b5ead77441998fc19669be3290aff 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-baseline.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=46f3bd52bad1fe8fc11ea269b37b15f9 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-baseline.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=a7cf328e494a3b674ad60f8a082d6b44 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-baseline.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=ae7ec9831c8c36855d085f48a8d8e6b3 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-baseline.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=4ef21b8740df8f02f7579239e9a9792a 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-baseline.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=c87682dbb3438f9c59a837a740d56979 2500w" />

## Open a trace

If the example you're evaluating is from an ingested [run](/langsmith/observability-concepts#runs), you can hover over the output cell and click on the trace icon to open the trace view for that run. This will open up a trace in the side panel.

<img src="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/open-source-trace.png?fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=b5feaa0894a645f4642c7422de937c7d" alt="The View trace icon highlighted from an ingested run." data-og-width="1632" width="1632" data-og-height="662" height="662" data-path="langsmith/images/open-source-trace.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/open-source-trace.png?w=280&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=49a42505f2992ab6caeab6f1c52c8e81 280w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/open-source-trace.png?w=560&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=c05cf153d430c062d42d1bf18ef73d6a 560w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/open-source-trace.png?w=840&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=734ad1f9a66a03615b259305298480f0 840w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/open-source-trace.png?w=1100&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=3ad016ed31dc9c43eaa74cdffcc66a5e 1100w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/open-source-trace.png?w=1650&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=991be6dc57ad88ef21aaba9c90ea24fa 1650w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/open-source-trace.png?w=2500&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=c4110db6d5e60b4e8df840c409cb5c38 2500w" />

## Expand detailed view

From any cell, you can click on the expand icon in the hover state to open up a detailed view of all experiment results on that particular example input, along with feedback keys and scores.

<img src="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-view.png?fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=1ff7f02d5ba6ea89902b4de0e37967e7" alt="An example in the Comparing Experiments view of a expanded view of the repetitions." data-og-width="1643" width="1643" data-og-height="926" height="926" data-path="langsmith/images/expanded-view.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-view.png?w=280&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=4803efab7ae4a4363145857941d50055 280w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-view.png?w=560&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=98f67feac8ad53ddb18be06955ab2895 560w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-view.png?w=840&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=628927173d27260eb7f76d82e66ff2d4 840w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-view.png?w=1100&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=6de1f01362d8658d9f673e91deec77b6 1100w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-view.png?w=1650&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=8f68caed39717af5314beb187c3f542f 1650w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-view.png?w=2500&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=00fa65a0580a10973a91d506b7cdbc09 2500w" />

## View summary charts

View summary charts by clicking on the **Charts** tab at the top of the page.

<img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/charts-tab.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=ada485c324964c8df96caa566ad11b1d" alt="The Charts summary page with 8 summary charts for the comparison." data-og-width="1639" width="1639" data-og-height="1147" height="1147" data-path="langsmith/images/charts-tab.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/charts-tab.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=e21f174643b26892cae4280962653263 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/charts-tab.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=40f98d83faf7c13a05a9b410787c9a75 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/charts-tab.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=6a49ddad4c2c02bc14a0d765dc7dc261 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/charts-tab.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=8aec566c2fd07b7d6e12a3e639cf6660 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/charts-tab.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=716edd3ad3626831a88c6db07f0fa566 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/charts-tab.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=08e8ea763e825413a3aff09c061593ca 2500w" />

## Use experiment metadata as chart labels

You can configure the x-axis labels for the charts based on [experiment metadata](/langsmith/filter-experiments-ui#background-add-metadata-to-your-experiments).

Select a metadata key in the **x-axis** dropdown to change the chart labels.

<img src="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/metadata-in-charts.png?fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=cb175478369f9a7a2314e44f6becc9e4" alt="x-axis dropdown highlighted with a list of the metadata attached to the experiment." data-og-width="1637" width="1637" data-og-height="1141" height="1141" data-path="langsmith/images/metadata-in-charts.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/metadata-in-charts.png?w=280&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=b34f45a5f0736b6b83a73a00f94212f6 280w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/metadata-in-charts.png?w=560&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=7084f0126711a63d6f539904f4e9091f 560w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/metadata-in-charts.png?w=840&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=dee53003f1c958d09f839d24cb54eb63 840w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/metadata-in-charts.png?w=1100&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=7a0323fef549e2aef3fcf22295d40d19 1100w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/metadata-in-charts.png?w=1650&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=81c37649a07ffa594bad4fd57cc8fb32 1650w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/metadata-in-charts.png?w=2500&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=70d700760e7ba9dd0edce54429d11f95 2500w" />

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/compare-experiment-results.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Compare traces
Source: https://docs.langchain.com/langsmith/compare-traces



To compare traces, click on the **Compare** button in the upper right hand side of any trace view.

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-button.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=794720bc9ef293984be25f19bffc89e7" alt="" data-og-width="2936" width="2936" data-og-height="1860" height="1860" data-path="langsmith/images/compare-button.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-button.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=f64c79f89cf94802ed0676cb4d175be7 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-button.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=111db8eecaadd0458001072616ef95fa 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-button.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=1df1b17af22d13e24a613bea71cb2e98 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-button.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=476560e917219525466629c49284de13 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-button.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=12cf8bf414cd5abd48355693f93d3bcf 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-button.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=7242812503868254fde6688b27e3586f 2500w" />

This will show the trace run table. Select the trace you want to compare against the original trace.

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-trace.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=89c9c16780ac8d129736ee800124625a" alt="" data-og-width="2388" width="2388" data-og-height="1856" height="1856" data-path="langsmith/images/select-trace.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-trace.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=1a7d4eb56fb4c2f8814b0584bce967b1 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-trace.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=72c89c532151d35facb58cb62b825213 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-trace.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=acd522e4aa3f54caaa3ef4036eb654b2 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-trace.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=aac705cc1574309b670d54e96bff666f 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-trace.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=db82f5ea3c9ea13b3919f892db65bd9c 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/select-trace.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=b926b0a266e8945063a71a7496034e97 2500w" />

The pane will open with both traces selected in a side by side comparison view.

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-trace.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=0f3b4479d399d18c3d6d48fb91681f73" alt="" data-og-width="2930" width="2930" data-og-height="1868" height="1868" data-path="langsmith/images/compare-trace.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-trace.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=05c151e7c9d0733ca6bfee6b6ce5ccc6 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-trace.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=579b34d6aa74caea5e055cba22f6fea8 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-trace.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=9a06b98a43f3a195eed05292cacae2d2 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-trace.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=968d1316e524123352412791515af047 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-trace.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=29065b3b65b98841a47ebe6487a26974 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-trace.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=dfe7e02a6db6693d4ff8490e170014bf 2500w" />

To stop comparing, close the pane or click on **Stop comparing** in the upper right hand side of the pane.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/compare-traces.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# LangSmith Deployment components
Source: https://docs.langchain.com/langsmith/components



When running self-hosted [LangSmith Deployment](/langsmith/deploy-self-hosted-full-platform), your installation includes several key components. Together these tools and services provide a complete solution for building, deploying, and managing graphs (including agentic applications) in your own infrastructure:

```mermaid  theme={null}
flowchart
    subgraph LangSmith Deployment
        A[LangGraph CLI] -->|creates| B(Agent Server deployment)
        B <--> D[Studio]
        B <--> E[SDKs]
        B <--> F[RemoteGraph]
    end
```

* [Agent Server](/langsmith/agent-server): Defines an opinionated API and runtime for deploying graphs and agents. Handles execution, state management, and persistence so you can focus on building logic rather than server infrastructure.
* [LangGraph CLI](/langsmith/cli): A command-line interface to build, package, and interact with graphs locally and prepare them for deployment.
* [Studio](/langsmith/studio): A specialized IDE for visualization, interaction, and debugging. Connects to a local Agent Server for developing and testing your graph.
* [Python/JS SDK](/langsmith/sdk): The Python/JS SDK provides a programmatic way to interact with deployed graphs and agents from your applications.
* [RemoteGraph](/langsmith/use-remote-graph): Allows you to interact with a deployed graph as though it were running locally.
* [Control Plane](/langsmith/control-plane): The UI and APIs for creating, updating, and managing Agent Server deployments.
* [Data plane](/langsmith/data-plane): The runtime layer that executes your graphs, including Agent Servers, their backing services (PostgreSQL, Redis, etc.), and the listener that reconciles state from the control plane.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/components.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Composite evaluators
Source: https://docs.langchain.com/langsmith/composite-evaluators



*Composite evaluators* are a way to combine multiple evaluator scores into a single [score](/langsmith/evaluation-concepts#evaluator-outputs). This is useful when you want to evaluate multiple aspects of your application and combine the results into a single result.

## Create a composite evaluator using the UI

You can create composite evaluators on a [tracing project](/langsmith/observability-concepts#projects) (for [online evaluations](/langsmith/evaluation-concepts#online-evaluation)) or a [dataset](/langsmith/evaluation-concepts#datasets) (for [offline evaluations](/langsmith/evaluation-concepts#offline-evaluation)). With composite evaluators in the UI, you can compute a weighted average or weighted sum of multiple evaluator scores, with configurable weights.

<div style={{ textAlign: 'center' }}>
  <img className="block dark:hidden" src="https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-light.png?fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=b3859ada8b576ebeaf5399ff15359b10" alt="LangSmith UI showing an LLM call trace called ChatOpenAI with a system and human input followed by an AI Output." data-og-width="756" width="756" data-og-height="594" height="594" data-path="langsmith/images/create_composite_evaluator-light.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-light.png?w=280&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=9bab5ad812328acdd6ffe858f487262b 280w, https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-light.png?w=560&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=4637a2dc732f945d98b0214023266180 560w, https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-light.png?w=840&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=c3e7b24dde21ed45f481b7a513ecc256 840w, https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-light.png?w=1100&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=1310a99e2a8b37d68d78f794b8ce6606 1100w, https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-light.png?w=1650&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=6beb89dcc6ec734b2ad012bc46c58821 1650w, https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-light.png?w=2500&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=ba7fd7ba48a3e46d8701b6f64bb68f66 2500w" />

  <img className="hidden dark:block" src="https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-dark.png?fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=ac13f4d2d4a5e3b67285284150b7d592" alt="LangSmith UI showing an LLM call trace called ChatOpenAI with a system and human input followed by an AI Output." data-og-width="761" width="761" data-og-height="585" height="585" data-path="langsmith/images/create_composite_evaluator-dark.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-dark.png?w=280&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=bfc19d802f0327a579d90e519441cf9a 280w, https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-dark.png?w=560&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=23ab26db75e25795c17abf07e487ba5d 560w, https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-dark.png?w=840&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=7ce9597b62f3e68b2dc1afa5f17f0e8c 840w, https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-dark.png?w=1100&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=ee7058d60185a820fe23decf003bd2c1 1100w, https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-dark.png?w=1650&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=cff38ad541c55d6834edfa67f5650818 1650w, https://mintcdn.com/langchain-5e9cc07a/cRRwi1N4-QohYC73/langsmith/images/create_composite_evaluator-dark.png?w=2500&fit=max&auto=format&n=cRRwi1N4-QohYC73&q=85&s=0f85093799a489eff72dae01ed5b6d94 2500w" />
</div>

### 1. Navigate to the tracing project or dataset

To start configuring a composite evaluator, navigate to the **Tracing Projects** or **Dataset & Experiments** tab and select a project or dataset.

* From within a tracing project: **+ New** > **Evaluator** > **Composite score**
* From within a dataset: **+ Evaluator** > **Composite score**

### 2. Configure the composite evaluator

1. Name your evaluator.
2. Select an aggregation method, either **Average** or **Sum**.
   * **Average**: ∑(weight\*score) / ∑(weight).
   * **Sum**: ∑(weight\*score).
3. Add the feedback keys you want to include in the composite score.
4. Add the weights for the feedback keys. By default, the weights are equal for each feedback key. Adjust the weights to increase or decrease the importance of specific feedback keys in the final score.
5. Click **Create** to save the evaluator.

<Tip> If you need to adjust the weights for the composite scores, they can be updated after the evaluator is created. The resulting scores will be updated for all runs that have the evaluator configured. </Tip>

### 3. View composite evaluator results

Composite scores are attached to a run as **feedback**, similarly to feedback from a single evaluator. How you can view them depends on where the evaluation was run:

**On a tracing project**:

* Composite scores appear as feedback on runs.
* [Filter for runs](/langsmith/filter-traces-in-application) with a composite score, or where the composite score meets a certain threshold.
* [Create a chart](/langsmith/dashboards#custom-dashboards) to visualize trends in the composite score over time.

**On a dataset**:

* View the composite scores in the experiments tab. You can also filter and sort experiments based on the average composite score of their runs.
* Click into an experiment to view the composite score for each run.

<Note> If any of the constituent evaluators are not configured on the run, the composite score will not be calculated for that run. </Note>

## Create composite feedback with the SDK

This guide describes setting up an evaluation that uses multiple evaluators and combines their scores with a custom aggregation function.

<Note> Requires langsmith>=0.4.29 </Note>

### 1. Configure evaluators on a dataset

Start by configuring your evaluators. In this example, the application generates a tweet from a blog introduction and uses three evaluators — summary, tone, and formatting — to assess the output.

If you already have your own dataset with evaluators configured, you can skip this step.

<Accordion title="Configure evaluators on a dataset.">
  ```python  theme={null}
  import os
  from dotenv import load_dotenv
  from openai import OpenAI
  from langsmith import Client
  from pydantic import BaseModel
  import json

  # Load environment variables from .env file
  load_dotenv()

  # Access environment variables
  openai_api_key = os.getenv('OPENAI_API_KEY')
  langsmith_api_key = os.getenv('LANGSMITH_API_KEY')
  langsmith_project = os.getenv('LANGSMITH_PROJECT', 'default')


  # Create a dataset. Only need to do this once.
  client = Client()
  oai_client = OpenAI()

  examples = [
    {
      "inputs": {"blog_intro": "Today we’re excited to announce the general availability of LangSmith — our purpose-built infrastructure and management layer for deploying and scaling long-running, stateful agents. Since our beta last June, nearly 400 companies have used LangSmith to deploy their agents into production. Agent deployment is the next hard hurdle for shipping reliable agents, and LangSmith dramatically lowers this barrier with: 1-click deployment to go live in minutes, 30 API endpoints for designing custom user experiences that fit any interaction pattern, Horizontal scaling to handle bursty, long-running traffic, A persistence layer to support memory, conversational history, and async collaboration with human-in-the-loop or multi-agent workflows, Native Studio, the agent IDE, for easy debugging, visibility, and iteration "},
    },
    {
      "inputs": {"blog_intro": "Klarna has reshaped global commerce with its consumer-centric, AI-powered payment and shopping solutions. With over 85 million active users and 2.5 million daily transactions on its platform, Klarna is a fintech leader that simplifies shopping while empowering consumers with smarter, more flexible financial solutions. Klarna’s flagship AI Assistant is revolutionizing the shopping and payments experience. Built on LangGraph and powered by LangSmith, the AI Assistant handles tasks ranging from customer payments, to refunds, to other payment escalations. With 2.5 million conversations to date, the AI Assistant is more than just a chatbot; it’s a transformative agent that performs the work equivalent of 700 full-time staff, delivering results quickly and improving company efficiency."},
    },
  ]

  dataset = client.create_dataset(dataset_name="Blog Intros")

  client.create_examples(
    dataset_id=dataset.id,
    examples=examples,
  )

  # Define a target function. In this case, we're using a simple function that generates a tweet from a blog intro.
  def generate_tweet(inputs: dict) -> dict:
      instructions = (
        "Given the blog introduction, please generate a catchy yet professional tweet that can be used to promote the blog post on social media. Summarize the key point of the blog post in the tweet. Use emojis in a tasteful manner."
      )
      messages = [
          {"role": "system", "content": instructions},
          {"role": "user", "content": inputs["blog_intro"]},
      ]
      result = oai_client.responses.create(
          input=messages, model="gpt-5-nano"
      )
      return {"tweet": result.output_text}

  # Define evaluators. In this case, we're using three evaluators: summary, formatting, and tone.
  def summary(inputs: dict, outputs: dict) -> bool:
      """Judge whether the tweet is a good summary of the blog intro."""
      instructions = "Given the following text and summary, determine if the summary is a good summary of the text."

      class Response(BaseModel):
          summary: bool

      msg = f"Question: {inputs['blog_intro']}\nAnswer: {outputs['tweet']}"
      response = oai_client.responses.parse(
          model="gpt-5-nano",
          input=[{"role": "system", "content": instructions,}, {"role": "user", "content": msg}],
          text_format=Response
      )

      parsed_response = json.loads(response.output_text)
      return parsed_response["summary"]

  def formatting(inputs: dict, outputs: dict) -> bool:
      """Judge whether the tweet is formatted for easy human readability."""
      instructions = "Given the following text, determine if it is formatted well so that a human can easily read it. Pay particular attention to spacing and punctuation."

      class Response(BaseModel):
          formatting: bool

      msg = f"{outputs['tweet']}"
      response = oai_client.responses.parse(
          model="gpt-5-nano",
          input=[{"role": "system", "content": instructions,}, {"role": "user", "content": msg}],
          text_format=Response
      )

      parsed_response = json.loads(response.output_text)
      return parsed_response["formatting"]

  def tone(inputs: dict, outputs: dict) -> bool:
      """Judge whether the tweet's tone is informative, friendly, and engaging."""
      instructions = "Given the following text, determine if the tweet is informative, yet friendly and engaging."

      class Response(BaseModel):
          tone: bool

      msg = f"{outputs['tweet']}"
      response = oai_client.responses.parse(
          model="gpt-5-nano",
          input=[{"role": "system", "content": instructions,}, {"role": "user", "content": msg}],
          text_format=Response
      )
      parsed_response = json.loads(response.output_text)
      return parsed_response["tone"]

  # Calling evaluate() with the dataset, target function, and evaluators.
  results = client.evaluate(
      generate_tweet,
      data=dataset.name,
      evaluators=[summary, tone, formatting],
      experiment_prefix="gpt-5-nano",
  )

  # Get the experiment name to be used in client.get_experiment_results() in the next section
  experiment_name = results.experiment_name
  ```
</Accordion>

### 2. Create composite feedback

Create composite feedback that aggregates the individual evaluator scores using your custom function. This example uses a weighted average of the individual evaluator scores.

<Accordion title="Create a composite feedback.">
  ```python  theme={null}
  from typing import Dict
  import math
  from langsmith import Client
  from dotenv import load_dotenv

  load_dotenv()

  # TODO: Replace with your experiment name. Can be found in UI or from the above client.evaluate() result
  YOUR_EXPERIMENT_NAME = "placeholder_experiment_name"

  # Set weights for the individual evaluator scores
  DEFAULT_WEIGHTS: Dict[str, float] = {
      "summary": 0.7,
      "tone": 0.2,
      "formatting": 0.1,
  }
  WEIGHTED_FEEDBACK_NAME = "weighted_summary"

  # Pull experiment results
  client = Client()
  results = client.get_experiment_results(
      name=YOUR_EXPERIMENT_NAME,
  )

  # Calculate weighted score for each run
  def calculate_weighted_score(feedback_stats: dict) -> float:
      if not feedback_stats:
          return float("nan")

      # Check if all required metrics are present and have data
      required_metrics = set(DEFAULT_WEIGHTS.keys())
      available_metrics = set(feedback_stats.keys())

      if not required_metrics.issubset(available_metrics):
          return float("nan")

      # Calculate weighted score
      total_score = 0.0
      for metric, weight in DEFAULT_WEIGHTS.items():
          metric_data = feedback_stats[metric]
          if metric_data.get("n", 0) > 0 and "avg" in metric_data:
              total_score += metric_data["avg"] * weight
          else:
              return float("nan")

      return total_score

  # Process each run and write feedback
  # Note that experiment results need to finish processing before this should be called.
  for example_with_runs in results["examples_with_runs"]:
      for run in example_with_runs.runs:
          if run.feedback_stats:
              score = calculate_weighted_score(run.feedback_stats)
              if not math.isnan(score):
                  client.create_feedback(
                      run_id=run.id,
                      key=WEIGHTED_FEEDBACK_NAME,
                      score=float(score)
                  )
  ```
</Accordion>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/composite-evaluators.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Configurable headers
Source: https://docs.langchain.com/langsmith/configurable-headers



LangGraph allows runtime configuration to modify agent behavior and permissions dynamically. When using [LangSmith Deployment](/langsmith/deployment-quickstart), you can pass this configuration in the request body (`config`) or specific request headers. This enables adjustments based on user identity or other requests.

For privacy, control which headers are passed to the runtime configuration via the `http.configurable_headers` section in your [`langgraph.json`](/langsmith/application-structure#configuration-file) file.

Here's how to customize the included and excluded headers:

```json  theme={null}
{
  "http": {
    "configurable_headers": {
      "includes": ["x-user-id", "x-organization-id", "my-prefix-*"],
      "excludes": ["authorization", "x-api-key"]
    }
  }
}
```

The `includes` and `excludes` lists accept exact header names or patterns using `*` to match any number of characters. For your security, no other regex patterns are supported.

## Using within your graph

You can access the included headers in your graph using the `config` argument of any node.

```python  theme={null}
def my_node(state, config):
  organization_id = config["configurable"].get("x-organization-id")
  ...
```

Or by fetching from context (useful in tools and or within other nested functions).

```python  theme={null}
from langgraph.config import get_config

def search_everything(query: str):
  organization_id = get_config()["configurable"].get("x-organization-id")
  ...
```

You can even use this to dynamically compile the graph.

```python  theme={null}
# my_graph.py.
import contextlib

@contextlib.asynccontextmanager
async def generate_agent(config):
  organization_id = config["configurable"].get("x-organization-id")
  if organization_id == "org1":
    graph = ...
    yield graph
  else:
    graph = ...
    yield graph

```

```json  theme={null}
{
  "graphs": {"agent": "my_grph.py:generate_agent"}
}
```

### Opt-out of configurable headers

If you'd like to opt-out of configurable headers, you can simply set a wildcard pattern in the `s` list:

```json  theme={null}
{
  "http": {
    "configurable_headers": {
      "excludes": ["*"]
    }
  }
}
```

This will exclude all headers from being added to your run's configuration.

Note that exclusions take precedence over inclusions.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/configurable-headers.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Manage assistants
Source: https://docs.langchain.com/langsmith/configuration-cloud



In this guide we will show how to create, configure, and manage an [assistant](/langsmith/assistants).

First, as a brief refresher on the concept of context, consider the following simple `call_model` node and context schema.
Observe that this node tries to read and use the `model_name` as defined by the `context` object's `model_name` field.

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    class ContextSchema(TypedDict):
        model_name: str

    builder = StateGraph(AgentState, context_schema=ContextSchema)

    def call_model(state, runtime: Runtime[ContextSchema]):
        messages = state["messages"]
        model = _get_model(runtime.context.get("model_name", "anthropic"))
        response = model.invoke(messages)
        # We return a list, because this will get added to the existing list
        return {"messages": [response]}
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    import { Annotation } from "@langchain/langgraph";

    const ContextSchema = Annotation.Root({
        model_name: Annotation<string>,
        system_prompt:
    });

    const builder = new StateGraph(AgentState, ContextSchema)

    function callModel(state: State, runtime: Runtime[ContextSchema]) {
      const messages = state.messages;
      const model = _getModel(runtime.context.model_name ?? "anthropic");
      const response = model.invoke(messages);
      // We return a list, because this will get added to the existing list
      return { messages: [response] };
    }
    ```
  </Tab>
</Tabs>

For more information on configurations, [see here](/langsmith/configuration-cloud#configuration).

## Create an assistant

### LangGraph SDK

To create an assistant, use the [LangGraph SDK](/langsmith/sdk) `create` method. See the [Python](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.client.AssistantsClient.create) and [JS](https://reference.langchain.com/javascript/classes/_langchain_langgraph-sdk.client.AssistantsClient.html#create) SDK reference docs for more information.

This example uses the same context schema as above, and creates an assistant with `model_name` set to `openai`.

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    from langgraph_sdk import get_client

    client = get_client(url=<DEPLOYMENT_URL>)
    openai_assistant = await client.assistants.create(
        # "agent" is the name of a graph we deployed
        "agent", context={"model_name": "openai"}, name="Open AI Assistant"
    )

    print(openai_assistant)
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    import { Client } from "@langchain/langgraph-sdk";

    const client = new Client({ apiUrl: <DEPLOYMENT_URL> });
    const openAIAssistant = await client.assistants.create({
        graphId: 'agent',
        name: "Open AI Assistant",
        context: { "model_name": "openai" },
    });

    console.log(openAIAssistant);
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request POST \
        --url <DEPLOYMENT_URL>/assistants \
        --header 'Content-Type: application/json' \
        --data '{"graph_id":"agent", "context":{"model_name":"openai"}, "name": "Open AI Assistant"}'
    ```
  </Tab>
</Tabs>

Output:

```
{
"assistant_id": "62e209ca-9154-432a-b9e9-2d75c7a9219b",
"graph_id": "agent",
"name": "Open AI Assistant"
"context": {
"model_name": "openai"
}
"metadata": {}
"created_at": "2024-08-31T03:09:10.230718+00:00",
"updated_at": "2024-08-31T03:09:10.230718+00:00",
}
```

### LangSmith UI

You can also create assistants from the LangSmith UI.

Inside your deployment, select the "Assistants" tab. This will load a table of all of the assistants in your deployment, across all graphs.

To create a new assistant, select the "+ New assistant" button. This will open a form where you can specify the graph this assistant is for, as well as provide a name, description, and the desired configuration for the assistant based on the configuration schema for that graph.

To confirm, click "Create assistant". This will take you to [Studio](/langsmith/studio) where you can test the assistant. If you go back to the "Assistants" tab in the deployment, you will see the newly created assistant in the table.

## Use an assistant

### LangGraph SDK

We have now created an assistant called "Open AI Assistant" that has `model_name` defined as `openai`. We can now use this assistant with this configuration:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    thread = await client.threads.create()
    input = {"messages": [{"role": "user", "content": "who made you?"}]}
    async for event in client.runs.stream(
        thread["thread_id"],
        # this is where we specify the assistant id to use
        openai_assistant["assistant_id"],
        input=input,
        stream_mode="updates",
    ):
        print(f"Receiving event of type: {event.event}")
        print(event.data)
        print("\n\n")
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    const thread = await client.threads.create();
    const input = { "messages": [{ "role": "user", "content": "who made you?" }] };

    const streamResponse = client.runs.stream(
      thread["thread_id"],
      // this is where we specify the assistant id to use
      openAIAssistant["assistant_id"],
      {
        input,
        streamMode: "updates"
      }
    );

    for await (const event of streamResponse) {
      console.log(`Receiving event of type: ${event.event}`);
      console.log(event.data);
      console.log("\n\n");
    }
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    thread_id=$(curl --request POST \
        --url <DEPLOYMENT_URL>/threads \
        --header 'Content-Type: application/json' \
        --data '{}' | jq -r '.thread_id') && \
    curl --request POST \
        --url "<DEPLOYMENT_URL>/threads/${thread_id}/runs/stream" \
        --header 'Content-Type: application/json' \
        --data '{
            "assistant_id": <OPENAI_ASSISTANT_ID>,
            "input": {
                "messages": [
                    {
                        "role": "user",
                        "content": "who made you?"
                    }
                ]
            },
            "stream_mode": [
                "updates"
            ]
        }' | \
        sed 's/\r$//' | \
        awk '
        /^event:/ {
            if (data_content != "") {
                print data_content "\n"
            }
            sub(/^event: /, "Receiving event of type: ", $0)
            printf "%s...\n", $0
            data_content = ""
        }
        /^data:/ {
            sub(/^data: /, "", $0)
            data_content = $0
        }
        END {
            if (data_content != "") {
                print data_content "\n\n"
            }
        }
    '
    ```
  </Tab>
</Tabs>

Output:

```
Receiving event of type: metadata
{'run_id': '1ef6746e-5893-67b1-978a-0f1cd4060e16'}



Receiving event of type: updates
{'agent': {'messages': [{'content': 'I was created by OpenAI, a research organization focused on developing and advancing artificial intelligence technology.', 'additional_kwargs': {}, 'response_metadata': {'finish_reason': 'stop', 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_157b3831f5'}, 'type': 'ai', 'name': None, 'id': 'run-e1a6b25c-8416-41f2-9981-f9cfe043f414', 'example': False, 'tool_calls': [], 'invalid_tool_calls': [], 'usage_metadata': None}]}}
```

### LangSmith UI

Inside your deployment, select the "Assistants" tab. For the assistant you would like to use, click the **Studio** button. This will open Studio with the selected assistant. When you submit an input (either in Graph or Chat mode), the selected assistant and its configuration will be used.

## Create a new version for your assistant

### LangGraph SDK

To edit the assistant, use the `update` method. This will create a new version of the assistant with the provided edits. See the [Python](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.client.AssistantsClient.update) and [JS](https://reference.langchain.com/javascript/classes/_langchain_langgraph-sdk.client.AssistantsClient.html#update) SDK reference docs for more information.

<Note>
  **Note**
  You must pass in the ENTIRE context (and metadata if you are using it). The update endpoint creates new versions completely from scratch and does not rely on previous versions.
</Note>

For example, to update your assistant's system prompt:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    openai_assistant_v2 = await client.assistants.update(
        openai_assistant["assistant_id"],
        context={
              "model_name": "openai",
              "system_prompt": "You are an unhelpful assistant!",
        },
    )
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    const openaiAssistantV2 = await client.assistants.update(
        openai_assistant["assistant_id"],
        {
            context: {
                model_name: 'openai',
                system_prompt: 'You are an unhelpful assistant!',
            },
        },
    );
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request PATCH \
    --url <DEPLOYMENT_URL>/assistants/<ASSISTANT_ID> \
    --header 'Content-Type: application/json' \
    --data '{
    "context": {"model_name": "openai", "system_prompt": "You are an unhelpful assistant!"}
    }'
    ```
  </Tab>
</Tabs>

This will create a new version of the assistant with the updated parameters and set this as the active version of your assistant. If you now run your graph and pass in this assistant id, it will use this latest version.

### LangSmith UI

You can also edit assistants from the LangSmith UI.

Inside your deployment, select the "Assistants" tab. This will load a table of all of the assistants in your deployment, across all graphs.

To edit an existing assistant, select the "Edit" button for the specified assistant. This will open a form where you can edit the assistant's name, description, and configuration.

Additionally, if using Studio, you can edit the assistants and create new versions via the "Manage Assistants" button.

## Use a previous assistant version

### LangGraph SDK

You can also change the active version of your assistant. To do so, use the `setLatest` method.

In the example above, to rollback to the first version of the assistant:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    await client.assistants.set_latest(openai_assistant['assistant_id'], 1)
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    await client.assistants.setLatest(openaiAssistant['assistant_id'], 1);
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request POST \
    --url <DEPLOYMENT_URL>/assistants/<ASSISTANT_ID>/latest \
    --header 'Content-Type: application/json' \
    --data '{
    "version": 1
    }'
    ```
  </Tab>
</Tabs>

If you now run your graph and pass in this assistant id, it will use the first version of the assistant.

### LangSmith UI

If using Studio, to set the active version of your assistant, click the "Manage Assistants" button and locate the assistant you would like to use. Select the assistant and the version, and then click the "Active" toggle. This will update the assistant to make the selected version active.

<Warning>
  **Deleting Assistants**
  Deleting as assistant will delete ALL of its versions. There is currently no way to delete a single version, but by pointing your assistant to the correct version you can skip any versions that you don't wish to use.
</Warning>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/configuration-cloud.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to add TTLs to your application
Source: https://docs.langchain.com/langsmith/configure-ttl



<Tip>
  **Prerequisites**
  This guide assumes familiarity with [LangSmith](/langsmith/home), [Persistence](/oss/python/langgraph/persistence), and [Cross-thread persistence](/oss/python/langgraph/persistence#memory-store) concepts.
</Tip>

LangSmith persists both [checkpoints](/oss/python/langgraph/persistence#checkpoints) (thread state) and [cross-thread memories](/oss/python/langgraph/persistence#memory-store) (store items). Configure Time-to-Live (TTL) policies in `langgraph.json` to automatically manage the lifecycle of this data, preventing indefinite accumulation.

## Configuring checkpoint TTL

Checkpoints capture the state of conversation threads. Setting a TTL ensures old checkpoints and threads are automatically deleted.

Add a `checkpointer.ttl` configuration to your `langgraph.json` file:

```json  theme={null}
{
  "dependencies": ["."],
  "graphs": {
    "agent": "./agent.py:graph"
  },
  "checkpointer": {
    "ttl": {
      "strategy": "delete",
      "sweep_interval_minutes": 60,
      "default_ttl": 43200
    }
  }
}
```

* `strategy`: Specifies the action taken on expiration. Currently, only `"delete"` is supported, which deletes all checkpoints in the thread upon expiration.
* `sweep_interval_minutes`: Defines how often, in minutes, the system checks for expired checkpoints.
* `default_ttl`: Sets the default lifespan of threads (and corresponding checkpoints) in minutes (e.g., 43200 minutes = 30 days). Applies only to checkpoints created after this configuration is deployed; existing checkpoints/threads are not changed. To clear older data, delete it explicitly.

## Configuring store item TTL

Store items allow cross-thread data persistence. Configuring TTL for store items helps manage memory by removing stale data.

Add a `store.ttl` configuration to your `langgraph.json` file:

```json  theme={null}
{
  "dependencies": ["."],
  "graphs": {
    "agent": "./agent.py:graph"
  },
  "store": {
    "ttl": {
      "refresh_on_read": true,
      "sweep_interval_minutes": 120,
      "default_ttl": 10080
    }
  }
}
```

* `refresh_on_read`: (Optional, default `true`) If `true`, accessing an item via `get` or `search` resets its expiration timer. If `false`, TTL only refreshes on `put`.
* `sweep_interval_minutes`: (Optional) Defines how often, in minutes, the system checks for expired items. If omitted, no sweeping occurs.
* `default_ttl`: (Optional) Sets the default lifespan of store items in minutes (e.g., 10080 minutes = 7 days). Applies only to items created after this configuration is deployed; existing items are not changed. If you need to clear older items, delete them manually. If omitted, items do not expire by default.

## Combining TTL configurations

You can configure TTLs for both checkpoints and store items in the same `langgraph.json` file to set different policies for each data type. Here is an example:

```json  theme={null}
{
  "dependencies": ["."],
  "graphs": {
    "agent": "./agent.py:graph"
  },
  "checkpointer": {
    "ttl": {
      "strategy": "delete",
      "sweep_interval_minutes": 60,
      "default_ttl": 43200
    }
  },
  "store": {
    "ttl": {
      "refresh_on_read": true,
      "sweep_interval_minutes": 120,
      "default_ttl": 10080
    }
  }
}
```

## Configure per-thread TTL

You can apply [TTL configurations per-thread](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.client.ThreadsClient.create).

```python  theme={null}
thread = await client.threads.create(
    ttl={
        "strategy": "delete",
        "ttl": 43200  # 30 days in minutes
    }
)
```

## Runtime overrides

The default `store.ttl` settings from `langgraph.json` can be overridden at runtime by providing specific TTL values in SDK method calls like `get`, `put`, and `search`.

## Deployment process

After configuring TTLs in `langgraph.json`, deploy or restart your LangGraph application for the changes to take effect. Use `langgraph dev` for local development or `langgraph up` for Docker deployment.

See the [langgraph.json CLI reference](https://langchain-ai.github.io/langgraph/reference/configuration/#configuration-file) for more details on the other configurable options.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/configure-ttl.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# LangSmith control plane
Source: https://docs.langchain.com/langsmith/control-plane



The *control plane* is the part of LangSmith that manages deployments. It includes the control plane UI, where users create and update [Agent Servers](/langsmith/agent-server), and the control plane APIs, which support the UI and provide programmatic access.

When you make an update through the control plane, the update is stored in control plane state. The [data plane](/langsmith/data-plane) “listener” polls for these updates by calling the control plane APIs.

## Control plane UI

From the control plane UI, you can:

* View a list of outstanding deployments.
* View details of an individual deployment.
* Create a new deployment.
* Update a deployment.
* Update environment variables for a deployment.
* View build and server logs of a deployment.
* View deployment metrics such as CPU and memory usage.
* Delete a deployment.

The Control plane UI is embedded in [LangSmith](https://docs.smith.langchain.com).

## Control plane API

This section describes the data model of the control plane API. The API is used to create, update, and delete deployments. See the [control plane API reference](/langsmith/api-ref-control-plane) for more details.

### Integrations

An integration is an abstraction for a `git` repository provider (e.g. GitHub). It contains all of the required metadata needed to connect with and deploy from a `git` repository.

### Deployments

A deployment is an instance of an Agent Server. A single deployment can have many revisions.

### Revisions

A revision is an iteration of a deployment. When a new deployment is created, an initial revision is automatically created. To deploy code changes or update secrets for a deployment, a new revision must be created.

### Listeners

A listener is an instance of a ["listener" application](/langsmith/data-plane#”listener”-application). A listener contains metadata about the application (e.g. version) and metadata about the compute infrastructure where it can deploy to (e.g. Kubernetes namespaces).

The listener data model only applies for [Hybrid](/langsmith/hybrid) and [Self-Hosted](/langsmith/self-hosted) deployments.

## Control plane features

This section describes various features of the control plane.

### Deployment types

For simplicity, the control plane offers two deployment types with different resource allocations: `Development` and `Production`.

| **Deployment Type** | **CPU/Memory**  | **Scaling**       | **Database**                                                                     |
| ------------------- | --------------- | ----------------- | -------------------------------------------------------------------------------- |
| Development         | 1 CPU, 1 GB RAM | Up to 1 replica   | 10 GB disk, no backups                                                           |
| Production          | 2 CPU, 2 GB RAM | Up to 10 replicas | Autoscaling disk, automatic backups, highly available (multi-zone configuration) |

CPU and memory resources are per replica.

<Warning>
  **Immutable Deployment Type**
  Once a deployment is created, the deployment type cannot be changed.
</Warning>

<Info>
  **Self-Hosted Deployment**
  Resources for [Hybrid](/langsmith/hybrid) and [Self-Hosted](/langsmith/self-hosted) deployments can be fully customized. Deployment types are only applicable for [Cloud](/langsmith/cloud) deployments.
</Info>

#### Production

`Production` type deployments are suitable for "production" workloads. For example, select `Production` for customer-facing applications in the critical path.

Resources for `Production` type deployments can be manually increased on a case-by-case basis depending on use case and capacity constraints. Contact [support@langchain.dev](mailto:support@langchain.dev) to request an increase in resources.

#### Development

`Development` type deployments are suitable development and testing. For example, select `Development` for internal testing environments. `Development` type deployments are not suitable for "production" workloads.

<Danger>
  **Preemptible Compute Infrastructure**
  `Development` type deployments (API server, queue server, and database) are provisioned on preemptible compute infrastructure. This means the compute infrastructure **may be terminated at any time without notice**. This may result in intermittent...

  * Redis connection timeouts/errors
  * Postgres connection timeouts/errors
  * Failed or retrying background runs

  This behavior is expected. Preemptible compute infrastructure **significantly reduces the cost to provision a `Development` type deployment**. By design, Agent Server is fault-tolerant. The implementation will automatically attempt to recover from Redis/Postgres connection errors and retry failed background runs.

  `Production` type deployments are provisioned on durable compute infrastructure, not preemptible compute infrastructure.
</Danger>

Database disk size for `Development` type deployments can be manually increased on a case-by-case basis depending on use case and capacity constraints. For most use cases, [TTLs](/langsmith/configure-ttl) should be configured to manage disk usage. Contact [support@langchain.dev](mailto:support@langchain.dev) to request an increase in resources.

### Database provisioning

The control plane and [data plane](/langsmith/data-plane) "listener" application coordinate to automatically create a Postgres database for each deployment. The database serves as the [persistence layer](/oss/python/langgraph/persistence#memory-store) for the deployment.

When implementing a LangGraph application, a [checkpointer](/oss/python/langgraph/persistence#checkpointer-libraries) does not need to be configured by the developer. Instead, a checkpointer is automatically configured for the graph. Any checkpointer configured for a graph will be replaced by the one that is automatically configured.

There is no direct access to the database. All access to the database occurs through the [Agent Server](/langsmith/agent-server).

The database is never deleted until the deployment itself is deleted.

<Info>
  A custom Postgres instance can be configured for [Hybrid](/langsmith/hybrid) and [Self-Hosted](/langsmith/self-hosted) deployments.
</Info>

### Asynchronous deployment

Infrastructure for deployments and revisions are provisioned and deployed asynchronously. They are not deployed immediately after submission. Currently, deployment can take up to several minutes.

* When a new deployment is created, a new database is created for the deployment. Database creation is a one-time step. This step contributes to a longer deployment time for the initial revision of the deployment.
* When a subsequent revision is created for a deployment, there is no database creation step. The deployment time for a subsequent revision is significantly faster compared to the deployment time of the initial revision.
* The deployment process for each revision contains a build step, which can take up to a few minutes.

The control plane and [data plane](/langsmith/data-plane) "listener" application coordinate to achieve asynchronous deployments.

### Monitoring

After a deployment is ready, the control plane monitors the deployment and records various metrics, such as:

* CPU and memory usage of the deployment.
* Number of container restarts.
* Number of replicas (this will increase with [autoscaling](/langsmith/data-plane#autoscaling)).
* [PostgreSQL](/langsmith/data-plane#postgres) CPU, memory usage, and disk usage.
* [Agent Server queue](/langsmith/agent-server#persistence-and-task-queue) pending/active run count.
* [Agent Server API](/langsmith/agent-server) success response count, error response count, and latency.

These metrics are displayed as charts in the Control Plane UI.

### LangSmith integration

A [LangSmith](/langsmith/home) tracing project is automatically created for each deployment. The tracing project has the same name as the deployment. When creating a deployment, the `LANGCHAIN_TRACING` and `LANGSMITH_API_KEY`/`LANGCHAIN_API_KEY` environment variables do not need to be specified; they are set automatically by the control plane.

When a deployment is deleted, the traces and the tracing project are not deleted.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/control-plane.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Cost tracking
Source: https://docs.langchain.com/langsmith/cost-tracking



Building agents at scale introduces non-trivial, usage-based costs that can be difficult to track. LangSmith automatically records LLM token usage and costs for major providers, and also allows you to submit custom cost data for any additional components.

This gives you a single, unified view of costs across your entire application, which makes it easy to monitor, understand, and debug your spend.

This guide covers:

* [Viewing costs in the LangSmith UI](#viewing-costs-in-the-langsmith-ui)
* [How cost tracking works](#cost-tracking)
* [How to send custom cost data](#send-custom-cost-data)

## Viewing costs in the LangSmith UI

In the [LangSmith UI](https://smith.langchain.com), you can explore usage and spend in three main ways: first by understanding how tokens and costs are broken down, then by viewing those details within individual traces, and finally by inspecting aggregated metrics in project stats and dashboards.

### Token and cost breakdowns

Token usage and costs are broken down into three categories:

* **Input**: Tokens in the prompt sent to the model. Subtypes include: cache reads, text tokens, image tokens, etc
* **Output**: Tokens generated in the response from the model. Subtypes include: reasoning tokens, text tokens, image tokens, etc
* **Other**: Costs from tool calls, retrieval steps or any custom runs.

You can view detailed breakdowns by hovering over cost sections in the UI. When available, each section is further categorized by subtype.

<img className="block dark:hidden" src="https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-light.png?fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=49971715854df465e81e53ad6b7b297c" alt="Cost tooltip" data-og-width="894" width="894" data-og-height="400" height="400" data-path="langsmith/images/cost-tooltip-light.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-light.png?w=280&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=0eefe6caadcf4d9a7a93c6c378122476 280w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-light.png?w=560&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=24a18c4afc2274abd598238598dfdf7d 560w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-light.png?w=840&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=fb04f0d82dbdb3e26a3fd58b4bcdc895 840w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-light.png?w=1100&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=6740a97c3545dc0df28415d2d7c67f6e 1100w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-light.png?w=1650&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=e7a8c02294dc5dbf08461118e820af11 1650w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-light.png?w=2500&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=d83617c271e7b701794589caa5964ba4 2500w" />

<img className="hidden dark:block" src="https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-dark.png?fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=a51c9bc7bbd1836231b80d7d5a8db735" alt="Cost tooltip" data-og-width="900" width="900" data-og-height="394" height="394" data-path="langsmith/images/cost-tooltip-dark.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-dark.png?w=280&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=55e6e557896671cf177be070b53853ca 280w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-dark.png?w=560&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=5aacd2afe8bb68d48f1b8718b04b337e 560w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-dark.png?w=840&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=b18555fd40e07821742940bcf23776f4 840w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-dark.png?w=1100&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=f5ba1370e3dd595cfc0af949ffe454f4 1100w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-dark.png?w=1650&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=f25ef24cc27cc01c3da8e76f402aeb12 1650w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tooltip-dark.png?w=2500&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=0a1d6251fbab47105ca63b4dfe6ef809 2500w" />

You can inspect these breakdowns throughout the LangSmith UI, described in the following section.

### Where to view token and cost breakdowns

<AccordionGroup>
  <Accordion title="In the trace tree">
    The trace tree shows the most detailed view of token usage and cost (for a single trace).  It displays the total usage for the entire trace, aggregated values for each parent run and token and cost breakdowns for each child run.

    Open any run inside a tracing project to view its trace tree.

    <img className="block dark:hidden" src="https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-light.png?fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=a25bf30084d96292ba00ca84c07653d6" alt="Cost tooltip" data-og-width="2062" width="2062" data-og-height="1530" height="1530" data-path="langsmith/images/trace-tree-costs-light.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-light.png?w=280&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=e8a79cea1a5bb04adcbf1ee0e62533e7 280w, https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-light.png?w=560&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=3af7a8b874fcd58778412d260f1ab586 560w, https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-light.png?w=840&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=4697febea4d4ece0924f34dc87ddba8f 840w, https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-light.png?w=1100&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=5b306c9a32e9ce77bc2f92eaac315c2e 1100w, https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-light.png?w=1650&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=d5145faa36dd98b6f0442cb0bfaa4fa7 1650w, https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-light.png?w=2500&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=238b220a634d50adcf0a2754c167cee6 2500w" />

    <img className="hidden dark:block" src="https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-dark.png?fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=e2037cd8309e754f8753278d334c8344" alt="Cost tooltip" data-og-width="2052" width="2052" data-og-height="1490" height="1490" data-path="langsmith/images/trace-tree-costs-dark.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-dark.png?w=280&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=9f273466040d5f3178a1f32903b23578 280w, https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-dark.png?w=560&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=4848270e7454f48070aa7a3585d9cafa 560w, https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-dark.png?w=840&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=da0f8036b96015eb75b8721dc0c10425 840w, https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-dark.png?w=1100&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=4ee69aee2b0dfae76923f09528a10977 1100w, https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-dark.png?w=1650&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=0fc6303cc3c060aa35db962d6cfcb211 1650w, https://mintcdn.com/langchain-5e9cc07a/GpRpLUps9-PFSAXx/langsmith/images/trace-tree-costs-dark.png?w=2500&fit=max&auto=format&n=GpRpLUps9-PFSAXx&q=85&s=3e9abf61bc4da61f930fd923bec0bcfb 2500w" />
  </Accordion>

  <Accordion title="In project stats">
    The project stats panel shows the total token usage and cost for all traces in a project.

    <img className="block dark:hidden" src="https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-light.png?fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=c9168cc335b0d9ccdde0ebe6ab1abd91" alt="Cost tracking chart" data-og-width="1257" width="1257" data-og-height="544" height="544" data-path="langsmith/images/stats-pane-cost-tracking-light.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-light.png?w=280&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=72360ffba7901bae6a32dccffbc8098a 280w, https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-light.png?w=560&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=9876ac6aa567436835c169cd416320ce 560w, https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-light.png?w=840&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=fb2ac554cae4b7634d73872df0be735d 840w, https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-light.png?w=1100&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=44027849f87a5644bc0c791be2c21ffe 1100w, https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-light.png?w=1650&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=e9a7ea651c412dd20cea34c7b91e15e4 1650w, https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-light.png?w=2500&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=b35b55da99133863bb1bf80c57b15fc7 2500w" />

    <img className="hidden dark:block" src="https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-dark.png?fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=e0be66ec244c134421af0475f83c3b1d" alt="Cost tracking chart" data-og-width="1253" width="1253" data-og-height="546" height="546" data-path="langsmith/images/stats-pane-cost-tracking-dark.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-dark.png?w=280&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=a87ee7f1da026cd212eadda5616c9e76 280w, https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-dark.png?w=560&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=506dae4c2035c37fd846b4b96575130e 560w, https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-dark.png?w=840&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=0883314216050e125bb733953b506a4e 840w, https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-dark.png?w=1100&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=1781f3039d06b932a126d403b5060f99 1100w, https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-dark.png?w=1650&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=6a2ebfeeb609516d33eec4affe08af2d 1650w, https://mintcdn.com/langchain-5e9cc07a/yIWcej3jR6iH0nDR/langsmith/images/stats-pane-cost-tracking-dark.png?w=2500&fit=max&auto=format&n=yIWcej3jR6iH0nDR&q=85&s=f9490ee9b5dc79060f6ef8cb072a2c73 2500w" />
  </Accordion>

  <Accordion title="In dashboards">
    Dashboards help you explore cost and token usage trends over time. The [prebuilt dashboard](/langsmith/dashboards/#prebuilt-dashboards) for a tracing project shows total costs and a cost breakdown by input and output tokens.

    You may also configure custom cost tracking charts in [custom dashboards](https://docs.langchain.com/langsmith/dashboards#custom-dashboards).

    <img className="block dark:hidden" src="https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-light.png?fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=18b74d9ee26db0fe17877b3dc3c2c120" alt="Cost tracking chart" data-og-width="1206" width="1206" data-og-height="866" height="866" data-path="langsmith/images/cost-tracking-chart-light.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-light.png?w=280&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=1f8119ffc4dbe3647884e83b9e600b2a 280w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-light.png?w=560&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=35fc237732844199920fe48c16c06c68 560w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-light.png?w=840&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=2697b4965b63dfaebdff1604fb509abd 840w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-light.png?w=1100&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=19a096bc55c074465060e6d7f1c0a5b3 1100w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-light.png?w=1650&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=0d524ec464b630a04312873f3847887a 1650w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-light.png?w=2500&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=d27dcf301e56083dfefb5d1f1b06baa6 2500w" />

    <img className="hidden dark:block" src="https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-dark.png?fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=134115cab7e741a5b7f6d784f9d51b76" alt="Cost tracking chart" data-og-width="1202" width="1202" data-og-height="920" height="920" data-path="langsmith/images/cost-tracking-chart-dark.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-dark.png?w=280&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=ad9e895cdf72d959bce04ec03321a78f 280w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-dark.png?w=560&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=e8c0d349174089891b2cd20d13be7d41 560w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-dark.png?w=840&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=a0cc08bf1d4aaff36f11f906c6b19729 840w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-dark.png?w=1100&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=268a4acd73f13045ff8de90733a50cde 1100w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-dark.png?w=1650&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=21c6c9d062c67e9f6900ac3deb7825d2 1650w, https://mintcdn.com/langchain-5e9cc07a/S029Harmw-iSrSVw/langsmith/images/cost-tracking-chart-dark.png?w=2500&fit=max&auto=format&n=S029Harmw-iSrSVw&q=85&s=ac2d88cf00ccf7daf7d367433bbf6d62 2500w" />
  </Accordion>
</AccordionGroup>

## Cost tracking

You can track costs in two ways:

1. Costs for LLM calls can be **automatically derived from token counts and model prices**
2. Cost for LLM calls or any other run type can be **manually specified as part of the run data**

The approach you use will depend on on what you're tracking and how your model pricing is structured:

| Method            | Run type: LLM                                                                                                                                                                                                                                                                                                                                                                                                                                                    | Run type: Other                                                |
| ----------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------- |
| **Automatically** | <ul><li>Calling LLMs with [LangChain](/oss/python/langchain/overview)</li><li>Tracing LLM calls to OpenAI, Anthropic or models that follow an OpenAI-compliant format with `@traceable`</li><li> Using LangSmith wrappers for [OpenAI](/langsmith/trace-openai) or [Anthropic](/langsmith/trace-anthropic)</li><li>For other model providers, read the [token and cost information guide](/langsmith/log-llm-trace#provide-token-and-cost-information)</li></ul> | Not applicable.                                                |
| **Manually**      | If LLM call costs are non-linear (eg. follow a custom cost function)                                                                                                                                                                                                                                                                                                                                                                                             | Send costs for any run types, e.g. tool calls, retrieval steps |

### LLM calls: Automatically track costs based on token counts

To compute cost automatically from token usage, you need to provide **token counts**, the **model and provider** and the **model price**.

<Note>
  Follow the instructions below if you’re using model providers whose responses don’t follow the same patterns as one of OpenAI or Anthropic.

  These steps are **only required** if you are *not*:

  * Calling LLMs with [LangChain](/oss/python/langchain/overview)
  * Using `@traceable` to trace LLM calls to OpenAI, Anthropic or models that follow an OpenAI-compliant format
  * Using LangSmith wrappers for [OpenAI](/langsmith/trace-openai) or [Anthropic](/langsmith/trace-anthropic).
</Note>

**1. Send token counts**

Many models include token counts as part of the response. You must extract this information and include it in your run using one of the following methods:

<Accordion title="A. Set a `usage_metadata` field on the run’s metadata">
  Set a `usage_metadata` field on the run's metadata. The advantage of this approach is that you do not need to change your traced function’s runtime outputs

  <CodeGroup>
    ```python Python theme={null}
    from langsmith import traceable, get_current_run_tree

    inputs = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "I'd like to book a table for two."},
    ]

    @traceable(
        run_type="llm",
        metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
    )
    def chat_model(messages: list):
        # Imagine this is the real model output format your application expects
        assistant_message = {
            "role": "assistant",
            "content": "Sure, what time would you like to book the table for?"
        }

        # Token usage you compute or receive from the provider
        token_usage = {
            "input_tokens": 27,
            "output_tokens": 13,
            "total_tokens": 40,
            "input_token_details": {"cache_read": 10}
        }

        # Attach token usage to the LangSmith run
        run = get_current_run_tree()
        run.set(usage_metadata=token_usage)

        return assistant_message

    chat_model(inputs)
    ```

    ```typescript TypeScript theme={null}
    import { traceable, getCurrentRunTree } from "langsmith/traceable";

    const inputs = [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: "I'd like to book a table for two." },
    ];

    const chatModel = traceable(
      async ({ messages }) => {
        // The output your application expects
        const assistantMessage = {
          role: "assistant",
          content: "Sure, what time would you like to book the table for?",
        };

        // Token usage you compute or receive from the provider
        const tokenUsage = {
          input_tokens: 27,
          output_tokens: 13,
          total_tokens: 40,
          input_token_details: { cache_read: 10 },
        };

        // Attach usage to the LangSmith run
        const runTree = getCurrentRunTree();
        runTree.metadata.usage_metadata = tokenUsage;

        return assistantMessage;
      },
      {
        run_type: "llm",
        name: "chat_model",
        metadata: {
          ls_provider: "my_provider",
          ls_model_name: "my_model",
        },
      }
    );

    await chatModel({ messages: inputs });
    ```
  </CodeGroup>
</Accordion>

<Accordion title="B. Return a `usage_metadata` field in your traced function's outputs.">
  Include the `usage_metadata` key directly within the object returned by your traced function. LangSmith will extract it from the output.

  <CodeGroup>
    ```python Python theme={null}
    from langsmith import traceable

    inputs = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "I'd like to book a table for two."},
    ]
    output = {
        "choices": [
            {
                "message": {
                    "role": "assistant",
                    "content": "Sure, what time would you like to book the table for?"
                }
            }
        ],
        "usage_metadata": {
            "input_tokens": 27,
            "output_tokens": 13,
            "total_tokens": 40,
            "input_token_details": {"cache_read": 10}
        },
    }

    @traceable(
        run_type="llm",
        metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
    )
    def chat_model(messages: list):
        return output

    chat_model(inputs)
    ```

    ```typescript TypeScript theme={null}
    import { traceable } from "langsmith/traceable";

    const messages = [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "I'd like to book a table for two." }
    ];
    const output = {
        choices: [
            {
                message: {
                    role: "assistant",
                    content: "Sure, what time would you like to book the table for?",
                },
            },
        ],
        usage_metadata: {
            input_tokens: 27,
            output_tokens: 13,
            total_tokens: 40,
        },
    };

    const chatModel = traceable(
        async ({
            messages,
        }: {
            messages: { role: string; content: string }[];
            model: string;
        }) => {
            return output;
        },
        {
            run_type: "llm",
            name: "chat_model",
            metadata: {
                ls_provider: "my_provider",
                ls_model_name: "my_model"
            }
        }
    );

    await chatModel({ messages });
    ```
  </CodeGroup>
</Accordion>

In either case, the usage metadata should contain a subset of the following LangSmith-recognized fields:

<Accordion title="Usage Metadata Schema and Cost Calculation">
  The following fields in the `usage_metadata` dict are recognized by LangSmith. You can view the full [Python types](https://github.com/langchain-ai/langsmith-sdk/blob/e705fbd362be69dd70229f94bc09651ef8056a61/python/langsmith/schemas.py#L1196-L1227) or [TypeScript interfaces](https://github.com/langchain-ai/langsmith-sdk/blob/e705fbd362be69dd70229f94bc09651ef8056a61/js/src/schemas.ts#L637-L689) directly.

  <ParamField path="input_tokens" type="number">
    Number of tokens used in the model input. Sum of all input token types.
  </ParamField>

  <ParamField path="output_tokens" type="number">
    Number of tokens used in the model response. Sum of all output token types.
  </ParamField>

  <ParamField path="total_tokens" type="number">
    Number of tokens used in the input and output. Optional, can be inferred. Sum of input\_tokens + output\_tokens.
  </ParamField>

  <ParamField path="input_token_details" type="object">
    Breakdown of input token types. Keys are token-type strings, values are counts. Example `{"cache_read": 5}`.

    Known fields include: `audio`, `text`, `image`, `cache_read`, `cache_creation`. Additional fields are possible depending on the model or provider.
  </ParamField>

  <ParamField path="output_token_details" type="object">
    Breakdown of output token types. Keys are token-type strings, values are counts. Example `{"reasoning": 5}`.

    Known fields include: `audio`, `text`, `image`, `reasoning`. Additional fields are possible depending on the model or provider.
  </ParamField>

  <ParamField path="input_cost" type="number">
    Cost of the input tokens.
  </ParamField>

  <ParamField path="output_cost" type="number">
    Cost of the output tokens.
  </ParamField>

  <ParamField path="total_cost" type="number">
    Cost of the tokens. Optional, can be inferred.  Sum of input\_cost + output\_cost.
  </ParamField>

  <ParamField path="input_cost_details" type="object">
    Details of the input cost. Keys are token-type strings, values are cost amounts.
  </ParamField>

  <ParamField path="output_cost_details" type="object">
    Details of the output cost. Keys are token-type strings, values are cost amounts.
  </ParamField>

  **Cost Calculations**

  The cost for a run is computed greedily from most-to-least specific token type. Suppose you set a price of \$2 per 1M input tokens with a detailed price of \$1 per 1M `cache_read` input tokens, and \$3 per 1M output tokens. If you uploaded the following usage metadata:

  ```python  theme={null}
  {
    "input_tokens": 20,
    "input_token_details": {"cache_read": 5},
    "output_tokens": 10,
    "total_tokens": 30,
  }
  ```

  Then, the token costs would be computed as follows:

  ```python  theme={null}
  # Notice that LangSmith computes the cache_read cost and then for any
  # remaining input_tokens, the default input price is applied.
  input_cost = 5 * 1e-6 + (20 - 5) * 2e-6  # 3.5e-5
  output_cost = 10 * 3e-6  # 3e-5
  total_cost = input_cost + output_cost  # 6.5e-5
  ```
</Accordion>

**2. Specify model name**

When using a custom model, the following fields need to be specified in a [run's metadata](/langsmith/add-metadata-tags) in order to associate token counts with costs. It's also helpful to provide these metadata fields to identify the model when viewing traces and when filtering.

* `ls_provider`: The provider of the model, e.g., “openai”, “anthropic”
* `ls_model_name`: The name of the model, e.g., “gpt-4o-mini”, “claude-3-opus-20240229”

**3. Set model prices**

A model pricing map is used to map model names to their per-token prices to compute costs from token counts. LangSmith's [model pricing table](https://smith.langchain.com/settings/workspaces/models) is used for this.

<Note>
  The table comes with pricing information for most OpenAI, Anthropic, and Gemini models. You can [add prices for other models](/langsmith/cost-tracking#create-a-new-model-price-entry), or [overwrite pricing for default models](/langsmith/cost-tracking#update-an-existing-model-price-entry) if you have custom pricing.
</Note>

For models that have different pricing for different token types (e.g., multimodal or cached tokens), you can specify a breakdown of prices for each token type. Hovering over the `...` next to the input/output prices shows you the price breakdown by token type.

<img className="block dark:hidden" src="https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-light.png?fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=ae82f1ff59cfc57923d63869cb0608c0" alt="Model price map" data-og-width="1256" width="1256" data-og-height="494" height="494" data-path="langsmith/images/model-price-map-light.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-light.png?w=280&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=ed5d4889252f7a890b8b86705a07aa31 280w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-light.png?w=560&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=5a801e45f2f628bbb5565b240a1060a3 560w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-light.png?w=840&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=5c43622f4b5259a41b911c1a1a686d1e 840w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-light.png?w=1100&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=73bfe48bbbc86df27b3c2705eb3ec850 1100w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-light.png?w=1650&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=a37f360aa7b50e29e984daf69a969be5 1650w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-light.png?w=2500&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=81a6f4ce7a883d757bd35d80ed950de4 2500w" />

<img className="hidden dark:block" src="https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-dark.png?fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=739bb0123e9a238944452048578a4c49" alt="Model price map" data-og-width="1265" width="1265" data-og-height="486" height="486" data-path="langsmith/images/model-price-map-dark.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-dark.png?w=280&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=3ef5d46c55b6d2a5a697bec31a908db4 280w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-dark.png?w=560&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=92cdeb0a4009cd7fb8f942ba28242571 560w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-dark.png?w=840&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=96100a58c8821063940522dd24758305 840w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-dark.png?w=1100&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=6e1e6a811ff7a28cf316b712deb62cd9 1100w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-dark.png?w=1650&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=61748722943b6fb76b5a45af10caabbc 1650w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/model-price-map-dark.png?w=2500&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=badfe7dda009e8c5ec675228331b3ed9 2500w" />

<Note>
  Updates to the model pricing map are not reflected in the costs for traces already logged. We do not currently support backfilling model pricing changes.
</Note>

<Accordion title="Create a new or modify an existing model price entry">
  To modify the default model prices, create a new entry with the same model, provider and match pattern as the default entry.

  To create a *new entry* in the model pricing map, click on the `+ Model` button in the top right corner.

  <img className="block dark:hidden" src="https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/new-price-map-entry-light.png?fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=63dbd6e59b279a1f4ae692c892223af9" alt="New price map entry interface" data-og-width="467" width="467" data-og-height="854" height="854" data-path="langsmith/images/new-price-map-entry-light.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/new-price-map-entry-light.png?w=280&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=0553163c010622eeb61af856cc6c41c4 280w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/new-price-map-entry-light.png?w=560&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=95cb41aa3695ea32e701f6a620cb778e 560w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/new-price-map-entry-light.png?w=840&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=37c244bfcfbf969a717dac9b7a01c58f 840w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/new-price-map-entry-light.png?w=1100&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=ca82285f5323216bb59b1be9b6cf1a2e 1100w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/new-price-map-entry-light.png?w=1650&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=a92a5257db7e1323b3301e5ed1aef7b0 1650w, https://mintcdn.com/langchain-5e9cc07a/PYCacG42leg3Zt_8/langsmith/images/new-price-map-entry-light.png?w=2500&fit=max&auto=format&n=PYCacG42leg3Zt_8&q=85&s=537332d988ad5b674bf5e5bd1f5584cc 2500w" />

  <img className="hidden dark:block" src="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/new-price-map-entry.png?fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=2df87e349db00b8560f3d44824f2df13" alt="New price map entry interface" data-og-width="958" width="958" data-og-height="1762" height="1762" data-path="langsmith/images/new-price-map-entry.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/new-price-map-entry.png?w=280&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=4c49d72012424c80dc831b7f19125206 280w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/new-price-map-entry.png?w=560&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=ce3f75b14759e43c35723726933177a8 560w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/new-price-map-entry.png?w=840&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=2fb82f3bc83ee9dcecfad5243edb0844 840w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/new-price-map-entry.png?w=1100&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=57aebbd5b56a0ccbf91a664f7e930bef 1100w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/new-price-map-entry.png?w=1650&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=fbd6624fff7cf22139e022491e3188c3 1650w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/new-price-map-entry.png?w=2500&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=9ced4164e44f666c9e3dba0a9be9a188 2500w" />

  Here, you can specify the following fields:

  * **Model Name**: The human-readable name of the model.
  * **Input Price**: The cost per 1M input tokens for the model. This number is multiplied by the number of tokens in the prompt to calculate the prompt cost.
  * **Input Price Breakdown** (Optional): The breakdown of price for each different type of input token, e.g. `cache_read`, `video`, `audio`
  * **Output Price**: The cost per 1M output tokens for the model. This number is multiplied by the number of tokens in the completion to calculate the completion cost.
  * **Output Price Breakdown** (Optional): The breakdown of price for each different type of output token, e.g. `reasoning`, `image`, etc.
  * **Model Activation Date** (Optional): The date from which the pricing is applicable. Only runs after this date will apply this model price.
  * **Match Pattern**: A regex pattern to match the model name. This is used to match the value for `ls_model_name` in the run metadata.
  * **Provider** (Optional): The provider of the model. If specified, this is matched against `ls_provider` in the run metadata.

  Once you have set up the model pricing map, LangSmith will automatically calculate and aggregate the token-based costs for traces based on the token counts provided in the LLM invocations.
</Accordion>

### LLM calls: Sending costs directly

If your model follows a non-linear pricing scheme, we recommend calculating costs client-side and sending them to LangSmith as `usage_metadata`.

<Note>
  Gemini 3 Pro Preview and Gemini 2.5 Pro follow a pricing scheme with a stepwise cost function. We support this pricing scheme for Gemini by default. For any other models with non-linear pricing, you will need to follow these instructions to calculate costs.
</Note>

<CodeGroup>
  ```python Python theme={null}
  from langsmith import traceable, get_current_run_tree

  inputs = [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "I'd like to book a table for two."},
  ]

  @traceable(
      run_type="llm",
      metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
  )
  def chat_model(messages: list):
      llm_output = {
          "choices": [
              {
                  "message": {
                      "role": "assistant",
                      "content": "Sure, what time would you like to book the table for?"
                  }
              }
          ],
          "usage_metadata": {
              # Specify cost (in dollars) for the inputs and outputs
              "input_cost": 1.1e-6,
              "input_cost_details": {"cache_read": 2.3e-7},
              "output_cost": 5.0e-6,
          },
      }
      run = get_current_run_tree()
      run.set(usage_metadata=llm_output["usage_metadata"])
      return llm_output["choices"][0]["message"]

  chat_model(inputs)
  ```

  ```typescript TypeScript theme={null}
  import { traceable, getCurrentRunTree } from "langsmith/traceable";

  const messages = [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "I'd like to book a table for two." }
  ];

  const chatModel = traceable(
    async (messages: { role: string; content: string }[]) => {
      const llmOutput = {
        choices: [
          {
            message: {
              role: "assistant",
              content: "Sure, what time would you like to book the table for?",
            },
          },
        ],
        // Specify cost (in dollars) for the inputs and outputs
        usage_metadata: {
          input_cost: 1.1e-6,
          input_cost_details: { cache_read: 2.3e-7 },
          output_cost: 5.0e-6,
        },
      };

      // Attach usage metadata to the run
      const runTree = getCurrentRunTree();
      runTree.metadata.usage_metadata = llmOutput.usage_metadata;

      // Return only the assistant message
      return llmOutput.choices[0].message;
    },
    {
      run_type: "llm",
      name: "chat_model",
      metadata: {
        ls_provider: "my_provider",
        ls_model_name: "my_model",
      },
    }
  );

  await chatModel(messages);
  ```
</CodeGroup>

### Other runs: Sending costs

You can also send cost information for any non-LLM runs, such as tool calls.The cost must be specified in the `total_cost` field under the runs `usage_metadata`.

<Accordion title="A. Set a `total_cost` field on the run’s usage_metadata">
  Set a `total_cost` field on the run’s `usage_metadata`. The advantage of this approach is that you do not need to change your traced function’s runtime outputs

  <CodeGroup>
    ```python Python theme={null}
    from langsmith import traceable, get_current_run_tree

    # Example tool: get_weather
    @traceable(run_type="tool", name="get_weather")
    def get_weather(city: str):
        # Your tool logic goes here
        result = {
            "temperature_f": 68,
            "condition": "sunny",
            "city": city,
        }

        # Cost for this tool call (computed however you like)
        tool_cost = 0.0015

        # Attach usage metadata to the LangSmith run
        run = get_current_run_tree()
        run.set(usage_metadata={"total_cost": tool_cost})

        # Return only the actual tool result (no usage info)
        return result

    tool_response = get_weather("San Francisco")
    ```

    ```typescript TypeScript theme={null}
    import { traceable, getCurrentRunTree } from "langsmith/traceable";

    // Example tool: get_weather
    const getWeather = traceable(
      async ({ city }) => {
        // Your tool logic goes here
        const result = {
          temperature_f: 68,
          condition: "sunny",
          city,
        };

        // Cost for this tool call (computed however you like)
        const toolCost = 0.0015;

        // Attach usage metadata to the LangSmith run
        const runTree = getCurrentRunTree();
        runTree.metadata.usage_metadata = {
          total_cost: toolCost,
        };

        // Return only the actual tool result (no usage info)
        return result;
      },
      {
        run_type: "tool",
        name: "get_weather",
      }
    );

    const toolResponse = await getWeather({ city: "San Francisco" });
    ```
  </CodeGroup>
</Accordion>

<Accordion title="B. Return a `total_cost` field in your traced function's outputs.">
  Include the `usage_metadata` key directly within the object returned by your traced function. LangSmith will extract it from the output.

  <CodeGroup>
    ```python Python theme={null}
    from langsmith import traceable

    # Example tool: get_weather
    @traceable(run_type="tool", name="get_weather")
    def get_weather(city: str):
        # Your tool logic goes here
        result = {
            "temperature_f": 68,
            "condition": "sunny",
            "city": city,
        }

        # Attach tool call costs here
        return {
            **result,
            "usage_metadata": {
                "total_cost": 0.0015,   # <-- cost for this tool call
            },
        }

    tool_response = get_weather("San Francisco")
    ```

    ```typescript TypeScript theme={null}
    import { traceable } from "langsmith/traceable";

    // Example tool: get_weather
    const getWeather = traceable(
      async ({ city }) => {
        // Your tool logic goes here
        const result = {
          temperature_f: 68,
          condition: "sunny",
          city,
        };

        // Attach tool call costs here
        return {
          ...result,
          usage_metadata: {
            total_cost: 0.0015,  // <-- cost for this tool call
          },
        };
      },
      {
        run_type: "tool",
        name: "get_weather",
      }
    );

    const toolResponse = await getWeather({ city: "San Francisco" });
    ```
  </CodeGroup>
</Accordion>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/cost-tracking.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Create a prompt
Source: https://docs.langchain.com/langsmith/create-a-prompt



Navigate to the  in the left-hand sidebar or from the application homepage.

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/empty-playground.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=2bede4ae9332bdf43ae20580d5bb957d" alt="" data-og-width="1747" width="1747" data-og-height="1285" height="1285" data-path="langsmith/images/empty-playground.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/empty-playground.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=20602d3b2e7b4219a8dc3612fee194b7 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/empty-playground.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=141b0ab234c54970f4e86f24ed13a954 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/empty-playground.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=1fa1d0cb9075f4fbdcae487ea4348116 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/empty-playground.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=bb3583d9357a597754396ee94e52c0da 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/empty-playground.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=c21b3e9881e95a0b954d578fa1d8aa47 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/empty-playground.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=3dcdde6a3fd4f3365059e181f53f68a0 2500w" />

## Compose your prompt

On the left is an editable view of the prompt.

The prompt is made up of messages, each of which has a "role" - including `system`, `human`, and `ai`.

### Template format

The default template format is `f-string`, but you can change the prompt template format to `mustache` by clicking on the settings icon next to the model -> prompt format -> template format. Learn more about template formats [here](/langsmith/prompt-engineering-concepts#f-string-vs-mustache).
<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/template-format.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=417fa567135babb46d2bf080b7eb44f0" alt="" data-og-width="938" width="938" data-og-height="352" height="352" data-path="langsmith/images/template-format.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/template-format.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=5bbcf83ad9e251ea9fdeb6c0f7dd49eb 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/template-format.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=02776bb45c6ffe0df98775886e75eaeb 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/template-format.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=c63b67b6afe893f7944ee2fb8b76bbaf 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/template-format.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=9df27878d7eaf60e953b9438fcf3f8c4 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/template-format.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=a8b2e1b3fd0977e5b6beb984119bc6fd 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/template-format.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=d30b1d76394200f9473f0b8fb08d2a5e 2500w" />

### Add a template variable

The power of prompts comes from the ability to use variables in your prompt. You can use variables to add dynamic content to your prompt. Add a template variable in one of two ways:

1. Add `{{variable_name}}` to your prompt (with one curly brace on each side for `f-string` and two for `mustache`). <img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-with-variable.png?fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=b250a57ef0e0a40a56822af750d52810" alt="" data-og-width="726" width="726" data-og-height="169" height="169" data-path="langsmith/images/prompt-with-variable.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-with-variable.png?w=280&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=ea82252d99c7ee63c15d1c1036db8c55 280w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-with-variable.png?w=560&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=fc56d047cf631b1e66ecfd09ab4c03a5 560w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-with-variable.png?w=840&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=7cc8cb861e9361445dcee545cbac84b7 840w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-with-variable.png?w=1100&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=a7ed018754ebb8dad675a75c4aa638cb 1100w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-with-variable.png?w=1650&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=5d360aac28b60689789e8d3274103611 1650w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-with-variable.png?w=2500&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=b9b98555c36f929004d234cc7fadaea5 2500w" />

2. Highlight text you want to templatize and click the tooltip button that shows up. Enter a name for your variable, and convert. <img src="https://mintlify.s3.us-west-1.amazonaws.com/langchain-5e9cc07a/langsmith/images/convert-to-variable.gif" alt="" />

When we add a variable, we see a place to enter sample inputs for our prompt variables. Fill these in with values to test the prompt. <img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-inputs.png?fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=35674518c53e340d02719cbb7b5fd782" alt="" data-og-width="775" width="775" data-og-height="134" height="134" data-path="langsmith/images/prompt-inputs.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-inputs.png?w=280&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=59775af2fccde924b7ad7657db2b4656 280w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-inputs.png?w=560&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=df9b24f1eb306578590ee772720ced08 560w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-inputs.png?w=840&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=df549771817326a72a9387d0133ea590 840w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-inputs.png?w=1100&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=4bc399d02630e4873ae34afbbb60057e 1100w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-inputs.png?w=1650&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=78917ac5c4c938d926a5fe420d4d2780 1650w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-inputs.png?w=2500&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=89368aa76e0764d3f3fcb90623b2f28b 2500w" />

### Structured output

Adding an output schema to your prompt will get output in a structured format. Learn more about structured output [here](/langsmith/prompt-engineering-concepts#structured-output). <img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/structured-output.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=51cdef35a620c225896dbf2f3ab07528" alt="" data-og-width="814" width="814" data-og-height="574" height="574" data-path="langsmith/images/structured-output.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/structured-output.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=7af3797e6106f2c7858cc67dac7cfe60 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/structured-output.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=7a3b8bb3d32eda103213856e4e05e74c 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/structured-output.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=e973d61622e18ab678631c734c4b16a8 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/structured-output.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=c63196b1604baef88db3f56375458005 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/structured-output.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=e672a8f8af25d414d61592a1b22205c0 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/structured-output.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=9d57196b5940203b66a31196ea464243 2500w" />

### Tools

You can also add a tool by clicking the `+ Tool` button at the bottom of the prompt editor. See [here](/langsmith/use-tools) for more information on how to use tools.

## Run the prompt

Click "Start" to run the prompt.

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-a-prompt-run.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=9dec20b94326e5c8b11775b56eca55b4" alt="" data-og-width="1525" width="1525" data-og-height="766" height="766" data-path="langsmith/images/create-a-prompt-run.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-a-prompt-run.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=10cdcbb52db991d37478cdd199f51baf 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-a-prompt-run.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=14986d2ef1ad2aafd69d5005413604c8 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-a-prompt-run.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=20cc523bb2ffa9756d95908b0b434d91 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-a-prompt-run.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=5bdcb8118c13d9e9710ccc663058f47c 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-a-prompt-run.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=1f6760ed7f81ce2d9c7907c8539d6122 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-a-prompt-run.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=4c47787a81ad1ffed842e630ec80a362 2500w" />

## Save your prompt

To save your prompt, click the "Save" button, name your prompt, and decide if you want it to be "private" or "public". Private prompts are only visible to your workspace, while public prompts are discoverable to anyone.

The model and configuration you select in the Playground settings will be saved with the prompt. When you reopen the prompt, the model and configuration will automatically load from the saved version. <img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/save-prompt.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=180e2c79fb9d1ee8d7869fc279e2d94a" alt="" data-og-width="465" width="465" data-og-height="306" height="306" data-path="langsmith/images/save-prompt.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/save-prompt.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=2a2ea17f2ffb787ce2bbbfc88302636e 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/save-prompt.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=7120851531e79e13d8717ba14eb64483 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/save-prompt.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=6b7d47e08120d2329fa923be8c19a612 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/save-prompt.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=b6a4ed33c82c54f093ad4f07db1b7a1c 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/save-prompt.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=5746af64213ab0bc08d7315a895a0d28 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/save-prompt.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=470e4700c6ed64348f25ae23a8d462c5 2500w" />

<Check>
  The first time you create a public prompt, you'll be asked to set a LangChain Hub handle. All your public prompts will be linked to this handle. In a shared workspace, this handle will be set for the whole workspace.
</Check>

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/public-handle.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=56dbd809c50fb2abc816c73c599f0baf" alt="" data-og-width="575" width="575" data-og-height="357" height="357" data-path="langsmith/images/public-handle.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/public-handle.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=2e87d4538d4413b9ce13aed26f43e075 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/public-handle.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=df539980aec2f06214ee13668d271e38 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/public-handle.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=0b19f257c2d9eb78b88c0df6e7920d8d 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/public-handle.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=06c1efd1463437913cbdd6e19274f354 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/public-handle.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=648394098abe45d6d916ee1fe3f9066b 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/public-handle.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=844bf001cfa48ba449c873d228cc5fa3 2500w" />

## View your prompts

You've just created your first prompt! View a table of your prompts in the prompts tab.

<img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-table.png?fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=9f8f5567bb93a0add181a51531474796" alt="" data-og-width="1508" width="1508" data-og-height="309" height="309" data-path="langsmith/images/prompt-table.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-table.png?w=280&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=6421b33ff02f9af3f994e665fbcddf96 280w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-table.png?w=560&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=bf3d973fd239d93c0c4b0dd45e2e7129 560w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-table.png?w=840&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=d2d60a1b18eb7baaeaa5a04fa67d4ac7 840w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-table.png?w=1100&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=9d11895d724a60910231494927f86e24 1100w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-table.png?w=1650&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=52afb51d257f64c5e59ad77e3a3cb67a 1650w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prompt-table.png?w=2500&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=15e2b24c349a9f21729b572fdb23734f 2500w" />

## Add metadata

To add metadata to your prompt, click the prompt and then click the "Edit" pencil icon next to the name. This brings you to where you can add additional information about the prompt, including a description, a README, and use cases. For public prompts this information will be visible to anyone who views your prompt in the LangChain Hub.

<img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pencil.png?fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=e54ca6c0c8283b027f5848f79d1cf064" alt="" data-og-width="1167" width="1167" data-og-height="1067" height="1067" data-path="langsmith/images/pencil.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pencil.png?w=280&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=0336448912188bb85366b0c49556d207 280w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pencil.png?w=560&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=6456cdae3e5743b770ca530748efd21b 560w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pencil.png?w=840&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=7f16e9adaa3e2d3063623317344fb7d7 840w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pencil.png?w=1100&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=12e5f2c3a2cc2f91d66d467cff910809 1100w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pencil.png?w=1650&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=ce6cb1b3011638d9bec990c1e8131837 1650w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pencil.png?w=2500&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=b61eb5331a47c9e6998261799729fdbf 2500w" /> <img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-prompt.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=febb5f53b2f917cf3e5a2ff2566eaef4" alt="" data-og-width="1508" width="1508" data-og-height="1084" height="1084" data-path="langsmith/images/edit-prompt.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-prompt.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=440242ac2e20e092c50810ca84d37286 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-prompt.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=46ad1078bdd4342cc49ecd1a70837149 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-prompt.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=f4998ff77574085ec2eb2fdc8c755fdc 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-prompt.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=2065392afa627416297cc8dc5a10780d 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-prompt.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=bda747e312cbabdd107a7b54b4e1c03e 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-prompt.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=846eaa58fed8f3a14aa76dbb610a0e41 2500w" />

# Next steps

Now that you've created a prompt, you can use it in your application code. See [how to pull a prompt programmatically](/langsmith/manage-prompts-programmatically#pull-a-prompt).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/create-a-prompt.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Create an account and API key
Source: https://docs.langchain.com/langsmith/create-account-api-key



To get started with LangSmith, you need to create an account. You can sign up for a free account in the [LangSmith UI](https://smith.langchain.com). LangSmith supports sign in with Google, GitHub, and email.

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-account.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=2a18e674c9b6e96dd0d5af16ddeeaf1a" alt="Create account" data-og-width="1768" width="1768" data-og-height="1252" height="1252" data-path="langsmith/images/create-account.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-account.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=45abdaeb33706739a680080f52a5457c 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-account.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=991caa9cc07e229e8a6db0a2ac38fdb9 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-account.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=0ff78a93a7e28cfe9fccbfd5a7d54ec5 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-account.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=04f93900e425e47461c882b25e5298f7 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-account.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=866363d551391983b36edfc481d67404 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/create-account.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=186e16f729e0a637c0eaab35002aa4ad 2500w" />

## API keys

LangSmith supports two types of API keys: Service Keys and Personal Access Tokens. Both types of tokens can be used to authenticate requests to the LangSmith API, but they have different use cases.

For more details on Service Keys and Personal Access Tokens, refer to the [Administration overview page](/langsmith/administration-overview).

## Create an API key

To log traces and run evaluations with LangSmith, you will need to create an API key to authenticate your requests. API keys can be scoped to a set of [workspaces](/langsmith/administration-overview#workspaces), or the entire [organization](/langsmith/administration-overview#organizations).

To create either type of API key:

1. Navigate to the [Settings page](https://smith.langchain.com/settings) and scroll to the **API Keys** section.
2. For service keys, choose between an organization-scoped and workspace-scoped key. If the key is workspace-scoped, the workspaces must then be specified.

   Enterprise users are also able to [assign specific roles](/langsmith/administration-overview#workspace-roles-rbac) to the key, which adjusts its permissions.
3. Set the key's expiration; the key will become unusable after the number of days chosen, or never, if that is selected.
4. Click **Create API Key.**

<Note>
  The API key will be shown only once, so make sure to copy it and store it in a safe place.
</Note>

<img src="https://mintcdn.com/langchain-5e9cc07a/RZqwlMMHZpKJks4w/langsmith/images/create-api-key.png?fit=max&auto=format&n=RZqwlMMHZpKJks4w&q=85&s=e27b419a9c317a78f8a98ff5024e1235" alt="Create API key" data-og-width="1224" width="1224" data-og-height="1137" height="1137" data-path="langsmith/images/create-api-key.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/RZqwlMMHZpKJks4w/langsmith/images/create-api-key.png?w=280&fit=max&auto=format&n=RZqwlMMHZpKJks4w&q=85&s=b0148d07161d0f214a9e6442297e83a3 280w, https://mintcdn.com/langchain-5e9cc07a/RZqwlMMHZpKJks4w/langsmith/images/create-api-key.png?w=560&fit=max&auto=format&n=RZqwlMMHZpKJks4w&q=85&s=305cf6904089edcee1db749552e41b5f 560w, https://mintcdn.com/langchain-5e9cc07a/RZqwlMMHZpKJks4w/langsmith/images/create-api-key.png?w=840&fit=max&auto=format&n=RZqwlMMHZpKJks4w&q=85&s=88ead9f8f16475135f43f32ecdfae35a 840w, https://mintcdn.com/langchain-5e9cc07a/RZqwlMMHZpKJks4w/langsmith/images/create-api-key.png?w=1100&fit=max&auto=format&n=RZqwlMMHZpKJks4w&q=85&s=0803eebefa356974e485217545b9cf13 1100w, https://mintcdn.com/langchain-5e9cc07a/RZqwlMMHZpKJks4w/langsmith/images/create-api-key.png?w=1650&fit=max&auto=format&n=RZqwlMMHZpKJks4w&q=85&s=fe671c360990ca8e4e86f5749a7f59db 1650w, https://mintcdn.com/langchain-5e9cc07a/RZqwlMMHZpKJks4w/langsmith/images/create-api-key.png?w=2500&fit=max&auto=format&n=RZqwlMMHZpKJks4w&q=85&s=02431a31fddd5ee58d30296fed75a238 2500w" />

## Delete an API key

To delete an API key:

1. Navigate to the [Settings page](https://smith.langchain.com/settings) and scroll to the **API Keys** section.
2. Find the API key you need to delete from the table. Toggle **Personal** or **Service** as needed.
3. Select the trash icon <Icon icon="trash" iconType="solid" /> in the **Actions** column and confirm deletion.

## Configure the SDK

You may set the following environment variables in addition to `LANGSMITH_API_KEY`.

This is only required if using the EU instance.

`LANGSMITH_ENDPOINT=https://eu.api.smith.langchain.com`

This is only required for keys scoped to more than one workspace.

`LANGSMITH_WORKSPACE_ID=<Workspace ID>`

## Using API keys outside of the SDK

See [instructions for managing your organization via API](/langsmith/manage-organization-by-api).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/create-account-api-key.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to improve your evaluator with few-shot examples
Source: https://docs.langchain.com/langsmith/create-few-shot-evaluators



Using LLM-as-a-judge evaluators can be very helpful when you can't evaluate your system programmatically. However, their effectiveness depends on their quality and how well they align with human reviewer feedback. LangSmith provides the ability to improve the alignment of LLM-as-a-judge evaluator to human preferences using few-shot examples.

Human corrections are automatically inserted into your evaluator prompt using few-shot examples. Few-shot examples is a technique inspired by [few-shot prompting](https://www.promptingguide.ai/techniques/fewshot) that guides the models output with a few high-quality examples.

This guide covers how to set up few-shot examples as part of your LLM-as-a-judge evaluator and apply corrections to feedback scores.

## How few-shot examples work

* Few-shot examples are added to your evaluator prompt using the `{{Few-shot examples}}` variable
* Creating an evaluator with few-shot examples, will automatically create a dataset for you, which will be auto-populated with few-shot examples once you start making corrections
* At runtime, these examples will inserted into the evaluator to serve as a guide for its outputs - this will help the evaluator to better align with human preferences

## Configure your evaluator

<Note>
  Few-shot examples are not currently supported in LLM-as-a-judge evaluators that use the prompt hub and are only compatible with prompts that use mustache formatting.
</Note>

Before enabling few-shot examples, set up your LLM-as-a-judge evaluator. If you haven't done this yet, follow the steps in the [LLM-as-a-judge evaluator guide](/langsmith/llm-as-judge).

### 1. Configure variable mapping

Each few-shot example is formatted according to the variable mapping specified in the configuration. The variable mapping for few-shot examples, should contain the same variables as your main prompt, plus a `few_shot_explanation` and a `score` variable which should have the same name as your feedback key.

For example, if your main prompt has variables `question` and `response`, and your evaluator outputs a `correctness` score, then your few-shot prompt should have the vartiables `question`, `response`, `few_shot_explanation`, and `correctness`.

### 2. Specify the number of few-shot examples to use

You may also specify the number of few-shot examples to use. The default is 5. If your examples are very long, you may want to set this number lower to save tokens - whereas if your examples tend to be short, you can set a higher number in order to give your evaluator more examples to learn from. If you have more examples in your dataset than this number, we will randomly choose them for you.

## Make corrections

<Info>
  [Audit evaluator scores](/langsmith/audit-evaluator-scores)
</Info>

As you start logging traces or running experiments, you will likely disagree with some of the scores that your evaluator has given. When you [make corrections to these scores](/langsmith/audit-evaluator-scores), you will begin seeing examples populated inside your corrections dataset. As you make corrections, make sure to attach explanations - these will get populated into your evaluator prompt in place of the `few_shot_explanation` variable.

The inputs to the few-shot examples will be the relevant fields from the inputs, outputs, and reference (if this an offline evaluator) of your chain/dataset. The outputs will be the corrected evaluator score and the explanations that you created when you left the corrections. Feel free to edit these to your liking. Here is an example of a few-shot example in a corrections dataset:

<img src="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/few-shot-example.png?fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=8c7bfcc6cc4ab86c18240c3cbf2ea44c" alt="Few-shot example" data-og-width="1572" width="1572" data-og-height="790" height="790" data-path="langsmith/images/few-shot-example.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/few-shot-example.png?w=280&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=91f4e17fd853ba23c1b04934144dfa77 280w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/few-shot-example.png?w=560&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=e0a5e2a026e4166c341900dd49316f35 560w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/few-shot-example.png?w=840&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=88aec2ef5c37c16c67e0eefecd3fbc0a 840w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/few-shot-example.png?w=1100&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=bfbbd35cf503ce2f3dbf743fab8fb75b 1100w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/few-shot-example.png?w=1650&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=bf0765bfeabc8d34ef49626dca0135ae 1650w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/few-shot-example.png?w=2500&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=f9d2cf9437ee0e160a511903bd88238a 2500w" />

Note that the corrections may take a minute or two to be populated into your few-shot dataset. Once they are there, future runs of your evaluator will include them in the prompt!

## View your corrections dataset

In order to view your corrections dataset:

* **Online evaluators**: Select your run rule and click **Edit Rule**
* **Offline evaluators**: Select your evaluator and click **Edit Evaluator**

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-evaluator.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=03453ef08f1c272d5d9aaf71d1fb7301" alt="Edit Evaluator" data-og-width="800" width="800" data-og-height="284" height="284" data-path="langsmith/images/edit-evaluator.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-evaluator.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=d495c791e6c8ae9d241085795d4b67b5 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-evaluator.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=b1ddde8054744862494e4d3f02a460b0 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-evaluator.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=9838e073d1d7e61c6b79d8f35ba1a1b3 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-evaluator.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=67322be4166f4479a466427a9b270ca1 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-evaluator.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=4cd490404d1616498fed810b3ce75a21 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/edit-evaluator.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=4c8a87ad332661a0b8b472dd34f1f4ab 2500w" />

Head to your dataset of corrections linked in the the **Improve evaluator accuracy using few-shot examples** section. You can view and update your few-shot examples in the dataset.

<img src="https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-few-shot-ds.png?fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=3215f3f24a08186fd76c6dbad18a3cf5" alt="View few-shot dataset" data-og-width="1470" width="1470" data-og-height="478" height="478" data-path="langsmith/images/view-few-shot-ds.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-few-shot-ds.png?w=280&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=ad702a532a8f083c71056baff4370f30 280w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-few-shot-ds.png?w=560&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=d45ebd4263adc9c10598fad633167ca3 560w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-few-shot-ds.png?w=840&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=fe060c8d000a41566949ff35d6c62135 840w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-few-shot-ds.png?w=1100&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=f735943da46a1e57328b86246f5da25f 1100w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-few-shot-ds.png?w=1650&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=9861b9651d5d63a07662e7aa1bc68491 1650w, https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-few-shot-ds.png?w=2500&fit=max&auto=format&n=1RIJxfRpkszanJLL&q=85&s=6084c3697ffd582e30301540906a5698 2500w" />

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/create-few-shot-evaluators.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Use cron jobs
Source: https://docs.langchain.com/langsmith/cron-jobs



There are many situations in which it is useful to run an assistant on a schedule.

For example, say that you're building an assistant that runs daily and sends an email summary
of the day's news. You could use a cron job to run the assistant every day at 8:00 PM.

LangSmith Deployment supports cron jobs, which run on a user-defined schedule. The user specifies a schedule, an assistant, and some input. After that, on the specified schedule, the server will:

* Create a new thread with the specified assistant
* Send the specified input to that thread

Note that this sends the same input to the thread every time.

The LangSmith Deployment API provides several endpoints for creating and managing cron jobs. See the [API reference](https://langchain-ai.github.io/langgraph/cloud/reference/api/api_ref/) for more details.

Sometimes you don't want to run your graph based on user interaction, but rather you would like to schedule your graph to run on a schedule - for example if you wish for your graph to compose and send out a weekly email of to-dos for your team. LangSmith Deployment allows you to do this without having to write your own script by using the `Crons` client. To schedule a graph job, you need to pass a [cron expression](https://crontab.cronhub.io/) to inform the client when you want to run the graph. `Cron` jobs are run in the background and do not interfere with normal invocations of the graph.

## Setup

First, let's set up our SDK client, assistant, and thread:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    from langgraph_sdk import get_client

    client = get_client(url=<DEPLOYMENT_URL>)
    # Using the graph deployed with the name "agent"
    assistant_id = "agent"
    # create thread
    thread = await client.threads.create()
    print(thread)
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    import { Client } from "@langchain/langgraph-sdk";

    const client = new Client({ apiUrl: <DEPLOYMENT_URL> });
    // Using the graph deployed with the name "agent"
    const assistantId = "agent";
    // create thread
    const thread = await client.threads.create();
    console.log(thread);
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request POST \
        --url <DEPLOYMENT_URL>/assistants/search \
        --header 'Content-Type: application/json' \
        --data '{
            "limit": 10,
            "offset": 0
        }' | jq -c 'map(select(.config == null or .config == {})) | .[0].graph_id' && \
    curl --request POST \
        --url <DEPLOYMENT_URL>/threads \
        --header 'Content-Type: application/json' \
        --data '{}'
    ```
  </Tab>
</Tabs>

Output:

```
{
'thread_id': '9dde5490-2b67-47c8-aa14-4bfec88af217',
'created_at': '2024-08-30T23:07:38.242730+00:00',
'updated_at': '2024-08-30T23:07:38.242730+00:00',
'metadata': {},
'status': 'idle',
'config': {},
'values': None
}
```

## Cron job on a thread

To create a cron job associated with a specific thread, you can write:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    # This schedules a job to run at 15:27 (3:27PM) every day
    cron_job = await client.crons.create_for_thread(
        thread["thread_id"],
        assistant_id,
        schedule="27 15 * * *",
        input={"messages": [{"role": "user", "content": "What time is it?"}]},
    )
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    // This schedules a job to run at 15:27 (3:27PM) every day
    const cronJob = await client.crons.create_for_thread(
      thread["thread_id"],
      assistantId,
      {
        schedule: "27 15 * * *",
        input: { messages: [{ role: "user", content: "What time is it?" }] }
      }
    );
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request POST \
        --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/crons \
        --header 'Content-Type: application/json' \
        --data '{
            "assistant_id": <ASSISTANT_ID>,
        }'
    ```
  </Tab>
</Tabs>

Note that it is **very** important to delete `Cron` jobs that are no longer useful. Otherwise you could rack up unwanted API charges to the LLM! You can delete a `Cron` job using the following code:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    await client.crons.delete(cron_job["cron_id"])
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    await client.crons.delete(cronJob["cron_id"]);
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request DELETE \
        --url <DEPLOYMENT_URL>/runs/crons/<CRON_ID>
    ```
  </Tab>
</Tabs>

## Cron job stateless

You can also create stateless cron jobs by using the following code:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    # This schedules a job to run at 15:27 (3:27PM) every day
    cron_job_stateless = await client.crons.create(
        assistant_id,
        schedule="27 15 * * *",
        input={"messages": [{"role": "user", "content": "What time is it?"}]},
    )
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    // This schedules a job to run at 15:27 (3:27PM) every day
    const cronJobStateless = await client.crons.create(
      assistantId,
      {
        schedule: "27 15 * * *",
        input: { messages: [{ role: "user", content: "What time is it?" }] }
      }
    );
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request POST \
        --url <DEPLOYMENT_URL>/runs/crons \
        --header 'Content-Type: application/json' \
        --data '{
            "assistant_id": <ASSISTANT_ID>,
        }'
    ```
  </Tab>
</Tabs>

Again, remember to delete your job once you are done with it!

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    await client.crons.delete(cron_job_stateless["cron_id"])
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    await client.crons.delete(cronJobStateless["cron_id"]);
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request DELETE \
        --url <DEPLOYMENT_URL>/runs/crons/<CRON_ID>
    ```
  </Tab>
</Tabs>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/cron-jobs.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Add custom authentication
Source: https://docs.langchain.com/langsmith/custom-auth



This guide shows you how to add custom authentication to your LangSmith application. The steps on this page apply to both [cloud](/langsmith/cloud) and [self-hosted](/langsmith/self-hosted) deployments. It does not apply to isolated usage of the [LangGraph open source library](/oss/python/langgraph/overview) in your own custom server.

## Add custom authentication to your deployment

To leverage custom authentication and access user-level metadata in your deployments, set up custom authentication to automatically populate the `config["configurable"]["langgraph_auth_user"]` object through a custom authentication handler. You can then access this object in your graph with the `langgraph_auth_user` key to [allow an agent to perform authenticated actions on behalf of the user](#enable-agent-authentication).

1. Implement authentication:

   <Note>
     Without a custom `@auth.authenticate` handler, LangGraph sees only the API-key owner (usually the developer), so requests aren’t scoped to individual end-users. To propagate custom tokens, you must implement your own handler.
   </Note>

   ```python  theme={null}
   from langgraph_sdk import Auth
   import requests

   auth = Auth()

   def is_valid_key(api_key: str) -> bool:
       is_valid = # your API key validation logic
       return is_valid

   @auth.authenticate # (1)!
   async def authenticate(headers: dict) -> Auth.types.MinimalUserDict:
       api_key = headers.get(b"x-api-key")
       if not api_key or not is_valid_key(api_key):
           raise Auth.exceptions.HTTPException(status_code=401, detail="Invalid API key")

       # Fetch user-specific tokens from your secret store
       user_tokens = await fetch_user_tokens(api_key)

       return { # (2)!
           "identity": api_key,  #  fetch user ID from LangSmith
           "github_token" : user_tokens.github_token
           "jira_token" : user_tokens.jira_token
           # ... custom fields/secrets here
       }
   ```

* This handler receives the request (headers, etc.), validates the user, and returns a dictionary with at least an identity field.
* You can add any custom fields you want (e.g., OAuth tokens, roles, org IDs, etc.).

2. In your [`langgraph.json`](/langsmith/application-structure#configuration-file), add the path to your auth file:

   ```json highlight={7-9} theme={null}
   {
       "dependencies": ["."],
       "graphs": {
       "agent": "./agent.py:graph"
       },
       "env": ".env",
       "auth": {
           "path": "./auth.py:my_auth"
       }
   }
   ```
3. Once you've set up authentication in your server, requests must include the required authorization information based on your chosen scheme. Assuming you are using JWT token authentication, you could access your deployments using any of the following methods:

   <Tabs>
     <Tab title="Python Client">
       ```python  theme={null}
       from langgraph_sdk import get_client

       my_token = "your-token" # In practice, you would generate a signed token with your auth provider
       client = get_client(
           url="http://localhost:2024",
           headers={"Authorization": f"Bearer {my_token}"}
       )
       threads = await client.threads.search()
       ```
     </Tab>

     <Tab title="Python RemoteGraph">
       ```python  theme={null}
       from langgraph.pregel.remote import RemoteGraph

       my_token = "your-token" # In practice, you would generate a signed token with your auth provider
       remote-graph = RemoteGraph(
           "agent",
           url="http://localhost:2024",
           headers={"Authorization": f"Bearer {my_token}"}
       )
       threads = await remote-graph.ainvoke(...)
       ```
     </Tab>

     <Tab title="JavaScript Client">
       ```javascript  theme={null}
       import { Client } from "@langchain/langgraph-sdk";

       const my_token = "your-token"; // In practice, you would generate a signed token with your auth provider
       const client = new Client({
       apiUrl: "http://localhost:2024",
       defaultHeaders: { Authorization: `Bearer ${my_token}` },
       });
       const threads = await client.threads.search();
       ```
     </Tab>

     <Tab title="JavaScript RemoteGraph">
       ```javascript  theme={null}
       import { RemoteGraph } from "@langchain/langgraph/remote";

       const my_token = "your-token"; // In practice, you would generate a signed token with your auth provider
       const remoteGraph = new RemoteGraph({
       graphId: "agent",
       url: "http://localhost:2024",
       headers: { Authorization: `Bearer ${my_token}` },
       });
       const threads = await remoteGraph.invoke(...);
       ```
     </Tab>

     <Tab title="CURL">
       ```bash  theme={null}
       curl -H "Authorization: Bearer ${your-token}" http://localhost:2024/threads
       ```
     </Tab>
   </Tabs>

   For more details on RemoteGraph, refer to the [Use RemoteGraph](/langsmith/use-remote-graph) guide.

## Enable agent authentication

After [authentication](#add-custom-authentication-to-your-deployment), the platform creates a special configuration object (`config`) that is passed to LangSmith deployment. This object contains information about the current user, including any custom fields you return from your `@auth.authenticate` handler.

To allow an agent to perform authenticated actions on behalf of the user, access this object in your graph with the `langgraph_auth_user` key:

```python  theme={null}
def my_node(state, config):
    user_config = config["configurable"].get("langgraph_auth_user")
    # token was resolved during the @auth.authenticate function
    token = user_config.get("github_token","")
    ...
```

<Note>
  Fetch user credentials from a secure secret store. Storing secrets in graph state is not recommended.
</Note>

### Authorizing a user for Studio

By default, if you add custom authorization on your resources, this will also apply to interactions made from [Studio](/langsmith/studio). If you want, you can handle logged-in Studio users differently by checking [is\_studio\_user()](https://langchain-ai.github.io/langgraph/cloud/reference/sdk/python_sdk_ref/#langgraph_sdk.auth.types.StudioUser).

<Note>
  `is_studio_user` was added in version 0.1.73 of the langgraph-sdk. If you're on an older version, you can still check whether `isinstance(ctx.user, StudioUser)`.
</Note>

```python  theme={null}
from langgraph_sdk.auth import is_studio_user, Auth
auth = Auth()

# ... Setup authenticate, etc.

@auth.on
async def add_owner(
    ctx: Auth.types.AuthContext,
    value: dict  # The payload being sent to this access method
) -> dict:  # Returns a filter dict that restricts access to resources
    if is_studio_user(ctx.user):
        return {}

    filters = {"owner": ctx.user.identity}
    metadata = value.setdefault("metadata", {})
    metadata.update(filters)
    return filters
```

Only use this if you want to permit developer access to a graph deployed on the managed LangSmith SaaS.

## Learn more

* [Authentication & Access Control](/langsmith/auth)
* [Setting up custom authentication tutorial](/langsmith/set-up-custom-auth)

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/custom-auth.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to customize the Dockerfile
Source: https://docs.langchain.com/langsmith/custom-docker



Users can add an array of additional lines to add to the Dockerfile following the import from the parent LangGraph image. In order to do this, you simply need to modify your `langgraph.json` file by passing in the commands you want run to the `dockerfile_lines` key. For example, if we wanted to use `Pillow` in our graph you would need to add the following dependencies:

```
{
    "dependencies": ["."],
    "graphs": {
        "openai_agent": "./openai_agent.py:agent",
    },
    "env": "./.env",
    "dockerfile_lines": [
        "RUN apt-get update && apt-get install -y libjpeg-dev zlib1g-dev libpng-dev",
        "RUN pip install Pillow"
    ]
}
```

This would install the system packages required to use Pillow if we were working with `jpeg` or `png` image formats.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/custom-docker.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Connect to a custom model
Source: https://docs.langchain.com/langsmith/custom-endpoint



The LangSmith playground allows you to use your own custom models. You can deploy a model server that exposes your model's API via , an open source library for serving LangChain applications. Behind the scenes, the playground will interact with your model server to generate responses.

## Deploy a custom model server

For your convenience, we have provided a sample model server that you can use as a reference. You can find the sample model server [here](https://github.com/langchain-ai/langsmith-model-server) We highly recommend using the sample model server as a starting point.

Depending on your model is an instruct-style or chat-style model, you will need to implement either `custom_model.py` or `custom_chat_model.py` respectively.

## Adding configurable fields

It is often useful to configure your model with different parameters. These might include temperature, model\_name, max\_tokens, etc.

To make your model configurable in the LangSmith playground, you need to add configurable fields to your model server. These fields can be used to change model parameters from the playground.

You can add configurable fields by implementing the `with_configurable_fields` function in the `config.py` file. You can

```python  theme={null}
def with_configurable_fields(self) -> Runnable:
    """Expose fields you want to be configurable in the playground. We will automatically expose these to the
    playground. If you don't want to expose any fields, you can remove this method."""
    return self.configurable_fields(n=ConfigurableField(
        id="n",
        name="Num Characters",
        description="Number of characters to return from the input prompt.",
    ))
```

## Use the model in the LangSmith Playground

Once you have deployed a model server, you can use it in the LangSmith Playground. Enter the playground and select either the `ChatCustomModel` or the `CustomModel` provider for chat-style model or instruct-style models.

Enter the `URL`. The playground will automatically detect the available endpoints and configurable fields. You can then invoke the model with the desired parameters.

<img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/playground-custom-model.png?fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=7a2889af5f55cc73661033837a50fad6" alt="ChatCustomModel in Playground" data-og-width="2816" width="2816" data-og-height="1676" height="1676" data-path="langsmith/images/playground-custom-model.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/playground-custom-model.png?w=280&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=c6509706fee0c85205e039f6868a5ead 280w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/playground-custom-model.png?w=560&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=deafe903353d9bec02143ebd578d5599 560w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/playground-custom-model.png?w=840&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=928818d42fc58d83e1b5a04ecaa36630 840w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/playground-custom-model.png?w=1100&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=552046bb4c04947154a2c8fa3457beca 1100w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/playground-custom-model.png?w=1650&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=2735d4eed015cafa0861079133c5220c 1650w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/playground-custom-model.png?w=2500&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=f59ef79d897acce3ae4835ce949d61b6 2500w" />

If everything is set up correctly, you should see the model's response in the playground as well as the configurable fields specified in the `with_configurable_fields`.

See how to store your model configuration for later use [here](/langsmith/managing-model-configurations).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/custom-endpoint.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to add custom lifespan events
Source: https://docs.langchain.com/langsmith/custom-lifespan



When deploying agents to LangSmith, you often need to initialize resources like database connections when your server starts up, and ensure they're properly closed when it shuts down. Lifespan events let you hook into your server's startup and shutdown sequence to handle these critical setup and teardown tasks.

This works the same way as [adding custom routes](/langsmith/custom-routes). You just need to provide your own [`Starlette`](https://www.starlette.io/applications/) app (including [`FastAPI`](https://fastapi.tiangolo.com/), [`FastHTML`](https://fastht.ml/) and other compatible apps).

Below is an example using FastAPI.

<Note>
  "Python only"
  We currently only support custom lifespan events in Python deployments with `langgraph-api>=0.0.26`.
</Note>

## Create app

Starting from an **existing** LangSmith application, add the following lifespan code to your `webapp.py` file. If you are starting from scratch, you can create a new app from a template using the CLI.

```bash  theme={null}
langgraph new --template=new-langgraph-project-python my_new_project
```

Once you have a LangGraph project, add the following app code:

```python {highlight={19}} theme={null}
# ./src/agent/webapp.py
from contextlib import asynccontextmanager
from fastapi import FastAPI
from sqlalchemy.ext.asyncio import create_async_engine, AsyncSession
from sqlalchemy.orm import sessionmaker

@asynccontextmanager
async def lifespan(app: FastAPI):
    # for example...
    engine = create_async_engine("postgresql+asyncpg://user:pass@localhost/db")
    # Create reusable session factory
    async_session = sessionmaker(engine, class_=AsyncSession)
    # Store in app state
    app.state.db_session = async_session
    yield
    # Clean up connections
    await engine.dispose()

app = FastAPI(lifespan=lifespan)

# ... can add custom routes if needed.
```

## Configure `langgraph.json`

Add the following to your `langgraph.json` configuration file. Make sure the path points to the `webapp.py` file you created above.

```json  theme={null}
{
  "dependencies": ["."],
  "graphs": {
    "agent": "./src/agent/graph.py:graph"
  },
  "env": ".env",
  "http": {
    "app": "./src/agent/webapp.py:app"
  }
  // Other configuration options like auth, store, etc.
}
```

## Start server

Test the server out locally:

```bash  theme={null}
langgraph dev --no-browser
```

You should see your startup message printed when the server starts, and your cleanup message when you stop it with `Ctrl+C`.

## Deploying

You can deploy your app as-is to cloud or to your self-hosted platform.

## Next steps

Now that you've added lifespan events to your deployment, you can use similar techniques to add [custom routes](/langsmith/custom-routes) or [custom middleware](/langsmith/custom-middleware) to further customize your server's behavior.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/custom-lifespan.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to add custom middleware
Source: https://docs.langchain.com/langsmith/custom-middleware



When deploying agents to LangSmith, you can add custom middleware to your server to handle concerns like logging request metrics, injecting or checking headers, and enforcing security policies without modifying core server logic. This works the same way as [adding custom routes](/langsmith/custom-routes). You just need to provide your own [`Starlette`](https://www.starlette.io/applications/) app (including [`FastAPI`](https://fastapi.tiangolo.com/), [`FastHTML`](https://fastht.ml/) and other compatible apps).

Adding middleware lets you intercept and modify requests and responses globally across your deployment, whether they're hitting your custom endpoints or the built-in LangSmith APIs.

Below is an example using FastAPI.

<Note>
  "Python only"
  We currently only support custom middleware in Python deployments with `langgraph-api>=0.0.26`.
</Note>

## Create app

Starting from an **existing** LangSmith application, add the following middleware code to your `webapp.py` file. If you are starting from scratch, you can create a new app from a template using the CLI.

```bash  theme={null}
langgraph new --template=new-langgraph-project-python my_new_project
```

Once you have a LangGraph project, add the following app code:

```python {highlight={5}} theme={null}
# ./src/agent/webapp.py
from fastapi import FastAPI, Request
from starlette.middleware.base import BaseHTTPMiddleware

app = FastAPI()

class CustomHeaderMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request: Request, call_next):
        response = await call_next(request)
        response.headers['X-Custom-Header'] = 'Hello from middleware!'
        return response

# Add the middleware to the app
app.add_middleware(CustomHeaderMiddleware)
```

## Configure `langgraph.json`

Add the following to your `langgraph.json` configuration file. Make sure the path points to the `webapp.py` file you created above.

```json  theme={null}
{
  "dependencies": ["."],
  "graphs": {
    "agent": "./src/agent/graph.py:graph"
  },
  "env": ".env",
  "http": {
    "app": "./src/agent/webapp.py:app"
  }
  // Other configuration options like auth, store, etc.
}
```

### Customize middleware ordering

By default, custom middleware runs before authentication logic. To run custom middleware *after* authentication, set `middleware_order` to `auth_first` in your `http` configuration. (This customization is supported starting with API server v0.4.35 and later.)

```json  theme={null}
{
  "dependencies": ["."],
  "graphs": {
    "agent": "./src/agent/graph.py:graph"
  },
  "env": ".env",
  "http": {
    "app": "./src/agent/webapp.py:app",
    "middleware_order": "auth_first"
  },
  "auth": {
    "path": "./auth.py:my_auth"
  }
}
```

## Start server

Test the server out locally:

```bash  theme={null}
langgraph dev --no-browser
```

Now any request to your server will include the custom header `X-Custom-Header` in its response.

## Deploying

You can deploy this app as-is to cloud or to your self-hosted platform.

## Next steps

Now that you've added custom middleware to your deployment, you can use similar techniques to add [custom routes](/langsmith/custom-routes) or define [custom lifespan events](/langsmith/custom-lifespan) to further customize your server's behavior.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/custom-middleware.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Connect to an OpenAI compliant model provider/proxy
Source: https://docs.langchain.com/langsmith/custom-openai-compliant-model



The LangSmith playground allows you to use any model that is compliant with the OpenAI API. You can utilize your model by setting the Proxy Provider for  in the playground.

## Deploy an OpenAI compliant model

Many providers offer OpenAI compliant models or proxy services. Some examples of this include:

* [LiteLLM Proxy](https://github.com/BerriAI/litellm?tab=readme-ov-file#quick-start-proxy---cli)
* [Ollama](https://ollama.com/)

You can use these providers to deploy your model and get an API endpoint that is compliant with the OpenAI API.

Take a look at the full [specification](https://platform.openai.com/docs/api-reference/chat) for more information.

## Use the model in the LangSmith Playground

Once you have deployed a model server, you can use it in the LangSmith [Playground](/langsmith/prompt-engineering-concepts#prompt-playground).

To access the **Prompt Settings** menu:

1. Under the **Prompts** heading select the gear <Icon icon="gear" iconType="solid" /> icon next to the model name.
2. In the **Model Configuration** tab, select the model to edit in the dropdown.
3. For the **Provider** dropdown, select **OpenAI Compatible Endpoint**.
4. Add your OpenAI Compatible Endpoint to the **Base URL** input.

   <div style={{ textAlign: 'center' }}>
     <img className="block dark:hidden" src="https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint.png?fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=fdbe548e512ed40fb512578d02986b45" alt="Model Configuration window in the LangSmith UI with a model selected and the Provider dropdown with OpenAI Compatible Endpoint selected." data-og-width="897" width="897" data-og-height="572" height="572" data-path="langsmith/images/openai-compatible-endpoint.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint.png?w=280&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=75921e7b30edbac5263ee10178977383 280w, https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint.png?w=560&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=468265c2ad0a6c1740eb18590dab27a5 560w, https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint.png?w=840&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=9f9c3f68df77205264eb9221373790f2 840w, https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint.png?w=1100&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=ccb9820653e1e57b529bc46ac7d20e40 1100w, https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint.png?w=1650&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=c320b3f4051643b71fba4faa350daf9b 1650w, https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint.png?w=2500&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=e9d1be7c69021b7fa575a2a466dbfe58 2500w" />

     <img className="hidden dark:block" src="https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint-dark.png?fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=97459563da21d17228a1bb94a1b9edf3" alt="Model Configuration window in the LangSmith UI with a model selected and the Provider dropdown with OpenAI Compatible Endpoint selected." data-og-width="896" width="896" data-og-height="552" height="552" data-path="langsmith/images/openai-compatible-endpoint-dark.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint-dark.png?w=280&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=c3e8e46813ec673fbc3ac4e4748a4ab6 280w, https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint-dark.png?w=560&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=381a567b022c71ed6f74abbb7e3cecbd 560w, https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint-dark.png?w=840&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=7df617e39d5bd098f4e80d523ef85778 840w, https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint-dark.png?w=1100&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=06408cd348f56fcc409af8273c799a97 1100w, https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint-dark.png?w=1650&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=6b79b55e2868ed11e9d2c2eb396b004f 1650w, https://mintcdn.com/langchain-5e9cc07a/cemWY9w7h0W8uMbk/langsmith/images/openai-compatible-endpoint-dark.png?w=2500&fit=max&auto=format&n=cemWY9w7h0W8uMbk&q=85&s=4e0439d889aa2175c0802e9bf5db399b 2500w" />
   </div>

If everything is set up correctly, you should see the model's response in the playground. You can also use this functionality to invoke downstream pipelines as well.

For information on how to store your model configuration , refer to [Configure prompt settings](/langsmith/managing-model-configurations).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/custom-openai-compliant-model.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Custom output rendering
Source: https://docs.langchain.com/langsmith/custom-output-rendering



Custom output rendering allows you to visualize run outputs and dataset reference outputs using your own custom HTML pages. This is particularly useful for:

* **Domain-specific formatting**: Display medical records, legal documents, or other specialized data types in their native format.
* **Custom visualizations**: Create charts, graphs, or diagrams from numeric or structured output data.

In this page you'll learn how to:

* **[Configure custom rendering](#configure-custom-output-rendering)** in the LangSmith UI.
* **[Build a custom renderer](#build-a-custom-renderer)** to display output data.
* **[Understand where custom rendering appears](#where-custom-rendering-appears)** in LangSmith.

## Configure custom output rendering

Configure custom rendering at two levels:

* **For datasets**: Apply custom rendering to all runs associated with that dataset, wherever they appear—in experiments, run detail panes, or annotation queues.
* **For annotation queues**: Apply custom rendering to all runs within a specific annotation queue, regardless of which dataset they come from. This takes precedence over dataset-level configuration.

### For datasets

To configure custom output rendering for a dataset:

<img src="https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-menu.png?fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=7daf042ebae80eec20cd90a25c1d6087" alt="Dataset page with three-dot menu showing Custom Output Rendering option" data-og-width="3456" width="3456" data-og-height="2156" height="2156" data-path="langsmith/images/custom-output-rendering-menu.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-menu.png?w=280&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=432f6a5bb6c79797c17fa8faa74169b1 280w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-menu.png?w=560&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=c77ec3be9ea681180d5c82d2588814b5 560w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-menu.png?w=840&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=3f8fdb966f809ad049585efe3271182b 840w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-menu.png?w=1100&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=3bdd2738b8dc1d17eeea3951536dedc8 1100w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-menu.png?w=1650&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=5428362058ab8e8f4f71009779e3134f 1650w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-menu.png?w=2500&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=a31530b7bd16f2fe6f8c6b622f55d9de 2500w" />

1. Navigate to your dataset in the **Datasets & Experiments** page.
2. Click **⋮** (three-dot menu) in the top right corner.
3. Select **Custom Output Rendering**.
4. Toggle **Enable custom output rendering**.
5. Enter the webpage URL in the **URL** field.
6. Click **Save**.

<img src="https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-modal.png?fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=bffd3b40ca14bbebc05c998d1cb5fa7e" alt="Custom Output Rendering modal with fields filled in" data-og-width="3456" width="3456" data-og-height="2156" height="2156" data-path="langsmith/images/custom-output-rendering-modal.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-modal.png?w=280&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=2fd8bdfcce6fdf5e9bd865590a8e0f79 280w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-modal.png?w=560&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=c110a35a03909e146be961ac5386c888 560w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-modal.png?w=840&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=1959a6a045a04e0d3290de4379a358bc 840w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-modal.png?w=1100&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=4a154ab114547ff7e243acb36c9bd9a3 1100w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-modal.png?w=1650&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=c2eae83b1d6ec160a37fd998511e9794 1650w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-modal.png?w=2500&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=025b243cf994c130439ef55f6dc5e856 2500w" />

### For annotation queues

To configure custom output rendering for an annotation queue:

<img src="https://mintcdn.com/langchain-5e9cc07a/optUJrLvYf4z4j5I/langsmith/images/annotation-queue-custom-output-rendering-settings.png?fit=max&auto=format&n=optUJrLvYf4z4j5I&q=85&s=579aad04fa6990b220514280eef799f4" alt="Annotation queue settings showing custom output rendering configuration" data-og-width="3456" width="3456" data-og-height="1914" height="1914" data-path="langsmith/images/annotation-queue-custom-output-rendering-settings.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/optUJrLvYf4z4j5I/langsmith/images/annotation-queue-custom-output-rendering-settings.png?w=280&fit=max&auto=format&n=optUJrLvYf4z4j5I&q=85&s=525d5f11afb5bf8a5fd86f5de5063d72 280w, https://mintcdn.com/langchain-5e9cc07a/optUJrLvYf4z4j5I/langsmith/images/annotation-queue-custom-output-rendering-settings.png?w=560&fit=max&auto=format&n=optUJrLvYf4z4j5I&q=85&s=0d58600ddf45ff1e7bbad5cc31489f24 560w, https://mintcdn.com/langchain-5e9cc07a/optUJrLvYf4z4j5I/langsmith/images/annotation-queue-custom-output-rendering-settings.png?w=840&fit=max&auto=format&n=optUJrLvYf4z4j5I&q=85&s=d5aaa11700f5f58bcf157dc0fd32877a 840w, https://mintcdn.com/langchain-5e9cc07a/optUJrLvYf4z4j5I/langsmith/images/annotation-queue-custom-output-rendering-settings.png?w=1100&fit=max&auto=format&n=optUJrLvYf4z4j5I&q=85&s=51794d75925516c728177e91689dbdcd 1100w, https://mintcdn.com/langchain-5e9cc07a/optUJrLvYf4z4j5I/langsmith/images/annotation-queue-custom-output-rendering-settings.png?w=1650&fit=max&auto=format&n=optUJrLvYf4z4j5I&q=85&s=cf7c47a38a18a548f627dc56e94abfde 1650w, https://mintcdn.com/langchain-5e9cc07a/optUJrLvYf4z4j5I/langsmith/images/annotation-queue-custom-output-rendering-settings.png?w=2500&fit=max&auto=format&n=optUJrLvYf4z4j5I&q=85&s=bcd50a9547635188f2e7e458af0baa46 2500w" />

1. Navigate to the **Annotation Queues** page.
2. Click on an existing annotation queue or create a new one.
3. In the annotation queue settings pane, scroll to the **Custom Output Rendering** section.
4. Toggle **Enable custom output rendering**.
5. Enter the webpage URL in the **URL** field.
6. Click **Save** or **Create**.

<Info>
  When custom rendering is configured at both levels, annotation queue configuration takes precedence over dataset configuration for runs viewed within that queue.
</Info>

## Build a custom renderer

### Understand the message format

Your HTML page will receive output data via the [postMessage API](https://developer.mozilla.org/en-US/docs/Web/API/Window/postMessage). LangSmith sends messages with the following structure:

```typescript  theme={null}
{
  type: "output" | "reference",
  data: {
    // The outputs (actual output or reference output)
    // Structure varies based on your application
  },
  metadata: {
    inputs: {
      // The inputs that generated this output
      // Structure varies based on your application
    }
  }
}
```

* `type`: Indicates whether this is an actual output (`"output"`) or a reference output (`"reference"`).
* `data`: The output data itself.
* `metadata.inputs`: The input data that generated this output, provided for context.

<Note>
  **Message delivery timing**: LangSmith uses an exponential backoff retry mechanism to ensure your page receives the data even if it loads slowly. Messages are sent up to 6 times with increasing delays (100ms, 200ms, 400ms, 800ms, 1600ms, 3200ms).
</Note>

### Example implementation

This example listens for incoming postMessage events and displays them on the page. Each message is numbered and formatted as JSON, making it easy to inspect the data structure LangSmith sends to your renderer.

```html  theme={null}
<!DOCTYPE html>
<html>
  <head>
    <meta charset="UTF-8" />
    <title>PostMessage Echo</title>
    <link rel="stylesheet" href="https://unpkg.com/sakura.css/css/sakura.css" />
  </head>
  <body>
    <h1>PostMessage Messages</h1>
    <div id="messages"></div>
    <script>
      let count = 0;
      window.addEventListener("message", (event) => {
        count++;
        const header = document.createElement("h3");
        header.appendChild(document.createTextNode(`Message ${count}`));
        const code = document.createElement("code");
        code.appendChild(
          document.createTextNode(JSON.stringify(event.data, null, 2))
        );
        const pre = document.createElement("pre");
        pre.appendChild(code);
        document.getElementById("messages").appendChild(header);
        document.getElementById("messages").appendChild(pre);
      });
    </script>
  </body>
</html>
```

## Where custom rendering appears

When enabled, your custom rendering will replace the default output view in:

* **Experiment comparison view**: When comparing outputs across multiple experiments:

<img src="https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-experiment-comparison.png?fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=6f1fd9d3ca4be55aa9a0b40140771e08" alt="Experiment comparison view showing custom rendering" data-og-width="3456" width="3456" data-og-height="2156" height="2156" data-path="langsmith/images/custom-output-rendering-experiment-comparison.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-experiment-comparison.png?w=280&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=4010749197ec99216b6cbc004f8e2aed 280w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-experiment-comparison.png?w=560&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=3eb5a8825d76c92f531c34912c476fb5 560w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-experiment-comparison.png?w=840&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=1977c3020b1b2150beefc4980db857f9 840w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-experiment-comparison.png?w=1100&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=2f8f86c1ad7ff02b65f78cd43bead446 1100w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-experiment-comparison.png?w=1650&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=979e5426ee86673cd088aef614c52a6e 1650w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-experiment-comparison.png?w=2500&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=292b4ac1322414a059b2ba5a4285caf3 2500w" />

* **Run detail panes**: When viewing runs that are associated with a dataset:

<img src="https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-run-details.png?fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=abec759e27bb3dfa827354d13746cf61" alt="Run detail pane showing custom rendering" data-og-width="3456" width="3456" data-og-height="2156" height="2156" data-path="langsmith/images/custom-output-rendering-run-details.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-run-details.png?w=280&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=9d4d8e9e4d3e2f9e856d9da23e16d805 280w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-run-details.png?w=560&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=1adfc78a0ba06a3b9b18433fc2e1f82c 560w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-run-details.png?w=840&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=a882e8fb03c81526070c0e9f49de157b 840w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-run-details.png?w=1100&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=39d6a85d5578c58cf2c6853ce1d5e704 1100w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-run-details.png?w=1650&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=d9c37910f147b76685b018a620aa9d40 1650w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-run-details.png?w=2500&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=24538e055a2363976a7f3ce5d0111356 2500w" />

* **Annotation queues**: When reviewing runs in annotation queues:

<img src="https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-annotation-queue.png?fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=8d1b66541ea7dcd0246354fca1568719" alt="Annotation queue showing custom rendering" data-og-width="3456" width="3456" data-og-height="2156" height="2156" data-path="langsmith/images/custom-output-rendering-annotation-queue.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-annotation-queue.png?w=280&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=a42373180e5ab88ee80c3a670102f290 280w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-annotation-queue.png?w=560&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=47be4504dd9d4a0c191cc426d7eaa7c4 560w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-annotation-queue.png?w=840&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=36eda0a27d923057fbb4f0fc661e311e 840w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-annotation-queue.png?w=1100&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=82593a5f663686eb61d9b90d9b1b8c39 1100w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-annotation-queue.png?w=1650&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=dec8e45c1c77bd1cdd9b3aec093bf566 1650w, https://mintcdn.com/langchain-5e9cc07a/l7rhdSRpjWBkaCke/langsmith/images/custom-output-rendering-annotation-queue.png?w=2500&fit=max&auto=format&n=l7rhdSRpjWBkaCke&q=85&s=a8a978b3ec3d119fb16002d1a432c91a 2500w" />

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/custom-output-rendering.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to add custom routes
Source: https://docs.langchain.com/langsmith/custom-routes



When deploying agents to LangSmith Deployment, your server automatically exposes routes for creating runs and threads, interacting with the long-term memory store, managing configurable assistants, and other core functionality ([see all default API endpoints](/langsmith/server-api-ref)).

You can add custom routes by providing your own [`Starlette`](https://www.starlette.io/applications/) app (including [`FastAPI`](https://fastapi.tiangolo.com/), [`FastHTML`](https://fastht.ml/) and other compatible apps). You make LangSmith aware of this by providing a path to the app in your `langgraph.json` configuration file.

Defining a custom app object lets you add any routes you'd like, so you can do anything from adding a `/login` endpoint to writing an entire full-stack web-app, all deployed in a single Agent Server.

Below is an example using FastAPI.

## Create app

Starting from an **existing** LangSmith application, add the following custom route code to your `webapp.py` file. If you are starting from scratch, you can create a new app from a template using the CLI.

```bash  theme={null}
langgraph new --template=new-langgraph-project-python my_new_project
```

Once you have a LangGraph project, add the following app code:

```python {highlight={4}} theme={null}
# ./src/agent/webapp.py
from fastapi import FastAPI

app = FastAPI()


@app.get("/hello")
def read_root():
    return {"Hello": "World"}

```

## Configure `langgraph.json`

Add the following to your `langgraph.json` configuration file. Make sure the path points to the FastAPI application instance `app` in the `webapp.py` file you created above.

```json  theme={null}
{
  "dependencies": ["."],
  "graphs": {
    "agent": "./src/agent/graph.py:graph"
  },
  "env": ".env",
  "http": {
    "app": "./src/agent/webapp.py:app"
  }
  // Other configuration options like auth, store, etc.
}
```

## Start server

Test the server out locally:

```bash  theme={null}
langgraph dev --no-browser
```

If you navigate to `localhost:2024/hello` in your browser (`2024` is the default development port), you should see the `/hello` endpoint returning `{"Hello": "World"}`.

<Note>
  **Shadowing default endpoints**
  The routes you create in the app are given priority over the system defaults, meaning you can shadow and redefine the behavior of any default endpoint.
</Note>

## Deploying

You can deploy this app as-is to LangSmith or to your self-hosted platform.

## Next steps

Now that you've added a custom route to your deployment, you can use this same technique to further customize how your server behaves, such as defining custom [custom middleware](/langsmith/custom-middleware) and [custom lifespan events](/langsmith/custom-lifespan).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/custom-routes.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Monitor projects with dashboards
Source: https://docs.langchain.com/langsmith/dashboards



Dashboards give you high-level insights into your trace data, helping you spot trends and monitor the health of your applications. Dashboards are available in the **Monitoring** tab in the left sidebar.

LangSmith offers two dashboard types:

* **Prebuilt dashboards**: Automatically generated for every tracing project.
* **Custom dashboards**: Fully configurable collections of charts tailored to your needs.

## Prebuilt dashboards

Prebuilt dashboards are created automatically for each project and cover essential metrics, such as trace count, error rates, token usage, and more. By default, the prebuilt dashboard for your tracing project can be accessed using the **Dashboard** button on the top right of the tracing project page.

<img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/prebuilt.gif?s=cbb7410db7c9036ed6c03af251f13a99" alt="prebuilt" data-og-width="1392" width="1392" data-og-height="1080" height="1080" data-path="langsmith/images/prebuilt.gif" data-optimize="true" data-opv="3" />

<Note>**You cannot modify a prebuilt dashboard. In the future, we plan to allow you to clone a default dashboard in order to have a starting point to customize it.**</Note>

### Dashboard sections

Prebuilt dashboards are broken down into the following sections:

| Section         | What it shows                                                                                                                                                                                                                                                                                                    |
| :-------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Traces          | Trace count, latency and error rates. A [trace](/langsmith/observability-concepts#traces) is a collection of [runs](/langsmith/observability-concepts#runs) related to a single operation. For example, if a user request triggers an agent, all runs for that agent invocation would be part of the same trace. |
| LLM Calls       | LLM call count and latency. Includes all runs where run type is "llm".                                                                                                                                                                                                                                           |
| Cost & Tokens   | Total and per-trace token counts and costs, broken down by token type. Costs are measured using [LangSmith's cost tracking](/langsmith/log-llm-trace#manually-provide-token-counts).                                                                                                                             |
| Tools           | Run counts, error rates, and latency stats for tool runs broken down by tool name. Includes runs where run type is "tool". Limits to top 5 most frequently occurring tools.                                                                                                                                      |
| Run Types       | Run counts, error rates, and latency stats for runs that are immediate children of the root run. This helps in understanding the high-level execution path of agents. Limits to top 5 most frequently occurring run names. Refer to the image following this table.                                              |
| Feedback Scores | Aggregate stats for the top 5 most frequently occurring types of feedback. Charts show average score for numerical feedback and category counts for categorical feedback.                                                                                                                                        |

For example, for the following trace, the following runs have a depth of 1:

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/run-depth-explained.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=382b7d6064e09c23efc6770fcd983a69" alt="" data-og-width="524" width="524" data-og-height="810" height="810" data-path="langsmith/images/run-depth-explained.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/run-depth-explained.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=03503ce7bfd170a22ff5152a7564e130 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/run-depth-explained.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=a84d5785aaa93bb3e8ab7d2a4143f063 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/run-depth-explained.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=5d6df2d7b6dbe6e358ad2f03df341661 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/run-depth-explained.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=74abfc6a192c9a433dcf62b01a571594 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/run-depth-explained.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=9e3089b4311d828f80594649647f85bd 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/run-depth-explained.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=c6e3a31aded7c2ad7ae96f948613c983 2500w" />

### Group by

Group by [run tag or metadata](/langsmith/add-metadata-tags) can be used to split data over attributes that are important to your application. The global group by setting appears on the top right hand side of the dashboard. Note that the Tool and Run Type charts already have a group by applied, so the global group by won't take effect; the global group by will apply to all other charts.

<Note>When adding metadata to runs, we recommend having the same metadata on the trace, as well as the specific run (e.g. LLM call). Metadata and tags are not propagated from parent to child runs, or vice versa. So, if you want to see e.g. both your trace charts and your LLM call charts grouped on some metadata key then both your traces (root runs) and your LLM runs need to have that metadata attached.</Note>

## Custom Dashboards

Create tailored collections of charts for tracking metrics that matter most for your application.

### Creating a new dashboard

1. Navigate to the **Monitor** tab in the left sidebar.
2. Click on the **+ New Dashboard** button.
3. Give your dashboard a name and a description.
4. Click on **Create**.

### Adding charts to your dashboard

1. Within a dashboard, click on the **+ New Chart** button to open up the chart creation pane.
2. Give your chart a name and a description.
3. Configure the chart.

### Chart configuration

#### Select tracing projects and filter runs

* Select one or more tracing projects to track metrics for.
* Use the **Chart filters** section to refine the matching runs. This filter applies to all data series in the chart. For more information on filtering traces, view our guide on [filtering traces in application](./filter-traces-in-application).

#### Pick a metric

* Choose a metric from the dropdown menu to set the y-axis of your chart. With a project and a metric selected, you'll see a preview of your chart and the matching runs.
* For certain metrics (such as latency, token usage, cost), we support comparing multiple metrics with the same unit. For example, you may want one chart where you can see prompt tokens and completion tokens. Each metric appears as a separate line.

<img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-metrics.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=3543d454f25f5d11046fbd5bcab7aeff" alt="Multiple metrics" data-og-width="1475" width="1475" data-og-height="741" height="741" data-path="langsmith/images/compare-metrics.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-metrics.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=75f80a706d3e3ddd09117bc9d317ecdc 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-metrics.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=2606ae254c47349e2c5bd14ae4fc49b8 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-metrics.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=4239ab64861d8e3768f90f837dbefe67 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-metrics.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=0e8d5f3e7fcf40fe6f33b6066afa346f 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-metrics.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=d45519451d443538ad76fdcbb0d21f62 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/compare-metrics.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=33024dfb462b8229b16c4fbd9cc650ba 2500w" />

#### Split the data

There are two ways to create multiple series in a chart (i.e. create multiple lines in a chart):

1. **Group by**: Group runs by [run tag or metadata](/langsmith/add-metadata-tags), run name, or run type. Group by automatically splits the data into multiple series based on the field selected. Note that group by is limited to the top 5 elements by frequency.

2. **Data series**: Manually define multiple series with individual filters. This is useful for comparing granular data within a single metric.

<img src="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multiple-data-series.png?fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=e5ed30b317abdb90aea612702d94cf04" alt="Multiple data series" data-og-width="2796" width="2796" data-og-height="1396" height="1396" data-path="langsmith/images/multiple-data-series.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multiple-data-series.png?w=280&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=75134cada0ab88b532d6073c3317dc36 280w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multiple-data-series.png?w=560&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=05a0b345dd25435cae0c6e25fee62942 560w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multiple-data-series.png?w=840&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=aede683fd53d369cbfe7c1649d128953 840w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multiple-data-series.png?w=1100&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=e96e54ecf149d6a5881bc44d81941220 1100w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multiple-data-series.png?w=1650&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=fee4a18237d105668c227eafb014d0d5 1650w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/multiple-data-series.png?w=2500&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=6a93006b73bc0451c68acf10189155a4 2500w" />

#### Pick a chart type

* Choose between a line chart and a bar chart for visualizing

### Save and manage charts

* Click `Save` to save your chart to the dashboard.
* Edit or delete a chart by clicking the triple dot button in the top right of the chart.
* Clone a chart by clicking the triple line button in the top right of the chart and selecting **+ Clone**. This will open a new chart creation pane with the same configurations as the original.

<img src="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/more-actions-bar.png?fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=b40f5f132a2c0770cfe04a125dcbc7f2" alt="More actions bar" data-og-width="2102" width="2102" data-og-height="758" height="758" data-path="langsmith/images/more-actions-bar.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/more-actions-bar.png?w=280&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=b4033fc996ce15ac3eee9e42430f3089 280w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/more-actions-bar.png?w=560&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=0176aa9470908b2ee287c6588f8898ae 560w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/more-actions-bar.png?w=840&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=38af0f07ca3394b40dbdd4f6d83af5ba 840w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/more-actions-bar.png?w=1100&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=b41c74e251946e2f6f31739c71a42096 1100w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/more-actions-bar.png?w=1650&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=c5a6f79532146e2596857be1f85f342a 1650w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/more-actions-bar.png?w=2500&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=623e4b00a5a04e950e8312cf2a2233b6 2500w" />

<img src="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-chart.png?fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=d0c5597d6adfa7d02888ef4b89ac616d" alt="Expanded chart" data-og-width="2238" width="2238" data-og-height="1662" height="1662" data-path="langsmith/images/expanded-chart.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-chart.png?w=280&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=73fa53c7cdb5a89eaac5e685035bfc82 280w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-chart.png?w=560&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=df3f46036e1692777f6c0d460b78ca0c 560w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-chart.png?w=840&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=9c4abdd5d8cf517a9e537d09f6d6e761 840w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-chart.png?w=1100&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=91a2ff0f83ad36710058479fc5f30698 1100w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-chart.png?w=1650&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=7f16a10cf75c87fb38f3cf6db39e1fe4 1650w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/expanded-chart.png?w=2500&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=97d6ff4e7841cdf169c4e069605c7089 2500w" />

## Linking to a dashboard from a tracing project

You can link to any dashboard directly from a tracing project. By default, the prebuilt dashboard for your tracing project is selected. If you have a custom dashboard that you would like to link instead:

1. In your tracing project, click the three dots next to the **Dashboard** button.
2. Choose a dashboard to set as the new default.

<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/tracing-project-to-dashboard.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=06b63ab9c4f8d7185c72a77e84862f3a" alt="Tracing project to dashboard" data-og-width="2080" width="2080" data-og-height="770" height="770" data-path="langsmith/images/tracing-project-to-dashboard.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/tracing-project-to-dashboard.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=b06a5e1a4fe6b82260032624bdc1ce68 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/tracing-project-to-dashboard.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=ea5dd2397337b290c996ef26948836ea 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/tracing-project-to-dashboard.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=26832107ff0613d0575432fc4f2424d6 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/tracing-project-to-dashboard.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=2ace3a16bd11cd8a8906bbdab77c3f10 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/tracing-project-to-dashboard.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=c5b29e7497d837a0d84b97f5c3c522a6 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/tracing-project-to-dashboard.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=6055757e9ac1f5fd9381d14bd92cf088 2500w" />

## Example: user-journey monitoring

Use monitoring charts for mapping the decisions made by an agent at a particular node.

Consider an email assistant agent. At a particular node it makes a decision about an email to:

* send an email back
* notify the user
* no response needed

We can create a chart to track and visualize the breakdown of these decisions.

**Creating the chart**

1. **Metric Selection**: Select the metric `Run count`.

2. **Chart Filters**: Add a tree filter to include all of the traces with name `triage_input`. This means we only include traces that hit the `triage_input` node. Also add a chart filter for `Is Root` is `true`, so our count is not inflated by the number of nodes in the trace.
   <img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chart-filters-for-node-decision.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=f769fb23624b3b1b6c042ff5cfd910e6" alt="Decision at node" data-og-width="2620" width="2620" data-og-height="1698" height="1698" data-path="langsmith/images/chart-filters-for-node-decision.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chart-filters-for-node-decision.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=e09f22192fa9cb75b32f6812071e1e07 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chart-filters-for-node-decision.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=e3108ae084660ecc151a68a47fab6912 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chart-filters-for-node-decision.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=fb75ff0f6c0e9140bc4554cceb93134b 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chart-filters-for-node-decision.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=704fb96e453fdd92dcce1e34a35a48f2 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chart-filters-for-node-decision.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=902d095d27fb3f20b8e403f4ddc18e24 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chart-filters-for-node-decision.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=9afff6512a968017060e6fde0a5fb992 2500w" />

3. **Data Series**: Create a data series for each decision made at the `triage_input` node. The output of the decision is stored in the `triage.response` field of the output object, and the value of the decision is either `no`, `email`, or `notify`. Each of these decisions generates a separate data series in the chart.
   <img src="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/decision-at-node.png?fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=eb98a2c2c7988b5b6c5c3db9740ed172" alt="Decision at node" data-og-width="2578" width="2578" data-og-height="1692" height="1692" data-path="langsmith/images/decision-at-node.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/decision-at-node.png?w=280&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=665c6cf0c3368c8f180db15cc7bf4cba 280w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/decision-at-node.png?w=560&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=e68f21187ec95e74e033850200893c66 560w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/decision-at-node.png?w=840&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=33f729b74f697b7b7554badb6c76cec0 840w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/decision-at-node.png?w=1100&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=1ebed1b633a724ed5c9e2b93d62b1c2e 1100w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/decision-at-node.png?w=1650&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=8d3287007e200f37151955edd6cc2632 1650w, https://mintcdn.com/langchain-5e9cc07a/aKRoUGXX6ygp4DlC/langsmith/images/decision-at-node.png?w=2500&fit=max&auto=format&n=aKRoUGXX6ygp4DlC&q=85&s=ee0d68172a80dce21c1f1afad533298d 2500w" />

Now we can visualize the decisions made at the `triage_input` node over time.

## Video guide

<iframe className="w-full aspect-video rounded-xl" src="https://www.youtube.com/embed/VxsIvf9NdxI?si=7ksp9qyw-i0lcwxg" title="YouTube video player" frameBorder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowFullScreen />

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/dashboards.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Bulk Exporting Trace Data
Source: https://docs.langchain.com/langsmith/data-export



<Info>
  **Plan restrictions apply**

  Please note that the Data Export functionality is only supported for [LangSmith Plus or Enterprise tiers](https://www.langchain.com/pricing-langsmith).
</Info>

LangSmith's bulk data export functionality allows you to export your traces into an external destination. This can be useful if you want to analyze the
data offline in a tool such as BigQuery, Snowflake, RedShift, Jupyter Notebooks, etc.

An export can be launched to target a specific LangSmith project and date range. Once a batch export is launched, our system will handle the orchestration and resilience of the export process.
Please note that exporting your data may take some time depending on the size of your data. We also have a limit on how many of your exports can run at the same time.
Bulk exports also have a runtime timeout of 24 hours.

## Destinations

Currently we support exporting to an S3 bucket or S3 API compatible bucket that you provide. The data will be exported in
[Parquet](https://parquet.apache.org/docs/overview/) columnar format. This format will allow you to easily import the data into
other systems. The data export will contain equivalent data fields as the [Run data format](/langsmith/run-data-format).

## Exporting Data

### Destinations - Providing a S3 bucket

To export LangSmith data, you will need to provide an S3 bucket where the data will be exported to.

The following information is needed for the export:

* **Bucket Name**: The name of the S3 bucket where the data will be exported to.
* **Prefix**: The root prefix within the bucket where the data will be exported to.
* **S3 Region**: The region of the bucket - this is needed for AWS S3 buckets.
* **Endpoint URL**: The endpoint URL for the S3 bucket - this is needed for S3 API compatible buckets.
* **Access Key**: The access key for the S3 bucket.
* **Secret Key**: The secret key for the S3 bucket.

We support any S3 compatible bucket, for non AWS buckets such as GCS or MinIO, you will need to provide the endpoint URL.

### Preparing the Destination

<Note>
  **For self-hosted and EU region deployments**

  Update the LangSmith URL appropriately for self-hosted installations or organizations in the EU region in the requests below.
  For the EU region, use `eu.api.smith.langchain.com`.
</Note>

<Note>
  **Permissions required**

  Both the `backend` and `queue` services require write access to the destination bucket:

  * The `backend` service attempts to write a test file to the destination bucket when the export destination is created.
    It will delete the test file if it has permission to do so (delete access is optional).
  * The `queue` service is responsible for bulk export execution and uploading the files to the bucket.
</Note>

The following example demonstrates how to create a destination using cURL. Replace the placeholder values with your actual configuration details.
Note that credentials will be stored securely in an encrypted form in our system.

```bash  theme={null}
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports/destinations' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "destination_type": "s3",
    "display_name": "My S3 Destination",
    "config": {
      "bucket_name": "your-s3-bucket-name",
      "prefix": "root_folder_prefix",
      "region": "your aws s3 region",
      "endpoint_url": "your endpoint url for s3 compatible buckets"
    },
    "credentials": {
      "access_key_id": "YOUR_S3_ACCESS_KEY_ID",
      "secret_access_key": "YOUR_S3_SECRET_ACCESS_KEY"
    }
  }'
```

Use the returned `id` to reference this destination in subsequent bulk export operations.

**If you receive an error while creating a destination, see [debug destination errors](#debugging-destination-errors) for details on how to debug this.**

#### Credentials configuration

<Note>**Requires LangSmith Helm version >= `0.10.34` (application version >= `0.10.91`)**</Note>

We support the following additional credentials formats besides static `access_key_id` and `secret_access_key`:

* To use [temporary credentials](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html) that include an AWS session token,
  additionally provide the `credentials.session_token` key when creating the bulk export destination.
* (Self-hosted only): To use environment-based credentials such as with [AWS IAM Roles for Service Accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) (IRSA),
  omit the `credentials` key from the request when creating the bulk export destination.
  In this case, the [standard Boto3 credentials locations](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#credentials) will be checked in the order defined by the library.

#### AWS S3 bucket

For AWS S3, you can leave off the `endpoint_url` and supply the region that matches the region of your bucket.

```bash  theme={null}
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports/destinations' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "destination_type": "s3",
    "display_name": "My AWS S3 Destination",
    "config": {
      "bucket_name": "my_bucket",
      "prefix": "data_exports",
      "region": "us-east-1"
    },
    "credentials": {
      "access_key_id": "YOUR_S3_ACCESS_KEY_ID",
      "secret_access_key": "YOUR_S3_SECRET_ACCESS_KEY"
    }
  }'
```

#### Google GCS XML S3 compatible bucket

When using Google's GCS bucket, you need to use the XML S3 compatible API, and supply the `endpoint_url`
which is typically `https://storage.googleapis.com`.
Here is an example of the API request when using the GCS XML API which is compatible with S3:

```bash  theme={null}
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports/destinations' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "destination_type": "s3",
    "display_name": "My GCS Destination",
    "config": {
      "bucket_name": "my_bucket",
      "prefix": "data_exports",
      "endpoint_url": "https://storage.googleapis.com"
    },
    "credentials": {
      "access_key_id": "YOUR_S3_ACCESS_KEY_ID",
      "secret_access_key": "YOUR_S3_SECRET_ACCESS_KEY"
    }
  }'
```

See [Google documentation](https://cloud.google.com/storage/docs/interoperability#xml_api) for more info

### Create an export job

To export data, you will need to create an export job. This job will specify the destination, the project, the date range, and filter expression of the data to export. The filter expression is used to narrow down the set of runs exported and is optional. Not setting the filter field will export all runs. Refer to our [filter query language](/langsmith/trace-query-syntax#filter-query-language) and [examples](/langsmith/export-traces#use-filter-query-language) to determine the correct filter expression for your export.

You can use the following cURL command to create the job:

```bash  theme={null}
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "bulk_export_destination_id": "your_destination_id",
    "session_id": "project_uuid",
    "start_time": "2024-01-01T00:00:00Z",
    "end_time": "2024-01-02T23:59:59Z",
    "filter": "and(eq(run_type, \"llm\"), eq(name, \"ChatOpenAI\"), eq(input_key, \"messages.content\"), like(input_value, \"%messages.content%\"))"
  }'
```

<Note>
  The `session_id` is also known as the Tracing Project ID, which can be copied from the individual project view by clicking into the project in the Tracing Projects list.
</Note>

Use the returned `id` to reference this export in subsequent bulk export operations.

#### Limiting exported fields

<Note>
  Requires LangSmith Helm version >= `0.12.11` (application version >= `0.12.42`). This feature **is supported** in [scheduled bulk exports](#scheduled-exports) and [standard bulk exports](#create-an-export-job).
</Note>

You can improve bulk export speed and reduce row size by limiting which fields are included in the exported Parquet files using the `export_fields` parameter. When `export_fields` is provided, only the specified fields are exported as columns in the Parquet files. When `export_fields` is not provided, all exportable fields are included.

This is particularly useful when you want to exclude larger fields like `inputs` and `outputs`.

The following example creates an export job that only includes specific fields:

```bash  theme={null}
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "bulk_export_destination_id": "your_destination_id",
    "session_id": "project_uuid",
    "start_time": "2024-01-01T00:00:00Z",
    "end_time": "2024-01-02T23:59:59Z",
    "export_fields": ["id", "name", "run_type", "start_time", "end_time", "status", "total_tokens", "total_cost"]
  }'
```

The `export_fields` parameter accepts an array of field names. Available fields include the [Run data format](/langsmith/run-data-format) fields as well as additional export-only fields:

* `tenant_id`
* `is_root`

<Tip>
  **Performance tip**: Excluding `inputs` and `outputs` from your export can significantly improve export performance and reduce file sizes, especially for large runs. Only include these fields if you need them for your analysis.
</Tip>

### Scheduled exports

<Note>
  Requires LangSmith Helm version >= `0.10.42` (application version >= `0.10.109`)
</Note>

Scheduled exports collect runs periodically and export to the configured destination.
To create a scheduled export, include `interval_hours` and remove `end_time`:

```bash  theme={null}
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "bulk_export_destination_id": "your_destination_id",
    "session_id": "project_uuid",
    "start_time": "2024-01-01T00:00:00Z",
    "filter": "and(eq(run_type, \"llm\"), eq(name, \"ChatOpenAI\"), eq(input_key, \"messages.content\"), like(input_value, \"%messages.content%\"))",
    "interval_hours": 1
  }'
```

You can also use `export_fields` with scheduled exports to limit which fields are exported:

```bash  theme={null}
curl --request POST \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "bulk_export_destination_id": "your_destination_id",
    "session_id": "project_uuid",
    "start_time": "2024-01-01T00:00:00Z",
    "interval_hours": 1,
    "export_fields": ["id", "name", "run_type", "start_time", "end_time", "status", "total_tokens", "total_cost"]
  }'
```

**Details**

* `interval_hours` must be between 1 hour and 168 hours (1 week) inclusive.
* For spawned exports, the first time range exported is `start_time=(scheduled_export_start_time), end_time=(start_time + interval_hours)`.
  Then `start_time=(previous_export_end_time), end_time=(this_export_start_time + interval_hours)`, and so on.
* `end_time` must be omitted for scheduled exports. `end_time` is still required for non-scheduled exports.
* Scheduled exports can be stopped by [cancelling the export](#stop-an-export).
  * Exports that have been spawned by a scheduled export have the `source_bulk_export_id` attribute filled.
  * If desired, these spawned bulk exports must be canceled separately from the source scheduled bulk export -
    canceling the source bulk export **does not** cancel the spawned bulk exports.
* Spawned exports run at `end_time + 10 minutes` to account for any runs that are submitted with `end_time` in the recent past.

**Example**

If a scheduled bulk export is created with `start_time=2025-07-16T00:00:00Z` and `interval_hours=6`:

| Export | Start Time           | End Time             | Runs At              |
| ------ | -------------------- | -------------------- | -------------------- |
| 1      | 2025-07-16T00:00:00Z | 2025-07-16T06:00:00Z | 2025-07-16T06:10:00Z |
| 2      | 2025-07-16T06:00:00Z | 2025-07-16T12:00:00Z | 2025-07-16T12:10:00Z |
| 3      | 2025-07-16T12:00:00Z | 2025-07-16T18:00:00Z | 2025-07-16T18:10:00Z |

## Monitoring the Export Job

### Monitor Export Status

To monitor the status of an export job, use the following cURL command:

```bash  theme={null}
curl --request GET \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports/{export_id}' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID'
```

Replace `{export_id}` with the ID of the export you want to monitor. This command retrieves the current status of the specified export job.

### List Runs for an Export

An export is typically broken up into multiple runs which correspond to a specific date partition to export.
To list all runs associated with a specific export, use the following cURL command:

```bash  theme={null}
curl --request GET \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports/{export_id}/runs' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID'
```

This command fetches all runs related to the specified export, providing details such as run ID, status, creation time, rows exported, etc.

### List All Exports

To retrieve a list of all export jobs, use the following cURL command:

```bash  theme={null}
curl --request GET \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID'
```

This command returns a list of all export jobs along with their current statuses and creation timestamps.

### Stop an Export

To stop an existing export, use the following cURL command:

```bash  theme={null}
curl --request PATCH \
  --url 'https://api.smith.langchain.com/api/v1/bulk-exports/{export_id}' \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: YOUR_API_KEY' \
  --header 'X-Tenant-Id: YOUR_WORKSPACE_ID' \
  --data '{
    "status": "Cancelled"
}'
```

Replace `{export_id}` with the ID of the export you wish to cancel. Note that a job cannot be restarted once it has been cancelled,
you will need to create a new export job instead.

## Partitioning Scheme

Data will be exported into your bucket into the follow Hive partitioned format:

```
<bucket>/<prefix>/export_id=<export_id>/tenant_id=<tenant_id>/session_id=<session_id>/runs/year=<year>/month=<month>/day=<day>
```

## Importing Data into other systems

Importing data from S3 and Parquet format is commonly supported by the majority of analytical systems. See below for documentation links:

### BigQuery

To import your data into BigQuery, see [Loading Data from Parquet](https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-parquet) and also
[Hive Partitioned loads](https://cloud.google.com/bigquery/docs/hive-partitioned-loads-gcs).

### Snowflake

You can load data into Snowflake from S3 by following the [Load from Cloud Document](https://docs.snowflake.com/en/user-guide/tutorials/load-from-cloud-tutorial).

### RedShift

You can COPY data from S3 or Parquet into Amazon Redshift by following the [AWS COPY command documentation](https://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html).

### Clickhouse

You can directly query data in S3 / Parquet format in Clickhouse. As an example, if using GCS, you can query the data as follows:

```sql  theme={null}
SELECT count(distinct id) FROM s3('https://storage.googleapis.com/<bucket>/<prefix>/export_id=<export_id>/**',
 'access_key_id', 'access_secret', 'Parquet')
```

See [Clickhouse S3 Integration Documentation](https://clickhouse.com/docs/en/engines/table-engines/integrations/s3) for more information.

### DuckDB

You can query the data from S3 in-memory with SQL using DuckDB. See [S3 import Documentation](https://duckdb.org/docs/guides/network_cloud_storage/s3_import.html).

## Error Handling

### Debugging Destination Errors

The destinations API endpoint will validate that the destination and credentials are valid and that write access is
is present for the bucket.

If you receive an error, and would like to debug this error, you can use the [AWS CLI](https://aws.amazon.com/cli/)
to test the connectivity to the bucket. You should be able to write a file with the CLI using the same
data that you supplied to the destinations API above.

**AWS S3:**

```bash  theme={null}
aws configure

# set the same access key credentials and region as you used for the destination
> AWS Access Key ID: <access_key_id>
> AWS Secret Access Key: <secret_access_key>
> Default region name [us-east-1]: <region>

# List buckets
aws s3 ls /

# test write permissions
touch ./test.txt
aws s3 cp ./test.txt s3://<bucket-name>/tmp/test.txt
```

**GCS Compatible Buckets:**

You will need to supply the endpoint\_url with `--endpoint-url` option.
For GCS, the `endpoint_url` is typically `https://storage.googleapis.com`:

```bash  theme={null}
aws configure

# set the same access key credentials and region as you used for the destination
> AWS Access Key ID: <access_key_id>
> AWS Secret Access Key: <secret_access_key>
> Default region name [us-east-1]: <region>

# List buckets
aws s3 --endpoint-url=<endpoint_url> ls /

# test write permissions
touch ./test.txt
aws s3 --endpoint-url=<endpoint_url> cp ./test.txt s3://<bucket-name>/tmp/test.txt
```

### Monitoring Runs

You can monitor your runs using the [List Runs API](#list-runs-for-an-export). If this is a known error, this will be added to the `errors` field of the run.

### Common Errors

Here are some common errors:

| Error                              | Description                                                                                                                                                                                                                                                                                              |
| ---------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Access denied                      | The blob store credentials or bucket are not valid. This error occurs when the provided access key and secret key combination doesn't have the necessary permissions to access the specified bucket or perform the required operations.                                                                  |
| Bucket is not valid                | The specified blob store bucket is not valid. This error is thrown when the bucket doesn't exist or there is not enough access to perform writes on the bucket.                                                                                                                                          |
| Key ID you provided does not exist | The blob store credentials provided are not valid. This error occurs when the access key ID used for authentication is not a valid key.                                                                                                                                                                  |
| Invalid endpoint                   | The endpoint\_url provided is invalid. This error is raised when the specified endpoint is an invalid endpoint. Only S3 compatible endpoints are supported, for example `https://storage.googleapis.com` for GCS, `https://play.min.io` for minio, etc. If using AWS, you should omit the endpoint\_url. |

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/data-export.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# LangSmith data plane
Source: https://docs.langchain.com/langsmith/data-plane



The *data plane* consists of your [Agent Servers](/langsmith/agent-server) (deployments), their supporting infrastructure, and the "listener" application that continuously polls for updates from the [LangSmith control plane](/langsmith/control-plane).

## Server infrastructure

In addition to the [Agent Server](/langsmith/agent-server) itself, the following infrastructure components for each server are also included in the broad definition of "data plane":

* **PostgreSQL**: persistence layer for user, run, and memory data.
* **Redis**: communication and ephemeral metadata for workers.
* **Secrets store**: secure management of environment secrets.
* **Autoscalers**: scale server containers based on load.

## "Listener" application

The data plane "listener" application periodically calls [control plane APIs](/langsmith/control-plane#control-plane-api) to:

* Determine if new deployments should be created.
* Determine if existing deployments should be updated (i.e. new revisions).
* Determine if existing deployments should be deleted.

In other words, the data plane "listener" reads the latest state of the control plane (desired state) and takes action to reconcile outstanding deployments (current state) to match the latest state.

## PostgreSQL

PostgreSQL is the persistence layer for all user, run, and long-term memory data in a Agent Server. This stores both checkpoints (see more info [here](/oss/python/langgraph/persistence)), server resources (threads, runs, assistants and crons), as well as items saved in the long-term memory store (see more info [here](/oss/python/langgraph/persistence#memory-store)).

## Redis

Redis is used in each Agent Server as a way for server and queue workers to communicate, and to store ephemeral metadata. No user or run data is stored in Redis.

### Communication

All runs in an Agent Server are executed by a pool of background workers that are part of each deployment. In order to enable some features for those runs (such as cancellation and output streaming) we need a channel for two-way communication between the server and the worker handling a particular run. We use Redis to organize that communication.

1. A Redis list is used as a mechanism to wake up a worker as soon as a new run is created. Only a sentinel value is stored in this list, no actual run information. The run information is then retrieved from PostgreSQL by the worker.
2. A combination of a Redis string and Redis PubSub channel is used for the server to communicate a run cancellation request to the appropriate worker.
3. A Redis PubSub channel is used by the worker to broadcast streaming output from an agent while the run is being handled. Any open `/stream` request in the server will subscribe to that channel and forward any events to the response as they arrive. No events are stored in Redis at any time.

### Ephemeral metadata

Runs in an Agent Server may be retried for specific failures (currently only for transient PostgreSQL errors encountered during the run). In order to limit the number of retries (currently limited to 3 attempts per run) we record the attempt number in a Redis string when it is picked up. This contains no run-specific info other than its ID, and expires after a short delay.

## Data plane features

This section describes various features of the data plane.

### Data region

<Info>
  **Only for Cloud**
  Data regions are only applicable for [Cloud](/langsmith/cloud) deployments.
</Info>

Deployments can be created in 2 data regions: US and EU

The data region for a deployment is implied by the data region of the LangSmith organization where the deployment is created. Deployments and the underlying database for the deployments cannot be migrated between data regions.

### Autoscaling

[`Production` type](/langsmith/control-plane#deployment-types) deployments automatically scale up to 10 containers. Scaling is based on 3 metrics:

1. CPU utilization
2. Memory utilization
3. Number of pending (in progress) [runs](/langsmith/assistants#execution)

For CPU utilization, the autoscaler targets 75% utilization. This means the autoscaler will scale the number of containers up or down to ensure that CPU utilization is at or near 75%. For memory utilization, the autoscaler targets 75% utilization as well.

For number of pending runs, the autoscaler targets 10 pending runs. For example, if the current number of containers is 1, but the number of pending runs is 20, the autoscaler will scale up the deployment to 2 containers (20 pending runs / 2 containers = 10 pending runs per container).

Each metric is computed independently and the autoscaler will determine the scaling action based on the metric that results in the largest number of containers.

Scale down actions are delayed for 30 minutes before any action is taken. In other words, if the autoscaler decides to scale down a deployment, it will first wait for 30 minutes before scaling down. After 30 minutes, the metrics are recomputed and the deployment will scale down if the recomputed metrics result in a lower number of containers than the current number. Otherwise, the deployment remains scaled up. This "cool down" period ensures that deployments do not scale up and down too frequently.

### Static IP addresses

<Info>
  **Only for Cloud**
  Static IP addresses are only available for [Cloud](/langsmith/cloud) deployments.
</Info>

All traffic from deployments created after January 6th 2025 will come through a NAT gateway. This NAT gateway will have several static IP addresses depending on the data region. Refer to the table below for the list of static IP addresses:

| US             | EU             |
| -------------- | -------------- |
| 35.197.29.146  | 34.13.192.67   |
| 34.145.102.123 | 34.147.105.64  |
| 34.169.45.153  | 34.90.22.166   |
| 34.82.222.17   | 34.147.36.213  |
| 35.227.171.135 | 34.32.137.113  |
| 34.169.88.30   | 34.91.238.184  |
| 34.19.93.202   | 35.204.101.241 |
| 34.19.34.50    | 35.204.48.32   |
| 34.59.244.194  |                |
| 34.9.99.224    |                |
| 34.68.27.146   |                |
| 34.41.178.137  |                |
| 34.123.151.210 |                |
| 34.135.61.140  |                |
| 34.121.166.52  |                |
| 34.31.121.70   |                |

### Payload size

<Info>
  **Only for Cloud**
  Payload size restrictions are only applicable to [Cloud](/langsmith/cloud) deployments.
</Info>

The maximum payload size for all requests sent to [Cloud](/langsmith/cloud) deployments is 25 MB. Attempting to send a request with a payload larger than 25 MB will result in a `413 Payload Too Large` error.

### Custom PostgreSQL

<Info>
  Custom PostgreSQL instances are only available for [hybrid](/langsmith/hybrid) and [self-hosted](/langsmith/self-hosted) deployments.
</Info>

A custom PostgreSQL instance can be used instead of the [one automatically created by the control plane](/langsmith/control-plane#database-provisioning). Specify the [`POSTGRES_URI_CUSTOM`](/langsmith/env-var#postgres-uri-custom) environment variable to use a custom PostgreSQL instance.

Multiple deployments can share the same PostgreSQL instance. For example, for `Deployment A`, `POSTGRES_URI_CUSTOM` can be set to `postgres://<user>:<password>@/<database_name_1>?host=<hostname_1>` and for `Deployment B`, `POSTGRES_URI_CUSTOM` can be set to `postgres://<user>:<password>@/<database_name_2>?host=<hostname_1>`. `<database_name_1>` and `database_name_2` are different databases within the same instance, but `<hostname_1>` is shared. **The same database cannot be used for separate deployments**.

### Custom Redis

<Info>
  Custom Redis instances are only available for [Hybrid](/langsmith/hybrid) and [Self-Hosted](/langsmith/self-hosted) deployments.
</Info>

A custom Redis instance can be used instead of the one automatically created by the control plane. Specify the [REDIS\_URI\_CUSTOM](/langsmith/env-var#redis-uri-custom) environment variable to use a custom Redis instance.

Multiple deployments can share the same Redis instance. For example, for `Deployment A`, `REDIS_URI_CUSTOM` can be set to `redis://<hostname_1>:<port>/1` and for `Deployment B`, `REDIS_URI_CUSTOM` can be set to `redis://<hostname_1>:<port>/2`. `1` and `2` are different database numbers within the same instance, but `<hostname_1>` is shared. **The same database number cannot be used for separate deployments**.

### LangSmith tracing

Agent Server is automatically configured to send traces to LangSmith. See the table below for details with respect to each deployment option.

| Cloud                                  | Hybrid                                                    | Self-Hosted                                                                                |
| -------------------------------------- | --------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
| Required<br />Trace to LangSmith SaaS. | Optional<br />Disable tracing or trace to LangSmith SaaS. | Optional<br />Disable tracing, trace to LangSmith SaaS, or trace to Self-Hosted LangSmith. |

### Telemetry

Agent Server is automatically configured to report telemetry metadata for billing purposes. See the table below for details with respect to each deployment option.

| Cloud                             | Hybrid                            | Self-Hosted                                                                                                              |
| --------------------------------- | --------------------------------- | ------------------------------------------------------------------------------------------------------------------------ |
| Telemetry sent to LangSmith SaaS. | Telemetry sent to LangSmith SaaS. | Self-reported usage (audit) for air-gapped license key.<br />Telemetry sent to LangSmith SaaS for LangSmith License Key. |

### Licensing

Agent Server is automatically configured to perform license key validation. See the table below for details with respect to each deployment option.

| Cloud                                               | Hybrid                                              | Self-Hosted                                                                      |
| --------------------------------------------------- | --------------------------------------------------- | -------------------------------------------------------------------------------- |
| LangSmith API Key validated against LangSmith SaaS. | LangSmith API Key validated against LangSmith SaaS. | Air-gapped license key or Platform License Key validated against LangSmith SaaS. |

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/data-plane.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Data purging for compliance
Source: https://docs.langchain.com/langsmith/data-purging-compliance



This guide covers the various features available after data reaches LangSmith Cloud servers to help you achieve your privacy goals.

## Data retention

LangSmith provides automatic data retention capabilities to help with compliance and storage management. Data retention policies can be configured at the organization and project levels.

For detailed information about data retention configuration and management, please refer to the [Data Retention concepts](/langsmith/administration-overview#data-retention) documentation.

## Trace deletes

You can use the API to complete trace deletes. The API supports two methods for deleting traces:

1. **By trace IDs and session ID**: Delete specific traces by providing a list of trace IDs and their corresponding session ID (up to 1000 traces per request)
2. **By metadata**: Delete traces across a workspace that match any of the specified metadata key-value pairs

For more details, refer to the [API spec](https://api.smith.langchain.com/redoc#tag/run/operation/delete_runs_api_v1_runs_delete_post).

<Warning>
  All trace deletions will delete related entities like feedbacks, aggregations, and stats across all data storages.
</Warning>

### Deletion timeline

Trace deletions are processed during non-peak usage times and are not instant. LangChain runs the delete job on the weekend. There is no confirmation of deletion - you'll need to query the data again to verify it has been removed.

### Delete specific traces

To delete specific traces by their trace IDs from a single session:

```bash  theme={null}
curl -X POST "https://api.smith.langchain.com/api/v1/runs/delete" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "trace_ids": ["trace-id-1", "trace-id-2", "trace-id-3"],
    "session_id": "session-id-1"
  }'
```

### Delete by metadata

When deleting by metadata:

* Accepts a `metadata` object of key/value pairs. KV pair matching uses an **or** condition. A trace will match if it has **any** of the key-value pairs specified in metadata (not all)
* You don't need to specify a session id when deleting by metadata. Deletes will apply across the workspace.

To delete traces based on metadata across a workspace (matches **any** of the metadata key-value pairs):

```bash  theme={null}
curl -X POST "https://api.smith.langchain.com/api/v1/runs/delete" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metadata": {
      "user_id": "user123",
      "environment": "staging"
    }
  }'
```

This will delete traces that have either `user_id: "user123"` **or** `environment: "staging"` in their metadata.

<Warning>
  Remember that you can only schedule up to 1000 traces per session per request. For larger deletions, you'll need to make multiple requests.
</Warning>

## Example deletes

You can delete dataset examples self-serve via our API, which supports both soft and hard deletion methods depending on your data retention needs.

<Warning>
  Hard deletes will permanently remove inputs, outputs, and metadata from ALL versions of the specified examples across the entire dataset history.
</Warning>

### Deleting examples is a two-step process

For bulk operations, example deletion follows a two-step process:

#### 1. Search for examples by metadata

Find all examples with matching metadata across all datasets in a workspace.

[GET /examples](https://api.smith.langchain.com/redoc#tag/examples/operation/read_examples_api_v1_examples_get)

* `as_of` must be explicitly specified as a timestamp. Only examples created before the `as_of` date will be returned

```bash  theme={null}
curl -X GET "https://api.smith.langchain.com/api/v1/examples?as_of=2024-01-01T00:00:00Z" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "metadata": {
      "user_id": "user123",
      "environment": "staging"
    }
  }'
```

This will return examples that have either `user_id: "user123"` **or** `environment: "staging"` in their metadata across all datasets in your workspace.

#### 2. Hard delete examples

Once you have the example IDs, send a delete request. This will zero-out the inputs, outputs, and metadata from all versions of the dataset for that example.

[DELETE /examples](https://api.smith.langchain.com/redoc#tag/examples/operation/delete_examples_api_v1_examples_delete)

* Specify example IDs and add `"hard_delete": true` to the query params of the request

```bash  theme={null}
curl -X DELETE "https://api.smith.langchain.com/api/v1/examples?hard_delete=true" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "example_ids": ["example-id-1", "example-id-2", "example-id-3"]
  }'
```

### Deletion types

#### Soft delete (default)

* Creates tombstoned entries with NULL inputs/outputs in the dataset
* Preserves historical data and maintains dataset versioning
* Only affects the current version of the dataset

#### Hard delete

* Permanently removes inputs, outputs, and metadata from ALL dataset versions
* Complete data removal when compliance requires zero-out across all versions
* Add `"hard_delete": true` to the query parameters

For more details, refer to the [API spec](https://api.smith.langchain.com/redoc#tag/examples/operation/delete_examples_api_v1_examples_delete).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/data-purging-compliance.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Data storage and privacy
Source: https://docs.langchain.com/langsmith/data-storage-and-privacy



This document describes how data is processed in the LangGraph CLI and the Agent Server for both the in-memory server (`langgraph dev`) and the local Docker server (`langgraph up`). It also describes what data is tracked when interacting with the hosted Studio frontend.

## CLI

LangGraph **CLI** is the command-line interface for building and running LangGraph applications; see the [CLI guide](/langsmith/cli) to learn more.

By default, calls to most CLI commands log a single analytics event upon invocation. This helps us better prioritize improvements to the CLI experience. Each telemetry event contains the calling process's OS, OS version, Python version, the CLI version, the command name (`dev`, `up`, `run`, etc.), and booleans representing whether a flag was passed to the command. You can see the full analytics logic [here](https://github.com/langchain-ai/langgraph/blob/main/libs/cli/langgraph-cli/analytics.py).

You can disable all CLI telemetry by setting `LANGGRAPH_CLI_NO_ANALYTICS=1`.

<a id="in-memory-docker" />

## Agent Server

The [Agent Server](/langsmith/agent-server) provides a durable execution runtime that relies on persisting checkpoints of your application state, long-term memories, thread metadata, assistants, and similar resources to the local file system or a database. Unless you have deliberately customized the storage location, this information is either written to local disk (for `langgraph dev`) or a PostgreSQL database (for `langgraph up` and in all deployments).

### LangSmith Tracing

When running the Agent server (either in-memory or in Docker), LangSmith tracing may be enabled to facilitate faster debugging and offer observability of graph state and LLM prompts in production. You can always disable tracing by setting `LANGSMITH_TRACING=false` in your server's runtime environment.

<a id="langgraph-dev" />

### In-memory development server

`langgraph dev` runs an [in-memory development server](/langsmith/local-server) as a single Python process, designed for quick development and testing. It saves all checkpointing and memory data to disk within a `.langgraph_api` directory in the current working directory. Apart from the telemetry data described in the [CLI](#cli) section, no data leaves the machine unless you have enabled tracing or your graph code explicitly contacts an external service.

<a id="langgraph-up" />

### Standalone Server

`langgraph up` builds your local package into a Docker image and runs the server as the [data plane](/langsmith/self-hosted) consisting of three containers: the API server, a PostgreSQL container, and a Redis container. All persistent data (checkpoints, assistants, etc.) are stored in the PostgreSQL database. Redis is used as a pubsub connection for real-time streaming of events. You can encrypt all checkpoints before saving to the database by setting a valid `LANGGRAPH_AES_KEY` environment variable. You can also specify [TTLs](/langsmith/configure-ttl) for checkpoints and cross-thread memories in `langgraph.json` to control how long data is stored. All persisted threads, memories, and other data can be deleted via the relevant API endpoints.

Additional API calls are made to confirm that the server has a valid license and to track the number of executed runs and tasks. Periodically, the API server validates the provided license key (or API key).

If you've disabled [tracing](#langsmith-tracing), no user data is persisted externally unless your graph code explicitly contacts an external service.

## Studio

[Studio](/langsmith/studio) is a graphical interface for interacting with your Agent Server. It does not persist any private data (the data you send to your server is not sent to LangSmith). Though the Studio interface is served at [smith.langchain.com](https://smith.langchain.com), it is run in your browser and connects directly to your local Agent Server so that no data needs to be sent to LangSmith.

If you are logged in, LangSmith does collect some usage analytics to help improve the debugging user experience. This includes:

* Page visits and navigation patterns
* User actions (button clicks)
* Browser type and version
* Screen resolution and viewport size

Importantly, no application data or code (or other sensitive configuration details) are collected. All of that is stored in the persistence layer of your Agent Server. When using Studio anonymously, no account creation is required and usage analytics are not collected.

## Quick reference

In summary, you can opt-out of server-side telemetry by turning off CLI analytics and disabling tracing.

| Variable                       | Purpose                   | Default                |
| ------------------------------ | ------------------------- | ---------------------- |
| `LANGGRAPH_CLI_NO_ANALYTICS=1` | Disable CLI analytics     | Analytics enabled      |
| `LANGSMITH_API_KEY`            | Enable LangSmith tracing  | Tracing disabled       |
| `LANGSMITH_TRACING=false`      | Disable LangSmith tracing | Depends on environment |

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/data-storage-and-privacy.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Dataset prebuilt JSON schema types
Source: https://docs.langchain.com/langsmith/dataset-json-types



LangSmith recommends that you set a schema on the inputs and outputs of your dataset schemas to ensure data consistency and that your examples are in the right format for downstream processing, like running evals.

In order to better support LLM workflows, LangSmith has support for a few different predefined prebuilt types. These schemas are hosted publicly by the LangSmith API, and can be defined in your dataset schemas using [JSON Schema references](https://json-schema.org/understanding-json-schema/structuring#dollarref). The table of available schemas can be seen below

| Type    | JSON Schema Reference Link                                                                                                       | Usage                                                                                                                     |
| ------- | -------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------- |
| Message | [https://api.smith.langchain.com/public/schemas/v1/message.json](https://api.smith.langchain.com/public/schemas/v1/message.json) | Represents messages sent to a chat model, following the OpenAI standard format.                                           |
| Tool    | [https://api.smith.langchain.com/public/schemas/v1/tooldef.json](https://api.smith.langchain.com/public/schemas/v1/tooldef.json) | Tool definitions available to chat models for function calling, defined in OpenAI's JSON Schema inspired function format. |

LangSmith lets you define a series of transformations that collect the above prebuilt types from your traces and add them to your dataset. For more info on available transformations, see our [reference](/langsmith/dataset-transformations)

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/dataset-json-types.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Dataset transformations
Source: https://docs.langchain.com/langsmith/dataset-transformations



LangSmith allows you to attach transformations to fields in your dataset's schema that apply to your data before it is added to your dataset, whether that be from UI, API, or run rules.

Coupled with [LangSmith's prebuilt JSON schema types](/langsmith/dataset-json-types), these allow you to do easy preprocessing of your data before saving it into your datasets.

## Transformation types

| Transformation Type         | Target Types                                                               | Functionality                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| --------------------------- | -------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `remove_system_messages`    | `Array[Message]`                                                           | Filters a list of messages to remove any system messages.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| `convert_to_openai_message` | Message `Array[Message]`                                                   | Converts any incoming data from LangChain's internal serialization format to OpenAI's standard message format using langchain's [`convert_to_openai_messages`](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.convert_to_openai_messages.html). If the target field is marked as required, and no matching message is found upon entry, it will attempt to extract a message (or list of messages) from several well-known LangSmith tracing formats (e.g., any traced LangChain [`BaseChatModel`](https://reference.langchain.com/python/langchain_core/language_models/#langchain_core.language_models.chat_models.BaseChatModel) run or traced run from the [LangSmith OpenAI wrapper](/langsmith/annotate-code#wrap-the-openai-client)), and remove the original key containing the message. |
| `convert_to_openai_tool`    | `Array[Tool]` Only available on top level fields in the inputs dictionary. | Converts any incoming data into OpenAI standard tool formats here using langchain's [`convert_to_openai_tool`](https://reference.langchain.com/python/langchain_core/utils/#langchain_core.utils.function_calling.convert_to_openai_tool) Will extract tool definitions from a run's invocation parameters if present / no tools are found at the specified key. This is useful because LangChain chat models trace tool definitions to the `extra.invocation_params` field of the run rather than inputs.                                                                                                                                                                                                                                                                                                                               |
| `remove_extra_fields`       | `Object`                                                                   | Removes any field not defined in the schema for this target object.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |

## Chat Model prebuilt schema

The main use case for transformations is to simplify collecting production traces into datasets in a format that can be standardized across model providers for usage in evaluations / few shot prompting / etc downstream.

To simplify setup of transformations for our end users, LangSmith offers a pre-defined schema that will do the following:

* Extract messages from your collected runs and transform them into the openai standard format, which makes them compatible all LangChain ChatModels and most model providers' SDK for downstream evaluation and experimentation
* Extract any tools used by your LLM and add them to your example's input to be used for reproducability in downstream evaluation

<Check>
  Users who want to iterate on their system prompts often also add the Remove System Messages transformation on their input messages when using our Chat Model schema, which will prevent you from saving the system prompt to your dataset.
</Check>

### Compatibility

The LLM run collection schema is built to collect data from LangChain [`BaseChatModel`](https://reference.langchain.com/python/langchain_core/language_models/#langchain_core.language_models.chat_models.BaseChatModel) runs or traced runs from the [LangSmith OpenAI wrapper](/langsmith/annotate-code#wrap-the-openai-client).

Please reach out to [support@langchain.dev](mailto:support@langchain.dev) if you have an LLM run you are tracing that is not compatible and we can extend support.

If you want to apply transformations to other sorts of runs (for example, representing LangGraph state with message history), please define your schema directly and manually add the relevant transformations.

### Enablement

When adding a run from a tracing project or annotation queue to a dataset, if it has the LLM run type, we will apply the Chat Model schema by default.

For enablement on new datasets, see our [dataset management how-to guide](/langsmith/manage-datasets-in-application).

### Specs

For the full API specs of the prebuilt schema, see the below sections:

#### Input schema

```json  theme={null}
{
  "type": "object",
  "properties": {
    "messages": {
      "type": "array",
      "items": {
        "$ref": "https://api.smith.langchain.com/public/schemas/v1/message.json"
      }
    },
    "tools": {
      "type": "array",
      "items": {
        "$ref": "https://api.smith.langchain.com/public/schemas/v1/tooldef.json"
      }
    }
  },
  "required": ["messages"]
}
```

#### Output schema

```json  theme={null}
{
  "type": "object",
  "properties": {
    "message": {
      "$ref": "https://api.smith.langchain.com/public/schemas/v1/message.json"
    }
  },
  "required": ["message"]
}
```

#### Transformations

And the transformations look as follows:

```json  theme={null}
[
  {
    "path": ["inputs"],
    "transformation_type": "remove_extra_fields"
  },
  {
    "path": ["inputs", "messages"],
    "transformation_type": "convert_to_openai_message"
  },
  {
    "path": ["inputs", "tools"],
    "transformation_type": "convert_to_openai_tool"
  },
  {
    "path": ["outputs"],
    "transformation_type": "remove_extra_fields"
  },
  {
    "path": ["outputs", "message"],
    "transformation_type": "convert_to_openai_message"
  }
]
```

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/dataset-transformations.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to define a target function to evaluate
Source: https://docs.langchain.com/langsmith/define-target-function



There are three main pieces need to run an evaluation:

1. A [dataset](/langsmith/evaluation-concepts#datasets) of test inputs and expected outputs.
2. A target function which is what you're evaluating.
3. [Evaluators](/langsmith/evaluation-concepts#evaluators) that score your target function's outputs.

This guide shows you how to define the target function depending on the part of your application you are evaluating. See here for [how to create a dataset](/langsmith/manage-datasets-programmatically) and [how to define evaluators](/langsmith/code-evaluator), and here for an [end-to-end example of running an evaluation](/langsmith/evaluate-llm-application).

## Target function signature

In order to evaluate an application in code, we need a way to run the application. When using `evaluate()` ([Python](https://docs.smith.langchain.com/reference/python/client/langsmith.client.Client#langsmith.client.Client.evaluate)/[TypeScript](https://docs.smith.langchain.com/reference/js/functions/evaluation.evaluate))we'll do this by passing in a *target function* argument. This is a function that takes in a dataset [Example's](/langsmith/evaluation-concepts#examples) inputs and returns the application output as a dict. Within this function we can call our application however we'd like. We can also format the output however we'd like. The key is that any evaluator functions we define should work with the output format we return in our target function.

```python  theme={null}
from langsmith import Client

# 'inputs' will come from your dataset.
def dummy_target(inputs: dict) -> dict:
    return {"foo": 1, "bar": "two"}

# 'inputs' will come from your dataset.
# 'outputs' will come from your target function.
def evaluator_one(inputs: dict, outputs: dict) -> bool:
    return outputs["foo"] == 2

def evaluator_two(inputs: dict, outputs: dict) -> bool:
    return len(outputs["bar"]) < 3

client = Client()
results = client.evaluate(
    dummy_target,  # <-- target function
    data="your-dataset-name",
    evaluators=[evaluator_one, evaluator_two],
    ...
)
```

<Check>
  `evaluate()` will automatically trace your target function. This means that if you run any traceable code within your target function, this will also be traced as child runs of the target trace.
</Check>

## Example: Single LLM call

<CodeGroup>
  ```python Python theme={null}
  from langsmith import wrappers
  from openai import OpenAI

  # Optionally wrap the OpenAI client to automatically
  # trace all model calls.
  oai_client = wrappers.wrap_openai(OpenAI())

  def target(inputs: dict) -> dict:
    # This assumes your dataset has inputs with a 'messages' key.
    # You can update to match your dataset schema.
    messages = inputs["messages"]
    response = oai_client.chat.completions.create(
        messages=messages,
        model="gpt-4o-mini",
    )
    return {"answer": response.choices[0].message.content}
  ```

  ```typescript TypeScript theme={null}
  import OpenAI from 'openai';
  import { wrapOpenAI } from "langsmith/wrappers";

  const client = wrapOpenAI(new OpenAI());

  // This is the function you will evaluate.
  const target = async(inputs) => {
    // This assumes your dataset has inputs with a `messages` key
    const messages = inputs.messages;
    const response = await client.chat.completions.create({
        messages: messages,
        model: 'gpt-4o-mini',
    });
    return { answer: response.choices[0].message.content };
  }
  ```

  ```python Python (LangChain) theme={null}
  from langchain.chat_models import init_chat_model

  model = init_chat_model("gpt-4o-mini")

  def target(inputs: dict) -> dict:
    # This assumes your dataset has inputs with a `messages` key
    messages = inputs["messages"]
    response = model.invoke(messages)
    return {"answer": response.content}
  ```

  ```typescript TypeScript (LangChain) theme={null}
  import { ChatOpenAI } from '@langchain/openai';

  // This is the function you will evaluate.
  const target = async(inputs) => {
    // This assumes your dataset has inputs with a `messages` key
    const messages = inputs.messages;
    const model = new ChatOpenAI({ model: "gpt-4o-mini" });
    const response = await model.invoke(messages);
    return {"answer": response.content};
  }
  ```
</CodeGroup>

## Example: Non-LLM component

<CodeGroup>
  ```python Python theme={null}
  from langsmith import traceable

  # Optionally decorate with '@traceable' to trace all invocations of this function.
  @traceable
  def calculator_tool(operation: str, number1: float, number2: float) -> str:
    if operation == "add":
        return str(number1 + number2)
    elif operation == "subtract":
        return str(number1 - number2)
    elif operation == "multiply":
        return str(number1 * number2)
    elif operation == "divide":
        return str(number1 / number2)
    else:
        raise ValueError(f"Unrecognized operation: {operation}.")

  # This is the function you will evaluate.
  def target(inputs: dict) -> dict:
    # This assumes your dataset has inputs with `operation`, `num1`, and `num2` keys.
    operation = inputs["operation"]
    number1 = inputs["num1"]
    number2 = inputs["num2"]
    result = calculator_tool(operation, number1, number2)
    return {"result": result}
  ```

  ```typescript TypeScript theme={null}
  import { traceable } from "langsmith/traceable";

  // Optionally wrap in 'traceable' to trace all invocations of this function.
  const calculatorTool = traceable(async ({ operation, number1, number2 }) => {
  // Functions must return strings
  if (operation === "add") {
    return (number1 + number2).toString();
  } else if (operation === "subtract") {
    return (number1 - number2).toString();
  } else if (operation === "multiply") {
    return (number1 * number2).toString();
  } else if (operation === "divide") {
    return (number1 / number2).toString();
  } else {
    throw new Error("Invalid operation.");
  }
  });

  // This is the function you will evaluate.
  const target = async (inputs) => {
  // This assumes your dataset has inputs with `operation`, `num1`, and `num2` keys
  const result = await calculatorTool.invoke({
    operation: inputs.operation,
    number1: inputs.num1,
    number2: inputs.num2,
  });
  return { result };
  }
  ```
</CodeGroup>

## Example: Application or agent

<CodeGroup>
  ```python Python theme={null}
  from my_agent import agent

        # This is the function you will evaluate.
  def target(inputs: dict) -> dict:
    # This assumes your dataset has inputs with a `messages` key
    messages = inputs["messages"]
    # Replace `invoke` with whatever you use to call your agent
    response = agent.invoke({"messages": messages})
    # This assumes your agent output is in the right format
    return response
  ```

  ```typescript TypeScript theme={null}
  import { agent } from 'my_agent';

  // This is the function you will evaluate.
  const target = async(inputs) => {
  // This assumes your dataset has inputs with a `messages` key
  const messages = inputs.messages;
  // Replace `invoke` with whatever you use to call your agent
  const response = await agent.invoke({ messages });
  // This assumes your agent output is in the right format
  return response;
  }
  ```
</CodeGroup>

<Check>
  If you have a LangGraph/LangChain agent that accepts the inputs defined in your dataset and that returns the output format you want to use in your evaluators, you can pass that object in as the target directly:

  ```python  theme={null}
  from my_agent import agent
  from langsmith import Client
  client = Client()
  client.evaluate(agent, ...)
  ```
</Check>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/define-target-function.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Set up hybrid LangSmith
Source: https://docs.langchain.com/langsmith/deploy-hybrid



<Info>
  **Important**
  The Hybrid deployment option requires an [Enterprise](https://langchain.com/pricing) plan.
</Info>

The [**hybrid**](/langsmith/hybrid) model lets you run the [data plane](/langsmith/data-plane)—your Agent Server deployments and agent workloads—in your own cloud, while LangChain hosts and manages the [control plane](/langsmith/control-plane) (the LangSmith UI and orchestration). This setup gives you the flexibility of self-hosting your runtime environments with the convenience of a managed LangSmith instance.

The following steps describe how to connect your self-hosted data plane to the managed LangSmith control plane.

## Kubernetes

### Prerequisites

1. `KEDA` is installed on your cluster.
   ```bash  theme={null}
     helm repo add kedacore https://kedacore.github.io/charts
     helm install keda kedacore/keda --namespace keda --create-namespace
   ```
2. A valid `Ingress` controller is installed on your cluster. For more information about configuring ingress for your deployment, refer to [Create an ingress for installations](/langsmith/self-host-ingress). We highly recommend using the modern [Gateway API](/langsmith/self-host-ingress#option-2%3A-gateway-api) in a production setup.
3. If you plan to have the listener watch multiple namespaces, you **MUST** use the [Gateway API](/langsmith/self-host-ingress#option-2%3A-gateway-api) or an [Istio Gateway](/langsmith/self-host-ingress#option-3%3A-istio-gateway) instead of the [standard ingress](/langsmith/self-host-ingress#option-1%3A-standard-ingress) resource. A standard ingress resource can only route traffic to services in the same namespace, whereas a Gateway or Istio Gateway can route traffic to services across multiple namespaces.
4. You have slack space in your cluster for multiple deployments. `Cluster-Autoscaler` is recommended to automatically provision new nodes.
5. You will need to enable egress to two control plane URLs. The listener polls these endpoints for deployments:
   * [https://api.host.langchain.com](https://api.host.langchain.com)
   * [https://api.smith.langchain.com](https://api.smith.langchain.com)

### Setup

1. Provide your LangSmith organization ID to us. Your LangSmith organization will be configured to deploy the data plane in your cloud.
2. Create a listener from the LangSmith UI. The `Listener` data model is configured for the actual ["listener" application](/langsmith/data-plane#”listener”-application).
   1. In the left-hand navigation, select `Deployments` > `Listeners`.
   2. In the top-right of the page, select `+ Create Listener`.
   3. Enter a unique `Compute ID` for the listener. The `Compute ID` is a user-defined identifier that should be unique across all listeners in the current LangSmith workspace. The `Compute ID` is displayed to end users when they are creating a new deployment. Ensure that the `Compute ID` provides context to the end user about where their Agent Server deployments will be deployed to. For example, a `Compute ID` can be set to `k8s-cluster-name-dev-01`. In this example, the name of the Kubernetes cluster is `k8s-cluster-name`, `dev` denotes that the cluster is reserved for "development" workloads, and `01` is a numerical suffix to reduce naming collisions.
   4. Enter one or more Kubernetes namespaces. Later, the "listener" application will be configured to deploy to each of these namespaces.
   5. In the top-right on the page, select `Submit`.
   6. After the listener is created, copy the listener ID. You will use it later when installing the actual "listener" application in the Kubernetes cluster (step 5).
   <Info>
     **Important**
     Creating a listener from the LangSmith UI does not install the "listener" application in the Kubernetes cluster.
   </Info>
3. A [Helm chart](https://github.com/langchain-ai/helm/tree/main/charts/langgraph-dataplane) is provided to install the necesssary components in your Kubernetes cluster.
   * `langgraph-dataplane-listener`: This is a service that listens to LangChain's [control plane](/langsmith/control-plane) for changes to your deployments and creates/updates downstream CRDs. This is the ["listener" application](/langsmith/data-plane#”listener”-application).
   * `LangGraphPlatform CRD`: A CRD for LangSmith Deployment. This contains the spec for managing an instance of a LangSmith Deployment.
   * `langgraph-dataplane-operator`: This operator handles changes to your LangSmith CRDs.
   * `langgraph-dataplane-redis`: A Redis instance is used by the `langgraph-dataplane-listener` to manage various tasks (mainly creating and deleting deployments).
4. Configure your `langgraph-dataplane-values.yaml` file.
   ```bash  theme={null}
     config:
       langsmithApiKey: "" # API Key of your Workspace
       langsmithWorkspaceId: "" # Workspace ID
       hostBackendUrl: "https://api.host.langchain.com" # Only override this if on EU
       smithBackendUrl: "https://api.smith.langchain.com" # Only override this if on EU
       langgraphListenerId: "" # Listener ID from Step 2f
       watchNamespaces: "" # comma-separated list of Kubernetes namespaces that the listener and operator will deploy to
       enableLGPDeploymentHealthCheck: true # enable/disable health check step for deployments

     ingress:
       hostname: "" # specify a hostname that will be configured for all deployments

     operator:
       enabled: true
       createCRDs: true # set this to `false` if the CRD has been previously installed in the current Kubernetes cluster
   ```
   * `config.langsmithApiKey`: The `langgraph-listener` deployment authenticates with LangChain's LangGraph control plane API with the `langsmithApiKey`.
   * `config.langsmithWorkspaceId`: The `langgraph-listener` deployment is coupled to Agent Server deployments in the LangSmith workspace. In other words, the `langgraph-listener` deployment can only manage Agent Server deployments in the specified LangSmith workspace ID.
   * `config.langgraphListenerId`: In addition to being coupled with a LangSmith workspace, the `langgraph-listener` deployment is also coupled to a listener. When a new Agent Server deployment is created, it is automatically coupled to a `langgraphListenerId`. Specifying `langgraphListenerId` ensures that the `langgraph-listener` deployment can only manage Agent Server deployments that are coupled to `langgraphListenerId`.
   * `config.watchNamespaces`: A comma-separated list of Kubernetes namespaces that the `langgraph-listener` deployment will deploy to. This list should match the list of namespaces specified in step 2d.
   * `config.enableLGPDeploymentHealthCheck`: To disable the Agent Server health check, set this to `false`.
   * `ingress.hostname`: As part of the deployment workflow, the `langgraph-listener` deployment attempts to call the Agent Server health check endpoint (`GET /ok`) to verify that the application has started up correctly. A typical setup involves creating a shared DNS record or domain for Agent Server deployments. This is not managed by LangSmith. Once created, set `ingress.hostname` to the domain, which will be used to complete the health check.
   * `operator.createCRDs`: Set this value to `false` if the Kubernetes cluster already has the `LangGraphPlatform CRD` installed. During installation, an error will occur if the CRD is already installed. This situation may occur if multiple listeners are deployed on the same Kubernetes cluster.
5. Deploy `langgraph-dataplane` Helm chart.
   ```bash  theme={null}
     helm repo add langchain https://langchain-ai.github.io/helm/
     helm repo update
     helm upgrade -i langgraph-dataplane langchain/langgraph-dataplane --values langgraph-dataplane-values.yaml --wait --debug
   ```
6. If successful, you will see three services start up in your namespace.

   ```bash  theme={null}
     NAME                                            READY   STATUS              RESTARTS   AGE
     langgraph-dataplane-listener-6dd4749445-zjmr4   0/1     ContainerCreating   0          26s
     langgraph-dataplane-operator-6b88879f9b-t76gk   1/1     Running             0          26s
     langgraph-dataplane-redis-0                     1/1     Running             0          25s
   ```

   Your hybrid infrastructure is now ready to create deployments.

### Configuring additional data planes in the same cluster

To create a data plane in a different namespace in the same cluster, repeat the above steps and pass a `-n` option to `helm upgrade` to specify a different namespace.

**When installing multiple data planes in the same cluster, it is very important to follow the rules below:**

1. The `config.watchNamespaces` list should never intersect with other installations `config.watchNamespaces`. For example, if installation A is watching namespaces `foo,bar`, installation B cannot watch either `foo` or `bar`. Multiple operators or listeners watching the same namespace will lead to unexpected behavior. This means that multiple LangSmith workspaces cannot deploy to the same namespace! Please review the [cluster organization](/langsmith/hybrid#kubernetes-cluster-organization) section to understand this better.
2. It is required to use the [Gateway API](/langsmith/self-host-ingress#option-2%3A-gateway-api) or an [Istio Gateway](/langsmith/self-host-ingress#option-3%3A-istio-gateway). Relying on the [standard ingress](/langsmith/self-host-ingress#option-1%3A-standard-ingress) resource can cause conflicts with Ingress objects created by other data planes in the same cluster. Because behavior in these cases depends on the specific ingress controller, this may result in unpredictable or undesired outcomes.

## Next steps

Once your infrastructure is set up, you're ready to deploy applications. See the deployment guides in the [Deployment tab](/langsmith/deployments) for instructions on building and deploying your applications.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/deploy-hybrid.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Enable LangSmith Deployment
Source: https://docs.langchain.com/langsmith/deploy-self-hosted-full-platform



This guide shows you how to enable **LangSmith Deployment** on your [self-hosted LangSmith instance](/langsmith/kubernetes). This adds a [control plane](/langsmith/control-plane) and [data plane](/langsmith/data-plane) that let you deploy, scale, and manage agents and applications directly through the LangSmith UI.

After completing this guide, you'll have access to LangSmith [Observability](/langsmith/observability), [Evaluation](/langsmith/evaluation), and [Deployment](/langsmith/deployments).

<Info>**Important**<br /> Enabling LangSmith Deployment requires an [Enterprise](https://langchain.com/pricing) plan. </Info>

<Note>
  **This setup page is for enabling [LangSmith Deployment](/langsmith/deployments) on an existing LangSmith instance.**

  Review the [self-hosted options](/langsmith/self-hosted) to understand:

  * [LangSmith (observability)](/langsmith/self-hosted#langsmith): What you should install first.
  * [LangSmith Deployment](/langsmith/self-hosted#langsmith-deployment): What this guide enables.
  * [Standalone Server](/langsmith/self-hosted#standalone-server): Lightweight alternative without the UI.
</Note>

## Overview

This guide builds on top of the [Kubernetes installation guide](/langsmith/kubernetes). **You must complete that guide first** before continuing. This page covers the additional setup steps required to enable LangSmith Deployment:

* Installing the LangGraph operator
* Configuring your ingress
* Connecting to the control plane

## Prerequisites

1. You are using Kubernetes.
2. You have an instance of [self-hosted LangSmith](/langsmith/kubernetes) running.
3. `KEDA` is installed on your cluster.

```bash  theme={null}
  helm repo add kedacore https://kedacore.github.io/charts
  helm install keda kedacore/keda --namespace keda --create-namespace
```

6. Ingress Configuration
   1. You must set up an ingress, gateway, or use Istio for your LangSmith instance. All agents will be deployed as Kubernetes services behind this ingress. Use this guide to [set up an ingress](/langsmith/self-host-ingress) for your instance.
7. You must have slack space in your cluster for multiple deployments. `Cluster-Autoscaler` is recommended to automatically provision new nodes.
8. A valid Dynamic PV provisioner or PVs available on your cluster. You can verify this by running:

```bash  theme={null}
  kubectl get storageclass
```

9. Egress to `https://beacon.langchain.com` from your network. This is required for license verification and usage reporting if not running in air-gapped mode. See the [Egress documentation](/langsmith/self-host-egress) for more details.

## Setup

1. As part of configuring your self-hosted LangSmith instance, you enable the `deployment` option. This will provision a few key resources.
   1. `listener`: This is a service that listens to the [control plane](/langsmith/control-plane) for changes to your deployments and creates/updates downstream CRDs.
   2. `LangGraphPlatform CRD`: A CRD for LangSmith Deployment. This contains the spec for managing an instance of a LangSmith deployment.
   3. `operator`: This operator handles changes to your LangSmith CRDs.
   4. `host-backend`: This is the [control plane](/langsmith/control-plane).

<Note>
  As of v0.12.0, the `langgraphPlatform` option is deprecated. Use `config.deployment` for any version after v0.12.0.
</Note>

2. Two additional images will be used by the chart. Use the images that are specified in the latest release.

```bash  theme={null}
  hostBackendImage:
    repository: "docker.io/langchain/hosted-langserve-backend"
    pullPolicy: IfNotPresent
  operatorImage:
    repository: "docker.io/langchain/langgraph-operator"
    pullPolicy: IfNotPresent
```

3. In your config file for langsmith (usually `langsmith_config.yaml`), enable the `deployment` option. Note that you must also have a valid ingress setup:

```bash  theme={null}
  config:
    deployment:
      enabled: true
    # As of v0.12.0, this section is deprecated. Use config.deployment for any version after v0.12.0.
    langgraphPlatform:
      enabled: true
      langgraphPlatformLicenseKey: "YOUR_LANGGRAPH_PLATFORM_LICENSE_KEY"
```

4. In your `values.yaml` file, configure the `hostBackendImage` and `operatorImage` options (if you need to mirror images). If you are using a private container registry that requires authentication, you must also configure `imagePullSecrets`, refer to [Configure authentication for private registries](#optional-configure-authentication-for-private-registries).

5. You can also configure base templates for your agents by overriding the base templates [here](https://github.com/langchain-ai/helm/blob/main/charts/langsmith/values.yaml#L898).

   Your self-hosted infrastructure is now ready to create deployments.

## (Optional) Configure additional data planes

In addition to the existing data plane already created in the above steps, you can create more data planes that reside in different Kubernetes clusters or the same cluster in a different namespace.

### Prerequisites

1. Read through the cluster organization guide in the [hybrid deployment documentation](/langsmith/hybrid#listeners) to understand how to best organize this for your use case.
2. Verify the prerequisites mentioned in the [hybrid](/langsmith/deploy-hybrid#prerequisites) section are met for the new cluster. Note that in step 5 of [this section](/langsmith/deploy-hybrid#prerequisites), you need to enable egress to your [self-hosted LangSmith instance](/langsmith/self-host-usage#configuring-the-application-you-want-to-use-with-langsmith) instead of [https://api.host.langchain.com](https://api.host.langchain.com) and [https://api.smith.langchain.com](https://api.smith.langchain.com).
3. Run the following commands against your LangSmith Postgres instance to enable this feature. This is the [Postgres instance](/langsmith/kubernetes#validate-your-deployment%3A) that comes with your self-hosted LangSmith setup.

```
update organizations set config = config || '{"enable_lgp_listeners_page": true}' where id = '<org id here>';
update tenants set config = config || '{"langgraph_remote_reconciler_enabled": true}' where id = '<workspace id here>';
```

Note down the workspace ID you choose as you will need this for future steps.

### Deploying to a different cluster

1. Follow steps 2-6 in the [hybrid setup guide](/langsmith/deploy-hybrid#setup). The `config.langsmithWorkspaceId` value should be set to the workspace ID you noted in the prerequisites.
2. To deploy more than one data plane to the cluster, follow the rules listed [here](/langsmith/deploy-hybrid#configuring-additional-data-planes-in-the-same-cluster).

### Deploying to a different namespace in the same cluster

1. You will need to make some modifications to the `langsmith_config.yaml` file you created in step 3 of the [above setup instructions](/langsmith/deploy-self-hosted-full-platform#setup):
   * Set the `operator.watchNamespaces` field to the current namespace your self-hosted LangSmith instance is running in. This is to prevent clashes with the operator that will be added as part of the new data plane.
   * It is required to use the [Gateway API](/langsmith/self-host-ingress#option-2%3A-gateway-api) or an [Istio Gateway](/langsmith/self-host-ingress#option-3%3A-istio-gateway). Please adjust your `langsmith_config.yaml` file accordingly.
2. Run a `helm upgrade` to update your self hosted LangSmith instance with the new config.
3. Follow steps 2-6 in the [hybrid setup guide](/langsmith/deploy-hybrid#setup). The `config.langsmithWorkspaceId` value should be set to the workspace ID you noted in the prerequisites. Remember that `config.watchNamespaces` should be set to different namespaces than the one used by the existing data plane!

## (Optional) Configure authentication for private registries

If your [Agent Server deployments](/langsmith/agent-server) will use images from private container registries (e.g., AWS ECR, Azure ACR, GCP Artifact Registry, private Docker registry), configure image pull secrets. This is a one-time infrastructure configuration that allows all deployments to automatically authenticate with your private registry.

**Step 1: Create a Kubernetes image pull secret**

```bash  theme={null}
kubectl create secret docker-registry langsmith-registry-secret \
    --docker-server=myregistry.com \
    --docker-username=your-username \
    --docker-password=your-password \
    --docker-email=your-email@example.com \
    -n langsmith
```

Replace the values with your registry credentials:

* `myregistry.com`: Your registry URL
* `your-username`: Your registry username
* `your-password`: Your registry password or access token
* `langsmith`: The Kubernetes namespace where LangSmith is installed

**Step 2: Configure the secret in your `values.yaml`**

```yaml  theme={null}
images:
    imagePullSecrets:
    - name: langsmith-registry-secret
```

**Step 3: Apply during Helm installation/upgrade**

When you deploy or upgrade your LangSmith instance using Helm, this configuration will be applied. All user deployments created through the LangSmith UI will automatically inherit these registry credentials.

For registry-specific authentication methods (AWS ECR, Azure ACR, GCP Artifact Registry, etc.), refer to the [Kubernetes documentation on pulling images from private registries](https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/).

## Next steps

Once your infrastructure is set up, you're ready to deploy applications. See the deployment guides in the [Deployment tab](/langsmith/deployments) for instructions on building and deploying your applications.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/deploy-self-hosted-full-platform.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Self-host standalone servers
Source: https://docs.langchain.com/langsmith/deploy-standalone-server



This guide shows you how to deploy **standalone <Tooltip tip="The server that runs your LangGraph applications.">Agent Servers</Tooltip>** without the LangSmith UI or control plane. This is the most lightweight self-hosting option for running one or a few agents as independent services.

<Warning>
  This deployment option provides flexibility but requires you to manage your own infrastructure and configuration.

  For production workloads, consider [self-hosting the full LangSmith platform](/langsmith/self-hosted) or [deploying with the control plane](/langsmith/deploy-with-control-plane), which offer standardized deployment patterns and UI-based management.
</Warning>

<Note>
  **This is the setup page for deploying Agent Servers directly without the LangSmith platform.**

  Review the [self-hosted options](/langsmith/self-hosted) to understand:

  * [Standalone Server](/langsmith/self-hosted#standalone-server): What this guide covers (no UI, just servers).
  * [LangSmith](/langsmith/self-hosted#langsmith): For the full LangSmith platform with UI.
  * [LangSmith Deployment](/langsmith/self-hosted#langsmith-deployment): For UI-based deployment management.

  Before continuing, review the [standalone server overview](/langsmith/self-hosted#standalone-server).
</Note>

## Prerequisites

1. Use the [LangGraph CLI](/langsmith/cli) to [test your application locally](/langsmith/local-server).
2. Use the [LangGraph CLI](/langsmith/cli) to build a Docker image (i.e. `langgraph build`).
3. The following environment variables are needed for a data plane deployment.
4. `REDIS_URI`: Connection details to a Redis instance. Redis will be used as a pub-sub broker to enable streaming real time output from background runs. The value of `REDIS_URI` must be a valid [Redis connection URI](https://redis-py.readthedocs.io/en/stable/connections.html#redis.Redis.from_url).

   <Note>
     **Shared Redis Instance**
     Multiple self-hosted deployments can share the same Redis instance. For example, for `Deployment A`, `REDIS_URI` can be set to `redis://<hostname_1>:<port>/1` and for `Deployment B`, `REDIS_URI` can be set to `redis://<hostname_1>:<port>/2`.

     `1` and `2` are different database numbers within the same instance, but `<hostname_1>` is shared. **The same database number cannot be used for separate deployments**.
   </Note>
5. `DATABASE_URI`: Postgres connection details. Postgres will be used to store assistants, threads, runs, persist thread state and long term memory, and to manage the state of the background task queue with 'exactly once' semantics. The value of `DATABASE_URI` must be a valid [Postgres connection URI](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING-URIS).

   <Note>
     **Shared Postgres Instance**
     Multiple self-hosted deployments can share the same Postgres instance. For example, for `Deployment A`, `DATABASE_URI` can be set to `postgres://<user>:<password>@/<database_name_1>?host=<hostname_1>` and for `Deployment B`, `DATABASE_URI` can be set to `postgres://<user>:<password>@/<database_name_2>?host=<hostname_1>`.

     `<database_name_1>` and `database_name_2` are different databases within the same instance, but `<hostname_1>` is shared. **The same database cannot be used for separate deployments**.
   </Note>
6. `LANGSMITH_API_KEY`: LangSmith API key.
7. `LANGGRAPH_CLOUD_LICENSE_KEY`: LangSmith license key. This will be used to authenticate ONCE at server start up.
8. `LANGSMITH_ENDPOINT`: To send traces to a [self-hosted LangSmith](/langsmith/self-hosted) instance, set `LANGSMITH_ENDPOINT` to the hostname of the self-hosted LangSmith instance.
9. Egress to `https://beacon.langchain.com` from your network. This is required for license verification and usage reporting if not running in air-gapped mode. See the [Egress documentation](/langsmith/self-host-egress) for more details.

<a id="helm" />

## Kubernetes

Use this [Helm chart](https://github.com/langchain-ai/helm/blob/main/charts/langgraph-cloud/README.md) to deploy an Agent Server to a Kubernetes cluster.

## Docker

Run the following `docker` command:

```shell  theme={null}
docker run \
    --env-file .env \
    -p 8123:8000 \
    -e REDIS_URI="foo" \
    -e DATABASE_URI="bar" \
    -e LANGSMITH_API_KEY="baz" \
    my-image
```

<Note>
  * You need to replace `my-image` with the name of the image you built in the prerequisite steps (from `langgraph build`)

  and you should provide appropriate values for `REDIS_URI`, `DATABASE_URI`, and `LANGSMITH_API_KEY`.

  * If your application requires additional environment variables, you can pass them in a similar way.
</Note>

## Docker Compose

Docker Compose YAML file:

```yml  theme={null}
volumes:
    langgraph-data:
        driver: local
services:
    langgraph-redis:
        image: redis:6
        healthcheck:
            test: redis-cli ping
            interval: 5s
            timeout: 1s
            retries: 5
    langgraph-postgres:
        image: postgres:16
        ports:
            - "5432:5432"
        environment:
            POSTGRES_DB: postgres
            POSTGRES_USER: postgres
            POSTGRES_PASSWORD: postgres
        volumes:
            - langgraph-data:/var/lib/postgresql/data
        healthcheck:
            test: pg_isready -U postgres
            start_period: 10s
            timeout: 1s
            retries: 5
            interval: 5s
    langgraph-api:
        image: ${IMAGE_NAME}
        ports:
            - "8123:8000"
        depends_on:
            langgraph-redis:
                condition: service_healthy
            langgraph-postgres:
                condition: service_healthy
        env_file:
            - .env
        environment:
            REDIS_URI: redis://langgraph-redis:6379
            LANGSMITH_API_KEY: ${LANGSMITH_API_KEY}
            DATABASE_URI: postgres://postgres:postgres@langgraph-postgres:5432/postgres?sslmode=disable
```

You can run the command `docker compose up` with this Docker Compose file in the same folder.

This will launch an Agent Server on port `8123` (if you want to change this, you can change this by changing the ports in the `langgraph-api` volume). You can test if the application is healthy by running:

```shell  theme={null}
curl --request GET --url 0.0.0.0:8123/ok
```

Assuming everything is running correctly, you should see a response like:

```shell  theme={null}
{"ok":true}
```

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/deploy-standalone-server.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# LangSmith on Cloud
Source: https://docs.langchain.com/langsmith/deploy-to-cloud



This is the comprehensive setup and management guide for deploying applications to LangSmith Cloud.

<Callout icon="zap" color="#4F46E5" iconType="regular">
  **Looking for a quick setup?** Try the [quickstart guide](/langsmith/deployment-quickstart) first.
</Callout>

Before setting up, review the [Cloud overview page](/langsmith/cloud) to understand the Cloud hosting model.

## Prerequisites

1. LangSmith applications are deployed from GitHub repositories. Configure and upload a LangSmith application to a GitHub repository in order to deploy it to LangSmith.
2. [Verify that the LangGraph API runs locally](/langsmith/local-server). If the API does not run successfully (i.e. `langgraph dev`), deploying to LangSmith will fail as well.

## Create new deployment

Starting from the <a href="https://smith.langchain.com/" target="_blank">LangSmith UI</a>:

1. In the left-hand navigation panel, select **Deployments**, which contains a list of existing deployments.
2. In the top-right corner, select **+ New Deployment** to create a new deployment.
3. In the `Create New Deployment` panel, fill out the required fields.
4. `Deployment details`
5. Select `Import from GitHub` and follow the GitHub OAuth workflow to install and authorize LangChain's `hosted-langserve` GitHub app to access the selected repositories. After installation is complete, return to the `Create New Deployment` panel and select the GitHub repository to deploy from the dropdown menu. **Note**: The GitHub user installing LangChain's `hosted-langserve` GitHub app must be an [owner](https://docs.github.com/en/organizations/managing-peoples-access-to-your-organization-with-roles/roles-in-an-organization#organization-owners) of the organization or account.
6. Specify a name for the deployment.
7. Specify the desired `Git Branch`. A deployment is linked to a branch. When a new revision is created, code for the linked branch will be deployed. The branch can be updated later in the [Deployment Settings](#deployment-settings).
8. Specify the full path to the [LangGraph API config file](/langsmith/cli#configuration-file) including the file name. For example, if the file `langgraph.json` is in the root of the repository, simply specify `langgraph.json`.
9. Check/uncheck checkbox to `Automatically update deployment on push to branch`. If checked, the deployment will automatically be updated when changes are pushed to the specified `Git Branch`. This setting can be enabled/disabled later in the [Deployment Settings](#deployment-settings).
10. Select the desired `Deployment Type`.
11. `Development` deployments are meant for non-production use cases and are provisioned with minimal resources.
12. `Production` deployments can serve up to 500 requests/second and are provisioned with highly available storage with automatic backups.
13. Determine if the deployment should be `Shareable through Studio`.
14. If unchecked, the deployment will only be accessible with a valid LangSmith API key for the workspace.
15. If checked, the deployment will be accessible through Studio to any LangSmith user. A direct URL to Studio for the deployment will be provided to share with other LangSmith users.
16. Specify `Environment Variables` and secrets. See the [Environment Variables reference](/langsmith/env-var) to configure additional variables for the deployment.
17. Sensitive values such as API keys (e.g. `OPENAI_API_KEY`) should be specified as secrets.
18. Additional non-secret environment variables can be specified as well.
19. A new LangSmith `Tracing Project` is automatically created with the same name as the deployment.
20. In the top-right corner, select `Submit`. After a few seconds, the `Deployment` view appears and the new deployment will be queued for provisioning.

## Create new revision

When [creating a new deployment](#create-new-deployment), a new revision is created by default. Subsequent revisions can be created to deploy new code changes.

Starting from the <a href="https://smith.langchain.com/" target="_blank">LangSmith UI</a>...

1. In the left-hand navigation panel, select **Deployments**, which contains a list of existing deployments.
2. Select an existing deployment to create a new revision for.
3. In the `Deployment` view, in the top-right corner, select `+ New Revision`.
4. In the `New Revision` modal, fill out the required fields.
5. Specify the full path to the [LangGraph API config file](/langsmith/cli#configuration-file) including the file name. For example, if the file `langgraph.json` is in the root of the repository, simply specify `langgraph.json`.
6. Determine if the deployment should be `Shareable through Studio`.
7. If unchecked, the deployment will only be accessible with a valid LangSmith API key for the workspace.
8. If checked, the deployment will be accessible through Studio to any LangSmith user. A direct URL to Studio for the deployment will be provided to share with other LangSmith users.
9. Specify `Environment Variables` and secrets. Existing secrets and environment variables are prepopulated. See the [Environment Variables reference](/langsmith/env-var) to configure additional variables for the revision.
10. Add new secrets or environment variables.
11. Remove existing secrets or environment variables.
12. Update the value of existing secrets or environment variables.
13. Select `Submit`. After a few seconds, the `New Revision` modal will close and the new revision will be queued for deployment.

## View build and server logs

Build and server logs are available for each revision.

Starting from the **Deployments** view:

1. Select the desired revision from the `Revisions` table. A panel slides open from the right-hand side and the `Build` tab is selected by default, which displays build logs for the revision.
2. In the panel, select the `Server` tab to view server logs for the revision. Server logs are only available after a revision has been deployed.
3. Within the `Server` tab, adjust the date/time range picker as needed. By default, the date/time range picker is set to the `Last 7 days`.

## View deployment metrics

Starting from the <a href="https://smith.langchain.com/" target="_blank">LangSmith UI</a>...

1. In the left-hand navigation panel, select **Deployments**, which contains a list of existing deployments.
2. Select an existing deployment to monitor.
3. Select the `Monitoring` tab to view the deployment metrics. See a list of [all available metrics](/langsmith/control-plane#monitoring).
4. Within the `Monitoring` tab, use the date/time range picker as needed. By default, the date/time range picker is set to the `Last 15 minutes`.

## Interrupt revision

Interrupting a revision will stop deployment of the revision.

<Warning>
  **Undefined Behavior**
  Interrupted revisions have undefined behavior. This is only useful if you need to deploy a new revision and you already have a revision "stuck" in progress. In the future, this feature may be removed.
</Warning>

Starting from the **Deployments** view:

1. Select the menu icon (three dots) on the right-hand side of the row for the desired revision from the `Revisions` table.
2. Select `Interrupt` from the menu.
3. A modal will appear. Review the confirmation message. Select `Interrupt revision`.

## Delete deployment

Starting from the <a href="https://smith.langchain.com/" target="_blank">LangSmith UI</a>...

1. In the left-hand navigation panel, select **Deployments**, which contains a list of existing deployments.
2. Select the menu icon (three dots) on the right-hand side of the row for the desired deployment and select `Delete`.
3. A `Confirmation` modal will appear. Select `Delete`.

## Deployment settings

Starting from the **Deployments** view:

1. In the top-right corner, select the gear icon (`Deployment Settings`).
2. Update the `Git Branch` to the desired branch.
3. Check/uncheck checkbox to `Automatically update deployment on push to branch`.
   1. Branch creation/deletion and tag creation/deletion events will not trigger an update. Only pushes to an existing branch will trigger an update.
   2. Pushes in quick succession to a branch will queue subsequent updates. Once a build completes, the most recent commit will begin building and the other queued builds will be skipped.

## Add or remove GitHub repositories

After installing and authorizing LangChain's `hosted-langserve` GitHub app, repository access for the app can be modified to add new repositories or remove existing repositories. If a new repository is created, it may need to be added explicitly.

1. From the GitHub profile, navigate to `Settings` > `Applications` > `hosted-langserve` > click `Configure`.
2. Under `Repository access`, select `All repositories` or `Only select repositories`. If `Only select repositories` is selected, new repositories must be explicitly added.
3. Click `Save`.
4. When creating a new deployment, the list of GitHub repositories in the dropdown menu will be updated to reflect the repository access changes.

## Allowlisting IP addresses

All traffic from LangSmith deployments created after January 6th 2025 will come through a NAT gateway.
This NAT gateway will have several static ip addresses depending on the region you are deploying in. Refer to the table below for the list of IP addresses to allowlist:

| US             | EU             |
| -------------- | -------------- |
| 35.197.29.146  | 34.90.213.236  |
| 34.145.102.123 | 34.13.244.114  |
| 34.169.45.153  | 34.32.180.189  |
| 34.82.222.17   | 34.34.69.108   |
| 35.227.171.135 | 34.32.145.240  |
| 34.169.88.30   | 34.90.157.44   |
| 34.19.93.202   | 34.141.242.180 |
| 34.19.34.50    | 34.32.141.108  |
| 34.59.244.194  |                |
| 34.9.99.224    |                |
| 34.68.27.146   |                |
| 34.41.178.137  |                |
| 34.123.151.210 |                |
| 34.135.61.140  |                |
| 34.121.166.52  |                |
| 34.31.121.70   |                |

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/deploy-to-cloud.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Deploy with control plane
Source: https://docs.langchain.com/langsmith/deploy-with-control-plane



This guide shows you how to deploy your applications to [hybrid](/langsmith/hybrid) or [self-hosted](/langsmith/self-hosted) instances with a [control plane](/langsmith/control-plane). With a control plane, you build Docker images locally, push them to a registry that your Kubernetes cluster has access to, and deploy them with the [LangSmith UI](https://smith.langchain.com).

<Note>
  **This guide is for deploying applications, not setting up infrastructure.**

  Before using this guide, you must have already completed infrastructure setup:

  * **[Hybrid setup](/langsmith/deploy-hybrid)**: For hybrid hosting.
  * **[Enable LangSmith Deployment](/langsmith/deploy-self-hosted-full-platform)**: For self-hosted with control plane.

  If you haven't set up your infrastructure yet, start with the [Platform setup section](/langsmith/platform-setup).
</Note>

## Overview

Applications deployed to hybrid or self-hosted LangSmith instances with control plane use Docker images. In this guide, the application deployment workflow is:

1. Test your application locally using `langgraph dev` or [Studio](/langsmith/studio).
2. Build a Docker image using the `langgraph build` command.
3. Push the image to a container registry accessible by your infrastructure.
4. Deploy from the [control plane UI](/langsmith/control-plane#control-plane-ui) by specifying the image URL.

## Prerequisites

Before completing this guide, you'll need the following:

* Completed infrastructure setup to enable your [data plane](/langsmith/data-plane) to receive application deployments:
  * [Hybrid setup](/langsmith/deploy-hybrid): Installs data plane components (listener, operator, CRDs) in your Kubernetes cluster that connect to LangChain's managed control plane.
  * [Enable LangSmith Deployment](/langsmith/deploy-self-hosted-full-platform): Enables LangSmith Deployment on your self-hosted LangSmith instance.
* Access to the [LangSmith UI](https://smith.langchain.com) with LangSmith Deployment enabled.
* A container registry accessible by your Kubernetes cluster. If using a private registry that requires authentication, you must configure image pull secrets as part of your infrastructure setup. Refer to [Private registry authentication](#private-registry-authentication).

## Step 1. Test locally

Before deploying, test your application locally. You can use the [LangGraph CLI](/langsmith/cli#dev) to run an Agent server in development mode:

```bash  theme={null}
langgraph dev
```

For a full guide local testing, refer to the [Local server quickstart](/langsmith/local-server).

## Step 2. Build Docker image

Build a Docker image of your application using the [`langgraph build`](/langsmith/cli#build) command:

```bash  theme={null}
langgraph build -t my-image
```

Build command options include:

| Option               | Default          | Description                                                       |
| -------------------- | ---------------- | ----------------------------------------------------------------- |
| `-t, --tag TEXT`     | Required         | Tag for the Docker image                                          |
| `--platform TEXT`    |                  | Target platform(s) to build for (e.g., `linux/amd64,linux/arm64`) |
| `--pull / --no-pull` | `--pull`         | Build with latest remote Docker image                             |
| `-c, --config FILE`  | `langgraph.json` | Path to configuration file                                        |

Example with platform specification:

```bash  theme={null}
langgraph build --platform linux/amd64 -t my-image:v1.0.0
```

For full details, see the [CLI reference](/langsmith/cli#build).

## Step 3. Push to container registry

Push your image to a container registry accessible by your Kubernetes cluster. The specific commands depend on your registry provider.

<Tip>
  Tag your images with version information (e.g., `my-registry.com/my-app:v1.0.0`) to make rollbacks easier.
</Tip>

## Step 4. Deploy with the control plane UI

The [control plane UI](/langsmith/control-plane#control-plane-ui) allows you to create and manage deployments, view logs and metrics, and update configurations. To create a new deployment in the [LangSmith UI](https://smith.langchain.com):

1. In the left-hand navigation panel, select **Deployments**.
2. In the top-right corner, select **+ New Deployment**.
3. In the deployment configuration panel, provide:
   * **Image URL**: The full image URL you pushed in [Step 3](#step-3-push-to-container-registry).
   * **Listener/Compute ID**: Select the listener configured for your infrastructure.
   * **Namespace**: The Kubernetes namespace to deploy to.
   * **Environment variables**: Any required configuration (API keys, etc.).
   * Other deployment settings as needed.
4. Select **Submit**.

The control plane will coordinate with your [data plane](/langsmith/data-plane) listener to deploy your application.

After creating a deployment, the infrastructure is [provisioned asynchronously](/langsmith/control-plane#asynchronous-deployment). Deployment can take up to several minutes, with initial deployments taking longer due to database creation.

From the control plane UI, you can view build logs, server logs, and deployment metrics including CPU/memory usage, replicas, and API performance. For more details, refer to the [control plane monitoring documentation](/langsmith/control-plane#monitoring).

<Note>
  A [LangSmith Observability tracing project](/langsmith/observability) is automatically created for each deployment with the same name as the deployment. Tracing environment variables are set automatically by the control plane.
</Note>

## Update deployment

To deploy a new version of your application, create a [new revision](/langsmith/control-plane#revisions):

Starting from the LangSmith UI:

1. In the left-hand navigation panel, select **Deployments**.
2. Select an existing deployment.
3. In the Deployment view, select **+ New Revision** in the top-right corner.
4. Update the configuration:
   * Update the **Image URL** to your new image version.
   * Update environment variables if needed.
   * Adjust other settings as needed.
5. Select **Submit**.

## Private registry authentication

If your container registry requires authentication (e.g., AWS ECR, Azure ACR, GCP Artifact Registry, private Docker registry), you must configure Kubernetes image pull secrets before deploying applications. This is a one-time infrastructure configuration.

<Note>
  **This configuration is done at the infrastructure level, not per-deployment.** Once configured, all deployments automatically inherit the registry credentials.
</Note>

The configuration steps depend on your deployment type:

* **Self-hosted with control plane**: Configure `imagePullSecrets` in your LangSmith Helm chart's `values.yaml` file. See the detailed steps in the [Enable LangSmith Deployment guide](/langsmith/deploy-self-hosted-full-platform#setup).
* **Hybrid**: Configure `imagePullSecrets` in your `langgraph-dataplane-values.yaml` file using the same format.

For detailed steps on creating image pull secrets for different registry providers, refer to the [Kubernetes documentation on pulling images from private registries](https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/).

## Next steps

* **[Control plane](/langsmith/control-plane)**: Learn more about control plane features.
* **[Data plane](/langsmith/data-plane)**: Understand data plane architecture.
* **[Observability](/langsmith/observability)**: Monitor your deployments with automatic tracing.
* **[Studio](/langsmith/studio)**: Test and debug deployed applications.
* **[LangGraph CLI](/langsmith/cli)**: Full CLI reference documentation.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/deploy-with-control-plane.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Deploy your app to Cloud
Source: https://docs.langchain.com/langsmith/deployment-quickstart



This is a quickstart guide for deploying your first application to LangSmith Cloud.

<Tip>
  For a comprehensive Cloud deployment guide with all configuration options, refer to the [Cloud deployment setup guide](/langsmith/deploy-to-cloud).
</Tip>

## Prerequisites

Before you begin, ensure you have the following:

* A [GitHub account](https://github.com/)
* A [LangSmith account](https://smith.langchain.com/) (free to sign up)

## 1. Create a repository on GitHub

To deploy an application to **LangSmith**, your application code must reside in a GitHub repository. Both public and private repositories are supported. For this quickstart, use the [`new-langgraph-project` template](https://github.com/langchain-ai/react-agent) for your application:

1. Go to the [`new-langgraph-project` repository](https://github.com/langchain-ai/new-langgraph-project) or [`new-langgraphjs-project` template](https://github.com/langchain-ai/new-langgraphjs-project).
2. Click the `Fork` button in the top right corner to fork the repository to your GitHub account.
3. Click **Create fork**.

## 2. Deploy to LangSmith

1. Log in to [LangSmith](https://smith.langchain.com/).
2. In the left sidebar, select **Deployments**.
3. Click the **+ New Deployment** button. A pane will open where you can fill in the required fields.
4. If you are a first time user or adding a private repository that has not been previously connected, click the **Import from GitHub** button and follow the instructions to connect your GitHub account.
5. Select your New LangGraph Project repository.
6. Click **Submit** to deploy.
   This may take about 15 minutes to complete. You can check the status in the **Deployment details** view.

## 3. Test your application in Studio

Once your application is deployed:

1. Select the deployment you just created to view more details.
2. Click the **Studio** button in the top right corner. [Studio](/langsmith/studio) will open to display your graph.

## 4. Get the API URL for your deployment

1. In the **Deployment details** view, click the **API URL** to copy it to your clipboard.
2. Click the `URL` to copy it to the clipboard.

## 5. Test the API

You can now test the API:

<Tabs>
  <Tab title="Python SDK (Async)">
    1. Install the LangGraph Python SDK:

    ```shell  theme={null}
    pip install langgraph-sdk
    ```

    2. Send a message to the assistant (threadless run):

    ```python  theme={null}
    from langgraph_sdk import get_client

    client = get_client(url="your-deployment-url", api_key="your-langsmith-api-key")

    async for chunk in client.runs.stream(
        None,  # Threadless run
        "agent", # Name of assistant. Defined in langgraph.json.
        input={
            "messages": [{
                "role": "human",
                "content": "What is LangGraph?",
            }],
        },
        stream_mode="updates",
    ):
        print(f"Receiving new event of type: {chunk.event}...")
        print(chunk.data)
        print("\n\n")
    ```
  </Tab>

  <Tab title="Python SDK (Sync)">
    1. Install the LangGraph Python SDK:

    ```shell  theme={null}
    pip install langgraph-sdk
    ```

    2. Send a message to the assistant (threadless run):

    ```python  theme={null}
    from langgraph_sdk import get_sync_client

    client = get_sync_client(url="your-deployment-url", api_key="your-langsmith-api-key")

    for chunk in client.runs.stream(
        None,  # Threadless run
        "agent", # Name of assistant. Defined in langgraph.json.
        input={
            "messages": [{
                "role": "human",
                "content": "What is LangGraph?",
            }],
        },
        stream_mode="updates",
    ):
        print(f"Receiving new event of type: {chunk.event}...")
        print(chunk.data)
        print("\n\n")
    ```
  </Tab>

  <Tab title="JavaScript SDK">
    1. Install the LangGraph JS SDK

    ```shell  theme={null}
    npm install @langchain/langgraph-sdk
    ```

    2. Send a message to the assistant (threadless run):

    ```js  theme={null}
    const { Client } = await import("@langchain/langgraph-sdk");

    const client = new Client({ apiUrl: "your-deployment-url", apiKey: "your-langsmith-api-key" });

    const streamResponse = client.runs.stream(
        null, // Threadless run
        "agent", // Assistant ID
        {
            input: {
                "messages": [
                    { "role": "user", "content": "What is LangGraph?"}
                ]
            },
            streamMode: "messages",
        }
    );

    for await (const chunk of streamResponse) {
        console.log(`Receiving new event of type: ${chunk.event}...`);
        console.log(JSON.stringify(chunk.data));
        console.log("\n\n");
    }
    ```
  </Tab>

  <Tab title="Rest API">
    ```bash  theme={null}
    curl -s --request POST \
        --url <DEPLOYMENT_URL>/runs/stream \
        --header 'Content-Type: application/json' \
        --header "X-Api-Key: <LANGSMITH API KEY> \
        --data "{
            \"assistant_id\": \"agent\",
            \"input\": {
                \"messages\": [
                    {
                        \"role\": \"human\",
                        \"content\": \"What is LangGraph?\"
                    }
                ]
            },
            \"stream_mode\": \"updates\"
        }"
    ```
  </Tab>
</Tabs>

## Next steps

You've successfully deployed your application to LangSmith Cloud. Here are some next steps:

* **Explore Studio**: Use [Studio](/langsmith/studio) to visualize and debug your graph interactively.
* **Monitor your app**: Set up [observability](/langsmith/observability) with traces, dashboards, and alerts.
* **Learn more about Cloud**: See the [complete Cloud setup guide](/langsmith/deploy-to-cloud) for all configuration options.

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/deployment-quickstart.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# LangSmith Deployment
Source: https://docs.langchain.com/langsmith/deployments



<Callout icon="rocket" color="#4F46E5" iconType="regular">
  **Start here if you're building or operating agent applications.** This section is about deploying **your application**. If you need to set up LangSmith infrastructure, the [Platform setup section](/langsmith/platform-setup) covers infrastructure options (cloud, hybrid, self-hosted) and setup guides for hybrid and self-hosted deployments.
</Callout>

This section covers how to package, build, and deploy your *agents* and applications as [Agent Servers](/langsmith/agent-server).

A typical deployment workflow consists of the following steps:

<Steps>
  <Step title={<a href="/langsmith/local-server">Test locally</a>}>
    Run your application on a local server.
  </Step>

  <Step title={<a href="/langsmith/application-structure">Configure app for deployment</a>}>
    Set up dependencies, project structure, and environment configuration.
  </Step>

  <Step title={<a href="/langsmith/platform-setup">Choose hosting</a>}>
    (Required for deployment) Select Cloud, Hybrid, or Self-hosted.
  </Step>

  <Step title="Deploy your app">
    * [**Cloud**](/langsmith/deploy-to-cloud): Push code from a git repository
    * [**Hybrid or Self-hosted with control plane**](/langsmith/deploy-with-control-plane): Build and push Docker images, deploy via UI
    * [**Standalone servers**](/langsmith/deploy-standalone-server): Deploy directly without control plane
  </Step>

  <Step title={<a href="/langsmith/observability">Monitor & manage</a>}>
    Track traces, alerts, and dashboards.
  </Step>
</Steps>

## What you'll learn

* Configure your [app for deployment](/langsmith/application-structure) (dependencies, [project setup](/langsmith/setup-app-requirements-txt), and [monorepo support](/langsmith/monorepo-support)).
* Build, deploy, and update [Agent Servers](/langsmith/agent-server).
* Secure your deployments with [authentication and access control](/langsmith/auth).
* Customize your server runtime ([lifespan hooks](/langsmith/custom-lifespan), [middleware](/langsmith/custom-middleware), and [routes](/langsmith/custom-routes)).
* Debug, observe, and troubleshoot deployed agents using the [Studio UI](/langsmith/studio).

<Columns cols={1}>
  <Card title="Get started with deployment" icon="robot" href="/langsmith/application-structure" cta="Configure your app">
    Package, build, and deploy your agents and graphs to Agent Server.
  </Card>
</Columns>

### Related

* [Agent Server](/langsmith/agent-server)
* [Application structure](/langsmith/application-structure)
* [Local server testing](/langsmith/local-server)

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/deployments.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Implement distributed tracing
Source: https://docs.langchain.com/langsmith/distributed-tracing



Sometimes, you need to trace a request across multiple services.

LangSmith supports distributed tracing out of the box, linking runs within a trace across services using context propagation headers (`langsmith-trace` and optional `baggage` for metadata/tags).

Example client-server setup:

* Trace starts on client
* Continues on server

## Distributed tracing in Python

```python  theme={null}
# client.py
from langsmith.run_helpers import get_current_run_tree, traceable
import httpx

@traceable
async def my_client_function():
    headers = {}
    async with httpx.AsyncClient(base_url="...") as client:
        if run_tree := get_current_run_tree():
            # add langsmith-id to headers
            headers.update(run_tree.to_headers())
        return await client.post("/my-route", headers=headers)
```

Then the server (or other service) can continue the trace by handling the headers appropriately. If you are using an asgi app Starlette or FastAPI, you can connect the distributed trace using LangSmith's `TracingMiddleware`.

<Info>
  The `TracingMiddleware` class was added in `langsmith==0.1.133`.
</Info>

Example using FastAPI:

```python  theme={null}
from langsmith import traceable
from langsmith.middleware import TracingMiddleware
from fastapi import FastAPI, Request

app = FastAPI()  # Or Flask, Django, or any other framework
app.add_middleware(TracingMiddleware)

@traceable
async def some_function():
    ...

@app.post("/my-route")
async def fake_route(request: Request):
    return await some_function()
```

Or in Starlette:

```python  theme={null}
from starlette.applications import Starlette
from starlette.middleware import Middleware
from langsmith.middleware import TracingMiddleware

routes = ...
middleware = [
    Middleware(TracingMiddleware),
]
app = Starlette(..., middleware=middleware)
```

If you are using other server frameworks, you can always "receive" the distributed trace by passing the headers in through `langsmith_extra`:

```python  theme={null}
# server.py
import langsmith as ls
from fastapi import FastAPI, Request

@ls.traceable
async def my_application():
    ...

app = FastAPI()  # Or Flask, Django, or any other framework

@app.post("/my-route")
async def fake_route(request: Request):
    # request.headers:  {"langsmith-trace": "..."}
    # as well as optional metadata/tags in `baggage`
    with ls.tracing_context(parent=request.headers):
        return await my_application()
```

The example above uses the `tracing_context` context manager. You can also directly specify the parent run context in the `langsmith_extra` parameter of a method wrapped with `@traceable`.

```python  theme={null}
# ... same as above

@app.post("/my-route")
async def fake_route(request: Request):
    # request.headers:  {"langsmith-trace": "..."}
    my_application(langsmith_extra={"parent": request.headers})
```

## Distributed tracing in TypeScript

<Note>
  Distributed tracing in TypeScript requires `langsmith` version `>=0.1.31`
</Note>

First, we obtain the current run tree from the client and convert it to `langsmith-trace` and `baggage` header values, which we can pass to the server:

```typescript  theme={null}
// client.mts
import { getCurrentRunTree, traceable } from "langsmith/traceable";

const client = traceable(
    async () => {
        const runTree = getCurrentRunTree();
        return await fetch("...", {
            method: "POST",
            headers: runTree.toHeaders(),
        }).then((a) => a.text());
    },
    { name: "client" }
);

await client();
```

Then, the server converts the headers back to a run tree, which it uses to further continue the tracing.

To pass the newly created run tree to a traceable function, we can use the `withRunTree` helper, which will ensure the run tree is propagated within traceable invocations.

<CodeGroup>
  ```typescript Express.JS theme={null}
  // server.mts
  import { RunTree } from "langsmith";
  import { traceable, withRunTree } from "langsmith/traceable";
  import express from "express";
  import bodyParser from "body-parser";

      const server = traceable(
          (text: string) => `Hello from the server! Received "${text}"`,
          { name: "server" }
      );

      const app = express();
      app.use(bodyParser.text());

  app.post("/", async (req, res) => {
      const runTree = RunTree.fromHeaders(req.headers);
      const result = await withRunTree(runTree, () => server(req.body));
      res.send(result);
  });
  ```

  ```typescript Hono theme={null}
  // server.mts
  import { RunTree } from "langsmith";
  import { traceable, withRunTree } from "langsmith/traceable";
  import { Hono } from "hono";

      const server = traceable(
          (text: string) => `Hello from the server! Received "${text}"`,
          { name: "server" }
      );

      const app = new Hono();

  app.post("/", async (c) => {
      const body = await c.req.text();
      const runTree = RunTree.fromHeaders(c.req.raw.headers);
      const result = await withRunTree(runTree, () => server(body));
      return c.body(result);
  });
  ```
</CodeGroup>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/distributed-tracing.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Self-host LangSmith with Docker
Source: https://docs.langchain.com/langsmith/docker



<Info>
  Self-hosting LangSmith is an add-on to the Enterprise Plan designed for our largest, most security-conscious customers. See our [pricing page](https://www.langchain.com/pricing) for more detail, and [contact our sales team](https://www.langchain.com/contact-sales) if you want to get a license key to trial LangSmith in your environment.
</Info>

This guide provides instructions for running the **LangSmith platform** locally using Docker for development and testing purposes.

<Warning>
  **For development/testing only**. Do not use Docker Compose for production. For production deployments, use [Kubernetes](/langsmith/kubernetes).
</Warning>

<Note>
  This page describes how to install the base [LangSmith platform](/langsmith/self-hosted#langsmith) for local testing. It does **not** include deployment management features. For more details, review the [self-hosted options](/langsmith/self-hosted).
</Note>

Note that Docker Compose is limited to local development environments only and does not extend support to container services such as AWS Elastic Container Service, Azure Container Instances, and Google Cloud Run.

## Prerequisites

1. Ensure Docker is installed and running on your system. You can verify this by running:

   ```bash  theme={null}
   docker info
   ```

   If you don't see any server information in the output, make sure Docker is installed correctly and launch the Docker daemon.

   1. Recommended: At least 4 vCPUs, 16GB Memory available on your machine.
      * You may need to tune resource requests/limits for all of our different services based off of organization size/usage
   2. Disk Space: LangSmith can potentially require a lot of disk space. Ensure you have enough disk space available.

2. LangSmith License Key
   1. You can get this from your LangChain representative. [Contact our sales team](https://www.langchain.com/contact-sales) for more information.

3. Api Key Salt

   1. This is a secret key that you can generate. It should be a random string of characters.
   2. You can generate this using the following command:

   ```bash  theme={null}
   openssl rand -base64 32
   ```

4. Egress to `https://beacon.langchain.com` (if not running in offline mode)
   1. LangSmith requires egress to `https://beacon.langchain.com` for license verification and usage reporting. This is required for LangSmith to function properly. You can find more information on egress requirements in the [Egress](/langsmith/self-host-egress) section.

5. Configuration
   1. There are several configuration options that you can set in the `.env` file. You can find more information on the available configuration options in the [Configuration](/langsmith/self-host-scale) section.

## Running via Docker Compose

The following explains how to run the LangSmith using Docker Compose. This is the most flexible way to run LangSmith without Kubernetes. The default configuration for Docker Compose is intended for local testing only and not for instances where any services are exposed to the public internet. **In production, we highly recommend using a secured Kubernetes environment.**

### 1. Fetch the LangSmith `docker-compose.yml` file

You can find the `docker-compose.yml` file and related files in the LangSmith SDK repository here: [*LangSmith Docker Compose File*](https://github.com/langchain-ai/helm/blob/main/charts/langsmith/docker-compose/docker-compose.yaml)

Copy the `docker-compose.yml` file and all files in that directory from the LangSmith SDK to your project directory.

* Ensure that you copy the `users.xml` file as well.

### 2. Configure environment variables

1. Copy the `.env.example` file from the LangSmith SDK to your project directory and rename it to `.env`.
2. Configure the appropriate values in the `.env` file. You can find the available configuration options in the [Configuration](/langsmith/self-hosted) section.

You can also set these environment variables in the `docker-compose.yml` file directly or export them in your terminal. We recommend setting them in the `.env` file.

### 3. Start server

Start the LangSmith application by executing the following command in your terminal:

```bash  theme={null}
docker-compose up
```

You can also run the server in the background by running:

```bash  theme={null}
docker-compose up -d
```

### Validate your deployment:

1. Curl the exposed port of the `cli-langchain-frontend-1` container:

   ```bash  theme={null}
   curl localhost:1980/info{"version":"0.5.7","license_expiration_time":"2033-05-20T20:08:06","batch_ingest_config":{"scale_up_qsize_trigger":1000,"scale_up_nthreads_limit":16,"scale_down_nempty_trigger":4,"size_limit":100,"size_limit_bytes":20971520}}
   ```

2. Visit the exposed port of the `cli-langchain-frontend-1` container on your browser

   The LangSmith UI should be visible/operational at `http://localhost:1980`

   <img src="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/langsmith-ui.png?fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=5310f686e7b9eebaaee4fe2a152a8675" alt=".langsmith_ui.png" data-og-width="2886" width="2886" data-og-height="1698" height="1698" data-path="langsmith/images/langsmith-ui.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/langsmith-ui.png?w=280&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=5f155ce778ca848f89fefff237b69bcb 280w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/langsmith-ui.png?w=560&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=1d55d4068a9f53387c129b4688b0971e 560w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/langsmith-ui.png?w=840&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=feb20198d67249ece559e5fd0e6d8e98 840w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/langsmith-ui.png?w=1100&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=3e5eba764d911e567d5aaa9e5702327b 1100w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/langsmith-ui.png?w=1650&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=d45af56632578a8d1b05e546dfc8d01d 1650w, https://mintcdn.com/langchain-5e9cc07a/4kN8yiLrZX_amfFn/langsmith/images/langsmith-ui.png?w=2500&fit=max&auto=format&n=4kN8yiLrZX_amfFn&q=85&s=16a49517a6c224930fdb81c9ccde5527 2500w" />

### Checking the logs

If, at any point, you want to check if the server is running and see the logs, run

```bash  theme={null}
docker-compose logs
```

### Stopping the server

```bash  theme={null}
docker-compose down
```

## Using LangSmith

Now that LangSmith is running, you can start using it to trace your code. You can find more information on how to use self-hosted LangSmith in the [self-hosted usage guide](/langsmith/self-hosted).

Your LangSmith instance is now running but may not be fully setup yet.

If you used one of the basic configs, you may have deployed a no-auth configuration. In this state, there is no authentication or concept of user accounts nor API keys and traces can be submitted directly without an API key so long as the hostname is passed to the LangChain tracer/LangSmith SDK.

As a next step, it is strongly recommended you work with your infrastructure administrators to:

* Setup DNS for your LangSmith instance to enable easier access
* Configure SSL to ensure in-transit encryption of traces submitted to LangSmith
* Configure LangSmith for [oauth authentication](/langsmith/self-host-sso) or [basic authentication](/langsmith/self-host-basic-auth) to secure your LangSmith instance
* Secure access to your Docker environment to limit access to only the LangSmith frontend and API
* Connect LangSmith to secured Postgres and Redis instances

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/docker.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Double texting
Source: https://docs.langchain.com/langsmith/double-texting



<Info>
  **Prerequisites**

  * [Agent Server](/langsmith/agent-server)
</Info>

Many times users might interact with your graph in unintended ways.
For instance, a user may send one message and before the graph has finished running send a second message.
More generally, users may invoke the graph a second time before the first run has finished.
We call this "double texting".

<Note>
  Double texting is a feature of LangSmith Deployment. It is not available in the [LangGraph open source framework](/oss/python/langgraph/overview).
</Note>

<img src="https://mintcdn.com/langchain-5e9cc07a/Hucw5hmCzWXDanL-/langsmith/images/double-texting.png?fit=max&auto=format&n=Hucw5hmCzWXDanL-&q=85&s=1cae1e8cd4920872e7992460b081f76d" alt="Double-text strategies across first vs. second run: Reject keeps only the first; Enqueue runs the second afterward; Interrupt halts the first to run the second; Rollback reverts the first and reruns with the second." data-og-width="1886" width="1886" data-og-height="648" height="648" data-path="langsmith/images/double-texting.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Hucw5hmCzWXDanL-/langsmith/images/double-texting.png?w=280&fit=max&auto=format&n=Hucw5hmCzWXDanL-&q=85&s=67fc4d3817141da00d0f0e0b5c6de093 280w, https://mintcdn.com/langchain-5e9cc07a/Hucw5hmCzWXDanL-/langsmith/images/double-texting.png?w=560&fit=max&auto=format&n=Hucw5hmCzWXDanL-&q=85&s=2c9cf620db602c51a7e3804cb0815058 560w, https://mintcdn.com/langchain-5e9cc07a/Hucw5hmCzWXDanL-/langsmith/images/double-texting.png?w=840&fit=max&auto=format&n=Hucw5hmCzWXDanL-&q=85&s=7e16e946f3c616476fd99b40aa731a3c 840w, https://mintcdn.com/langchain-5e9cc07a/Hucw5hmCzWXDanL-/langsmith/images/double-texting.png?w=1100&fit=max&auto=format&n=Hucw5hmCzWXDanL-&q=85&s=98032b626677cf73744a3922112abda4 1100w, https://mintcdn.com/langchain-5e9cc07a/Hucw5hmCzWXDanL-/langsmith/images/double-texting.png?w=1650&fit=max&auto=format&n=Hucw5hmCzWXDanL-&q=85&s=2537ca1524e871001cd454f41dca6597 1650w, https://mintcdn.com/langchain-5e9cc07a/Hucw5hmCzWXDanL-/langsmith/images/double-texting.png?w=2500&fit=max&auto=format&n=Hucw5hmCzWXDanL-&q=85&s=d37e7e8cd1c4a5bad8cc399d0b878382 2500w" />

## Reject

This option rejects any additional incoming runs while a current run is in progress and prevents concurrent execution or double texting.

For configuring the reject double text option, refer to the [how-to guide](/langsmith/reject-concurrent).

## Enqueue

This option allows the current run to finish before processing any new input. Incoming requests are queued and executed sequentially once prior runs complete.

For configuring the enqueue double text option, refer to the [how-to guide](/langsmith/enqueue-concurrent).

## Interrupt

This option halts the current execution and preserves the progress made up to the interruption point. The new user input is then inserted, and execution continues from that state.

When using this option, your graph must account for potential edge cases. For example, a tool call may have been initiated but not yet completed at the time of interruption. In these cases, handling or removing partial tool calls may be necessary to avoid unresolved operations.

For configuring the interrupt double text option, refer to the [how-to guide](/langsmith/interrupt-concurrent).

## Rollback

This option halts the current execution and reverts all progress—including the initial run input—before processing the new user input. The new input is treated as a fresh run, starting from the initial state.

For configuring the rollback double text option, refer to the [how-to guide](/langsmith/rollback-concurrent).

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/double-texting.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Enqueue concurrent
Source: https://docs.langchain.com/langsmith/enqueue-concurrent



This guide assumes knowledge of what double-texting is, which you can learn about in the [double-texting conceptual guide](/langsmith/double-texting).

The guide covers the `enqueue` option for double texting, which adds the interruptions to a queue and executes them in the order they are received by the client. Below is a quick example of using the `enqueue` option.

## Setup

First, we will define a quick helper function for printing out JS and CURL model outputs (you can skip this if using Python):

<Tabs>
  <Tab title="Javascript">
    ```js  theme={null}
    function prettyPrint(m) {
      const padded = " " + m['type'] + " ";
      const sepLen = Math.floor((80 - padded.length) / 2);
      const sep = "=".repeat(sepLen);
      const secondSep = sep + (padded.length % 2 ? "=" : "");

      console.log(`${sep}${padded}${secondSep}`);
      console.log("\n\n");
      console.log(m.content);
    }
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    # PLACE THIS IN A FILE CALLED pretty_print.sh
    pretty_print() {
      local type="$1"
      local content="$2"
      local padded=" $type "
      local total_width=80
      local sep_len=$(( (total_width - ${#padded}) / 2 ))
      local sep=$(printf '=%.0s' $(eval "echo {1.."${sep_len}"}"))
      local second_sep=$sep
      if (( (total_width - ${#padded}) % 2 )); then
        second_sep="${second_sep}="
      fi

      echo "${sep}${padded}${second_sep}"
      echo
      echo "$content"
    }
    ```
  </Tab>
</Tabs>

Then, let's import our required packages and instantiate our client, assistant, and thread.

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    import asyncio

    import httpx
    from langchain_core.messages import convert_to_messages
    from langgraph_sdk import get_client

    client = get_client(url=<DEPLOYMENT_URL>)
    # Using the graph deployed with the name "agent"
    assistant_id = "agent"
    thread = await client.threads.create()
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    import { Client } from "@langchain/langgraph-sdk";


    const client = new Client({ apiUrl: <DEPLOYMENT_URL> });
    // Using the graph deployed with the name "agent"
    const assistantId = "agent";
    const thread = await client.threads.create();
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request POST \
      --url <DEPLOYMENT_URL>/threads \
      --header 'Content-Type: application/json' \
      --data '{}'
    ```
  </Tab>
</Tabs>

## Create runs

Now let's start two runs, with the second interrupting the first one with a multitask strategy of "enqueue":

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    first_run = await client.runs.create(
        thread["thread_id"],
        assistant_id,
        input={"messages": [{"role": "user", "content": "what's the weather in sf?"}]},
    )
    second_run = await client.runs.create(
        thread["thread_id"],
        assistant_id,
        input={"messages": [{"role": "user", "content": "what's the weather in nyc?"}]},
        multitask_strategy="enqueue",
    )
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    const firstRun = await client.runs.create(
      thread["thread_id"],
      assistantId,
      input={"messages": [{"role": "user", "content": "what's the weather in sf?"}]},
    )

    const secondRun = await client.runs.create(
      thread["thread_id"],
      assistantId,
      input={"messages": [{"role": "user", "content": "what's the weather in nyc?"}]},
      multitask_strategy="enqueue",
    )
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    curl --request POST \
    --url <DEPLOY<ENT_URL>>/threads/<THREAD_ID>/runs \
    --header 'Content-Type: application/json' \
    --data "{
      \"assistant_id\": \"agent\",
      \"input\": {\"messages\": [{\"role\": \"human\", \"content\": \"what\'s the weather in sf?\"}]},
    }" && curl --request POST \
    --url <DEPLOY<ENT_URL>>/threads/<THREAD_ID>/runs \
    --header 'Content-Type: application/json' \
    --data "{
      \"assistant_id\": \"agent\",
      \"input\": {\"messages\": [{\"role\": \"human\", \"content\": \"what\'s the weather in nyc?\"}]},
      \"multitask_strategy\": \"enqueue\"
    }"
    ```
  </Tab>
</Tabs>

## View run results

Verify that the thread has data from both runs:

<Tabs>
  <Tab title="Python">
    ```python  theme={null}
    # wait until the second run completes
    await client.runs.join(thread["thread_id"], second_run["run_id"])

    state = await client.threads.get_state(thread["thread_id"])

    for m in convert_to_messages(state["values"]["messages"]):
        m.pretty_print()
    ```
  </Tab>

  <Tab title="Javascript">
    ```js  theme={null}
    await client.runs.join(thread["thread_id"], secondRun["run_id"]);

    const state = await client.threads.getState(thread["thread_id"]);

    for (const m of state["values"]["messages"]) {
      prettyPrint(m);
    }
    ```
  </Tab>

  <Tab title="CURL">
    ```bash  theme={null}
    source pretty_print.sh && curl --request GET \
    --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/runs/<RUN_ID>/join && \
    curl --request GET --url <DEPLOYMENT_URL>/threads/<THREAD_ID>/state | \
    jq -c '.values.messages[]' | while read -r element; do
        type=$(echo "$element" | jq -r '.type')
        content=$(echo "$element" | jq -r '.content | if type == "array" then tostring else . end')
        pretty_print "$type" "$content"
    done
    ```
  </Tab>
</Tabs>

Output:

```
================================ Human Message =================================

what's the weather in sf?
================================== Ai Message ==================================

[{'id': 'toolu_01Dez1sJre4oA2Y7NsKJV6VT', 'input': {'query': 'weather in san francisco'}, 'name': 'tavily_search_results_json', 'type': 'tool_use'}]
Tool Calls:
tavily_search_results_json (toolu_01Dez1sJre4oA2Y7NsKJV6VT)
Call ID: toolu_01Dez1sJre4oA2Y7NsKJV6VT
Args:
query: weather in san francisco
================================= Tool Message =================================
Name: tavily_search_results_json

[{"url": "https://www.accuweather.com/en/us/san-francisco/94103/weather-forecast/347629", "content": "Get the current and future weather conditions for San Francisco, CA, including temperature, precipitation, wind, air quality and more. See the hourly and 10-day outlook, radar maps, alerts and allergy information."}]
================================== Ai Message ==================================

According to AccuWeather, the current weather conditions in San Francisco are:

Temperature: 57°F (14°C)
Conditions: Mostly Sunny
Wind: WSW 10 mph
Humidity: 72%

The forecast for the next few days shows partly sunny skies with highs in the upper 50s to mid 60s F (14-18°C) and lows in the upper 40s to low 50s F (9-11°C). Typical mild, dry weather for San Francisco this time of year.

Some key details from the AccuWeather forecast:

Today: Mostly sunny, high of 62°F (17°C)
Tonight: Partly cloudy, low of 49°F (9°C)
Tomorrow: Partly sunny, high of 59°F (15°C)
Saturday: Mostly sunny, high of 64°F (18°C)
Sunday: Partly sunny, high of 61°F (16°C)

So in summary, expect seasonable spring weather in San Francisco over the next several days, with a mix of sun and clouds and temperatures ranging from the upper 40s at night to the low 60s during the days. Typical dry conditions with no rain in the forecast.
================================ Human Message =================================

what's the weather in nyc?
================================== Ai Message ==================================

[{'text': 'Here are the current weather conditions and forecast for New York City:', 'type': 'text'}, {'id': 'toolu_01FFft5Sx9oS6AdVJuRWWcGp', 'input': {'query': 'weather in new york city'}, 'name': 'tavily_search_results_json', 'type': 'tool_use'}]
Tool Calls:
tavily_search_results_json (toolu_01FFft5Sx9oS6AdVJuRWWcGp)
Call ID: toolu_01FFft5Sx9oS6AdVJuRWWcGp
Args:
query: weather in new york city
================================= Tool Message =================================
Name: tavily_search_results_json

[{"url": "https://www.weatherapi.com/", "content": "{'location': {'name': 'New York', 'region': 'New York', 'country': 'United States of America', 'lat': 40.71, 'lon': -74.01, 'tz_id': 'America/New_York', 'localtime_epoch': 1718734479, 'localtime': '2024-06-18 14:14'}, 'current': {'last_updated_epoch': 1718733600, 'last_updated': '2024-06-18 14:00', 'temp_c': 29.4, 'temp_f': 84.9, 'is_day': 1, 'condition': {'text': 'Sunny', 'icon': '//cdn.weatherapi.com/weather/64x64/day/113.png', 'code': 1000}, 'wind_mph': 2.2, 'wind_kph': 3.6, 'wind_degree': 158, 'wind_dir': 'SSE', 'pressure_mb': 1025.0, 'pressure_in': 30.26, 'precip_mm': 0.0, 'precip_in': 0.0, 'humidity': 63, 'cloud': 0, 'feelslike_c': 31.3, 'feelslike_f': 88.3, 'windchill_c': 28.3, 'windchill_f': 82.9, 'heatindex_c': 29.6, 'heatindex_f': 85.3, 'dewpoint_c': 18.4, 'dewpoint_f': 65.2, 'vis_km': 16.0, 'vis_miles': 9.0, 'uv': 7.0, 'gust_mph': 16.5, 'gust_kph': 26.5}}"}]
================================== Ai Message ==================================

According to the weather data from WeatherAPI:

Current Conditions in New York City (as of 2:00 PM local time):

* Temperature: 85°F (29°C)
* Conditions: Sunny
* Wind: 2 mph (4 km/h) from the SSE
* Humidity: 63%
* Heat Index: 85°F (30°C)

The forecast shows sunny and warm conditions persisting over the next few days:

Today: Sunny, high of 85°F (29°C)
Tonight: Clear, low of 68°F (20°C)
Tomorrow: Sunny, high of 88°F (31°C)
Thursday: Mostly sunny, high of 90°F (32°C)
Friday: Partly cloudy, high of 87°F (31°C)

So New York City is experiencing beautiful sunny weather with seasonably warm temperatures in the mid-to-upper 80s Fahrenheit (around 30°C). Humidity is moderate in the 60% range. Overall, ideal late spring/early summer conditions for being outdoors in the city over the next several days.
```

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/enqueue-concurrent.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Environment variables
Source: https://docs.langchain.com/langsmith/env-var



The Agent Server supports specific environment variables for configuring a deployment.

## `BG_JOB_ISOLATED_LOOPS`

Set `BG_JOB_ISOLATED_LOOPS` to `True` to execute background runs in an isolated event loop separate from the serving API event loop.

This environment variable should be set to `True` if the implementation of a graph/node contains synchronous code. In this situation, the synchronous code will block the serving API event loop, which may cause the API to be unavailable. A symptom of an unavailable API is continuous application restarts due to failing health checks.

Defaults to `False`.

## `BG_JOB_SHUTDOWN_GRACE_PERIOD_SECS`

Specifies, in seconds, how long the server will wait for background jobs to finish after the queue receives a shutdown signal. After this period, the server will force termination. Defaults to `180` seconds. Set this to ensure jobs have enough time to complete cleanly during shutdown. Added in `langgraph-api==0.2.16`.

## `BG_JOB_TIMEOUT_SECS`

The timeout of a background run can be increased. However, the infrastructure for a Cloud deployment enforces a 1 hour timeout limit for API requests. This means the connection between client and server will timeout after 1 hour. This is not configurable.

A background run can execute for longer than 1 hour, but a client must reconnect to the server (e.g. join stream via `POST /threads/{thread_id}/runs/{run_id}/stream`) to retrieve output from the run if the run is taking longer than 1 hour.

Defaults to `3600`.

## `DD_API_KEY`

Specify `DD_API_KEY` (your [Datadog API Key](https://docs.datadoghq.com/account_management/api-app-keys/)) to automatically enable Datadog tracing for the deployment. Specify other [`DD_*` environment variables](https://ddtrace.readthedocs.io/en/stable/configuration.html) to configure the tracing instrumentation.

If `DD_API_KEY` is specified, the application process is wrapped in the [`ddtrace-run` command](https://ddtrace.readthedocs.io/en/stable/installation_quickstart.html). Other `DD_*` environment variables (e.g. `DD_SITE`, `DD_ENV`, `DD_SERVICE`, `DD_TRACE_ENABLED`) are typically needed to properly configure the tracing instrumentation. See [`DD_*` environment variables](https://ddtrace.readthedocs.io/en/stable/configuration.html) for more details. You can enable `DD_TRACE_DEBUG=true` and set `DD_LOG_LEVEL=debug` to troubleshoot.

<Note>
  Enabling `DD_API_KEY` (and thus `ddtrace-run`) can override or interfere with other auto-instrumentation solutions (such as OpenTelemetry) that you may have instrumented into your application code.
</Note>

## `LANGCHAIN_TRACING_SAMPLING_RATE`

Sampling rate for traces sent to LangSmith. Valid values: Any float between `0` and `1`.

For more details, refer to [Set a sampling rate for traces](/langsmith/sample-traces).

## `LANGGRAPH_AUTH_TYPE`

Type of authentication for the Agent Server deployment. Valid values: `langsmith`, `noop`.

For deployments to LangSmith, this environment variable is set automatically. For local development or deployments where authentication is handled externally (e.g. self-hosted), set this environment variable to `noop`.

## `LANGGRAPH_POSTGRES_POOL_MAX_SIZE`

Beginning with langgraph-api version `0.2.12`, the maximum size of the Postgres connection pool (per replica) can be controlled using the `LANGGRAPH_POSTGRES_POOL_MAX_SIZE` environment variable. By setting this variable, you can determine the upper bound on the number of simultaneous connections the server will establish with the Postgres database.

For example, if a deployment is scaled up to 10 replicas and `LANGGRAPH_POSTGRES_POOL_MAX_SIZE` is configured to `150`, then up to `1500` connections to Postgres can be established. This is particularly useful for deployments where database resources are limited (or more available) or where you need to tune connection behavior for performance or scaling reasons.

Defaults to `150` connections.

## `LANGSMITH_API_KEY`

For deployments with [self-hosted LangSmith](/langsmith/self-hosted) only.

To send traces to a self-hosted LangSmith instance, set `LANGSMITH_API_KEY` to an API key created from the self-hosted instance.

## `LANGSMITH_ENDPOINT`

For deployments with [self-hosted LangSmith](/langsmith/self-hosted) only.

To send traces to a self-hosted LangSmith instance, set `LANGSMITH_ENDPOINT` to the hostname of the self-hosted instance.

## `LANGSMITH_TRACING`

Set `LANGSMITH_TRACING` to `false` to disable tracing to LangSmith.

Defaults to `true`.

## `LOG_COLOR`

This is mainly relevant in the context of using the dev server via the `langgraph dev` command. Set `LOG_COLOR` to `true` to enable ANSI-colored console output when using the default console renderer. Disabling color output by setting this variable to `false` produces monochrome logs. Defaults to `true`.

## `LOG_LEVEL`

Configure [log level](https://docs.python.org/3/library/logging.html#logging-levels). Defaults to `INFO`.

## `LOG_JSON`

Set `LOG_JSON` to `true` to render all log messages as JSON objects using the configured `JSONRenderer`. This produces structured logs that can be easily parsed or ingested by log management systems. Defaults to `false`.

## `MOUNT_PREFIX`

<Info>
  **Only Allowed in Self-Hosted Deployments**
  The `MOUNT_PREFIX` environment variable is only allowed in Self-Hosted Deployment models, LangSmith SaaS will not allow this environment variable.
</Info>

Set `MOUNT_PREFIX` to serve the Agent Server under a specific path prefix. This is useful for deployments where the server is behind a reverse proxy or load balancer that requires a specific path prefix.

For example, if the server is to be served under `https://example.com/langgraph`, set `MOUNT_PREFIX` to `/langgraph`.

## `N_JOBS_PER_WORKER`

Number of jobs per worker for the Agent Server task queue. Defaults to `10`.

## `POSTGRES_URI_CUSTOM`

<Info>
  **Only for Hybrid and Self-Hosted**
  Custom Postgres instances are only available for [Hybrid](/langsmith/hybrid) and [Self-Hosted](/langsmith/self-hosted) deployments.
</Info>

Specify `POSTGRES_URI_CUSTOM` to use a custom Postgres instance. The value of `POSTGRES_URI_CUSTOM` must be a valid [Postgres connection URI](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING-URIS).

Postgres:

* Version 15.8 or higher.
* An initial database must be present and the connection URI must reference the database.

Control Plane Functionality:

* If `POSTGRES_URI_CUSTOM` is specified, the control plane will not provision a database for the server.
* If `POSTGRES_URI_CUSTOM` is removed, the control plane will not provision a database for the server and will not delete the externally managed Postgres instance.
* If `POSTGRES_URI_CUSTOM` is removed, deployment of the revision will not succeed. Once `POSTGRES_URI_CUSTOM` is specified, it must always be set for the lifecycle of the deployment.
* If the deployment is deleted, the control plane will not delete the externally managed Postgres instance.
* The value of `POSTGRES_URI_CUSTOM` can be updated. For example, a password in the URI can be updated.

Database Connectivity:

* The custom Postgres instance must be accessible by the Agent Server. The user is responsible for ensuring connectivity.

## `REDIS_CLUSTER`

<Warning>
  This feature is in Alpha.
</Warning>

<Info>
  **Only Allowed in Self-Hosted Deployments**
  Redis Cluster mode is only available in Self-Hosted Deployment models, LangSmith SaaS will provision a redis instance for you by default.
</Info>

Set `REDIS_CLUSTER` to `True` to enable Redis Cluster mode. When enabled, the system will connect to Redis using cluster mode. This is useful when connecting to a Redis Cluster deployment.

Defaults to `False`.

## `REDIS_KEY_PREFIX`

<Info>
  **Available in API Server version 0.1.9+**
  This environment variable is supported in API Server version 0.1.9 and above.
</Info>

Specify a prefix for Redis keys. This allows multiple Agent Server instances to share the same Redis instance by using different key prefixes.

Defaults to `''`.

## `REDIS_URI_CUSTOM`

<Info>
  **Only for Hybrid and Self-Hosted**
  Custom Redis instances are only available for [Hybrid](/langsmith/hybrid) and [Self-Hosted](/langsmith/self-hosted) deployments.
</Info>

Specify `REDIS_URI_CUSTOM` to use a custom Redis instance. The value of `REDIS_URI_CUSTOM` must be a valid [Redis connection URI](https://redis-py.readthedocs.io/en/stable/connections.html#redis.Redis.from_url).

## `REDIS_MAX_CONNECTIONS`

The maximum size of the Redis connection pool (per replica) can be controlled using the `REDIS_MAX_CONNECTIONS` environment variable. By setting this variable, you can determine the upper bound on the number of simultaneous connections the server will establish with the Redis instance.

For example, if a deployment is scaled up to 10 replicas and `REDIS_MAX_CONNECTIONS` is configured to `150`, then up to `1500` connections to Redis can be established.

Defaults to `2000`.

## `RESUMABLE_STREAM_TTL_SECONDS`

Time-to-live in seconds for resumable stream data in Redis.

When a run is created and the output is streamed, the stream can be configured to be resumable (e.g. `stream_resumable=True`). If a stream is resumable, output from the stream is temporarily stored in Redis. The TTL for this data can be configured by setting `RESUMABLE_STREAM_TTL_SECONDS`.

See the [Python](https://reference.langchain.com/python/langsmith/deployment/sdk/#langgraph_sdk.client.RunsClient.stream) and [JS/TS](https://langchain-ai.github.io/langgraphjs/reference/classes/sdk_client.RunsClient.html#stream) SDKs for more details on how to implement resumable streams.

Defaults to `120` seconds.

<Note>
  Setting a very high value for `RESUMABLE_STREAM_TTL_SECONDS` can result in substantial Redis memory usage when there are many concurrent runs with large or frequent streaming output. Set this value to the minimum value to enable recovery during network interruptions and prefer checkpointing for long term durability and execution snapshotting.
</Note>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/env-var.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Evaluate a chatbot
Source: https://docs.langchain.com/langsmith/evaluate-chatbot-tutorial



In this guide we will set up evaluations for a chatbot. These allow you to measure how well your application is performing over a set of data. Being able to get this insight quickly and reliably will allow you to iterate with confidence.

At a high level, in this tutorial we will:

* *Create an initial golden dataset to measure performance*
* *Define metrics to use to measure performance*
* *Run evaluations on a few different prompts or models*
* *Compare results manually*
* *Track results over time*
* *Set up automated testing to run in CI/CD*

For more information on the evaluation workflows LangSmith supports, check out the [how-to guides](/langsmith/evaluation), or see the reference docs for [evaluate](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._runner.evaluate) and its asynchronous [aevaluate](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._arunner.aevaluate) counterpart.

Lots to cover, let's dive in!

## Setup

First install the required dependencies for this tutorial. We happen to use OpenAI, but LangSmith can be used with any model:

<CodeGroup>
  ```bash pip theme={null}
  pip install -U langsmith openai
  ```

  ```bash uv theme={null}
  uv add langsmith openai
  ```
</CodeGroup>

And set environment variables to enable LangSmith tracing:

```bash  theme={null}
export LANGSMITH_TRACING="true"
export LANGSMITH_API_KEY="<Your LangSmith API key>"
export OPENAI_API_KEY="<Your OpenAI API key>"
```

## Create a dataset

The first step when getting ready to test and evaluate your application is to define the datapoints you want to evaluate. There are a few aspects to consider here:

* What should the schema of each datapoint be?
* How many datapoints should I gather?
* How should I gather those datapoints?

**Schema:** Each datapoint should consist of, at the very least, the inputs to the application. If you are able, it is also very helpful to define the expected outputs - these represent what you would expect a properly functioning application to output. Often times you cannot define the perfect output - that's okay! Evaluation is an iterative process. Sometimes you may also want to define more information for each example - like the expected documents to fetch in RAG, or the expected steps to take as an agent. LangSmith datasets are very flexible and allow you to define arbitrary schemas.

**How many:** There's no hard and fast rule for how many you should gather. The main thing is to make sure you have proper coverage of edge cases you may want to guard against. Even 10-50 examples can provide a lot of value! Don't worry about getting a large number to start - you can (and should) always add over time!

**How to get:** This is maybe the trickiest part. Once you know you want to gather a dataset... how do you actually go about it? For most teams that are starting a new project, we generally see them start by collecting the first 10-20 datapoints by hand. After starting with these datapoints, these datasets are generally *living* constructs and grow over time. They generally grow after seeing how real users will use your application, seeing the pain points that exist, and then moving a few of those datapoints into this set. There are also methods like synthetically generating data that can be used to augment your dataset. To start, we recommend not worrying about those and just hand labeling \~10-20 examples.

Once you've got your dataset, there are a few different ways to upload them to LangSmith. For this tutorial, we will use the client, but you can also upload via the UI (or even create them in the UI).

For this tutorial, we will create 5 datapoints to evaluate on. We will be evaluating a question-answering application. The input will be a question, and the output will be an answer. Since this is a question-answering application, we can define the expected answer. Let's show how to create and upload this dataset to LangSmith!

```python  theme={null}
from langsmith import Client

client = Client()

# Define dataset: these are your test cases
dataset_name = "QA Example Dataset"
dataset = client.create_dataset(dataset_name)

client.create_examples(
    dataset_id=dataset.id,
    examples=[
        {
            "inputs": {"question": "What is LangChain?"},
            "outputs": {"answer": "A framework for building LLM applications"},
        },
        {
            "inputs": {"question": "What is LangSmith?"},
            "outputs": {"answer": "A platform for observing and evaluating LLM applications"},
        },
        {
            "inputs": {"question": "What is OpenAI?"},
            "outputs": {"answer": "A company that creates Large Language Models"},
        },
        {
            "inputs": {"question": "What is Google?"},
            "outputs": {"answer": "A technology company known for search"},
        },
        {
            "inputs": {"question": "What is Mistral?"},
            "outputs": {"answer": "A company that creates Large Language Models"},
        }
    ]
)
```

Now, if we go the LangSmith UI and look for `QA Example Dataset` in the `Datasets & Testing` page, when we click into it we should see that we have five new examples.

<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-dataset.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=9ab5110714d009d5865ba0e2d8ee0ffa" alt="" data-og-width="1251" width="1251" data-og-height="560" height="560" data-path="langsmith/images/testing-tutorial-dataset.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-dataset.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=e4b38ded6968e649ed8ab507f63f1f3e 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-dataset.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=f7aee5327f8058dd99684cd43e44c791 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-dataset.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=9e853ed05b0a2ad40f9e4d0403e7004c 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-dataset.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=331654a31885b89a93924eaac4fa95da 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-dataset.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=833bf2a60b392323bba47fbe42655537 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-dataset.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=3410e4bc7ac5c28f8838fc5fb88026bd 2500w" />

## Define metrics

After creating our dataset, we can now define some metrics to evaluate our responses on. Since we have an expected answer, we can compare to that as part of our evaluation. However, we do not expect our application to output those **exact** answers, but rather something that is similar. This makes our evaluation a little trickier.

In addition to evaluating correctness, let's also make sure our answers are short and concise. This will be a little easier - we can define a simple Python function to measure the length of the response.

Let's go ahead and define these two metrics.

For the first, we will use an LLM to **judge** whether the output is correct (with respect to the expected output). This **LLM-as-a-judge** is relatively common for cases that are too complex to measure with a simple function. We can define our own prompt and LLM to use for evaluation here:

```python  theme={null}
import openai
from langsmith import wrappers

openai_client = wrappers.wrap_openai(openai.OpenAI())

eval_instructions = "You are an expert professor specialized in grading students' answers to questions."

def correctness(inputs: dict, outputs: dict, reference_outputs: dict) -> bool:
    user_content = f"""You are grading the following question:
{inputs['question']}
Here is the real answer:
{reference_outputs['answer']}
You are grading the following predicted answer:
{outputs['response']}
Respond with CORRECT or INCORRECT:
Grade:"""
    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        temperature=0,
        messages=[
            {"role": "system", "content": eval_instructions},
            {"role": "user", "content": user_content},
        ],
    ).choices[0].message.content
    return response == "CORRECT"
```

For evaluating the length of the response, this is a lot easier! We can just define a simple function that checks whether the actual output is less than 2x the length of the expected result.

```python  theme={null}
def concision(outputs: dict, reference_outputs: dict) -> bool:
    return int(len(outputs["response"]) < 2 * len(reference_outputs["answer"]))
```

## Run Evaluations

Great! So now how do we run evaluations? Now that we have a dataset and evaluators, all that we need is our application! We will build a simple application that just has a system message with instructions on how to respond and then passes it to the LLM. We will build this using the OpenAI SDK directly:

```python  theme={null}
default_instructions = "Respond to the users question in a short, concise manner (one short sentence)."

def my_app(question: str, model: str = "gpt-4o-mini", instructions: str = default_instructions) -> str:
    return openai_client.chat.completions.create(
        model=model,
        temperature=0,
        messages=[
            {"role": "system", "content": instructions},
            {"role": "user", "content": question},
        ],
    ).choices[0].message.content
```

Before running this through LangSmith evaluations, we need to define a simple wrapper that maps the input keys from our dataset to the function we want to call, and then also maps the output of the function to the output key we expect.

```python  theme={null}
def ls_target(inputs: str) -> dict:
    return {"response": my_app(inputs["question"])}
```

Great! Now we're ready to run an evaluation. Let's do it!

```python  theme={null}
experiment_results = client.evaluate(
    ls_target, # Your AI system
    data=dataset_name, # The data to predict and grade over
    evaluators=[concision, correctness], # The evaluators to score the results
    experiment_prefix="openai-4o-mini", # A prefix for your experiment names to easily identify them
)
```

This will output a URL. If we click on it, we should see results of our evaluation!

<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-run.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=9517dd9f9fc23062fcba7b061fe5cdda" alt="" data-og-width="3022" width="3022" data-og-height="1128" height="1128" data-path="langsmith/images/testing-tutorial-run.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-run.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=06b6a2a70ea3f85929dca6f03653be68 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-run.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=3ddae16867d89b93f0155dc654dede93 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-run.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=5ba93e0b10423be56081602f65ec41bc 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-run.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=2088e69be2ee5efdf949296ea3c74652 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-run.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=b6fa551b7edab188da50ccc63dbd9769 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-run.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=0b79a1b6f5a351ed986810f53d14d9d9 2500w" />

If we go back to the dataset page and select the `Experiments` tab, we can now see a summary of our one run!

<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-one-run.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=4c30f7474727d2f537c75e5f80ae1298" alt="" data-og-width="3022" width="3022" data-og-height="1532" height="1532" data-path="langsmith/images/testing-tutorial-one-run.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-one-run.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=4886dc0021394767078872237779a8f3 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-one-run.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=cd111a19a463ed45fabe53caa2fc08be 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-one-run.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=25363336afaecae0198f7c55f5bfe739 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-one-run.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=42473dfcfad36f26ad7cfea316dbb458 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-one-run.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=4317ea45ce5de4f3917a902d84b3e2e3 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-one-run.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=0b0371997a45b936b37c8f44539fd1aa 2500w" />

Let's now try it out with a different model! Let's try `gpt-4-turbo`

```python  theme={null}
def ls_target_v2(inputs: str) -> dict:
    return {"response": my_app(inputs["question"], model="gpt-4-turbo")}

experiment_results = client.evaluate(
    ls_target_v2,
    data=dataset_name,
    evaluators=[concision, correctness],
    experiment_prefix="openai-4-turbo",
)
```

And now let's use GPT-4 but also update the prompt to be a bit more strict in requiring the answer to be short.

```python  theme={null}
instructions_v3 = "Respond to the users question in a short, concise manner (one short sentence). Do NOT use more than ten words."

def ls_target_v3(inputs: str) -> dict:
    response = my_app(
        inputs["question"],
        model="gpt-4-turbo",
        instructions=instructions_v3
    )
    return {"response": response}

experiment_results = client.evaluate(
    ls_target_v3,
    data=dataset_name,
    evaluators=[concision, correctness],
    experiment_prefix="strict-openai-4-turbo",
)
```

If we go back to the `Experiments` tab on the datasets page, we should see that all three runs now show up!

<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-three-runs.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=9d74c18991c33d4fbd5180dbb12a4f91" alt="" data-og-width="3020" width="3020" data-og-height="1540" height="1540" data-path="langsmith/images/testing-tutorial-three-runs.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-three-runs.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=11a8a19c576760e949137952786ad325 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-three-runs.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=dedd52ea55db2e7c886bdbddab19bcf4 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-three-runs.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=d10e4b9efd6dbea2c50859fde066afb7 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-three-runs.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=bc78ab5c6cad42ee2aa0aaf5350329ac 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-three-runs.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=84edcc8ea892ac534009c9a05212bf26 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-three-runs.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=aefc9019026a0c38b5a92c2aa5ac3462 2500w" />

## Comparing results

Awesome, we've evaluated three different runs. But how can we compare results? The first way we can do this is just by looking at the runs in the `Experiments` tab. If we do that, we can see a high level view of the metrics for each run:

<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-metrics.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=224acfbea78b8b1d0e08ce59d06b5088" alt="" data-og-width="3020" width="3020" data-og-height="1540" height="1540" data-path="langsmith/images/testing-tutorial-compare-metrics.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-metrics.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=abf04bbd16675f00a4a3941c27e21ac7 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-metrics.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=40893fc5274fb96ea5a1bd26919604a7 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-metrics.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=3ebe594101ccb3c8efcf3d9cf0e8d906 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-metrics.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=6a934bbb56ebf268636db8f3c775d73d 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-metrics.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=178412fa63f8011046cd44d5e4411f68 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-metrics.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=8f893b4d545e6723dc2efe8d76c4da9f 2500w" />

Great! So we can tell that GPT-4 is better than GPT-3.5 at knowing who companies are, and we can see that the strict prompt helped a lot with the length. But what if we want to explore in more detail?

In order to do that, we can select all the runs we want to compare (in this case all three) and open them up in a comparison view. We immediately see all three tests side by side. Some of the cells are color coded - this is showing a regression of *a certain metric* compared to *a certain baseline*. We automatically choose defaults for the baseline and metric, but you can change those yourself. You can also choose which columns and which metrics you see by using the `Display` control. You can also automatically filter to only see the runs that have improvements/regressions by clicking on the icons at the top.

<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-runs.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=de5575b837cdf97d479e5c91aff9dc78" alt="" data-og-width="3022" width="3022" data-og-height="1548" height="1548" data-path="langsmith/images/testing-tutorial-compare-runs.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-runs.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=09d73536620b794bab530f1c154ca1cd 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-runs.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=3cc976e3066b5a1777bd76ef367e5cd9 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-runs.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=8fa9010ec69b095f97094ea3f1322b7c 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-runs.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=03f984b970ffeceafa30bc6b242dd468 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-runs.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=9d7ba0782fdf722b27ff1382a8156ea9 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-compare-runs.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=64ff5b28a56fcffedb5d0daaa2c87e33 2500w" />

If we want to see more information, we can also select the `Expand` button that appears when hovering over a row to open up a side panel with more detailed information:

<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-side-panel.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=a72c4924a0ad9bebceae2da9518c56cc" alt="" data-og-width="2824" width="2824" data-og-height="1546" height="1546" data-path="langsmith/images/testing-tutorial-side-panel.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-side-panel.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=f10bab81b285ece30e110570962caeaf 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-side-panel.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=ced32c31ada87af7f81ca78ee4d9b1a5 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-side-panel.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=282ab40f9b90b46589a92fb9cd1da680 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-side-panel.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=b584cee17dd93e0e121b7a30ccdc1d3f 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-side-panel.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=b2b51c2aaf8e582eedae0586b0a08031 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-side-panel.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=fc077628c53f5684d16bb309df8fc84a 2500w" />

## Set up automated testing to run in CI/CD

Now that we've run this in a one-off manner, we can set it to run in an automated fashion. We can do this pretty easily by just including it as a pytest file that we run in CI/CD. As part of this, we can either just log the results OR set up some criteria to determine if it passes or not. For example, if I wanted to ensure that we always got at least 80% of generated responses passing the `length` check, we could set that up with a test like:

```python  theme={null}
def test_length_score() -> None:
    """Test that the length score is at least 80%."""
    experiment_results = evaluate(
        ls_target, # Your AI system
        data=dataset_name, # The data to predict and grade over
        evaluators=[concision, correctness], # The evaluators to score the results
    )
    # This will be cleaned up in the next release:
    feedback = client.list_feedback(
        run_ids=[r.id for r in client.list_runs(project_name=experiment_results.experiment_name)],
        feedback_key="concision"
    )
    scores = [f.score for f in feedback]
    assert sum(scores) / len(scores) >= 0.8, "Aggregate score should be at least .8"
```

## Track results over time

Now that we've got these experiments running in an automated fashion, we want to track these results over time. We can do this from the overall `Experiments` tab in the datasets page. By default, we show evaluation metrics over time (highlighted in red). We also automatically track git metrics, to easily associate it with the branch of your code (highlighted in yellow).

<img src="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-over-time.png?fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=a5961747b6ea92bb2f838d025ca5e3d5" alt="" data-og-width="3020" width="3020" data-og-height="1544" height="1544" data-path="langsmith/images/testing-tutorial-over-time.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-over-time.png?w=280&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=e0dc96840d39b210e1ae957e8dfb5fb2 280w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-over-time.png?w=560&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=8f56098392ba5b2c0670ceb16e29ac0c 560w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-over-time.png?w=840&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=df47e336e15b6d23f36b20915db7192d 840w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-over-time.png?w=1100&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=e6e38b84ee80b1df85ee926a16e3b654 1100w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-over-time.png?w=1650&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=8ef1345d6b299b41399a1a60fbef2219 1650w, https://mintcdn.com/langchain-5e9cc07a/ImHGLQW1HnQYwnJV/langsmith/images/testing-tutorial-over-time.png?w=2500&fit=max&auto=format&n=ImHGLQW1HnQYwnJV&q=85&s=679f95e97b7ac941e3380f658b158ef5 2500w" />

## Conclusion

That's it for this tutorial!

We've gone over how to create an initial test set, define some evaluation metrics, run experiments, compare them manually, set up CI/CD, and track results over time. Hopefully this can help you iterate with confidence.

This is just the start. As mentioned earlier, evaluation is an ongoing process. For example - the datapoints you will want to evaluate on will likely continue to change over time. There are many types of evaluators you may wish to explore. For information on this, check out the [how-to guides](/langsmith/evaluation).

Additionally, there are other ways to evaluate data besides in this "offline" manner (e.g. you can evaluate production data). For more information on online evaluation, check out [this guide](/langsmith/online-evaluations).

## Reference code

<Accordion title="Click to see a consolidated code snippet">
  ```python  theme={null}
  import openai
  from langsmith import Client, wrappers

  # Application code
  openai_client = wrappers.wrap_openai(openai.OpenAI())

  default_instructions = "Respond to the users question in a short, concise manner (one short sentence)."

  def my_app(question: str, model: str = "gpt-4o-mini", instructions: str = default_instructions) -> str:
      return openai_client.chat.completions.create(
          model=model,
          temperature=0,
          messages=[
              {"role": "system", "content": instructions},
              {"role": "user", "content": question},
          ],
      ).choices[0].message.content

  client = Client()

  # Define dataset: these are your test cases
  dataset_name = "QA Example Dataset"
  dataset = client.create_dataset(dataset_name)

  client.create_examples(
      dataset_id=dataset.id,
      examples=[
          {
              "inputs": {"question": "What is LangChain?"},
              "outputs": {"answer": "A framework for building LLM applications"},
          },
          {
              "inputs": {"question": "What is LangSmith?"},
              "outputs": {"answer": "A platform for observing and evaluating LLM applications"},
          },
          {
              "inputs": {"question": "What is OpenAI?"},
              "outputs": {"answer": "A company that creates Large Language Models"},
          },
          {
              "inputs": {"question": "What is Google?"},
              "outputs": {"answer": "A technology company known for search"},
          },
          {
              "inputs": {"question": "What is Mistral?"},
              "outputs": {"answer": "A company that creates Large Language Models"},
          }
      ]
  )

  # Define evaluators
  eval_instructions = "You are an expert professor specialized in grading students' answers to questions."

  def correctness(inputs: dict, outputs: dict, reference_outputs: dict) -> bool:
      user_content = f"""You are grading the following question:
  {inputs['question']}
  Here is the real answer:
  {reference_outputs['answer']}
  You are grading the following predicted answer:
  {outputs['response']}
  Respond with CORRECT or INCORRECT:
  Grade:"""
      response = openai_client.chat.completions.create(
          model="gpt-4o-mini",
          temperature=0,
          messages=[
              {"role": "system", "content": eval_instructions},
              {"role": "user", "content": user_content},
          ],
      ).choices[0].message.content
      return response == "CORRECT"

  def concision(outputs: dict, reference_outputs: dict) -> bool:
      return int(len(outputs["response"]) < 2 * len(reference_outputs["answer"]))

  # Run evaluations
  def ls_target(inputs: str) -> dict:
      return {"response": my_app(inputs["question"])}

  experiment_results_v1 = client.evaluate(
      ls_target, # Your AI system
      data=dataset_name, # The data to predict and grade over
      evaluators=[concision, correctness], # The evaluators to score the results
      experiment_prefix="openai-4o-mini", # A prefix for your experiment names to easily identify them
  )

  def ls_target_v2(inputs: str) -> dict:
      return {"response": my_app(inputs["question"], model="gpt-4-turbo")}

  experiment_results_v2 = client.evaluate(
      ls_target_v2,
      data=dataset_name,
      evaluators=[concision, correctness],
      experiment_prefix="openai-4-turbo",
  )

  instructions_v3 = "Respond to the users question in a short, concise manner (one short sentence). Do NOT use more than ten words."

  def ls_target_v3(inputs: str) -> dict:
      response = my_app(
          inputs["question"],
          model="gpt-4-turbo",
          instructions=instructions_v3
      )
      return {"response": response}

  experiment_results_v3 = client.evaluate(
      ls_target_v3,
      data=dataset_name,
      evaluators=[concision, correctness],
      experiment_prefix="strict-openai-4-turbo",
  )
  ```
</Accordion>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/evaluate-chatbot-tutorial.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Evaluate a complex agent
Source: https://docs.langchain.com/langsmith/evaluate-complex-agent



<Info>
  [Agent evaluation](/langsmith/evaluation-concepts#agents) | [Evaluators](/langsmith/evaluation-concepts#evaluators) | [LLM-as-judge evaluators](/langsmith/evaluation-concepts#llm-as-judge)
</Info>

In this tutorial, we'll build a customer support bot that helps users navigate a digital music store. Then, we'll go through the three most effective types of evaluations to run on chat bots:

* [Final response](/langsmith/evaluation-concepts#evaluating-an-agents-final-response): Evaluate the agent's final response.
* [Trajectory](/langsmith/evaluation-concepts#evaluating-an-agents-trajectory): Evaluate whether the agent took the expected path (e.g., of tool calls) to arrive at the final answer.
* [Single step](/langsmith/evaluation-concepts#evaluating-a-single-step-of-an-agent): Evaluate any agent step in isolation (e.g., whether it selects the appropriate first tool for a given step).

We'll build our agent using [LangGraph](https://github.com/langchain-ai/langgraph), but the techniques and LangSmith functionality shown here are framework-agnostic.

## Setup

### Configure the environment

Let's install the required dependencies:

<CodeGroup>
  ```bash pip theme={null}
  pip install -U langgraph langchain[openai]
  ```

  ```bash uv theme={null}
  uv add langgraph langchain[openai]
  ```
</CodeGroup>

Let's set up environment variables for OpenAI and [LangSmith](https://smith.langchain.com):

```python  theme={null}
import getpass
import os

def _set_env(var: str) -> None:
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"Set {var}: ")

os.environ["LANGSMITH_TRACING"] = "true"
_set_env("LANGSMITH_API_KEY")
_set_env("OPENAI_API_KEY")
```

### Download the database

We will create a SQLite database for this tutorial. SQLite is a lightweight database that is easy to set up and use. We will load the `chinook` database, which is a sample database that represents a digital media store. Find more information about the database [here](https://www.sqlitetutorial.net/sqlite-sample-database/).

For convenience, we have hosted the database in a public GCS bucket:

```python  theme={null}
import requests

url = "https://storage.googleapis.com/benchmarks-artifacts/chinook/Chinook.db"
response = requests.get(url)

if response.status_code == 200:
    # Open a local file in binary write mode
    with open("chinook.db", "wb") as file:
        # Write the content of the response (the file) to the local file
        file.write(response.content)
    print("File downloaded and saved as Chinook.db")
else:
    print(f"Failed to download the file. Status code: {response.status_code}")
```

Here's a sample of the data in the db:

```python  theme={null}
import sqlite3
# ... database connection and query code
```

```
[(1, 'AC/DC'), (2, 'Accept'), (3, 'Aerosmith'), (4, 'Alanis Morissette'), (5, 'Alice In Chains'), (6, 'Antônio Carlos Jobim'), (7, 'Apocalyptica'), (8, 'Audioslave'), (9, 'BackBeat'), (10, 'Billy Cobham')]
```

And here's the database schema (image from [https://github.com/lerocha/chinook-database](https://github.com/lerocha/chinook-database)):

<img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chinook-diagram.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=5da2a8dcca68f02dfcec11f9c472d341" alt="Chinook DB" data-og-width="1672" width="1672" data-og-height="1132" height="1132" data-path="langsmith/images/chinook-diagram.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chinook-diagram.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=ea7b3a27e9780b556aa90f6914dcef30 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chinook-diagram.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=d9cf3ddad46562213014ffb1a77b1e45 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chinook-diagram.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=2c9e1e70e9be2cf07111b2211e1ef9b7 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chinook-diagram.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=970f7f48c80222219b493211331ee22f 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chinook-diagram.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=3188acb57c4abc0156f8687fa9e229d8 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/chinook-diagram.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=86b9655b9fd1bc38fcf9e054b11833df 2500w" />

### Define the customer support agent

We'll create a [LangGraph](https://langchain-ai.github.io/langgraph/) agent with limited access to our database. For demo purposes, our agent will support two basic types of requests:

* Lookup: The customer can look up song titles, artist names, and albums based on other identifying information. For example: "What songs do you have by Jimi Hendrix?"
* Refund: The customer can request a refund on their past purchases. For example: "My name is Claude Shannon and I'd like a refund on a purchase I made last week, could you help me?"

For simplicity in this demo, we'll implement refunds by deleting the corresponding database records. We'll skip implementing user authentication and other production security measures.

The agent's logic will be structured as two separate subgraphs (one for lookups and one for refunds), with a parent graph that routes requests to the appropriate subgraph.

#### Refund agent

Let's build the refund processing agent. This agent needs to:

1. Find the customer's purchase records in the database
2. Delete the relevant Invoice and InvoiceLine records to process the refund

We'll create two SQL helper functions:

1. A function to execute the refund by deleting records
2. A function to look up a customer's purchase history

To make testing easier, we'll add a "mock" mode to these functions. When mock mode is enabled, the functions will simulate database operations without actually modifying any data.

```python  theme={null}
import sqlite3

def _refund(invoice_id: int | None, invoice_line_ids: list[int] | None, mock: bool = False) -> float:
    ...

def _lookup( ...
```

Now let's define our graph. We'll use a simple architecture with three main paths:

1. Extract customer and purchase information from the conversation

2. Route the request to one of three paths:

   * Refund path: If we have sufficient purchase details (Invoice ID or Invoice Line IDs) to process a refund
   * Lookup path: If we have enough customer information (name and phone) to search their purchase history
   * Response path: If we need more information, respond to the user requesting the specific details needed

The graph's state will track:

* The conversation history (messages between user and agent)
* All customer and purchase information extracted from the conversation
* The next message to send to the user (followup text)

````python  theme={null}
from typing import Literal
import json

from langchain.chat_models import init_chat_model
from langchain_core.runnables import RunnableConfig
from langgraph.graph import END, StateGraph
from langgraph.graph.message import AnyMessage, add_messages
from langgraph.types import Command, interrupt
from tabulate import tabulate
from typing_extensions import Annotated, TypedDict

# Graph state.
class State(TypedDict):
    """Agent state."""
    messages: Annotated[list[AnyMessage], add_messages]
    followup: str | None

    invoice_id: int | None
    invoice_line_ids: list[int] | None
    customer_first_name: str | None
    customer_last_name: str | None
    customer_phone: str | None
    track_name: str | None
    album_title: str | None
    artist_name: str | None
    purchase_date_iso_8601: str | None

# Instructions for extracting the user/purchase info from the conversation.
gather_info_instructions = """You are managing an online music store that sells song tracks. \
Customers can buy multiple tracks at a time and these purchases are recorded in a database as \
an Invoice per purchase and an associated set of Invoice Lines for each purchased track.

Your task is to help customers who would like a refund for one or more of the tracks they've \
purchased. In order for you to be able refund them, the customer must specify the Invoice ID \
to get a refund on all the tracks they bought in a single transaction, or one or more Invoice \
Line IDs if they would like refunds on individual tracks.

Often a user will not know the specific Invoice ID(s) or Invoice Line ID(s) for which they \
would like a refund. In this case you can help them look up their invoices by asking them to \
specify:
- Required: Their first name, last name, and phone number.
- Optionally: The track name, artist name, album name, or purchase date.

If the customer has not specified the required information (either Invoice/Invoice Line IDs \
or first name, last name, phone) then please ask them to specify it."""

# Extraction schema, mirrors the graph state.
class PurchaseInformation(TypedDict):
    """All of the known information about the invoice / invoice lines the customer would like refunded. Do not make up values, leave fields as null if you don't know their value."""

    invoice_id: int | None
    invoice_line_ids: list[int] | None
    customer_first_name: str | None
    customer_last_name: str | None
    customer_phone: str | None
    track_name: str | None
    album_title: str | None
    artist_name: str | None
    purchase_date_iso_8601: str | None
    followup: Annotated[
        str | None,
        ...,
        "If the user hasn't enough identifying information, please tell them what the required information is and ask them to specify it.",
    ]

# Model for performing extraction.
info_llm = init_chat_model("gpt-4o-mini").with_structured_output(
    PurchaseInformation, method="json_schema", include_raw=True
)

# Graph node for extracting user info and routing to lookup/refund/END.
async def gather_info(state: State) -> Command[Literal["lookup", "refund", END]]:
    info = await info_llm.ainvoke(
        [
            {"role": "system", "content": gather_info_instructions},
            *state["messages"],
        ]
    )
    parsed = info["parsed"]
    if any(parsed[k] for k in ("invoice_id", "invoice_line_ids")):
        goto = "refund"
    elif all(
        parsed[k]
        for k in ("customer_first_name", "customer_last_name", "customer_phone")
    ):
        goto = "lookup"
    else:
        goto = END
    update = {"messages": [info["raw"]], **parsed}
    return Command(update=update, goto=goto)

# Graph node for executing the refund.
# Note that here we inspect the runtime config for an "env" variable.
# If "env" is set to "test", then we don't actually delete any rows from our database.
# This will become important when we're running our evaluations.
def refund(state: State, config: RunnableConfig) -> dict:
    # Whether to mock the deletion. True if the configurable var 'env' is set to 'test'.
    mock = config.get("configurable", {}).get("env", "prod") == "test"
    refunded = _refund(
        invoice_id=state["invoice_id"], invoice_line_ids=state["invoice_line_ids"], mock=mock
    )
    response = f"You have been refunded a total of: ${refunded:.2f}. Is there anything else I can help with?"
    return {
        "messages": [{"role": "assistant", "content": response}],
        "followup": response,
    }

# Graph node for looking up the users purchases
def lookup(state: State) -> dict:
    args = (
        state[k]
        for k in (
            "customer_first_name",
            "customer_last_name",
            "customer_phone",
            "track_name",
            "album_title",
            "artist_name",
            "purchase_date_iso_8601",
        )
    )
    results = _lookup(*args)
    if not results:
        response = "We did not find any purchases associated with the information you've provided. Are you sure you've entered all of your information correctly?"
        followup = response
    else:
        response = f"Which of the following purchases would you like to be refunded for?\n\n```json{json.dumps(results, indent=2)}\n```"
        followup = f"Which of the following purchases would you like to be refunded for?\n\n{tabulate(results, headers='keys')}"
    return {
        "messages": [{"role": "assistant", "content": response}],
        "followup": followup,
        "invoice_line_ids": [res["invoice_line_id"] for res in results],
    }

# Building our graph
graph_builder = StateGraph(State)

graph_builder.add_node(gather_info)
graph_builder.add_node(refund)
graph_builder.add_node(lookup)

graph_builder.set_entry_point("gather_info")
graph_builder.add_edge("lookup", END)
graph_builder.add_edge("refund", END)

refund_graph = graph_builder.compile()
````

We can visualize our refund graph:

```
# Assumes you're in an interactive Python environmentfrom IPython.display import Image, display ...
```

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/refund-graph.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=a65951850208fd3b03848629bdda8ae0" alt="Refund graph" data-og-width="256" width="256" data-og-height="333" height="333" data-path="langsmith/images/refund-graph.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/refund-graph.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=8817f44b37322ab9a51fd01ee7902181 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/refund-graph.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=753a20158640cbeeeb81498d5c5ae95d 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/refund-graph.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=8d38bcff07b53e1f5648b3dd45cffa66 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/refund-graph.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=50a7f863cf45d9df7b59cc3614fdb4e9 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/refund-graph.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=cfbda86ec83a651bfe8e38235579302d 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/refund-graph.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=56745d2e7603dca7fa233e1fd5818008 2500w" />

#### Lookup agent

For the lookup (i.e. question-answering) agent, we'll use a simple ReACT architecture and give the agent tools for looking up track names, artist names, and album names based on various filters. For example, you can look up albums by a particular artist, artists who released songs with a specific name, etc.

```python  theme={null}
from langchain.embeddings import init_embeddings
from langchain.tools import tool
from langchain_core.vectorstores import InMemoryVectorStore
from langchain.agents import create_agent


# Our SQL queries will only work if we filter on the exact string values that are in the DB.
# To ensure this, we'll create vectorstore indexes for all of the artists, tracks and albums
# ahead of time and use those to disambiguate the user input. E.g. if a user searches for
# songs by "prince" and our DB records the artist as "Prince", ideally when we query our
# artist vectorstore for "prince" we'll get back the value "Prince", which we can then
# use in our SQL queries.
def index_fields() -> tuple[InMemoryVectorStore, InMemoryVectorStore, InMemoryVectorStore]: ...

track_store, artist_store, album_store = index_fields()

# Agent tools
@tool
def lookup_track( ...

@tool
def lookup_album( ...

@tool
def lookup_artist( ...

# Agent model
qa_llm = init_chat_model("claude-sonnet-4-5-20250929")
# The prebuilt ReACT agent only expects State to have a 'messages' key, so the
# state we defined for the refund agent can also be passed to our lookup agent.
qa_graph = create_agent(qa_llm, tools=[lookup_track, lookup_artist, lookup_album])
```

```
display(Image(qa_graph.get_graph(xray=True).draw_mermaid_png()))
```

<img src="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/qa-graph.png?fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=fa838edc78b2b29e8c29807d8c3dd7fd" alt="QA Graph" data-og-width="214" width="214" data-og-height="249" height="249" data-path="langsmith/images/qa-graph.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/qa-graph.png?w=280&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=920e82f376d6bbbcfe02c07ac7a45b80 280w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/qa-graph.png?w=560&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=938d3bd8c19abfe27ea5efd1c996494c 560w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/qa-graph.png?w=840&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=e46ece85318d4c376cd6bb632bf41ab4 840w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/qa-graph.png?w=1100&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=3e3c715ef37db24fd0cbf8eb4ca19190 1100w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/qa-graph.png?w=1650&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=e3270477acbc50eb9e4e9736a5ec6afc 1650w, https://mintcdn.com/langchain-5e9cc07a/Fr2lazPB4XVeEA7l/langsmith/images/qa-graph.png?w=2500&fit=max&auto=format&n=Fr2lazPB4XVeEA7l&q=85&s=667b26bb91f33aaacbeb0a2ea749825a 2500w" />

#### Parent agent

Now let's define a parent agent that combines our two task-specific agents. The only job of the parent agent is to route to one of the sub-agents by classifying the user's current intent, and to compile the output into a followup message.

```python  theme={null}
# Schema for routing user intent.
# We'll use structured output to enforce that the model returns only
# the desired output.
class UserIntent(TypedDict):
    """The user's current intent in the conversation"""

    intent: Literal["refund", "question_answering"]

# Routing model with structured output
router_llm = init_chat_model("gpt-4o-mini").with_structured_output(
    UserIntent, method="json_schema", strict=True
)

# Instructions for routing.
route_instructions = """You are managing an online music store that sells song tracks. \
You can help customers in two types of ways: (1) answering general questions about \
tracks sold at your store, (2) helping them get a refund on a purhcase they made at your store.

Based on the following conversation, determine if the user is currently seeking general \
information about song tracks or if they are trying to refund a specific purchase.

Return 'refund' if they are trying to get a refund and 'question_answering' if they are \
asking a general music question. Do NOT return anything else. Do NOT try to respond to \
the user.
"""

# Node for routing.
async def intent_classifier(
    state: State,
) -> Command[Literal["refund_agent", "question_answering_agent"]]:
    response = router_llm.invoke(
        [{"role": "system", "content": route_instructions}, *state["messages"]]
    )
    return Command(goto=response["intent"] + "_agent")

# Node for making sure the 'followup' key is set before our agent run completes.
def compile_followup(state: State) -> dict:
    """Set the followup to be the last message if it hasn't explicitly been set."""
    if not state.get("followup"):
        return {"followup": state["messages"][-1].content}
    return {}

# Agent definition
graph_builder = StateGraph(State)
graph_builder.add_node(intent_classifier)
# Since all of our subagents have compatible state,
# we can add them as nodes directly.
graph_builder.add_node("refund_agent", refund_graph)
graph_builder.add_node("question_answering_agent", qa_graph)
graph_builder.add_node(compile_followup)

graph_builder.set_entry_point("intent_classifier")
graph_builder.add_edge("refund_agent", "compile_followup")
graph_builder.add_edge("question_answering_agent", "compile_followup")
graph_builder.add_edge("compile_followup", END)

graph = graph_builder.compile()
```

We can visualize our compiled parent graph including all of its subgraphs:

```python  theme={null}
display(Image(graph.get_graph().draw_mermaid_png()))
```

<img src="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/agent-tutorial-graph.png?fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=619f9b540ea69b1662b2a599ce78241b" alt="graph" data-og-width="646" width="646" data-og-height="680" height="680" data-path="langsmith/images/agent-tutorial-graph.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/agent-tutorial-graph.png?w=280&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=227790d90780a4c56233650b957130df 280w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/agent-tutorial-graph.png?w=560&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=30ae6a9b1bc367152a57d4a0c3e41af7 560w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/agent-tutorial-graph.png?w=840&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=37f29b6e783cf2a80714c29ab0be3c5f 840w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/agent-tutorial-graph.png?w=1100&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=423ad48e0266ac257b6d76962697b45d 1100w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/agent-tutorial-graph.png?w=1650&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=581821d6b377b98108507712d6b08c51 1650w, https://mintcdn.com/langchain-5e9cc07a/E8FdemkcQxROovD9/langsmith/images/agent-tutorial-graph.png?w=2500&fit=max&auto=format&n=E8FdemkcQxROovD9&q=85&s=126ef194ff5042f691c8f52cf3a1cb75 2500w" />

#### Try it out

Let's give our custom support agent a whirl!

```python  theme={null}
state = await graph.ainvoke(
    {"messages": [{"role": "user", "content": "what james brown songs do you have"}]}
)
print(state["followup"])
```

```
I found 20 James Brown songs in the database, all from the album "Sex Machine". Here they are: ...
```

```python  theme={null}
state = await graph.ainvoke({"messages": [
    {
        "role": "user",
        "content": "my name is Aaron Mitchell and my number is +1 (204) 452-6452. I bought some songs by Led Zeppelin that i'd like refunded",
    }
]})
print(state["followup"])
```

```
Which of the following purchases would you like to be refunded for? ...
```

## Evaluations

Now that we've got a testable version of our agent, let's run some evaluations. Agent evaluation can focus on at least 3 things:

* [Final response](/langsmith/evaluation-concepts#evaluating-an-agents-final-response): The inputs are a prompt and an optional list of tools. The output is the final agent response.
* [Trajectory](/langsmith/evaluation-concepts#evaluating-an-agents-trajectory): As before, the inputs are a prompt and an optional list of tools. The output is the list of tool calls
* [Single step](/langsmith/evaluation-concepts#evaluating-a-single-step-of-an-agent): As before, the inputs are a prompt and an optional list of tools. The output is the tool call.

Let's run each type of evaluation:

### Final response evaluator

First, let's create a [dataset](/langsmith/evaluation-concepts#datasets) that evaluates end-to-end performance of the agent. For simplicity we'll use the same dataset for final response and trajectory evaluation, so we'll add both ground-truth responses and trajectories for each example question. We'll cover the trajectories in the next section.

```python  theme={null}
from langsmith import Client

client = Client()

# Create a dataset
examples = [
    {
        "inputs": {
            "question": "How many songs do you have by James Brown",
        },
        "outputs": {
            "response": "We have 20 songs by James Brown",
            "trajectory": ["question_answering_agent", "lookup_track"]
        }
    },
    {
        "inputs": {
            "question": "My name is Aaron Mitchell and I'd like a refund.",
        },
        "outputs": {
            "response": "I need some more information to help you with the refund. Please specify your phone number, the invoice ID, or the line item IDs for the purchase you'd like refunded.",
            "trajectory": ["refund_agent"],
        }
    },
    {
        "inputs": {
            "question": "My name is Aaron Mitchell and I'd like a refund on my Led Zeppelin purchases. My number is +1 (204) 452-6452",
        },
        "outputs": {
            "response": 'Which of the following purchases would you like to be refunded for?\n\n  invoice_line_id  track_name                        artist_name    purchase_date          quantity_purchased    price_per_unit\n-----------------  --------------------------------  -------------  -------------------  --------------------  ----------------\n              267  How Many More Times               Led Zeppelin   2009-08-06 00:00:00                     1              0.99\n              268  What Is And What Should Never Be  Led Zeppelin   2009-08-06 00:00:00                     1              0.99',
            "trajectory": ["refund_agent", "lookup"],
        },
    },
    {
        "inputs": {
            "question": "Who recorded Wish You Were Here again? What other albums of there's do you have?",
        },
        "outputs": {
            "response": "Wish You Were Here is an album by Pink Floyd",
            "trajectory": ["question_answering_agent", "lookup_album"],
        },
    },
    {
        "inputs": {
            "question": "I want a full refund for invoice 237",
        },
        "outputs": {
            "response": "You have been refunded $0.99.",
            "trajectory": ["refund_agent", "refund"],
        }
    },
]

dataset_name = "Chinook Customer Service Bot: E2E"

if not client.has_dataset(dataset_name=dataset_name):
    dataset = client.create_dataset(dataset_name=dataset_name)
    client.create_examples(
        dataset_id=dataset.id,
        examples=examples
    )
```

We'll create a custom [LLM-as-judge](/langsmith/evaluation-concepts#llm-as-judge) evaluator that uses another model to compare our agent's output on each example to the reference response, and judge if they're equivalent or not:

```python  theme={null}
# LLM-as-judge instructions
grader_instructions = """You are a teacher grading a quiz.

You will be given a QUESTION, the GROUND TRUTH (correct) RESPONSE, and the STUDENT RESPONSE.

Here is the grade criteria to follow:
(1) Grade the student responses based ONLY on their factual accuracy relative to the ground truth answer.
(2) Ensure that the student response does not contain any conflicting statements.
(3) It is OK if the student response contains more information than the ground truth response, as long as it is factually accurate relative to the  ground truth response.

Correctness:
True means that the student's response meets all of the criteria.
False means that the student's response does not meet all of the criteria.

Explain your reasoning in a step-by-step manner to ensure your reasoning and conclusion are correct."""

# LLM-as-judge output schema
class Grade(TypedDict):
    """Compare the expected and actual answers and grade the actual answer."""
    reasoning: Annotated[str, ..., "Explain your reasoning for whether the actual response is correct or not."]
    is_correct: Annotated[bool, ..., "True if the student response is mostly or exactly correct, otherwise False."]

# Judge LLM
grader_llm = init_chat_model("gpt-4o-mini", temperature=0).with_structured_output(Grade, method="json_schema", strict=True)

# Evaluator function
async def final_answer_correct(inputs: dict, outputs: dict, reference_outputs: dict) -> bool:
    """Evaluate if the final response is equivalent to reference response."""

    # Note that we assume the outputs has a 'response' dictionary. We'll need to make sure
    # that the target function we define includes this key.
    user = f"""QUESTION: {inputs['question']}
    GROUND TRUTH RESPONSE: {reference_outputs['response']}
    STUDENT RESPONSE: {outputs['response']}"""

    grade = await grader_llm.ainvoke([{"role": "system", "content": grader_instructions}, {"role": "user", "content": user}])
    return grade["is_correct"]
```

Now we can run our evaluation. Our evaluator assumes that our target function returns a 'response' key, so lets define a target function that does so.

Also remember that in our refund graph we made the refund node configurable, so that if we specified `config={"env": "test"}`, we would mock out the refunds without actually updating the DB. We'll use this configurable variable in our target `run_graph` method when invoking our graph:

```python  theme={null}
# Target function
async def run_graph(inputs: dict) -> dict:
    """Run graph and track the trajectory it takes along with the final response."""
    result = await graph.ainvoke({"messages": [
        { "role": "user", "content": inputs['question']},
    ]}, config={"env": "test"})
    return {"response": result["followup"]}

# Evaluation job and results
experiment_results = await client.aevaluate(
    run_graph,
    data=dataset_name,
    evaluators=[final_answer_correct],
    experiment_prefix="sql-agent-gpt4o-e2e",
    num_repetitions=1,
    max_concurrency=4,
)
experiment_results.to_pandas()
```

You can see what these results look like here: [LangSmith link](https://smith.langchain.com/public/708d08f4-300e-4c75-9677-c6b71b0d28c9/d).

### Trajectory evaluator

As agents become more complex, they have more potential points of failure. Rather than using simple pass/fail evaluations, it's often better to use evaluations that can give partial credit when an agent takes some correct steps, even if it doesn't reach the right final answer.

This is where trajectory evaluations come in. A trajectory evaluation:

1. Compares the actual sequence of steps the agent took against an expected sequence
2. Calculates a score based on how many of the expected steps were completed correctly

For this example, our end-to-end dataset contains an ordered list of steps that we expect the agent to take. Let's create an evaluator that checks the agent's actual trajectory against these expected steps and calculates what percentage were completed:

```python  theme={null}
def trajectory_subsequence(outputs: dict, reference_outputs: dict) -> float:
    """Check how many of the desired steps the agent took."""
    if len(reference_outputs['trajectory']) > len(outputs['trajectory']):
        return False

    i = j = 0
    while i < len(reference_outputs['trajectory']) and j < len(outputs['trajectory']):
        if reference_outputs['trajectory'][i] == outputs['trajectory'][j]:
            i += 1
        j += 1

    return i / len(reference_outputs['trajectory'])
```

Now we can run our evaluation. Our evaluator assumes that our target function returns a 'trajectory' key, so lets define a target function that does so. We'll need to usage [LangGraph's streaming capabilities](https://langchain-ai.github.io/langgra/langsmith/observability-concepts/streaming/) to record the trajectory.

Note that we are reusing the same dataset as for our final response evaluation, so we could have run both evaluators together and defined a target function that returns both "response" and "trajectory". In practice it's often useful to have separate datasets for each type of evaluation, which is why we show them separately here:

```python  theme={null}
async def run_graph(inputs: dict) -> dict:
    """Run graph and track the trajectory it takes along with the final response."""
    trajectory = []
    # Set subgraph=True to stream events from subgraphs of the main graph: https://langchain-ai.github.io/langgraph/how-tos/streaming-subgraphs/
    # Set stream_mode="debug" to stream all possible events: https://langchain-ai.github.io/langgra/langsmith/observability-concepts/streaming
    async for namespace, chunk in graph.astream({"messages": [
            {
                "role": "user",
                "content": inputs['question'],
            }
        ]}, subgraphs=True, stream_mode="debug"):
        # Event type for entering a node
        if chunk['type'] == 'task':
            # Record the node name
            trajectory.append(chunk['payload']['name'])
            # Given how we defined our dataset, we also need to track when specific tools are
            # called by our question answering ReACT agent. These tool calls can be found
            # when the ToolsNode (named "tools") is invoked by looking at the AIMessage.tool_calls
            # of the latest input message.
            if chunk['payload']['name'] == 'tools' and chunk['type'] == 'task':
                for tc in chunk['payload']['input']['messages'][-1].tool_calls:
                    trajectory.append(tc['name'])
    return {"trajectory": trajectory}

experiment_results = await client.aevaluate(
    run_graph,
    data=dataset_name,
    evaluators=[trajectory_subsequence],
    experiment_prefix="sql-agent-gpt4o-trajectory",
    num_repetitions=1,
    max_concurrency=4,
)
experiment_results.to_pandas()
```

You can see what these results look like here: [LangSmith link](https://smith.langchain.com/public/708d08f4-300e-4c75-9677-c6b71b0d28c9/d).

### Single step evaluators

While end-to-end tests give you the most signal about your agents performance, for the sake of debugging and iterating on your agent it can be helpful to pinpoint specific steps that are difficult and evaluate them directly.

In our case, a crucial part of our agent is that it routes the user's intention correctly into either the "refund" path or the "question answering" path. Let's create a dataset and run some evaluations to directly stress test this one component.

```python  theme={null}
# Create dataset
examples = [
    {
        "inputs": {"messages": [{"role": "user", "content": "i bought some tracks recently and i dont like them"}]},
        "outputs": {"route": "refund_agent"},
    },
    {
        "inputs": {"messages": [{"role": "user", "content": "I was thinking of purchasing some Rolling Stones tunes, any recommendations?"}]},
        "outputs": {"route": "question_answering_agent"},
    },
    {
        "inputs": {"messages": [{"role": "user", "content": "i want a refund on purchase 237"}, {"role": "assistant", "content": "I've refunded you a total of $1.98. How else can I help you today?"}, {"role": "user", "content": "did prince release any albums in 2000?"}]},
        "outputs": {"route": "question_answering_agent"},
    },
    {
        "inputs": {"messages": [{"role": "user", "content": "i purchased a cover of Yesterday recently but can't remember who it was by, which versions of it do you have?"}]},
        "outputs": {"route": "question_answering_agent"},
    },
]

dataset_name = "Chinook Customer Service Bot: Intent Classifier"
if not client.has_dataset(dataset_name=dataset_name):
    dataset = client.create_dataset(dataset_name=dataset_name)
    client.create_examples(
        dataset_id=dataset.id,
        examples=examples
    )

# Evaluator
def correct(outputs: dict, reference_outputs: dict) -> bool:
    """Check if the agent chose the correct route."""
    return outputs["route"] == reference_outputs["route"]

# Target function for running the relevant step
async def run_intent_classifier(inputs: dict) -> dict:
    # Note that we can access and run the intent_classifier node of our graph directly.
    command = await graph.nodes['intent_classifier'].ainvoke(inputs)
    return {"route": command.goto}

# Run evaluation
experiment_results = await client.aevaluate(
    run_intent_classifier,
    data=dataset_name,
    evaluators=[correct],
    experiment_prefix="sql-agent-gpt4o-intent-classifier",
    max_concurrency=4,
)
```

You can see what these results look like here: [LangSmith link](https://smith.langchain.com/public/f133dae2-8a88-43a0-9bfd-ab45bfa3920b/d).

## Reference code

Here's a consolidated script with all the above code:

<Accordion title="Reference code">
  ````python  theme={null}
  import json
  import sqlite3
  from typing import Literal

  from langchain.chat_models import init_chat_model
  from langchain.embeddings import init_embeddings
  from langchain_core.runnables import RunnableConfig
  from langchain.tools import tool
  from langchain_core.vectorstores import InMemoryVectorStore
  from langgraph.graph import END, StateGraph
  from langgraph.graph.message import AnyMessage, add_messages
  from langchain.agents import create_agent
  from langgraph.types import Command, interrupt
  from langsmith import Client
  import requests
  from tabulate import tabulate
  from typing_extensions import Annotated, TypedDict


  url = "https://storage.googleapis.com/benchmarks-artifacts/chinook/Chinook.db"

  response = requests.get(url)

  if response.status_code == 200:
      # Open a local file in binary write mode
      with open("chinook.db", "wb") as file:
          # Write the content of the response (the file) to the local file
          file.write(response.content)
      print("File downloaded and saved as Chinook.db")
  else:
      print(f"Failed to download the file. Status code: {response.status_code}")


  def _refund(
      invoice_id: int | None, invoice_line_ids: list[int] | None, mock: bool = False
  ) -> float:
      """Given an Invoice ID and/or Invoice Line IDs, delete the relevant Invoice/InvoiceLine records in the Chinook DB.

      Args:
          invoice_id: The Invoice to delete.
          invoice_line_ids: The Invoice Lines to delete.
          mock: If True, do not actually delete the specified Invoice/Invoice Lines. Used for testing purposes.

      Returns:
          float: The total dollar amount that was deleted (or mock deleted).
      """

      if invoice_id is None and invoice_line_ids is None:
          return 0.0

      # Connect to the Chinook database
      conn = sqlite3.connect("chinook.db")
      cursor = conn.cursor()

      total_refund = 0.0

      try:
          # If invoice_id is provided, delete entire invoice and its lines
          if invoice_id is not None:
              # First get the total amount for the invoice
              cursor.execute(
                  """
                  SELECT Total
                  FROM Invoice
                  WHERE InvoiceId = ?
              """,
                  (invoice_id,),
              )

              result = cursor.fetchone()
              if result:
                  total_refund += result[0]

              # Delete invoice lines first (due to foreign key constraints)
              if not mock:
                  cursor.execute(
                      """
                      DELETE FROM InvoiceLine
                      WHERE InvoiceId = ?
                  """,
                      (invoice_id,),
                  )

                  # Then delete the invoice
                  cursor.execute(
                      """
                      DELETE FROM Invoice
                      WHERE InvoiceId = ?
                  """,
                      (invoice_id,),
                  )

          # If specific invoice lines are provided
          if invoice_line_ids is not None:
              # Get the total amount for the specified invoice lines
              placeholders = ",".join(["?" for _ in invoice_line_ids])
              cursor.execute(
                  f"""
                  SELECT SUM(UnitPrice * Quantity)
                  FROM InvoiceLine
                  WHERE InvoiceLineId IN ({placeholders})
              """,
                  invoice_line_ids,
              )

              result = cursor.fetchone()
              if result and result[0]:
                  total_refund += result[0]

              if not mock:
                  # Delete the specified invoice lines
                  cursor.execute(
                      f"""
                      DELETE FROM InvoiceLine
                      WHERE InvoiceLineId IN ({placeholders})
                  """,
                      invoice_line_ids,
                  )

          # Commit the changes
          conn.commit()

      except sqlite3.Error as e:
          # Roll back in case of error
          conn.rollback()
          raise e

      finally:
          # Close the connection
          conn.close()

      return float(total_refund)


  def _lookup(
      customer_first_name: str,
      customer_last_name: str,
      customer_phone: str,
      track_name: str | None,
      album_title: str | None,
      artist_name: str | None,
      purchase_date_iso_8601: str | None,
  ) -> list[dict]:
      """Find all of the Invoice Line IDs in the Chinook DB for the given filters.

      Returns:
          a list of dictionaries that contain keys: {
              'invoice_line_id',
              'track_name',
              'artist_name',
              'purchase_date',
              'quantity_purchased',
              'price_per_unit'
          }
      """

      # Connect to the database
      conn = sqlite3.connect("chinook.db")
      cursor = conn.cursor()

      # Base query joining all necessary tables
      query = """
      SELECT
          il.InvoiceLineId,
          t.Name as track_name,
          art.Name as artist_name,
          i.InvoiceDate as purchase_date,
          il.Quantity as quantity_purchased,
          il.UnitPrice as price_per_unit
      FROM InvoiceLine il
      JOIN Invoice i ON il.InvoiceId = i.InvoiceId
      JOIN Customer c ON i.CustomerId = c.CustomerId
      JOIN Track t ON il.TrackId = t.TrackId
      JOIN Album alb ON t.AlbumId = alb.AlbumId
      JOIN Artist art ON alb.ArtistId = art.ArtistId
      WHERE c.FirstName = ?
      AND c.LastName = ?
      AND c.Phone = ?
      """

      # Parameters for the query
      params = [customer_first_name, customer_last_name, customer_phone]

      # Add optional filters
      if track_name:
          query += " AND t.Name = ?"
          params.append(track_name)

      if album_title:
          query += " AND alb.Title = ?"
          params.append(album_title)

      if artist_name:
          query += " AND art.Name = ?"
          params.append(artist_name)

      if purchase_date_iso_8601:
          query += " AND date(i.InvoiceDate) = date(?)"
          params.append(purchase_date_iso_8601)

      # Execute query
      cursor.execute(query, params)

      # Fetch results
      results = cursor.fetchall()

      # Convert results to list of dictionaries
      output = []
      for row in results:
          output.append(
              {
                  "invoice_line_id": row[0],
                  "track_name": row[1],
                  "artist_name": row[2],
                  "purchase_date": row[3],
                  "quantity_purchased": row[4],
                  "price_per_unit": row[5],
              }
          )

      # Close connection
      conn.close()

      return output


  # Graph state.
  class State(TypedDict):
      """Agent state."""

      messages: Annotated[list[AnyMessage], add_messages]
      followup: str | None

      invoice_id: int | None
      invoice_line_ids: list[int] | None
      customer_first_name: str | None
      customer_last_name: str | None
      customer_phone: str | None
      track_name: str | None
      album_title: str | None
      artist_name: str | None
      purchase_date_iso_8601: str | None


  # Instructions for extracting the user/purchase info from the conversation.
  gather_info_instructions = """You are managing an online music store that sells song tracks. \
  Customers can buy multiple tracks at a time and these purchases are recorded in a database as \
  an Invoice per purchase and an associated set of Invoice Lines for each purchased track.

  Your task is to help customers who would like a refund for one or more of the tracks they've \
  purchased. In order for you to be able refund them, the customer must specify the Invoice ID \
  to get a refund on all the tracks they bought in a single transaction, or one or more Invoice \
  Line IDs if they would like refunds on individual tracks.

  Often a user will not know the specific Invoice ID(s) or Invoice Line ID(s) for which they \
  would like a refund. In this case you can help them look up their invoices by asking them to \
  specify:
  - Required: Their first name, last name, and phone number.
  - Optionally: The track name, artist name, album name, or purchase date.

  If the customer has not specified the required information (either Invoice/Invoice Line IDs \
  or first name, last name, phone) then please ask them to specify it."""


  # Extraction schema, mirrors the graph state.
  class PurchaseInformation(TypedDict):
      """All of the known information about the invoice / invoice lines the customer would like refunded. Do not make up values, leave fields as null if you don't know their value."""

      invoice_id: int | None
      invoice_line_ids: list[int] | None
      customer_first_name: str | None
      customer_last_name: str | None
      customer_phone: str | None
      track_name: str | None
      album_title: str | None
      artist_name: str | None
      purchase_date_iso_8601: str | None
      followup: Annotated[
          str | None,
          ...,
          "If the user hasn't enough identifying information, please tell them what the required information is and ask them to specify it.",
      ]


  # Model for performing extraction.
  info_llm = init_chat_model("gpt-4o-mini").with_structured_output(
      PurchaseInformation, method="json_schema", include_raw=True
  )


  # Graph node for extracting user info and routing to lookup/refund/END.
  async def gather_info(state: State) -> Command[Literal["lookup", "refund", END]]:
      info = await info_llm.ainvoke(
          [
              {"role": "system", "content": gather_info_instructions},
              *state["messages"],
          ]
      )
      parsed = info["parsed"]
      if any(parsed[k] for k in ("invoice_id", "invoice_line_ids")):
          goto = "refund"
      elif all(
          parsed[k]
          for k in ("customer_first_name", "customer_last_name", "customer_phone")
      ):
          goto = "lookup"
      else:
          goto = END
      update = {"messages": [info["raw"]], **parsed}
      return Command(update=update, goto=goto)


  # Graph node for executing the refund.
  # Note that here we inspect the runtime config for an "env" variable.
  # If "env" is set to "test", then we don't actually delete any rows from our database.
  # This will become important when we're running our evaluations.
  def refund(state: State, config: RunnableConfig) -> dict:
      # Whether to mock the deletion. True if the configurable var 'env' is set to 'test'.
      mock = config.get("configurable", {}).get("env", "prod") == "test"
      refunded = _refund(
          invoice_id=state["invoice_id"],
          invoice_line_ids=state["invoice_line_ids"],
          mock=mock,
      )
      response = f"You have been refunded a total of: ${refunded:.2f}. Is there anything else I can help with?"
      return {
          "messages": [{"role": "assistant", "content": response}],
          "followup": response,
      }


  # Graph node for looking up the users purchases
  def lookup(state: State) -> dict:
      args = (
          state[k]
          for k in (
              "customer_first_name",
              "customer_last_name",
              "customer_phone",
              "track_name",
              "album_title",
              "artist_name",
              "purchase_date_iso_8601",
          )
      )
      results = _lookup(*args)
      if not results:
          response = "We did not find any purchases associated with the information you've provided. Are you sure you've entered all of your information correctly?"
          followup = response
      else:
          response = f"Which of the following purchases would you like to be refunded for?\n\n```json{json.dumps(results, indent=2)}\n```"
          followup = f"Which of the following purchases would you like to be refunded for?\n\n{tabulate(results, headers='keys')}"
      return {
          "messages": [{"role": "assistant", "content": response}],
          "followup": followup,
          "invoice_line_ids": [res["invoice_line_id"] for res in results],
      }


  # Building our graph
  graph_builder = StateGraph(State)

  graph_builder.add_node(gather_info)
  graph_builder.add_node(refund)
  graph_builder.add_node(lookup)

  graph_builder.set_entry_point("gather_info")
  graph_builder.add_edge("lookup", END)
  graph_builder.add_edge("refund", END)

  refund_graph = graph_builder.compile()


  # Our SQL queries will only work if we filter on the exact string values that are in the DB.
  # To ensure this, we'll create vectorstore indexes for all of the artists, tracks and albums
  # ahead of time and use those to disambiguate the user input. E.g. if a user searches for
  # songs by "prince" and our DB records the artist as "Prince", ideally when we query our
  # artist vectorstore for "prince" we'll get back the value "Prince", which we can then
  # use in our SQL queries.
  def index_fields() -> (
      tuple[InMemoryVectorStore, InMemoryVectorStore, InMemoryVectorStore]
  ):
      """Create an index for all artists, an index for all albums, and an index for all songs."""
      try:
          # Connect to the chinook database
          conn = sqlite3.connect("chinook.db")
          cursor = conn.cursor()

          # Fetch all results
          tracks = cursor.execute("SELECT Name FROM Track").fetchall()
          artists = cursor.execute("SELECT Name FROM Artist").fetchall()
          albums = cursor.execute("SELECT Title FROM Album").fetchall()
      finally:
          # Close the connection
          if conn:
              conn.close()

      embeddings = init_embeddings("openai:text-embedding-3-small")

      track_store = InMemoryVectorStore(embeddings)
      artist_store = InMemoryVectorStore(embeddings)
      album_store = InMemoryVectorStore(embeddings)

      track_store.add_texts([t[0] for t in tracks])
      artist_store.add_texts([a[0] for a in artists])
      album_store.add_texts([a[0] for a in albums])
      return track_store, artist_store, album_store


  track_store, artist_store, album_store = index_fields()


  # Agent tools
  @tool
  def lookup_track(
      track_name: str | None = None,
      album_title: str | None = None,
      artist_name: str | None = None,
  ) -> list[dict]:
      """Lookup a track in Chinook DB based on identifying information about.

      Returns:
          a list of dictionaries per matching track that contain keys {'track_name', 'artist_name', 'album_name'}
      """
      conn = sqlite3.connect("chinook.db")
      cursor = conn.cursor()

      query = """
      SELECT DISTINCT t.Name as track_name, ar.Name as artist_name, al.Title as album_name
      FROM Track t
      JOIN Album al ON t.AlbumId = al.AlbumId
      JOIN Artist ar ON al.ArtistId = ar.ArtistId
      WHERE 1=1
      """
      params = []

      if track_name:
          track_name = track_store.similarity_search(track_name, k=1)[0].page_content
          query += " AND t.Name LIKE ?"
          params.append(f"%{track_name}%")
      if album_title:
          album_title = album_store.similarity_search(album_title, k=1)[0].page_content
          query += " AND al.Title LIKE ?"
          params.append(f"%{album_title}%")
      if artist_name:
          artist_name = artist_store.similarity_search(artist_name, k=1)[0].page_content
          query += " AND ar.Name LIKE ?"
          params.append(f"%{artist_name}%")

      cursor.execute(query, params)
      results = cursor.fetchall()

      tracks = [
          {"track_name": row[0], "artist_name": row[1], "album_name": row[2]}
          for row in results
      ]

      conn.close()
      return tracks


  @tool
  def lookup_album(
      track_name: str | None = None,
      album_title: str | None = None,
      artist_name: str | None = None,
  ) -> list[dict]:
      """Lookup an album in Chinook DB based on identifying information about.

      Returns:
          a list of dictionaries per matching album that contain keys {'album_name', 'artist_name'}
      """
      conn = sqlite3.connect("chinook.db")
      cursor = conn.cursor()

      query = """
      SELECT DISTINCT al.Title as album_name, ar.Name as artist_name
      FROM Album al
      JOIN Artist ar ON al.ArtistId = ar.ArtistId
      LEFT JOIN Track t ON t.AlbumId = al.AlbumId
      WHERE 1=1
      """
      params = []

      if track_name:
          query += " AND t.Name LIKE ?"
          params.append(f"%{track_name}%")
      if album_title:
          query += " AND al.Title LIKE ?"
          params.append(f"%{album_title}%")
      if artist_name:
          query += " AND ar.Name LIKE ?"
          params.append(f"%{artist_name}%")

      cursor.execute(query, params)
      results = cursor.fetchall()

      albums = [{"album_name": row[0], "artist_name": row[1]} for row in results]

      conn.close()
      return albums


  @tool
  def lookup_artist(
      track_name: str | None = None,
      album_title: str | None = None,
      artist_name: str | None = None,
  ) -> list[str]:
      """Lookup an album in Chinook DB based on identifying information about.

      Returns:
          a list of matching artist names
      """
      conn = sqlite3.connect("chinook.db")
      cursor = conn.cursor()

      query = """
      SELECT DISTINCT ar.Name as artist_name
      FROM Artist ar
      LEFT JOIN Album al ON al.ArtistId = ar.ArtistId
      LEFT JOIN Track t ON t.AlbumId = al.AlbumId
      WHERE 1=1
      """
      params = []

      if track_name:
          query += " AND t.Name LIKE ?"
          params.append(f"%{track_name}%")
      if album_title:
          query += " AND al.Title LIKE ?"
          params.append(f"%{album_title}%")
      if artist_name:
          query += " AND ar.Name LIKE ?"
          params.append(f"%{artist_name}%")

      cursor.execute(query, params)
      results = cursor.fetchall()

      artists = [row[0] for row in results]

      conn.close()
      return artists


  # Agent model
  qa_llm = init_chat_model("claude-sonnet-4-5-20250929")
  # The prebuilt ReACT agent only expects State to have a 'messages' key, so the
  # state we defined for the refund agent can also be passed to our lookup agent.
  qa_graph = create_agent(qa_llm, [lookup_track, lookup_artist, lookup_album])


  # Schema for routing user intent.
  # We'll use structured output to enforce that the model returns only
  # the desired output.
  class UserIntent(TypedDict):
      """The user's current intent in the conversation"""

      intent: Literal["refund", "question_answering"]


  # Routing model with structured output
  router_llm = init_chat_model("gpt-4o-mini").with_structured_output(
      UserIntent, method="json_schema", strict=True
  )

  # Instructions for routing.
  route_instructions = """You are managing an online music store that sells song tracks. \
  You can help customers in two types of ways: (1) answering general questions about \
  tracks sold at your store, (2) helping them get a refund on a purhcase they made at your store.

  Based on the following conversation, determine if the user is currently seeking general \
  information about song tracks or if they are trying to refund a specific purchase.

  Return 'refund' if they are trying to get a refund and 'question_answering' if they are \
  asking a general music question. Do NOT return anything else. Do NOT try to respond to \
  the user.
  """


  # Node for routing.
  async def intent_classifier(
      state: State,
  ) -> Command[Literal["refund_agent", "question_answering_agent"]]:
      response = router_llm.invoke(
          [{"role": "system", "content": route_instructions}, *state["messages"]]
      )
      return Command(goto=response["intent"] + "_agent")


  # Node for making sure the 'followup' key is set before our agent run completes.
  def compile_followup(state: State) -> dict:
      """Set the followup to be the last message if it hasn't explicitly been set."""
      if not state.get("followup"):
          return {"followup": state["messages"][-1].content}
      return {}


  # Agent definition
  graph_builder = StateGraph(State)
  graph_builder.add_node(intent_classifier)
  # Since all of our subagents have compatible state,
  # we can add them as nodes directly.
  graph_builder.add_node("refund_agent", refund_graph)
  graph_builder.add_node("question_answering_agent", qa_graph)
  graph_builder.add_node(compile_followup)

  graph_builder.set_entry_point("intent_classifier")
  graph_builder.add_edge("refund_agent", "compile_followup")
  graph_builder.add_edge("question_answering_agent", "compile_followup")
  graph_builder.add_edge("compile_followup", END)

  graph = graph_builder.compile()


  client = Client()

  # Create a dataset
  examples = [
      {
          "inputs": {
              "question": "How many songs do you have by James Brown"
          },
          "outputs": {
              "response": "We have 20 songs by James Brown",
              "trajectory": ["question_answering_agent", "lookup_tracks"]
          },
      },
      {
          "inputs": {
              "question": "My name is Aaron Mitchell and I'd like a refund.",
          },
          "outputs": {
              "response": "I need some more information to help you with the refund. Please specify your phone number, the invoice ID, or the line item IDs for the purchase you'd like refunded.",
              "trajectory": ["refund_agent"],
          }
      },
      {
          "inputs": {
              "question": "My name is Aaron Mitchell and I'd like a refund on my Led Zeppelin purchases. My number is +1 (204) 452-6452",
          },
          "outputs": {
              "response": "Which of the following purchases would you like to be refunded for?\n\n  invoice_line_id  track_name                        artist_name    purchase_date          quantity_purchased    price_per_unit\n-----------------  --------------------------------  -------------  -------------------  --------------------  ----------------\n              267  How Many More Times               Led Zeppelin   2009-08-06 00:00:00                     1              0.99\n              268  What Is And What Should Never Be  Led Zeppelin   2009-08-06 00:00:00                     1              0.99",
              "trajectory": ["refund_agent", "lookup"],
          },
      },
      {
          "inputs": {
              "question": "Who recorded Wish You Were Here again? What other albums of there's do you have?",
          },
          "outputs": {
              "response": "Wish You Were Here is an album by Pink Floyd",
              "trajectory": ["question_answering_agent", "lookup_album"],
          }
      },
      {
          "inputs": {
              "question": "I want a full refund for invoice 237",
          },
          "outputs": {
              "response": "You have been refunded $2.97.",
              "trajectory": ["refund_agent", "refund"],
          },
      },
  ]

  dataset_name = "Chinook Customer Service Bot: E2E"

  if not client.has_dataset(dataset_name=dataset_name):
      dataset = client.create_dataset(dataset_name=dataset_name)
      client.create_examples(
          dataset_id=dataset.id,
          examples=examples
      )

  # LLM-as-judge instructions
  grader_instructions = """You are a teacher grading a quiz.

  You will be given a QUESTION, the GROUND TRUTH (correct) RESPONSE, and the STUDENT RESPONSE.

  Here is the grade criteria to follow:
  (1) Grade the student responses based ONLY on their factual accuracy relative to the ground truth answer.
  (2) Ensure that the student response does not contain any conflicting statements.
  (3) It is OK if the student response contains more information than the ground truth response, as long as it is factually accurate relative to the  ground truth response.

  Correctness:
  True means that the student's response meets all of the criteria.
  False means that the student's response does not meet all of the criteria.

  Explain your reasoning in a step-by-step manner to ensure your reasoning and conclusion are correct."""


  # LLM-as-judge output schema
  class Grade(TypedDict):
      """Compare the expected and actual answers and grade the actual answer."""

      reasoning: Annotated[
          str,
          ...,
          "Explain your reasoning for whether the actual response is correct or not.",
      ]
      is_correct: Annotated[
          bool,
          ...,
          "True if the student response is mostly or exactly correct, otherwise False.",
      ]


  # Judge LLM
  grader_llm = init_chat_model("gpt-4o-mini", temperature=0).with_structured_output(
      Grade, method="json_schema", strict=True
  )


  # Evaluator function
  async def final_answer_correct(
      inputs: dict, outputs: dict, reference_outputs: dict
  ) -> bool:
      """Evaluate if the final response is equivalent to reference response."""

      # Note that we assume the outputs has a 'response' dictionary. We'll need to make sure
      # that the target function we define includes this key.
      user = f"""QUESTION: {inputs['question']}
      GROUND TRUTH RESPONSE: {reference_outputs['response']}
      STUDENT RESPONSE: {outputs['response']}"""

      grade = await grader_llm.ainvoke(
          [
              {"role": "system", "content": grader_instructions},
              {"role": "user", "content": user},
          ]
      )
      return grade["is_correct"]


  # Target function
  async def run_graph(inputs: dict) -> dict:
      """Run graph and track the trajectory it takes along with the final response."""
      result = await graph.ainvoke(
          {
              "messages": [
                  {"role": "user", "content": inputs["question"]},
              ]
          },
          config={"env": "test"},
      )
      return {"response": result["followup"]}


  # Evaluation job and results
  experiment_results = await client.aevaluate(
      run_graph,
      data=dataset_name,
      evaluators=[final_answer_correct],
      experiment_prefix="sql-agent-gpt4o-e2e",
      num_repetitions=1,
      max_concurrency=4,
  )
  experiment_results.to_pandas()


  def trajectory_subsequence(outputs: dict, reference_outputs: dict) -> float:
      """Check how many of the desired steps the agent took."""
      if len(reference_outputs["trajectory"]) > len(outputs["trajectory"]):
          return False

      i = j = 0
      while i < len(reference_outputs["trajectory"]) and j < len(outputs["trajectory"]):
          if reference_outputs["trajectory"][i] == outputs["trajectory"][j]:
              i += 1
          j += 1

      return i / len(reference_outputs["trajectory"])


  async def run_graph(inputs: dict) -> dict:
      """Run graph and track the trajectory it takes along with the final response."""
      trajectory = []
      # Set subgraph=True to stream events from subgraphs of the main graph: https://langchain-ai.github.io/langgraph/how-tos/streaming-subgraphs/
      # Set stream_mode="debug" to stream all possible events: https://langchain-ai.github.io/langgra/langsmith/observability-concepts/streaming
      async for namespace, chunk in graph.astream(
          {
              "messages": [
                  {
                      "role": "user",
                      "content": inputs["question"],
                  }
              ]
          },
          subgraphs=True,
          stream_mode="debug",
      ):
          # Event type for entering a node
          if chunk["type"] == "task":
              # Record the node name
              trajectory.append(chunk["payload"]["name"])
              # Given how we defined our dataset, we also need to track when specific tools are
              # called by our question answering ReACT agent. These tool calls can be found
              # when the ToolsNode (named "tools") is invoked by looking at the AIMessage.tool_calls
              # of the latest input message.
              if chunk["payload"]["name"] == "tools" and chunk["type"] == "task":
                  for tc in chunk["payload"]["input"]["messages"][-1].tool_calls:
                      trajectory.append(tc["name"])

      return {"trajectory": trajectory}


  experiment_results = await client.aevaluate(
      run_graph,
      data=dataset_name,
      evaluators=[trajectory_subsequence],
      experiment_prefix="sql-agent-gpt4o-trajectory",
      num_repetitions=1,
      max_concurrency=4,
  )
  experiment_results.to_pandas()

  # Create dataset
  examples = [
      {
          "inputs": {
              "messages": [
                  {
                      "role": "user",
                      "content": "i bought some tracks recently and i dont like them",
                  }
              ],
          }
          "outputs": {"route": "refund_agent"},
      },
      {
          "inputs": {
              "messages": [
                  {
                      "role": "user",
                      "content": "I was thinking of purchasing some Rolling Stones tunes, any recommendations?",
                  }
              ],
          },
          "outputs": {"route": "question_answering_agent"},
      },
      {
          "inputs": {
              "messages": [
                      {"role": "user", "content": "i want a refund on purchase 237"},
                  {
                      "role": "assistant",
                      "content": "I've refunded you a total of $1.98. How else can I help you today?",
                  },
                  {"role": "user", "content": "did prince release any albums in 2000?"},
              ],
          },
          "outputs": {"route": "question_answering_agent"},
      },
      {
          "inputs": {
              "messages": [
                  {
                      "role": "user",
                      "content": "i purchased a cover of Yesterday recently but can't remember who it was by, which versions of it do you have?",
                  }
              ],
          },
          "outputs": {"route": "question_answering_agent"},
      },
  ]

  dataset_name = "Chinook Customer Service Bot: Intent Classifier"
  if not client.has_dataset(dataset_name=dataset_name):
      dataset = client.create_dataset(dataset_name=dataset_name)
      client.create_examples(
          dataset_id=dataset.id,
          examples=examples,
      )


  # Evaluator
  def correct(outputs: dict, reference_outputs: dict) -> bool:
      """Check if the agent chose the correct route."""
      return outputs["route"] == reference_outputs["route"]


  # Target function for running the relevant step
  async def run_intent_classifier(inputs: dict) -> dict:
      # Note that we can access and run the intent_classifier node of our graph directly.
      command = await graph.nodes["intent_classifier"].ainvoke(inputs)
      return {"route": command.goto}


  # Run evaluation
  experiment_results = await client.aevaluate(
      run_intent_classifier,
      data=dataset_name,
      evaluators=[correct],
      experiment_prefix="sql-agent-gpt4o-intent-classifier",
      max_concurrency=4,
  )
  experiment_results.to_pandas()
  ````
</Accordion>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/evaluate-complex-agent.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to evaluate an existing experiment (Python only)
Source: https://docs.langchain.com/langsmith/evaluate-existing-experiment



Evaluation of existing experiments is currently only supported in the Python SDK.

If you have already run an experiment and want to add additional evaluation metrics, you can apply any evaluators to the experiment using the `evaluate()` / `aevaluate()` methods as before. Just pass in the experiment name / ID instead of a target function:

```python  theme={null}
from langsmith import evaluate

def always_half(inputs: dict, outputs: dict) -> float:
    return 0.5

experiment_name = "my-experiment:abc"  # Replace with an actual experiment name or ID

evaluate(experiment_name, evaluators=[always_half])
```

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/evaluate-existing-experiment.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to evaluate a graph
Source: https://docs.langchain.com/langsmith/evaluate-graph



<Info>
  [langgraph](https://langchain-ai.github.io/langgraph/)
</Info>

`langgraph` is a library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. Evaluating `langgraph` graphs can be challenging because a single invocation can involve many LLM calls, and which LLM calls are made may depend on the outputs of preceding calls. In this guide we will focus on the mechanics of how to pass graphs and graph nodes to `evaluate()` / `aevaluate()`. For evaluation techniques and best practices when building agents head to the [langgraph docs](https://langchain-ai.github.io/langgraph/tutorials/#evaluation).

## End-to-end evaluations

The most common type of evaluation is an end-to-end one, where we want to evaluate the final graph output for each example input.

### Define a graph

Lets construct a simple ReACT agent to start:

```python  theme={null}
from typing import Annotated, Literal, TypedDict
from langchain.chat_models import init_chat_model
from langchain.tools import tool
from langgraph.prebuilt import ToolNode
from langgraph.graph import END, START, StateGraph
from langgraph.graph.message import add_messages

class State(TypedDict):
    # Messages have the type "list". The 'add_messages' function
    # in the annotation defines how this state key should be updated
    # (in this case, it appends messages to the list, rather than overwriting them)
    messages: Annotated[list, add_messages]

# Define the tools for the agent to use
@tool
def search(query: str) -> str:
    """Call to surf the web."""
    # This is a placeholder, but don't tell the LLM that...
    if "sf" in query.lower() or "san francisco" in query.lower():
        return "It's 60 degrees and foggy."
    return "It's 90 degrees and sunny."

tools = [search]
tool_node = ToolNode(tools)
model = init_chat_model("claude-sonnet-4-5-20250929").bind_tools(tools)

# Define the function that determines whether to continue or not
def should_continue(state: State) -> Literal["tools", END]:
    messages = state['messages']
    last_message = messages[-1]

    # If the LLM makes a tool call, then we route to the "tools" node
    if last_message.tool_calls:
        return "tools"

    # Otherwise, we stop (reply to the user)
    return END

# Define the function that calls the model
def call_model(state: State):
    messages = state['messages']
    response = model.invoke(messages)

    # We return a list, because this will get added to the existing list
    return {"messages": [response]}

# Define a new graph
workflow = StateGraph(State)

# Define the two nodes we will cycle between
workflow.add_node("agent", call_model)
workflow.add_node("tools", tool_node)

# Set the entrypoint as 'agent'
# This means that this node is the first one called
workflow.add_edge(START, "agent")

# We now add a conditional edge
workflow.add_conditional_edges(
    # First, we define the start node. We use 'agent'.
    # This means these are the edges taken after the 'agent' node is called.
    "agent",
    # Next, we pass in the function that will determine which node is called next.
    should_continue,
)

# We now add a normal edge from 'tools' to 'agent'.
# This means that after 'tools' is called, 'agent' node is called next.
workflow.add_edge("tools", 'agent')

# Finally, we compile it!
# This compiles it into a LangChain Runnable,
# meaning you can use it as you would any other runnable.
# Note that we're (optionally) passing the memory when compiling the graph
app = workflow.compile()
```

### Create a dataset

Let's create a simple dataset of questions and expected responses:

```python  theme={null}
from langsmith import Client

questions = [
    "what's the weather in sf",
    "whats the weather in san fran",
    "whats the weather in tangier"
]

answers = [
    "It's 60 degrees and foggy.",
    "It's 60 degrees and foggy.",
    "It's 90 degrees and sunny.",
]

ls_client = Client()
dataset = ls_client.create_dataset(
    "weather agent",
    inputs=[{"question": q} for q in questions],
    outputs=[{"answers": a} for a in answers],
)
```

### Create an evaluator

And a simple evaluator:

Requires `langsmith>=0.2.0`

```python  theme={null}
judge_llm = init_chat_model("gpt-4o")

async def correct(outputs: dict, reference_outputs: dict) -> bool:
    instructions = (
        "Given an actual answer and an expected answer, determine whether"
        " the actual answer contains all of the information in the"
        " expected answer. Respond with 'CORRECT' if the actual answer"
        " does contain all of the expected information and 'INCORRECT'"
        " otherwise. Do not include anything else in your response."
    )
    # Our graph outputs a State dictionary, which in this case means
    # we'll have a 'messages' key and the final message should
    # be our actual answer.
    actual_answer = outputs["messages"][-1].content
    expected_answer = reference_outputs["answer"]
    user_msg = (
        f"ACTUAL ANSWER: {actual_answer}"
        f"\n\nEXPECTED ANSWER: {expected_answer}"
    )
    response = await judge_llm.ainvoke(
        [
            {"role": "system", "content": instructions},
            {"role": "user", "content": user_msg}
        ]
    )
    return response.content.upper() == "CORRECT"
```

### Run evaluations

Now we can run our evaluations and explore the results. We'll just need to wrap our graph function so that it can take inputs in the format they're stored on our example:

<Note>
  If all of your graph nodes are defined as sync functions then you can use `evaluate` or `aevaluate`. If any of you nodes are defined as async, you'll need to use `aevaluate`
</Note>

Requires `langsmith>=0.2.0`

```python  theme={null}
from langsmith import aevaluate

def example_to_state(inputs: dict) -> dict:
  return {"messages": [{"role": "user", "content": inputs['question']}]}

# We use LCEL declarative syntax here.
# Remember that langgraph graphs are also langchain runnables.
target = example_to_state | app

experiment_results = await aevaluate(
    target,
    data="weather agent",
    evaluators=[correct],
    max_concurrency=4,  # optional
    experiment_prefix="claude-3.5-baseline",  # optional
)
```

## Evaluating intermediate steps

Often it is valuable to evaluate not only the final output of an agent but also the intermediate steps it has taken. What's nice about `langgraph` is that the output of a graph is a state object that often already carries information about the intermediate steps taken. Usually we can evaluate whatever we're interested in just by looking at the messages in our state. For example, we can look at the messages to assert that the model invoked the 'search' tool upon as a first step.

Requires `langsmith>=0.2.0`

```python  theme={null}
def right_tool(outputs: dict) -> bool:
    tool_calls = outputs["messages"][1].tool_calls
    return bool(tool_calls and tool_calls[0]["name"] == "search")

experiment_results = await aevaluate(
    target,
    data="weather agent",
    evaluators=[correct, right_tool],
    max_concurrency=4,  # optional
    experiment_prefix="claude-3.5-baseline",  # optional
)
```

If we need access to information about intermediate steps that isn't in state, we can look at the Run object. This contains the full traces for all node inputs and outputs:

<Check>
  See more about what arguments you can pass to custom evaluators in this [how-to guide](/langsmith/code-evaluator).
</Check>

```python  theme={null}
from langsmith.schemas import Run, Example

def right_tool_from_run(run: Run, example: Example) -> dict:
    # Get documents and answer
    first_model_run = next(run for run in root_run.child_runs if run.name == "agent")
    tool_calls = first_model_run.outputs["messages"][-1].tool_calls
    right_tool = bool(tool_calls and tool_calls[0]["name"] == "search")
    return {"key": "right_tool", "value": right_tool}

experiment_results = await aevaluate(
    target,
    data="weather agent",
    evaluators=[correct, right_tool_from_run],
    max_concurrency=4,  # optional
    experiment_prefix="claude-3.5-baseline",  # optional
)
```

## Running and evaluating individual nodes

Sometimes you want to evaluate a single node directly to save time and costs. `langgraph` makes it easy to do this. In this case we can even continue using the evaluators we've been using.

```python  theme={null}
node_target = example_to_state | app.nodes["agent"]

node_experiment_results = await aevaluate(
    node_target,
    data="weather agent",
    evaluators=[right_tool_from_run],
    max_concurrency=4,  # optional
    experiment_prefix="claude-3.5-model-node",  # optional
)
```

## Related

* [`langgraph` evaluation docs](https://langchain-ai.github.io/langgraph/tutorials/#evaluation)

## Reference code

<Accordion title="Click to see a consolidated code snippet">
  ```python  theme={null}
  from typing import Annotated, Literal, TypedDict
  from langchain.chat_models import init_chat_model
  from langchain.tools import tool
  from langgraph.prebuilt import ToolNode
  from langgraph.graph import END, START, StateGraph
  from langgraph.graph.message import add_messages
  from langsmith import Client, aevaluate

  # Define a graph
  class State(TypedDict):
      # Messages have the type "list". The 'add_messages' function
      # in the annotation defines how this state key should be updated
      # (in this case, it appends messages to the list, rather than overwriting them)
      messages: Annotated[list, add_messages]

  # Define the tools for the agent to use
  @tool
  def search(query: str) -> str:
      """Call to surf the web."""
      # This is a placeholder, but don't tell the LLM that...
      if "sf" in query.lower() or "san francisco" in query.lower():
          return "It's 60 degrees and foggy."
      return "It's 90 degrees and sunny."

  tools = [search]
  tool_node = ToolNode(tools)
  model = init_chat_model("claude-sonnet-4-5-20250929").bind_tools(tools)

  # Define the function that determines whether to continue or not
  def should_continue(state: State) -> Literal["tools", END]:
      messages = state['messages']
      last_message = messages[-1]

      # If the LLM makes a tool call, then we route to the "tools" node
      if last_message.tool_calls:
          return "tools"

      # Otherwise, we stop (reply to the user)
      return END

  # Define the function that calls the model
  def call_model(state: State):
      messages = state['messages']
      response = model.invoke(messages)
      # We return a list, because this will get added to the existing list
      return {"messages": [response]}

  # Define a new graph
  workflow = StateGraph(State)

  # Define the two nodes we will cycle between
  workflow.add_node("agent", call_model)
  workflow.add_node("tools", tool_node)

  # Set the entrypoint as 'agent'
  # This means that this node is the first one called
  workflow.add_edge(START, "agent")

  # We now add a conditional edge
  workflow.add_conditional_edges(
      # First, we define the start node. We use 'agent'.
      # This means these are the edges taken after the 'agent' node is called.
      "agent",
      # Next, we pass in the function that will determine which node is called next.
      should_continue,
  )

  # We now add a normal edge from 'tools' to 'agent'.
  # This means that after 'tools' is called, 'agent' node is called next.
  workflow.add_edge("tools", 'agent')

  # Finally, we compile it!
  # This compiles it into a LangChain Runnable,
  # meaning you can use it as you would any other runnable.
  # Note that we're (optionally) passing the memory when compiling the graph
  app = workflow.compile()

  questions = [
      "what's the weather in sf",
      "whats the weather in san fran",
      "whats the weather in tangier"
  ]

  answers = [
      "It's 60 degrees and foggy.",
      "It's 60 degrees and foggy.",
      "It's 90 degrees and sunny.",
  ]

  # Create a dataset
  ls_client = Client()
  dataset = ls_client.create_dataset(
      "weather agent",
      inputs=[{"question": q} for q in questions],
      outputs=[{"answers": a} for a in answers],
  )

  # Define evaluators
  async def correct(outputs: dict, reference_outputs: dict) -> bool:
      instructions = (
          "Given an actual answer and an expected answer, determine whether"
          " the actual answer contains all of the information in the"
          " expected answer. Respond with 'CORRECT' if the actual answer"
          " does contain all of the expected information and 'INCORRECT'"
          " otherwise. Do not include anything else in your response."
      )
      # Our graph outputs a State dictionary, which in this case means
      # we'll have a 'messages' key and the final message should
      # be our actual answer.
      actual_answer = outputs["messages"][-1].content
      expected_answer = reference_outputs["answer"]
      user_msg = (
          f"ACTUAL ANSWER: {actual_answer}"
          f"\n\nEXPECTED ANSWER: {expected_answer}"
      )
      response = await judge_llm.ainvoke(
          [
              {"role": "system", "content": instructions},
              {"role": "user", "content": user_msg}
          ]
      )
      return response.content.upper() == "CORRECT"

  def right_tool(outputs: dict) -> bool:
      tool_calls = outputs["messages"][1].tool_calls
      return bool(tool_calls and tool_calls[0]["name"] == "search")

  # Run evaluation
  experiment_results = await aevaluate(
      target,
      data="weather agent",
      evaluators=[correct, right_tool],
      max_concurrency=4,  # optional
      experiment_prefix="claude-3.5-baseline",  # optional
  )
  ```
</Accordion>

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/evaluate-graph.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to evaluate an LLM application
Source: https://docs.langchain.com/langsmith/evaluate-llm-application



This guide shows you how to run an evaluation on an LLM application using the LangSmith SDK.

<Info>
  [Evaluations](/langsmith/evaluation-concepts#applying-evaluations) | [Evaluators](/langsmith/evaluation-concepts#evaluators) | [Datasets](/langsmith/evaluation-concepts#datasets)
</Info>

In this guide we'll go over how to evaluate an application using the [evaluate()](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._runner.evaluate) method in the LangSmith SDK.

<Check>
  For larger evaluation jobs in Python we recommend using [aevaluate()](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._arunner.aevaluate), the asynchronous version of [evaluate()](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._runner.evaluate). It is still worthwhile to read this guide first, as the two have identical interfaces, before reading the how-to guide on [running an evaluation asynchronously](/langsmith/evaluation-async).

  In JS/TS evaluate() is already asynchronous so no separate method is needed.

  It is also important to configure the `max_concurrency`/`maxConcurrency` arg when running large jobs. This parallelizes evaluation by effectively splitting the dataset across threads.
</Check>

## Define an application

First we need an application to evaluate. Let's create a simple toxicity classifier for this example.

<CodeGroup>
  ```python Python theme={null}
  from langsmith import traceable, wrappers
  from openai import OpenAI

  # Optionally wrap the OpenAI client to trace all model calls.
  oai_client = wrappers.wrap_openai(OpenAI())

  # Optionally add the 'traceable' decorator to trace the inputs/outputs of this function.
  @traceable
  def toxicity_classifier(inputs: dict) -> dict:
      instructions = (
        "Please review the user query below and determine if it contains any form of toxic behavior, "
        "such as insults, threats, or highly negative comments. Respond with 'Toxic' if it does "
        "and 'Not toxic' if it doesn't."
      )
      messages = [
          {"role": "system", "content": instructions},
          {"role": "user", "content": inputs["text"]},
      ]
      result = oai_client.chat.completions.create(
          messages=messages, model="gpt-4o-mini", temperature=0
      )
      return {"class": result.choices[0].message.content}
  ```

  ```typescript TypeScript theme={null}
  import { OpenAI } from "openai";
  import { wrapOpenAI } from "langsmith/wrappers";
  import { traceable } from "langsmith/traceable";

  // Optionally wrap the OpenAI client to trace all model calls.
  const oaiClient = wrapOpenAI(new OpenAI());

  // Optionally add the 'traceable' wrapper to trace the inputs/outputs of this function.
  const toxicityClassifier = traceable(
    async (text: string) => {
      const result = await oaiClient.chat.completions.create({
        messages: [
          {
             role: "system",
            content: "Please review the user query below and determine if it contains any form of toxic behavior, such as insults, threats, or highly negative comments. Respond with 'Toxic' if it does, and 'Not toxic' if it doesn't.",
          },
          { role: "user", content: text },
        ],
        model: "gpt-4o-mini",
        temperature: 0,
      });

      return result.choices[0].message.content;
    },
    { name: "toxicityClassifier" }
  );
  ```
</CodeGroup>

We've optionally enabled tracing to capture the inputs and outputs of each step in the pipeline. To understand how to annotate your code for tracing, please refer to [this guide](/langsmith/annotate-code).

## Create or select a dataset

We need a [Dataset](/langsmith/evaluation-concepts#datasets) to evaluate our application on. Our dataset will contain labeled [examples](/langsmith/evaluation-concepts#examples) of toxic and non-toxic text.

Requires `langsmith>=0.3.13`

<CodeGroup>
  ```python Python theme={null}
  from langsmith import Client
  ls_client = Client()

  examples = [
    {
      "inputs": {"text": "Shut up, idiot"},
      "outputs": {"label": "Toxic"},
    },
    {
      "inputs": {"text": "You're a wonderful person"},
      "outputs": {"label": "Not toxic"},
    },
    {
      "inputs": {"text": "This is the worst thing ever"},
      "outputs": {"label": "Toxic"},
    },
    {
      "inputs": {"text": "I had a great day today"},
      "outputs": {"label": "Not toxic"},
    },
    {
      "inputs": {"text": "Nobody likes you"},
      "outputs": {"label": "Toxic"},
    },
    {
      "inputs": {"text": "This is unacceptable. I want to speak to the manager."},
      "outputs": {"label": "Not toxic"},
    },
  ]

  dataset = ls_client.create_dataset(dataset_name="Toxic Queries")
  ls_client.create_examples(
    dataset_id=dataset.id,
    examples=examples,
  )
  ```

  ```typescript TypeScript theme={null}
  import { Client } from "langsmith";

  const langsmith = new Client();

  // create a dataset
  const labeledTexts = [
    ["Shut up, idiot", "Toxic"],
    ["You're a wonderful person", "Not toxic"],
    ["This is the worst thing ever", "Toxic"],
    ["I had a great day today", "Not toxic"],
    ["Nobody likes you", "Toxic"],
    ["This is unacceptable. I want to speak to the manager.", "Not toxic"],
  ];

  const [inputs, outputs] = labeledTexts.reduce<
    [Array<{ input: string }>, Array<{ outputs: string }>]
  >(
    ([inputs, outputs], item) => [
      [...inputs, { input: item[0] }],
      [...outputs, { outputs: item[1] }],
    ],
    [[], []]
  );

  const datasetName = "Toxic Queries";
  const toxicDataset = await langsmith.createDataset(datasetName);
  await langsmith.createExamples({ inputs, outputs, datasetId: toxicDataset.id });
  ```
</CodeGroup>

For more details on datasets, refer to the [Manage datasets](/langsmith/manage-datasets) page.

## Define an evaluator

<Check>
  You can also check out LangChain's open source evaluation package [openevals](https://github.com/langchain-ai/openevals) for common pre-built evaluators.
</Check>

[Evaluators](/langsmith/evaluation-concepts#evaluators) are functions for scoring your application's outputs. They take in the example inputs, actual outputs, and, when present, the reference outputs. Since we have labels for this task, our evaluator can directly check if the actual outputs match the reference outputs.

* Python: Requires `langsmith>=0.3.13`
* TypeScript: Requires `langsmith>=0.2.9`

<CodeGroup>
  ```python Python theme={null}
  def correct(inputs: dict, outputs: dict, reference_outputs: dict) -> bool:
      return outputs["class"] == reference_outputs["label"]
  ```

  ```typescript TypeScript theme={null}
  import type { EvaluationResult } from "langsmith/evaluation";

  function correct({
    outputs,
    referenceOutputs,
  }: {
    outputs: Record<string, any>;
    referenceOutputs?: Record<string, any>;
  }): EvaluationResult {
    const score = outputs.output === referenceOutputs?.outputs;
    return { key: "correct", score };
  }
  ```
</CodeGroup>

## Run the evaluation

We'll use the [evaluate()](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._runner.evaluate) / [aevaluate()](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._arunner.aevaluate) methods to run the evaluation.

The key arguments are:

* a target function that takes an input dictionary and returns an output dictionary. The `example.inputs` field of each [Example](/langsmith/example-data-format) is what gets passed to the target function. In this case our `toxicity_classifier` is already set up to take in example inputs so we can use it directly.
* `data` - the name OR UUID of the LangSmith dataset to evaluate on, or an iterator of examples
* `evaluators` - a list of evaluators to score the outputs of the function

Python: Requires `langsmith>=0.3.13`

<CodeGroup>
  ```python Python theme={null}
  # Can equivalently use the 'evaluate' function directly:
  # from langsmith import evaluate; evaluate(...)
  results = ls_client.evaluate(
      toxicity_classifier,
      data=dataset.name,
      evaluators=[correct],
      experiment_prefix="gpt-4o-mini, baseline",  # optional, experiment name prefix
      description="Testing the baseline system.",  # optional, experiment description
      max_concurrency=4, # optional, add concurrency
  )
  ```

  ```typescript TypeScript theme={null}
  import { evaluate } from "langsmith/evaluation";

  await evaluate((inputs) => toxicityClassifier(inputs["input"]), {
    data: datasetName,
    evaluators: [correct],
    experimentPrefix: "gpt-4o-mini, baseline",  // optional, experiment name prefix
    maxConcurrency: 4, // optional, add concurrency
  });
  ```
</CodeGroup>

## Explore the results[​](#explore-the-results "Direct link to Explore the results")

Each invocation of `evaluate()` creates an [Experiment](/langsmith/evaluation-concepts#experiments) which can be viewed in the LangSmith UI or queried via the SDK. Evaluation scores are stored against each actual output as feedback.

*If you've annotated your code for tracing, you can open the trace of each row in a side panel view.*

<img src="https://mintcdn.com/langchain-5e9cc07a/1RIJxfRpkszanJLL/langsmith/images/view-experiment.gif?s=252d96dbd2100a691f1d3b61716fde38" alt="" data-og-width="1132" width="1132" data-og-height="720" height="720" data-path="langsmith/images/view-experiment.gif" data-optimize="true" data-opv="3" />

## Reference code[​](#reference-code "Direct link to Reference code")

<Accordion title="Click to see a consolidated code snippet">
  <CodeGroup>
    ```python Python theme={null}
    from langsmith import Client, traceable, wrappers
    from openai import OpenAI

    # Step 1. Define an application
    oai_client = wrappers.wrap_openai(OpenAI())

    @traceable
    def toxicity_classifier(inputs: dict) -> str:
        system = (
          "Please review the user query below and determine if it contains any form of toxic behavior, "
          "such as insults, threats, or highly negative comments. Respond with 'Toxic' if it does "
          "and 'Not toxic' if it doesn't."
        )
        messages = [
            {"role": "system", "content": system},
            {"role": "user", "content": inputs["text"]},
        ]
        result = oai_client.chat.completions.create(
            messages=messages, model="gpt-4o-mini", temperature=0
        )
        return result.choices[0].message.content

    # Step 2. Create a dataset
    ls_client = Client()
    dataset = ls_client.create_dataset(dataset_name="Toxic Queries")
    examples = [
      {
        "inputs": {"text": "Shut up, idiot"},
        "outputs": {"label": "Toxic"},
      },
      {
        "inputs": {"text": "You're a wonderful person"},
        "outputs": {"label": "Not toxic"},
      },
      {
        "inputs": {"text": "This is the worst thing ever"},
        "outputs": {"label": "Toxic"},
      },
      {
        "inputs": {"text": "I had a great day today"},
        "outputs": {"label": "Not toxic"},
      },
      {
        "inputs": {"text": "Nobody likes you"},
        "outputs": {"label": "Toxic"},
      },
      {
        "inputs": {"text": "This is unacceptable. I want to speak to the manager."},
        "outputs": {"label": "Not toxic"},
      },
    ]
    ls_client.create_examples(
      dataset_id=dataset.id,
      examples=examples,
    )

    # Step 3. Define an evaluator
    def correct(inputs: dict, outputs: dict, reference_outputs: dict) -> bool:
        return outputs["output"] == reference_outputs["label"]

    # Step 4. Run the evaluation
    # Client.evaluate() and evaluate() behave the same.
    results = ls_client.evaluate(
        toxicity_classifier,
        data=dataset.name,
        evaluators=[correct],
        experiment_prefix="gpt-4o-mini, simple",  # optional, experiment name prefix
        description="Testing the baseline system.",  # optional, experiment description
        max_concurrency=4,  # optional, add concurrency
    )
    ```

    ```typescript TypeScript theme={null}
    import { OpenAI } from "openai";
    import { Client } from "langsmith";
    import { evaluate, EvaluationResult } from "langsmith/evaluation";
    import type { Run, Example } from "langsmith/schemas";
    import { traceable } from "langsmith/traceable";
    import { wrapOpenAI } from "langsmith/wrappers";

    const oaiClient = wrapOpenAI(new OpenAI());

    const toxicityClassifier = traceable(
      async (text: string) => {
        const result = await oaiClient.chat.completions.create({
          messages: [
            {
              role: "system",
              content: "Please review the user query below and determine if it contains any form of toxic behavior, such as insults, threats, or highly negative comments. Respond with 'Toxic' if it does, and 'Not toxic' if it doesn't.",
            },
            { role: "user", content: text },
          ],
          model: "gpt-4o-mini",
          temperature: 0,
        });
        return result.choices[0].message.content;
      },
      { name: "toxicityClassifier" }
    );

    const langsmith = new Client();

    // create a dataset
    const labeledTexts = [
      ["Shut up, idiot", "Toxic"],
      ["You're a wonderful person", "Not toxic"],
      ["This is the worst thing ever", "Toxic"],
      ["I had a great day today", "Not toxic"],
      ["Nobody likes you", "Toxic"],
      ["This is unacceptable. I want to speak to the manager.", "Not toxic"],
    ];

    const [inputs, outputs] = labeledTexts.reduce<
      [Array<{ input: string }>, Array<{ outputs: string }>]
    >(
      ([inputs, outputs], item) => [
        [...inputs, { input: item[0] }],
        [...outputs, { outputs: item[1] }],
      ],
      [[], []]
    );

    const datasetName = "Toxic Queries";
    const toxicDataset = await langsmith.createDataset(datasetName);
    await langsmith.createExamples({ inputs, outputs, datasetId: toxicDataset.id });

    // Row-level evaluator
    function correct({
      outputs,
      referenceOutputs,
    }: {
      outputs: Record<string, any>;
      referenceOutputs?: Record<string, any>;
    }): EvaluationResult {
      const score = outputs.output === referenceOutputs?.outputs;
      return { key: "correct", score };
    }

    await evaluate((inputs) => toxicityClassifier(inputs["input"]), {
      data: datasetName,
      evaluators: [correct],
      experimentPrefix: "gpt-4o-mini, simple",  // optional, experiment name prefix
      maxConcurrency: 4, // optional, add concurrency
    });
    ```
  </CodeGroup>
</Accordion>

## Related[​](#related "Direct link to Related")

* [Run an evaluation asynchronously](/langsmith/evaluation-async)
* [Run an evaluation via the REST API](/langsmith/run-evals-api-only)
* [Run an evaluation from the prompt playground](/langsmith/run-evaluation-from-prompt-playground)

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/evaluate-llm-application.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to evaluate an application's intermediate steps
Source: https://docs.langchain.com/langsmith/evaluate-on-intermediate-steps



While, in many scenarios, it is sufficient to evaluate the final output of your task, in some cases you might want to evaluate the intermediate steps of your pipeline.

For example, for retrieval-augmented generation (RAG), you might want to

1. Evaluate the retrieval step to ensure that the correct documents are retrieved w\.r.t the input query.
2. Evaluate the generation step to ensure that the correct answer is generated w\.r.t the retrieved documents.

In this guide, we will use a simple, fully-custom evaluator for evaluating criteria 1 and an LLM-based evaluator for evaluating criteria 2 to highlight both scenarios.

In order to evaluate the intermediate steps of your pipeline, your evaluator function should traverse and process the `run`/`rootRun` argument, which is a `Run` object that contains the intermediate steps of your pipeline.

## 1. Define your LLM pipeline

The below RAG pipeline consists of 1) generating a Wikipedia query given the input question, 2) retrieving relevant documents from Wikipedia, and 3) generating an answer given the retrieved documents.

<CodeGroup>
  ```bash Python theme={null}
  pip install -U langsmith langchain[openai] wikipedia
  ```

  ```bash TypeScript theme={null}
  yarn add langsmith langchain @langchain/openai wikipedia
  ```
</CodeGroup>

Requires `langsmith>=0.3.13`

<CodeGroup>
  ```python Python theme={null}
  import wikipedia as wp
  from openai import OpenAI
  from langsmith import traceable, wrappers

  oai_client = wrappers.wrap_openai(OpenAI())

  @traceable
  def generate_wiki_search(question: str) -> str:
      """Generate the query to search in wikipedia."""
      instructions = (
          "Generate a search query to pass into wikipedia to answer the user's question. "
          "Return only the search query and nothing more. "
          "This will passed in directly to the wikipedia search engine."
      )
      messages = [
          {"role": "system", "content": instructions},
          {"role": "user", "content": question}
      ]
      result = oai_client.chat.completions.create(
          messages=messages,
          model="gpt-4o-mini",
          temperature=0,
      )
      return result.choices[0].message.content

  @traceable(run_type="retriever")
  def retrieve(query: str) -> list:
      """Get up to two search wikipedia results."""
      results = []
      for term in wp.search(query, results = 10):
          try:
              page = wp.page(term, auto_suggest=False)
              results.append({
                  "page_content": page.summary,
                  "type": "Document",
                  "metadata": {"url": page.url}
              })
          except wp.DisambiguationError:
              pass
          if len(results) >= 2:
              return results

  @traceable
  def generate_answer(question: str, context: str) -> str:
      """Answer the question based on the retrieved information."""
      instructions = f"Answer the user's question based ONLY on the content below:\n\n{context}"
      messages = [
          {"role": "system", "content": instructions},
          {"role": "user", "content": question}
      ]
      result = oai_client.chat.completions.create(
          messages=messages,
          model="gpt-4o-mini",
          temperature=0
      )
      return result.choices[0].message.content

  @traceable
  def qa_pipeline(question: str) -> str:
      """The full pipeline."""
      query = generate_wiki_search(question)
      context = "\n\n".join([doc["page_content"] for doc in retrieve(query)])
      return generate_answer(question, context)
  ```

  ```typescript TypeScript theme={null}
  import OpenAI from "openai";
  import wiki from "wikipedia";
  import { Client } from "langsmith";
  import { traceable } from "langsmith/traceable";
  import { wrapOpenAI } from "langsmith/wrappers";

  const openai = wrapOpenAI(new OpenAI());

  const generateWikiSearch = traceable(
    async (input: { question: string }) => {
      const messages = [
        {
          role: "system" as const,
          content:
            "Generate a search query to pass into Wikipedia to answer the user's question. Return only the search query and nothing more. This will be passed in directly to the Wikipedia search engine.",
        },
        { role: "user" as const, content: input.question },
      ];
      const chatCompletion = await openai.chat.completions.create({
        model: "gpt-4o-mini",
        messages: messages,
        temperature: 0,
      });
      return chatCompletion.choices[0].message.content ?? "";
    },
    { name: "generateWikiSearch" }
  );

  const retrieve = traceable(
    async (input: { query: string; numDocuments: number }) => {
      const { results } = await wiki.search(input.query, { limit: 10 });
      const finalResults: Array<{
        page_content: string;
        type: "Document";
        metadata: { url: string };
      }> = [];
      for (const result of results) {
        if (finalResults.length >= input.numDocuments) {
          // Just return the top 2 pages for now
          break;
        }
        const page = await wiki.page(result.title, { autoSuggest: false });
        const summary = await page.summary();
        finalResults.push({
          page_content: summary.extract,
          type: "Document",
          metadata: { url: page.fullurl },
        });
      }
      return finalResults;
    },
    { name: "retrieve", run_type: "retriever" }
  );

  const generateAnswer = traceable(
    async (input: { question: string; context: string }) => {
      const messages = [
        {
          role: "system" as const,
          content: `Answer the user's question based only on the content below:\n\n${input.context}`,
        },
        { role: "user" as const, content: input.question },
      ];
      const chatCompletion = await openai.chat.completions.create({
        model: "gpt-4o-mini",
        messages: messages,
        temperature: 0,
      });
      return chatCompletion.choices[0].message.content ?? "";
    },
    { name: "generateAnswer" }
  );

  const ragPipeline = traceable(
    async ({ question }: { question: string }, numDocuments: number = 2) => {
      const query = await generateWikiSearch({ question });
      const retrieverResults = await retrieve({ query, numDocuments });
      const context = retrieverResults
        .map((result) => result.page_content)
        .join("\n\n");
      const answer = await generateAnswer({ question, context });
      return answer;
    },
    { name: "ragPipeline" }
  );
  ```
</CodeGroup>

This pipeline will produce a trace that looks something like: <img src="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-trace.png?fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=3b691ca56f9d60035dcba2c248692fa1" alt="evaluation_intermediate_trace.png" data-og-width="2586" width="2586" data-og-height="1676" height="1676" data-path="langsmith/images/evaluation-intermediate-trace.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-trace.png?w=280&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=23a9b9abdb3e43e0f6326f0d4293ab7d 280w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-trace.png?w=560&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=4b771691bd1afffe9f371a105f7eaebe 560w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-trace.png?w=840&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=beb01776de9a5fa663c82d4380bc78cd 840w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-trace.png?w=1100&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=ec84bd3345df3d2cef38878b902c355b 1100w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-trace.png?w=1650&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=043a7f22da6b158e070c853d67bacd69 1650w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-trace.png?w=2500&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=42a9ea799157fee30a6b243c02615a02 2500w" />

## 2. Create a dataset and examples to evaluate the pipeline

We are building a very simple dataset with a couple of examples to evaluate the pipeline.

Requires `langsmith>=0.3.13`

<CodeGroup>
  ```python Python theme={null}
  from langsmith import Client

  ls_client = Client()
  dataset_name = "Wikipedia RAG"

  if not ls_client.has_dataset(dataset_name=dataset_name):
      dataset = ls_client.create_dataset(dataset_name=dataset_name)
      examples = [
        {"inputs": {"question": "What is LangChain?"}},
        {"inputs": {"question": "What is LangSmith?"}},
      ]
      ls_client.create_examples(
        dataset_id=dataset.id,
        examples=examples,
      )
  ```

  ```typescript TypeScript theme={null}
  import { Client } from "langsmith";

  const client = new Client();
  const examples = [
    [
      "What is LangChain?",
      "LangChain is an open-source framework for building applications using large language models.",
    ],
    [
      "What is LangSmith?",
      "LangSmith is an observability and evaluation tool for LLM products, built by LangChain Inc.",
    ],
  ];
  const datasetName = "Wikipedia RAG";
  const inputs = examples.map(([input, _]) => ({ input }));
  const outputs = examples.map(([_, expected]) => ({ expected }));
  const dataset = await client.createDataset(datasetName);
  await client.createExamples({ datasetId: dataset.id, inputs, outputs });
  ```
</CodeGroup>

## 3. Define your custom evaluators

As mentioned above, we will define two evaluators: one that evaluates the relevance of the retrieved documents w\.r.t the input query and another that evaluates the hallucination of the generated answer w\.r.t the retrieved documents. We will be using LangChain LLM wrappers, along with [`with_structured_output`](https://python.langchain.com/v0.1/docs/modules/model_io/chat/structured_output/) to define the evaluator for hallucination.

The key here is that the evaluator function should traverse the `run` / `rootRun` argument to access the intermediate steps of the pipeline. The evaluator can then process the inputs and outputs of the intermediate steps to evaluate according to the desired criteria.

Example uses `langchain` for convenience, this is not required.

<CodeGroup>
  ```python Python theme={null}
  from langchain.chat_models import init_chat_model
  from langsmith.schemas import Run
  from pydantic import BaseModel, Field

  def document_relevance(run: Run) -> bool:
      """Checks if retriever input exists in the retrieved docs."""
      qa_pipeline_run = next(
          r for run in run.child_runs if r.name == "qa_pipeline"
      )
      retrieve_run = next(
          r for run in qa_pipeline_run.child_runs if r.name == "retrieve"
      )
      page_contents = "\n\n".join(
          doc["page_content"] for doc in retrieve_run.outputs["output"]
      )
      return retrieve_run.inputs["query"] in page_contents

  # Data model
  class GradeHallucinations(BaseModel):
      """Binary score for hallucination present in generation answer."""
      is_grounded: bool = Field(..., description="True if the answer is grounded in the facts, False otherwise.")

  # LLM with structured output for grading hallucinations
  # For more see: https://python.langchain.com/docs/how_to/structured_output/
  grader_llm= init_chat_model("gpt-4o-mini", temperature=0).with_structured_output(
      GradeHallucinations,
      method="json_schema",
      strict=True,
  )

  def no_hallucination(run: Run) -> bool:
      """Check if the answer is grounded in the documents.
      Return True if there is no hallucination, False otherwise.
      """
      # Get documents and answer
      qa_pipeline_run = next(
          r for r in run.child_runs if r.name == "qa_pipeline"
      )
      retrieve_run = next(
          r for r in qa_pipeline_run.child_runs if r.name == "retrieve"
      )
      retrieved_content = "\n\n".join(
          doc["page_content"] for doc in retrieve_run.outputs["output"]
      )

      # Construct prompt
      instructions = (
          "You are a grader assessing whether an LLM generation is grounded in / "
          "supported by a set of retrieved facts. Give a binary score 1 or 0, "
          "where 1 means that the answer is grounded in / supported by the set of facts."
      )
      messages = [
          {"role": "system", "content": instructions},
          {"role": "user", "content": f"Set of facts:\n{retrieved_content}\n\nLLM generation: {run.outputs['answer']}"},
      ]
      grade = grader_llm.invoke(messages)
      return grade.is_grounded
  ```

  ```typescript TypeScript theme={null}
  import { EvaluationResult } from "langsmith/evaluation";
  import { Run, Example } from "langsmith/schemas";
  import { ChatPromptTemplate } from "@langchain/core/prompts";
  import { ChatOpenAI } from "@langchain/openai";
  import { z } from "zod";

  function findNestedRun(run: Run, search: (run: Run) => boolean): Run | null {
    const queue: Run[] = [run];
    while (queue.length > 0) {
      const currentRun = queue.shift()!;
      if (search(currentRun)) return currentRun;
      queue.push(...currentRun.child_runs);
    }
    return null;
  }

  // A very simple evaluator that checks to see if the input of the retrieval step exists
  // in the retrieved docs.
  function documentRelevance(rootRun: Run, example: Example): EvaluationResult {
    const retrieveRun = findNestedRun(rootRun, (run) => run.name === "retrieve");
    const docs: Array<{ page_content: string }> | undefined =
      retrieveRun.outputs?.outputs;
    const pageContents = docs?.map((doc) => doc.page_content).join("\n\n");
    const score = pageContents.includes(retrieveRun.inputs?.query);
    return { key: "simple_document_relevance", score };
  }

  async function hallucination(
    rootRun: Run,
    example: Example
  ): Promise<EvaluationResult> {
    const rag = findNestedRun(rootRun, (run) => run.name === "ragPipeline");
    const retrieve = findNestedRun(rootRun, (run) => run.name === "retrieve");
    const docs: Array<{ page_content: string }> | undefined =
      retrieve.outputs?.outputs;
    const documents = docs?.map((doc) => doc.page_content).join("\n\n");

    const prompt = ChatPromptTemplate.fromMessages<{
      documents: string;
      generation: string;
    }>([
      [
        "system",
        [
          `You are a grader assessing whether an LLM generation is grounded in / supported by a set of retrieved facts. \n`,
          `Give a binary score 1 or 0, where 1 means that the answer is grounded in / supported by the set of facts.`,
        ].join("\n"),
      ],
      [
        "human",
        "Set of facts: \n\n {documents} \n\n LLM generation: {generation}",
      ],
    ]);

    const llm = new ChatOpenAI({
      model: "gpt-4o-mini",
      temperature: 0,
    }).withStructuredOutput(
      z
        .object({
          binary_score: z
            .number()
            .describe("Answer is grounded in the facts, 1 or 0"),
        })
        .describe("Binary score for hallucination present in generation answer.")
    );

    const grader = prompt.pipe(llm);
    const score = await grader.invoke({
      documents,
      generation: rag.outputs?.outputs,
    });
    return { key: "answer_hallucination", score: score.binary_score };
  }
  ```
</CodeGroup>

## 4. Evaluate the pipeline

Finally, we'll run `evaluate` with the custom evaluators defined above.

<CodeGroup>
  ```python Python theme={null}
  def qa_wrapper(inputs: dict) -> dict:
    """Wrap the qa_pipeline so it can accept the Example.inputs dict as input."""
    return {"answer": qa_pipeline(inputs["question"])}

  experiment_results = ls_client.evaluate(
      qa_wrapper,
      data=dataset_name,
      evaluators=[document_relevance, no_hallucination],
      experiment_prefix="rag-wiki-oai"
  )
  ```

  ```typescript TypeScript theme={null}
  import { evaluate } from "langsmith/evaluation";

  await evaluate((inputs) => ragPipeline({ question: inputs.input }), {
    data: datasetName,
    evaluators: [hallucination, documentRelevance],
    experimentPrefix: "rag-wiki-oai",
  });
  ```
</CodeGroup>

The experiment will contain the results of the evaluation, including the scores and comments from the evaluators: <img src="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-experiment.png?fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=e926744573c6b9757ba22ff245a3da2c" alt="evaluation_intermediate_experiment.png" data-og-width="2446" width="2446" data-og-height="1244" height="1244" data-path="langsmith/images/evaluation-intermediate-experiment.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-experiment.png?w=280&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=7b6e321b15a06b2adc7f1cacb8e07a35 280w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-experiment.png?w=560&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=c677007bcc1e2af4b3767d6b44fcb327 560w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-experiment.png?w=840&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=a39153399b6721b7c51693f5a59cf2b0 840w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-experiment.png?w=1100&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=1132228eba6761a724ae98d85fcf536c 1100w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-experiment.png?w=1650&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=5d74785384737df0cf67145b397b1934 1650w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/evaluation-intermediate-experiment.png?w=2500&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=bef00e4bdc12289d9f1e4b77ed8489cf 2500w" />

## Related

* [Evaluate a `langgraph` graph](/langsmith/evaluate-on-intermediate-steps)

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/evaluate-on-intermediate-steps.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# How to run a pairwise evaluation
Source: https://docs.langchain.com/langsmith/evaluate-pairwise



<Info>
  Concept: [Pairwise evaluations](/langsmith/evaluation-concepts#pairwise)
</Info>

LangSmith supports evaluating **existing** experiments in a comparative manner. Instead of evaluating one output at a time, you can score the output from multiple experiments against each other. In this guide, you'll use [`evaluate()`](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._runner.evaluate) with two existing experiments to [define an evaluator](#define-a-pairwise-evaluator) and [run a pairwise evaluation](#run-a-pairwise-evaluation). Finally, you'll use the LangSmith UI to [view the pairwise experiments](#view-pairwise-experiments).

## Prerequisites

* If you haven't already created experiments to compare, check out the [quick start](/langsmith/evaluation-quickstart) or the [how-to guide](/langsmith/evaluate-llm-application) to get started with evaluations.
* This guide requires `langsmith` Python version `>=0.2.0` or JS version `>=0.2.9`.

<Info>
  You can also use [`evaluate_comparative()`](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._runner.evaluate_comparative) with more than two existing experiments.
</Info>

## `evaluate()` comparative args

At its simplest, `evaluate` / `aevaluate` function takes the following arguments:

| Argument     | Description                                                                                                                        |
| ------------ | ---------------------------------------------------------------------------------------------------------------------------------- |
| `target`     | A list of the two **existing experiments** you would like to evaluate against each other. These can be uuids or experiment names.  |
| `evaluators` | A list of the pairwise evaluators that you would like to attach to this evaluation. See the section below for how to define these. |

Along with these, you can also pass in the following optional args:

| Argument                                 | Description                                                                                                                                                                                                                                                                                                                                                                    |
| ---------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `randomize_order` / `randomizeOrder`     | An optional boolean indicating whether the order of the outputs should be randomized for each evaluation. This is a strategy for minimizing positional bias in your prompt: often, the LLM will be biased towards one of the responses based on the order. This should mainly be addressed via prompt engineering, but this is another optional mitigation. Defaults to False. |
| `experiment_prefix` / `experimentPrefix` | A prefix to be attached to the beginning of the pairwise experiment name. Defaults to None.                                                                                                                                                                                                                                                                                    |
| `description`                            | A description of the pairwise experiment. Defaults to None.                                                                                                                                                                                                                                                                                                                    |
| `max_concurrency` / `maxConcurrency`     | The maximum number of concurrent evaluations to run. Defaults to 5.                                                                                                                                                                                                                                                                                                            |
| `client`                                 | The LangSmith client to use. Defaults to None.                                                                                                                                                                                                                                                                                                                                 |
| `metadata`                               | Metadata to attach to your pairwise experiment. Defaults to None.                                                                                                                                                                                                                                                                                                              |
| `load_nested` / `loadNested`             | Whether to load all child runs for the experiment. When False, only the root trace will be passed to your evaluator. Defaults to False.                                                                                                                                                                                                                                        |

## Define a pairwise evaluator

Pairwise evaluators are just functions with an expected signature.

### Evaluator args

Custom evaluator functions must have specific argument names. They can take any subset of the following arguments:

* `inputs: dict`: A dictionary of the inputs corresponding to a single example in a dataset.
* `outputs: list[dict]`: A two-item list of the dict outputs produced by each experiment on the given inputs.
* `reference_outputs` / `referenceOutputs: dict`: A dictionary of the reference outputs associated with the example, if available.
* `runs: list[Run]`: A two-item list of the full [Run](/langsmith/run-data-format) objects generated by the two experiments on the given example. Use this if you need access to intermediate steps or metadata about each run.
* `example: Example`: The full dataset [Example](/langsmith/example-data-format), including the example inputs, outputs (if available), and metadata (if available).

For most use cases you'll only need `inputs`, `outputs`, and `reference_outputs` / `referenceOutputs`. `runs` and `example` are useful only if you need some extra trace or example metadata outside of the actual inputs and outputs of the application.

### Evaluator output

Custom evaluators are expected to return one of the following types:

Python and JS/TS

* `dict`: dictionary with keys:

  * `key`, which represents the feedback key that will be logged
  * `scores`, which is a mapping from run ID to score for that run.
  * `comment`, which is a string. Most commonly used for model reasoning.

Currently Python only

* `list[int | float | bool]`: a two-item list of scores. The list is assumed to have the same order as the `runs` / `outputs` evaluator args. The evaluator function name is used for the feedback key.

Note that you should choose a feedback key that is distinct from standard feedbacks on your run. We recommend prefixing pairwise feedback keys with `pairwise_` or `ranked_`.

## Run a pairwise evaluation

The following example uses [a prompt](https://smith.langchain.com/hub/langchain-ai/pairwise-evaluation-2) which asks the LLM to decide which is better between two AI assistant responses. It uses structured output to parse the AI's response: 0, 1, or 2.

<Info>
  In the Python example below, we are pulling [this structured prompt](https://smith.langchain.com/hub/langchain-ai/pairwise-evaluation-2) from the [LangChain Hub](/langsmith/manage-prompts#public-prompt-hub) and using it with a LangChain chat model wrapper.

  **Usage of LangChain is totally optional.** To illustrate this point, the TypeScript example uses the OpenAI SDK directly.
</Info>

* Python: Requires `langsmith>=0.2.0`
* TypeScript: Requires `langsmith>=0.2.9`

<CodeGroup>
  ```python Python theme={null}
  from langchain_classic import hub
  from langchain.chat_models import init_chat_model
  from langsmith import evaluate

  # See the prompt: https://smith.langchain.com/hub/langchain-ai/pairwise-evaluation-2
  prompt = hub.pull("langchain-ai/pairwise-evaluation-2")
  model = init_chat_model("gpt-4o")
  chain = prompt | model

  def ranked_preference(inputs: dict, outputs: list[dict]) -> list:
      # Assumes example inputs have a 'question' key and experiment
      # outputs have an 'answer' key.
      response = chain.invoke({
          "question": inputs["question"],
          "answer_a": outputs[0].get("answer", "N/A"),
          "answer_b": outputs[1].get("answer", "N/A"),
      })
      if response["Preference"] == 1:
          scores = [1, 0]
      elif response["Preference"] == 2:
          scores = [0, 1]
      else:
          scores = [0, 0]
      return scores

  evaluate(
      ("experiment-1", "experiment-2"),  # Replace with the names/IDs of your experiments
      evaluators=[ranked_preference],
      randomize_order=True,
      max_concurrency=4,
  )
  ```

  ```typescript TypeScript theme={null}
  import { evaluate} from "langsmith/evaluation";
  import { Run } from "langsmith/schemas";
  import { wrapOpenAI } from "langsmith/wrappers";
  import OpenAI from "openai";
  import { z } from "zod";

  const openai = wrapOpenAI(new OpenAI());

  async function rankedPreference({
    inputs,
    runs,
  }: {
    inputs: Record<string, any>;
    runs: Run[];
  }) {
    const scores: Record<string, number> = {};
    const [runA, runB] = runs;
    if (!runA || !runB) throw new Error("Expected at least two runs");

    const payload = {
      question: inputs.question,
      answer_a: runA?.outputs?.output ?? "N/A",
      answer_b: runB?.outputs?.output ?? "N/A",
    };

    const output = await openai.chat.completions.create({
      model: "gpt-4-turbo",
      messages: [
        {
          role: "system",
          content: [
            "Please act as an impartial judge and evaluate the quality of the responses provided by two AI assistants to the user question displayed below.",
            "You should choose the assistant that follows the user's instructions and answers the user's question better.",
            "Your evaluation should consider factors such as the helpfulness, relevance, accuracy, depth, creativity, and level of detail of their responses.",
            "Begin your evaluation by comparing the two responses and provide a short explanation.",
            "Avoid any position biases and ensure that the order in which the responses were presented does not influence your decision.",
            "Do not allow the length of the responses to influence your evaluation. Do not favor certain names of the assistants. Be as objective as possible.",
          ].join(" "),
        },
        {
          role: "user",
          content: [
            `[User Question] ${payload.question}`,
            `[The Start of Assistant A's Answer] ${payload.answer_a} [The End of Assistant A's Answer]`,
            `The Start of Assistant B's Answer] ${payload.answer_b} [The End of Assistant B's Answer]`,
          ].join("\n\n"),
        },
      ],
      tool_choice: {
        type: "function",
        function: { name: "Score" },
      },
      tools: [
        {
          type: "function",
          function: {
            name: "Score",
            description: [
              `After providing your explanation, output your final verdict by strictly following this format:`,
              `Output "1" if Assistant A answer is better based upon the factors above.`,
              `Output "2" if Assistant B answer is better based upon the factors above.`,
              `Output "0" if it is a tie.`,
            ].join(" "),
            parameters: {
              type: "object",
              properties: {
                Preference: {
                  type: "integer",
                  description: "Which assistant answer is preferred?",
                },
              },
            },
          },
        },
      ],
    });

    const { Preference } = z
      .object({ Preference: z.number() })
      .parse(
        JSON.parse(output.choices[0].message.tool_calls[0].function.arguments)
      );

    if (Preference === 1) {
      scores[runA.id] = 1;
      scores[runB.id] = 0;
    } else if (Preference === 2) {
      scores[runA.id] = 0;
      scores[runB.id] = 1;
    } else {
      scores[runA.id] = 0;
      scores[runB.id] = 0;
    }

    return { key: "ranked_preference", scores };
  }

  await evaluate(["earnest-name-40", "reflecting-pump-91"], {
    evaluators: [rankedPreference],
  });
  ```
</CodeGroup>

## View pairwise experiments

Navigate to the "Pairwise Experiments" tab from the dataset page:

<img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-from-dataset.png?fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=dddf35fd971055d0d94ae4184c91dea3" alt="Pairwise Experiments Tab" data-og-width="3454" width="3454" data-og-height="1912" height="1912" data-path="langsmith/images/pairwise-from-dataset.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-from-dataset.png?w=280&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=4c1677867b832da9c3b4338a210570f8 280w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-from-dataset.png?w=560&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=80d4795bc999156850eb8092e8267c9f 560w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-from-dataset.png?w=840&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=ed2e5fb624828fb649bf33473e7dc797 840w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-from-dataset.png?w=1100&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=05c2248284b4efb2f5a9f38cffef0b9b 1100w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-from-dataset.png?w=1650&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=3014131b2f5ae730aa354afaa7312316 1650w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-from-dataset.png?w=2500&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=38c4f707158930cf7d1d155db4021362 2500w" />

Click on a pairwise experiment that you would like to inspect, and you will be brought to the Comparison View:

<img src="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-comparison-view.png?fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=8afa7467faf707c0bb5ede23b007beda" alt="Pairwise Comparison View" data-og-width="3430" width="3430" data-og-height="1886" height="1886" data-path="langsmith/images/pairwise-comparison-view.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-comparison-view.png?w=280&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=9a837cee527a1bf5dda5a77b8ce16ba6 280w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-comparison-view.png?w=560&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=ebb39f8f2fb7a542d2273cfc64c5b4f4 560w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-comparison-view.png?w=840&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=2f2de8c570a3e6401ba0220da343b3e0 840w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-comparison-view.png?w=1100&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=cf612b4c6938e856b78c7476f8cc6304 1100w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-comparison-view.png?w=1650&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=ff05d71cd12f19d0403e6a1e3e64609a 1650w, https://mintcdn.com/langchain-5e9cc07a/H9jA2WRyA-MV4-H0/langsmith/images/pairwise-comparison-view.png?w=2500&fit=max&auto=format&n=H9jA2WRyA-MV4-H0&q=85&s=5674d6403a3070935830983b9e36ac2f 2500w" />

You may filter to runs where the first experiment was better or vice versa by clicking the thumbs up/thumbs down buttons in the table header:

<img src="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/filter-pairwise.png?fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=677c48099cee9848d2119c154c7b0d88" alt="Pairwise Filtering" data-og-width="3454" width="3454" data-og-height="1914" height="1914" data-path="langsmith/images/filter-pairwise.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/filter-pairwise.png?w=280&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=1ceff9156ccfdb48f246f41c7e0d16ab 280w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/filter-pairwise.png?w=560&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=745f3dc2bed9e3e2d8333df0ff57a43e 560w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/filter-pairwise.png?w=840&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=e81f10f544953ee39366866c1f4a5d71 840w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/filter-pairwise.png?w=1100&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=3f4af39e4da50d0ad081d03aaf7b238e 1100w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/filter-pairwise.png?w=1650&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=63688564868c7a0989c429c6e740e014 1650w, https://mintcdn.com/langchain-5e9cc07a/0B2PFrFBMRWNccee/langsmith/images/filter-pairwise.png?w=2500&fit=max&auto=format&n=0B2PFrFBMRWNccee&q=85&s=527537f492665790aa8380e0d75e7fb3 2500w" />

***

<Callout icon="pen-to-square" iconType="regular">
  [Edit the source of this page on GitHub.](https://github.com/langchain-ai/docs/edit/main/src/langsmith/evaluate-pairwise.mdx)
</Callout>

<Tip icon="terminal" iconType="regular">
  [Connect these docs programmatically](/use-these-docs) to Claude, VSCode, and more via MCP for real-time answers.
</Tip>


# Evaluate a RAG application
Source: https://docs.langchain.com/langsmith/evaluate-rag-tutorial



<Info>
  [RAG evaluation](/langsmith/evaluation-concepts#retrieval-augmented-generation-rag) | [Evaluators](/langsmith/evaluation-concepts#evaluators) | [LLM-as-judge evaluators](/langsmith/evaluation-concepts#llm-as-judge)
</Info>

Retrieval Augmented Generation (RAG) is a technique that enhances Large Language Models (LLMs) by providing them with relevant external knowledge. It has become one of the most widely used approaches for building LLM applications.

This tutorial will show you how to evaluate your RAG applications using LangSmith. You'll learn:

1. How to create test datasets
2. How to run your RAG application on those datasets
3. How to measure your application's performance using different evaluation metrics

## Overview

A typical RAG evaluation workflow consists of three main steps:

1. Creating a dataset with questions and their expected answers

2. Running your RAG application on those questions

3. Using evaluators to measure how well your application performed, looking at factors like:

   * Answer relevance
   * Answer accuracy
   * Retrieval quality

For this tutorial, we'll create and evaluate a bot that answers questions about a few of [Lilian Weng's](https://lilianweng.github.io/) insightful blog posts.

## Setup

### Environment

First, let's set our environment variables:

<CodeGroup>
  ```python Python theme={null}
  import os
  os.environ["LANGSMITH_TRACING"] = "true"
  os.environ["LANGSMITH_API_KEY"] = "YOUR LANGSMITH API KEY"
  os.environ["OPENAI_API_KEY"] = "YOUR OPENAI API KEY"
  ```

  ```typescript TypeScript theme={null}
  process.env.LANGSMITH_TRACING = "true";
  process.env.LANGSMITH_API_KEY = "YOUR LANGSMITH API KEY";
  process.env.OPENAI_API_KEY = "YOUR OPENAI API KEY";
  ```
</CodeGroup>

And install the dependencies we'll need:

<CodeGroup>
  ```bash Python theme={null}
  pip install -U langsmith langchain[openai] langchain-community
  ```

  ```bash TypeScript theme={null}
  yarn add langsmith langchain @langchain/community @langchain/openai
  ```
</CodeGroup>

### Application

<Info>
  While this tutorial uses LangChain, the evaluation techniques and LangSmith functionality demonstrated here work with any framework. Feel free to use your preferred tools and libraries.
</Info>

In this section, we'll build a basic Retrieval-Augmented Generation (RAG) application.

We'll stick to a simple implementation that:

* Indexing: chunks and indexes a few of Lilian Weng's blogs in a vector store
* Retrieval: retrieves those chunks based on the user question
* Generation: passes the question and retrieved docs to an LLM.

#### Indexing and retrieval

First, lets load the blog posts we want to build a chatbot for and index them.

<CodeGroup>
  ```python Python theme={null}
  from langchain_community.document_loaders import WebBaseLoader
  from langchain_core.vectorstores import InMemoryVectorStore
  from langchain_openai import OpenAIEmbeddings
  from langchain_text_splitters import RecursiveCharacterTextSplitter

  # List of URLs to load documents from
  urls = [
      "https://lilianweng.github.io/posts/2023-06-23-agent/",
      "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
      "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
  ]

  # Load documents from the URLs
  docs = [WebBaseLoader(url).load() for url in urls]
  docs_list = [item for sublist in docs for item in sublist]

  # Initialize a text splitter with specified chunk size and overlap
  text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
      chunk_size=250, chunk_overlap=0
  )

  # Split the documents into chunks
  doc_splits = text_splitter.split_documents(docs_list)

  # Add the document chunks to the "vector store" using OpenAIEmbeddings
  vectorstore = InMemoryVectorStore.from_documents(
      documents=doc_splits,
      embedding=OpenAIEmbeddings(),
  )

  # With langchain we can easily turn any vector store into a retrieval component:
  retriever = vectorstore.as_retriever(k=6)
  ```

  ```typescript TypeScript theme={null}
  import { OpenAIEmbeddings } from "@langchain/openai";
  import { MemoryVectorStore } from "@langchain/classic/vectorstores/memory";
  import { BrowserbaseLoader } from "@langchain/community/document_loaders/web/browserbase";
  import { RecursiveCharacterTextSplitter } from "@langchain/text_splitters";

  // List of URLs to load documents from
  const urls = [
      "https://lilianweng.github.io/posts/2023-06-23-agent/",
      "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
      "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
  ]

  const loader = new BrowserbaseLoader(urls, {
      textContent: true,
  });

  const docs = await loader.load();

  const splitter = new RecursiveCharacterTextSplitter({
      chunkSize: 1000, chunkOverlap: 200
  });

  const allSplits = await splitter.splitDocuments(docs);

  const embeddings = new OpenAIEmbeddings({
      model: "text-embedding-3-large"
  });

  const vectorStore = new MemoryVectorStore(embeddings);  // Index chunks
  await vectorStore.addDocuments(allSplits)
  ```
</CodeGroup>

#### Generation

We can now define the generative pipeline.

<CodeGroup>
  ```python Python theme={null}
  from langchain_openai import ChatOpenAI
  from langsmith import traceable

  llm = ChatOpenAI(model="gpt-4o", temperature=1)

  # Add decorator so this function is traced in LangSmith
  @traceable()
  def rag_bot(question: str) -> dict:
      # LangChain retriever will be automatically traced
      docs = retriever.invoke(question)
      docs_string = "".join(doc.page_content for doc in docs)
      instructions = f"""You are a helpful assistant who is good at analyzing source information and answering questions.
         Use the following source documents to answer the user's questions.
         If you don't know the answer, just say that you don't know.
         Use three sentences maximum and keep the answer concise.

  Documents:
  {docs_string}"""
      # langchain ChatModel will be automatically traced
      ai_msg = llm.invoke([
              {"role": "system", "content": instructions},
              {"role": "user", "content": question},
          ],
      )
      return {"answer": ai_msg.content, "documents": docs}
  ```

  ```typescript TypeScript theme={null}
  import { ChatOpenAI } from "@langchain/openai";
  import { traceable } from "langsmith/traceable";

  const llm = new ChatOpenAI({
    model: "gpt-4o",
    temperature: 1,
  })

  // Add decorator so this function is traced in LangSmith
  con