This guide walks you through the process of deploying a LangChain application to a cloud endpoint. We’ll use Itura’s free tier to host the application in a serverless environment. The deployment will be connected to a GitHub repository, so any changes to the code will automatically be reflected in the deployment.

What you need

  • About 10 minutes
  • An OpenAI API key
  • A GitHub account

What you will deploy

You will deploy a simple LangChain application to a serverless cloud endpoint. The endpoint will accept POST requests at a URL similar to https://<your-agent-slug>.agent.itura.ai/run. When a POST request is sent to this URL, the LangChain chain will be invoked. Environment variables can be added to the application, which will be injected at runtime.

Our starting point is a simple LangChain application that generates a short story based on a character and their job. The key entry point for the application is the main.py file.

main.py
#!/usr/bin/env python
import os
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI

# 1. Instantiate the OpenAI Chat LLM.
llm = ChatOpenAI(model="gpt-4o-mini")

# 2. Create a simple prompt template.
prompt_template = "Create a 100 character story about a {character} who is a {job}."
prompt = PromptTemplate.from_template(template=prompt_template)

# 3. Create a chain.
chain = prompt | llm


def invoke_chain(inputs):
    if "OPENAI_API_KEY" not in os.environ:
        print("Error: OPENAI_API_KEY environment variable not set.")
        exit()
    try:
        # 4. Invoke the chain
        response = chain.invoke(inputs)
        # 5. Print the result content
        print("LLM-generated story:", response.content)
        return response.content
    except Exception as e:
        print(f"An error occurred: {e}")
        raise e


if __name__ == "__main__":
    invoke_chain({"character": "Alice", "job": "librarian"})

To run the script locally, ensure you have pip installed ChatOpenAI, and that you have set the OPENAI_API_KEY environment variable.

How to complete this guide

To follow along with the guide, you’ll need to:

  • Create a new project directory.
  • Save the code above as main.py in your project directory.
  • Initialize a Git repository (git init) and commit the main.py file.

The final deployable code for this guide is available in the complete branch of the langchain-deployment-example repository.

Deploying a LangChain application

1

Install Flask

To deploy the application to the Itura cloud platform, we’ll need to create a simple HTTP endpoint that will be used to invoke the LangChain chain. We’ll use Flask to create this endpoint. First, install Flask.

pip install Flask
2

Create a /run endpoint

To invoke the chain, Itura looks for a /run endpoint that accepts POST requests. Let’s modify our main.py file to include a Flask app and a /run endpoint. We’ll also use the request object to get the inputs from the POST request. Finally, we’ll return the response from the chain as a JSON object.

main.py
#!/usr/bin/env python
import os
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from flask import Flask, request, jsonify

llm = ChatOpenAI(model="gpt-4o-mini")

prompt_template = "Create a 100 character story about a {character} who is a {job}."
prompt = PromptTemplate.from_template(template=prompt_template)

chain = prompt | llm

app = Flask(__name__)


@app.route("/run", methods=["POST"])
def invoke_chain():
    if "OPENAI_API_KEY" not in os.environ:
        print("Error: OPENAI_API_KEY environment variable not set.")
        exit()

    inputs = request.json
    if not inputs or "character" not in inputs or "job" not in inputs:
        return jsonify({"error": "Missing 'character' or 'job' in request body"}), 400

    try:
        response = chain.invoke(inputs)
        print("LLM-generated story:", response.content)
        return jsonify({"story": response.content})
    except Exception as e:
        print(f"An error occurred: {e}")
        return jsonify({"error": str(e)}), 500

# This is optional to run the app locally
if __name__ == "__main__":
    app.run(debug=True, host="0.0.0.0", port=5000)
3

Create a requirements.txt file

You need to create a requirements.txt file to specify the dependencies for our application. Itura will use this file to install the dependencies when it is deployed. You can generate this using pip freeze.

pip freeze > requirements.txt
4

Commit and push your code

Now that we have the Flask /run endpoint and the requirements.txt file, we should commit these changes and push the code to a GitHub repository.

git add .
git commit -m "Add Flask endpoint and requirements"
git push
5

Deploy to Itura Cloud

Go to app.itura.ai and create a new agent project (the term “agent” is used broadly for deployed applications). You’ll be prompted to connect your GitHub account and select the repository and branch you want to deploy. Select the branch holding your LangChain code (with the Flask run endpoint and requirements.txt file), and click Deploy.

The deployment might take a couple of minutes to complete as Itura builds the environment and deploys your application.

Once the deployment is complete, you’ll be able to see an auto-generated API key (e.g., sk-agent-aa3f96a3-43e9-448f-ad94-84a38e64c229). Save this key in a secure location. You can generate a new key at any time from the project dashboard.

6

Add environment variables

When the deployment is complete, you will also be able to add environment variables to the application. These will be injected at runtime with each request to the endpoint. Add your OPENAI_API_KEY as an environment variable from the Itura project UI.

7

Invoke the application

Now that the application is deployed, you can invoke it by sending a POST request to the application endpoint. From the project dashboard, you can see the application endpoint URL (e.g., https://<your-agent-slug>.agent.itura.ai/run). Using this URL and the API key, you can invoke the application by sending a POST request to the endpoint with the required inputs.

curl --request POST \
  --url https://{agentSlug}.agent.itura.ai/run \
  --header 'Authorization: Bearer <your-itura-api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
  "character": "curious cat",
  "job": "detective"
}'

If successful, you should receive a 202 Accepted response. This means your request is queued. For example, the response might look like this:

response.json
{
  "run_id": "unique-run-id",
  "message": "Run request accepted and queued for execution",
  "status": "PENDING"
}
8

Check the status of the run

You can check the status of the run from Itura’s dashboard. Or, by sending a GET request to the /status endpoint.

The status endpoint is provided by Itura platform. So, you don’t need to add it explicitly to your code.

curl --request GET \
  --url https://{agentSlug}.agent.itura.ai/status/{run_id} \
  --header 'Authorization: Bearer <token>'

Sending the request to the /status endpoint will return a JSON object with the status of the run. For example, the response might look like this:

status.json
{
  "status": "COMPLETED",
  "run_id": "8116219c-7c94-4957-af9e-d7b442030d48",
  "agent_name": "langchain-deployment-example-12345",
  "agent_url": "https://langchain-deployment-example-12345.agent.itura.ai",
  "created_at": "2025-04-19T13:27:25.515776+00:00",
  "started_at": "2025-04-19T13:27:27+00:00",
  "ended_at": "2025-04-19T13:27:36.351+00:00",
  "duration_seconds": 11
}
9

Check the logs

You can check the logs from the Itura dashboard. Go to your agent dashboard and click on the Runs tab. You can see the logs for each run.

10

Update the deployment code

If you want to update the deployment code, you can do so by pushing a new commit to the connected GitHub repository. Itura will automatically detect the changes and update the deployment.

Conclusion

That’s it! In this guide, we’ve walked you through the process of deploying a LangChain application to a serverless cloud endpoint. We’ve also shown how to invoke the application and check the status of the run.