Deploy LangChain App to the Cloud
Learn how to deploy a LangChain application to a serverless cloud endpoint
This guide walks you through the process of deploying a LangChain application to a cloud endpoint. We’ll use Itura’s free tier to host the application in a serverless environment. The deployment will be connected to a GitHub repository, so any changes to the code will automatically be reflected in the deployment.
What you need
- About 10 minutes
- An OpenAI API key
- A GitHub account
What you will deploy
You will deploy a simple LangChain application to a serverless cloud endpoint. The endpoint will accept POST requests at a URL similar to https://<your-agent-slug>.agent.itura.ai/run
. When a POST request is sent to this URL, the LangChain chain will be invoked. Environment variables can be added to the application, which will be injected at runtime.
Our starting point is a simple LangChain application that generates a short story based on a character and their job. The key entry point for the application is the main.py
file.
To run the script locally, ensure you have pip installed ChatOpenAI
, and that you have set the OPENAI_API_KEY
environment variable.
How to complete this guide
To follow along with the guide, you’ll need to:
- Create a new project directory.
- Save the code above as
main.py
in your project directory. - Initialize a Git repository (
git init
) and commit themain.py
file.
The final deployable code for this guide is available in the complete
branch
of the
langchain-deployment-example
repository.
Deploying a LangChain application
Install Flask
To deploy the application to the Itura cloud platform, we’ll need to create a simple HTTP endpoint that will be used to invoke the LangChain chain. We’ll use Flask to create this endpoint. First, install Flask.
Create a /run endpoint
To invoke the chain, Itura looks for a /run
endpoint that accepts POST requests. Let’s modify our main.py
file to include a Flask app and a /run
endpoint. We’ll also use the request
object to get the inputs from the POST request. Finally, we’ll return the response from the chain as a JSON object.
Create a requirements.txt file
You need to create a requirements.txt
file to specify the dependencies for our application. Itura will use this file to install the dependencies when it is deployed. You can generate this using pip freeze
.
Commit and push your code
Now that we have the Flask /run
endpoint and the requirements.txt
file, we should commit these changes and push the code to a GitHub repository.
Deploy to Itura Cloud
Go to app.itura.ai and create a new agent project (the term “agent” is used broadly for deployed applications). You’ll be prompted to connect your GitHub account and select the repository and branch you want to deploy. Select the branch holding your LangChain code (with the Flask run
endpoint and requirements.txt
file), and click Deploy.
The deployment might take a couple of minutes to complete as Itura builds the environment and deploys your application.
Once the deployment is complete, you’ll be able to see an auto-generated API key (e.g., sk-agent-aa3f96a3-43e9-448f-ad94-84a38e64c229
). Save this key in a secure location. You can generate a new key at any time from the project dashboard.
Add environment variables
When the deployment is complete, you will also be able to add environment variables to the application. These will be injected at runtime with each request to the endpoint. Add your OPENAI_API_KEY
as an environment variable from the Itura project UI.
Invoke the application
Now that the application is deployed, you can invoke it by sending a POST request to the application endpoint. From the project dashboard, you can see the application endpoint URL (e.g., https://<your-agent-slug>.agent.itura.ai/run
). Using this URL and the API key, you can invoke the application by sending a POST request to the endpoint with the required inputs.
If successful, you should receive a 202 Accepted
response. This means your request is queued. For example, the response might look like this:
Check the status of the run
You can check the status of the run from Itura’s dashboard. Or, by sending a GET request to the /status
endpoint.
The status endpoint is provided by Itura platform. So, you don’t need to add it explicitly to your code.
Sending the request to the /status
endpoint will return a JSON object with the status of the run. For example, the response might look like this:
Check the logs
You can check the logs from the Itura dashboard. Go to your agent dashboard and click on the Runs tab. You can see the logs for each run.
Update the deployment code
If you want to update the deployment code, you can do so by pushing a new commit to the connected GitHub repository. Itura will automatically detect the changes and update the deployment.
Conclusion
That’s it! In this guide, we’ve walked you through the process of deploying a LangChain application to a serverless cloud endpoint. We’ve also shown how to invoke the application and check the status of the run.