Skip to content

Running Capstone 1 Experiments on Runpod

This document contains instructions for running Capstone 1 experiments using Runpod on-demand GPUs. You'll spin up a Runpod pod with the correct hardware requirements and execute this Capstone's Jupyter notebook on it.

What you'll need

  1. A Runpod account with credits ($20 of credits should be enough to run Qwen3-Coder:480B)
  2. An OpenAI API key (Optional, recommended for the judge LLM)

High-level steps

  1. (Optional) Create a Runpod secret for OPENAI_API_KEY
  2. Deploy a Pod (on-demand GPU) with an llm-agents-from-scratch Runpod template
  3. Wait for Pod to become available and finish its setup
  4. Connect to the Pod's JupyterLab service
  5. Navigate to the Capstone Jupyter notebook
  6. Run the notebook

(Optional) Step 0. Create a Runpod secret for OPENAI_API_KEY

If you're planning to use the OpenAILLM for the judge LLM, you'll need to supply an OpenAI API key. The Runpod templates that I've prepared for running these experiments will create the necessary environment variable from a Runpod secret.

To create the Runpod secret for your OpenAI API key, log in to the Runpod console and click "Secrets" found in the left sidebar menu. Afterwards, click on "Create Secret" to create a new secret.

For the Secret name, be sure to use "openai_api_key", and fill in the secret value with your API key. Click "Create Secret" button once the necessary fields have been populated.

image

Step 1. Deploy a Pod with an llm-agents-from-scratch Runpod template

I've prepared two Runpod templates for Capstone 1. These templates use a Docker image that has the necessary tools installed, including CUDA, uv, and ollama. It also clones the book's GitHub repository and installs the llm-agents-from-scratch package onto the system's Python. The two templates differ in terms of the backbone LLM:

Template Backbone LLM GPU VRAM Required
llmagentsfromscratch-qwen3coder30b Qwen3-Coder:30b 48GB
llmagentsfromscratch-qwen3coder480b Qwen3-Coder:480b 300GB

To deploy a Pod with one of these templates, on the Runpod console, click on "Pods" found in the left-hand sidebar, and then click the "+ Deploy" button. Specify the appropriate number of GPU VRAM for the Pod that's required for the template (i.e. backbone LLM) you'd like use.

image

After selecting your on-demand GPUs, you'll be prompted to configure the deployment, where you can specify the Runpod template. By default, Runpod uses a PyTorch template, so you'll need to change this to one you'd like to use. Click on the "Change Template" button.

image

Next, you can search for the llm-agents-from-scratch template that you'd like to use. Simply enter the name of the template in the search bar.

image

The llm-agents-from-scratch template should appear. Click on it to select the template to deploy the Pod with it. You'll be navigated back to the Configure Deployment page.

Finally, click on the "Deploy On-Demand" button.

image

Step 2. Wait for Pod to become available and finish its setup

After deploying the Pod, the Docker image used by the template will be downloaded, and its startup process will be executed. Included in this startup is the downloading of the Ollama LLM model as well as the downloading of the llm-agents-from-scratch source code and its installation into the container's system Python. It also launches the JupyterLab service that you'll connect with in the next step.

Go ahead and grab a cup of coffee. Expected setup times are provided in the table below, but note that these vary depending on your selected GPU hardware.

Template Setup Time
llmagentsfromscratch-qwen3coder30b 5–10 minutes
llmagentsfromscratch-qwen3coder480b 20–30 minutes

You can monitor the setup process by viewing the Pod's system as well as container logs. The system logs contain details on the downloading of the Docker image, while the container logs emit updates for the startup process.

Your Pod is ready once the container logs contains details for launching the JupyterLab service, as shown in the next screenshot.

image

Step 3. Connect to the Pod's JupyterLab service

Once startup has completed, you can connect to the JupyterLab service for your deployed Pod. Click on the "Connect" tab, then click the JupyterLab link.

image

If successful, this should open up a new browser window exposing the typical JupyterLab UI.

image

Step 4. Navigate to the Capstone Jupyter notebook

As mentioned earlier, the template downloads a clone of the book's GitHub repo into the container. All that remains now is to navigate the repo to locate and open the Jupyter notebook for Capstone 1.

The notebook is located at: /llm-agents-from-scratch/capstone/capstone_1_ch05.ipynb.

image

Step 5. Run the notebook

Now, instead of running this notebook on your local machine, you can run it on your deployed Pod. Note that an Ollama service has already been launched as part of the startup process. You should be able to click through each of the cells and execute them successfully.

Execution times for the repeated runs section are given in the table below.

Template Execution Time (repeated trials)
llmagentsfromscratch-qwen3coder30b ~15 minutes
llmagentsfromscratch-qwen3coder480b ~20 minutes

IMPORTANT

Make sure to terminate your Pod when you're finished to not incur additional costs.