Chapter 7 — Episodic Memory¶

Setup Instructions¶

To ensure you have the required dependencies to run this notebook, you'll need to have our llm-agents-from-scratch framework installed on the running Jupyter kernel. To do this, you can launch this notebook with the following command while within the project's root directory:

uv run --with jupyter jupyter lab

Alternatively, if you just want to use the published version of llm-agents-from-scratch without local development, you can install it from PyPi by uncommenting the cell below.

In [ ]:

Copied!

# Uncomment the line below to install `llm-agents-from-scratch` from PyPi
# !pip install llm-agents-from-scratch
# Uncomment the line below to install `llm-agents-from-scratch` from PyPi
# !pip install llm-agents-from-scratch

Running an Ollama service¶

To execute the code provided in this notebook, you'll need to have Ollama installed on your local machine and have its LLM hosting service running. To download Ollama, follow the instructions found on this page: https://ollama.com/download. After downloading and installing Ollama, you can start a service by opening a terminal and running the command ollama serve.

In [1]:

Copied!





import os, shutil, subprocess, time, urllib.request, urllib.error


def ensure_ollama(host="http://localhost:11434", timeout=15):
    """Start Ollama if not already running and wait until responsive."""

    def _up():
        try:
            urllib.request.urlopen(f"{host}/api/tags", timeout=1)
            return True
        except (urllib.error.URLError, ConnectionError, TimeoutError):
            return False

    if _up():
        return print(f"✓ Ollama already running at {host}")

    # Lightning persistent path first, then standard locations
    ollama_path = shutil.which("ollama")
    if ollama_path is None:
        for candidate in [
            "/teamspace/studios/this_studio/.local/bin/ollama",
            "/usr/local/bin/ollama",
            "/usr/bin/ollama",
        ]:
            if os.path.exists(candidate):
                ollama_path = candidate
                break
    if ollama_path is None:
        raise RuntimeError(
            "Could not find the ollama binary. Install with: "
            "curl -fsSL https://ollama.com/install.sh | sh"
        )

    print(f"Starting Ollama server ({ollama_path})...")
    subprocess.Popen(
        [ollama_path, "serve"],
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL,
    )

    deadline = time.time() + timeout
    while time.time() < deadline:
        if _up():
            return print(f"✓ Ollama up and running at {host}")
        time.sleep(0.5)

    raise RuntimeError(f"Ollama did not start within {timeout}s")


ensure_ollama()
import os, shutil, subprocess, time, urllib.request, urllib.error


def ensure_ollama(host="http://localhost:11434", timeout=15):
    """Start Ollama if not already running and wait until responsive."""

    def _up():
        try:
            urllib.request.urlopen(f"{host}/api/tags", timeout=1)
            return True
        except (urllib.error.URLError, ConnectionError, TimeoutError):
            return False

    if _up():
        return print(f"✓ Ollama already running at {host}")

    # Lightning persistent path first, then standard locations
    ollama_path = shutil.which("ollama")
    if ollama_path is None:
        for candidate in [
            "/teamspace/studios/this_studio/.local/bin/ollama",
            "/usr/local/bin/ollama",
            "/usr/bin/ollama",
        ]:
            if os.path.exists(candidate):
                ollama_path = candidate
                break
    if ollama_path is None:
        raise RuntimeError(
            "Could not find the ollama binary. Install with: "
            "curl -fsSL https://ollama.com/install.sh | sh"
        )

    print(f"Starting Ollama server ({ollama_path})...")
    subprocess.Popen(
        [ollama_path, "serve"],
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL,
    )

    deadline = time.time() + timeout
    while time.time() < deadline:
        if _up():
            return print(f"✓ Ollama up and running at {host}")
        time.sleep(0.5)

    raise RuntimeError(f"Ollama did not start within {timeout}s")


ensure_ollama()

✓ Ollama already running at http://localhost:11434

Examples¶

Example 1: Constructing an Episode and Printing Format Modes¶

In [2]:

Copied!





from llm_agents_from_scratch.data_structures import Task, TaskResult
from llm_agents_from_scratch.data_structures.memory import Episode, EpisodeFormatMode

task = Task(instruction="Compute the Hailstone sequence for 6.")
result = TaskResult(task_id=task.id_, content="6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1")
episode = Episode(
    task=task,
    rollout="mock rollout",
    result=result,
    metadata={"reflection": "Always use the hailstone tool."},
)

print("=== XML (default) ===")
print(episode.format(mode=EpisodeFormatMode.XML))

print()
print("=== CONCAT ===")
print(episode.format(mode=EpisodeFormatMode.CONCAT))
from llm_agents_from_scratch.data_structures import Task, TaskResult
from llm_agents_from_scratch.data_structures.memory import Episode, EpisodeFormatMode

task = Task(instruction="Compute the Hailstone sequence for 6.")
result = TaskResult(task_id=task.id_, content="6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1")
episode = Episode(
    task=task,
    rollout="mock rollout",
    result=result,
    metadata={"reflection": "Always use the hailstone tool."},
)

print("=== XML (default) ===")
print(episode.format(mode=EpisodeFormatMode.XML))

print()
print("=== CONCAT ===")
print(episode.format(mode=EpisodeFormatMode.CONCAT))

=== XML (default) ===
  <episode>
    <task>Compute the Hailstone sequence for 6.</task>
    <result>6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1</result>
    <reflection>Always use the hailstone tool.</reflection>
    <completed_at>2026-06-21 23:15:50</completed_at>
  </episode>

=== CONCAT ===
task: Compute the Hailstone sequence for 6.
result: 6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1
reflection: Always use the hailstone tool.
completed_at: 2026-06-21 23:15:50

Example 2a: Writing Episodes to a QdrantMemoryStore¶

In [3]:

Copied!





import warnings

from llm_agents_from_scratch.data_structures import Task, TaskResult
from llm_agents_from_scratch.data_structures.memory import Episode
from llm_agents_from_scratch.memory_stores.qdrant import QdrantMemoryStore

warnings.filterwarnings("ignore", message="Payload indexes have no effect")

store = QdrantMemoryStore(max_results=1)

task_1 = Task(instruction="Compute the Hailstone sequence for 6.")
task_2 = Task(instruction="Compute the Hailstone sequence for 8.")

episodes = [
    Episode(
        task=task_1,
        rollout="mock rollout",
        result=TaskResult(
            task_id=task_1.id_,
            content="6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1",
        ),
        metadata={"steps": "8"},
    ),
    Episode(
        task=task_2,
        rollout="mock rollout",
        result=TaskResult(
            task_id=task_2.id_,
            content="8 → 4 → 2 → 1",
        ),
        metadata={"steps": "3"},
    ),
]

for ep in episodes:
    await store.write(ep)
    print(f"count: {await store.count()}")
import warnings

from llm_agents_from_scratch.data_structures import Task, TaskResult
from llm_agents_from_scratch.data_structures.memory import Episode
from llm_agents_from_scratch.memory_stores.qdrant import QdrantMemoryStore

warnings.filterwarnings("ignore", message="Payload indexes have no effect")

store = QdrantMemoryStore(max_results=1)

task_1 = Task(instruction="Compute the Hailstone sequence for 6.")
task_2 = Task(instruction="Compute the Hailstone sequence for 8.")

episodes = [
    Episode(
        task=task_1,
        rollout="mock rollout",
        result=TaskResult(
            task_id=task_1.id_,
            content="6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1",
        ),
        metadata={"steps": "8"},
    ),
    Episode(
        task=task_2,
        rollout="mock rollout",
        result=TaskResult(
            task_id=task_2.id_,
            content="8 → 4 → 2 → 1",
        ),
        metadata={"steps": "3"},
    ),
]

for ep in episodes:
    await store.write(ep)
    print(f"count: {await store.count()}")

Downloading (incomplete total...): 0.00B [00:00, ?B/s]

Fetching 5 files:   0%|          | 0/5 [00:00<?, ?it/s]

count: 1
count: 2

Example 2b: Recalling from a QdrantMemoryStore¶

In [4]:

Copied!





results = await store.recall("Hailstone sequence starting from 6")
print(
    f"recall() returned {len(results)} episode(s)"
    " -- query matched by similarity (RecallMode.SEARCH)\n"
)
print(results[0].format())
results = await store.recall("Hailstone sequence starting from 6")
print(
    f"recall() returned {len(results)} episode(s)"
    " -- query matched by similarity (RecallMode.SEARCH)\n"
)
print(results[0].format())

recall() returned 1 episode(s) -- query matched by similarity (RecallMode.SEARCH)

  <episode>
    <task>Compute the Hailstone sequence for 6.</task>
    <result>6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1</result>
    <steps>8</steps>
    <completed_at>2026-06-21 23:15:54</completed_at>
  </episode>

Example 3: Memory with a metadata_fn¶

In [5]:

Copied!





import warnings

from llm_agents_from_scratch.data_structures import Task, TaskResult
from llm_agents_from_scratch.data_structures.memory import Episode
from llm_agents_from_scratch.memory import Memory
from llm_agents_from_scratch.memory_stores.qdrant import QdrantMemoryStore

warnings.filterwarnings("ignore", message="Payload indexes have no effect")

task = Task(instruction="Compute the Hailstone sequence for 6.")
episode = Episode(
    task=task,
    rollout="mock rollout",
    result=TaskResult(
        task_id=task.id_,
        content="6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1",
    ),
)


def step_counter_fn(ep: Episode) -> str:
    return str(len(ep.result.content.split("→")))


store = QdrantMemoryStore(max_results=1)
memory = Memory(store=store, metadata_fns={"steps": step_counter_fn})

await memory.record(episode)

# record() works on a deep copy — the original episode is not mutated
print("Original episode:")
print(episode)

recalled = await memory.recall(Task(instruction="Compute the Hailstone sequence for 3."))
print("\nRecalled episode (enriched by step_counter_fn at write time):")
print(recalled)
import warnings

from llm_agents_from_scratch.data_structures import Task, TaskResult
from llm_agents_from_scratch.data_structures.memory import Episode
from llm_agents_from_scratch.memory import Memory
from llm_agents_from_scratch.memory_stores.qdrant import QdrantMemoryStore

warnings.filterwarnings("ignore", message="Payload indexes have no effect")

task = Task(instruction="Compute the Hailstone sequence for 6.")
episode = Episode(
    task=task,
    rollout="mock rollout",
    result=TaskResult(
        task_id=task.id_,
        content="6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1",
    ),
)


def step_counter_fn(ep: Episode) -> str:
    return str(len(ep.result.content.split("→")))


store = QdrantMemoryStore(max_results=1)
memory = Memory(store=store, metadata_fns={"steps": step_counter_fn})

await memory.record(episode)

# record() works on a deep copy — the original episode is not mutated
print("Original episode:")
print(episode)

recalled = await memory.recall(Task(instruction="Compute the Hailstone sequence for 3."))
print("\nRecalled episode (enriched by step_counter_fn at write time):")
print(recalled)

Original episode:
  <episode>
    <task>Compute the Hailstone sequence for 6.</task>
    <result>6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1</result>
    <completed_at>2026-06-21 23:17:09</completed_at>
  </episode>

Recalled episode (enriched by step_counter_fn at write time):
  <episode>
    <task>Compute the Hailstone sequence for 6.</task>
    <result>6 → 3 → 10 → 5 → 16 → 8 → 4 → 2 → 1</result>
    <steps>9</steps>
    <completed_at>2026-06-21 23:17:09</completed_at>
  </episode>

Example 4: LLMAgent with similarity_memory()¶

In [6]:

Copied!





import logging
import warnings

from llm_agents_from_scratch import LLMAgent
from llm_agents_from_scratch.data_structures import Task
from llm_agents_from_scratch.logger import enable_console_logging
from llm_agents_from_scratch.llms import OllamaLLM
from llm_agents_from_scratch.memory.recipes import similarity_memory
from llm_agents_from_scratch.tools import SimpleFunctionTool

warnings.filterwarnings("ignore", message="Payload indexes have no effect")
enable_console_logging(logging.INFO)


def next_number(n: int) -> int:
    """Returns the next number in the stop-at-one sequence."""
    return n // 2 if n % 2 == 0 else 3 * n + 1


next_number_tool = SimpleFunctionTool(func=next_number)
llm = OllamaLLM(model="qwen3:14b", think=False)
memory = similarity_memory(max_results=1)
agent = LLMAgent(llm=llm, tools=[next_number_tool], memories=[memory])
import logging
import warnings

from llm_agents_from_scratch import LLMAgent
from llm_agents_from_scratch.data_structures import Task
from llm_agents_from_scratch.logger import enable_console_logging
from llm_agents_from_scratch.llms import OllamaLLM
from llm_agents_from_scratch.memory.recipes import similarity_memory
from llm_agents_from_scratch.tools import SimpleFunctionTool

warnings.filterwarnings("ignore", message="Payload indexes have no effect")
enable_console_logging(logging.INFO)


def next_number(n: int) -> int:
    """Returns the next number in the stop-at-one sequence."""
    return n // 2 if n % 2 == 0 else 3 * n + 1


next_number_tool = SimpleFunctionTool(func=next_number)
llm = OllamaLLM(model="qwen3:14b", think=False)
memory = similarity_memory(max_results=1)
agent = LLMAgent(llm=llm, tools=[next_number_tool], memories=[memory])

In [7]:

Copied!





# Task 1: no memory yet — the agent computes the sequence via tools and
# similarity_memory records the episode in the Qdrant vector store.
result_1 = await agent.run(
    Task(
        instruction=(
            "Compute the stop-at-one sequence for 4"
        )
    ),
)
print(result_1.content)
# Task 1: no memory yet — the agent computes the sequence via tools and
# similarity_memory records the episode in the Qdrant vector store.
result_1 = await agent.run(
    Task(
        instruction=(
            "Compute the stop-at-one sequence for 4"
        )
    ),
)
print(result_1.content)

INFO (llm_agents_fs.LLMAgent) :      🚀 Starting task: Compute the stop-at-one sequence for 4
INFO (llm_agents_fs.TaskHandler) :      ⚙️ Processing Step: Compute the stop-at-one sequence for 4
INFO (llm_agents_fs.TaskHandler) :      🛠️ Executing Tool Call: from_scratch__use_skill
INFO (llm_agents_fs.TaskHandler) :      ✅ Successful Tool Call: <skill_content name="stop-at-one">
# Stop At One

Compute a full sequence from a starting number down to 1
using the `next_num...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) :      ✅ Step Result: I need to compute the stop-at-one sequence for the starting number 4. Let me begin by calling the `next_number` tool with the initial v...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) :      🧠 New Step: Call the `next_number` tool with the initial value of 4.
INFO (llm_agents_fs.TaskHandler) :      ⚙️ Processing Step: Call the `next_number` tool with the initial value of 4.
INFO (llm_agents_fs.TaskHandler) :      🛠️ Executing Tool Call: next_number
INFO (llm_agents_fs.TaskHandler) :      ✅ Successful Tool Call: 2
INFO (llm_agents_fs.TaskHandler) :      ✅ Step Result: The result of the `next_number` tool call with the initial value of 4 is 2. I will append this to the sequence and continue the process...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) :      🧠 New Step: Call the `next_number` tool with the current value of 2.
INFO (llm_agents_fs.TaskHandler) :      ⚙️ Processing Step: Call the `next_number` tool with the current value of 2.
INFO (llm_agents_fs.TaskHandler) :      🛠️ Executing Tool Call: next_number
INFO (llm_agents_fs.TaskHandler) :      ✅ Successful Tool Call: 1
INFO (llm_agents_fs.TaskHandler) :      ✅ Step Result: The result of the `next_number` tool call with the current value of 2 is 1. Since we have reached 1, the sequence is complete. 

The fu...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) :      No new step required.
INFO (llm_agents_fs.LLMAgent) :      🏁 Task completed: The result of the `next_number` tool call with the current value of 2 is 1. Since we have reached 1, the sequence is complete. 

The...[TRUNCATED]
The result of the `next_number` tool call with the current value of 2 is 1. Since we have reached 1, the sequence is complete. 

The full sequence is: 4 → 2 → 1.

**Starting number**: 4  
**Total steps taken**: 2  
**Maximum value reached**: 4

In [8]:

Copied!





# Inspect what will be injected into the system prompt as memory for Task 2.
# This is the same call the agent makes internally via memory.recall() before
# its first step — surfacing it here makes the memory context visible.
task_2 = Task(
    instruction=(
        "What is the stop-at-one sequence for 4?"
    )
)
print("Memory recalled for Task 2:\n")
print(await memory.recall(task_2))
# Inspect what will be injected into the system prompt as memory for Task 2.
# This is the same call the agent makes internally via memory.recall() before
# its first step — surfacing it here makes the memory context visible.
task_2 = Task(
    instruction=(
        "What is the stop-at-one sequence for 4?"
    )
)
print("Memory recalled for Task 2:\n")
print(await memory.recall(task_2))

Memory recalled for Task 2:

  <episode>
    <task>Compute the stop-at-one sequence for 4</task>
    <result>The result of the `next_number` tool call with the current value of 2 is 1. Since we have reached 1, the sequence is complete. 

The full sequence is: 4 → 2 → 1.

**Starting number**: 4  
**Total steps taken**: 2  
**Maximum value reached**: 4</result>
    <completed_at>2026-06-21 23:18:20</completed_at>
  </episode>

In [9]:

Copied!





# Task 2: the episode from Task 1 is recalled by similarity search and
# injected into the system prompt — the agent answers directly from memory.
result_2 = await agent.run(
    task_2,
)
print(result_2.content)
# Task 2: the episode from Task 1 is recalled by similarity search and
# injected into the system prompt — the agent answers directly from memory.
result_2 = await agent.run(
    task_2,
)
print(result_2.content)

INFO (llm_agents_fs.LLMAgent) :      🚀 Starting task: What is the stop-at-one sequence for 4?
INFO (llm_agents_fs.TaskHandler) :      ⚙️ Processing Step: What is the stop-at-one sequence for 4?
INFO (llm_agents_fs.TaskHandler) :      ✅ Step Result: The stop-at-one sequence for 4 is: 4 → 2 → 1.

**Starting number**: 4  
**Total steps taken**: 2  
**Maximum value reached**: 4
INFO (llm_agents_fs.TaskHandler) :      No new step required.
INFO (llm_agents_fs.LLMAgent) :      🏁 Task completed: The stop-at-one sequence for 4 is: 4 → 2 → 1.

**Starting number**: 4  
**Total steps taken**: 2  
**Maximum value reached**: 4
The stop-at-one sequence for 4 is: 4 → 2 → 1.

**Starting number**: 4  
**Total steps taken**: 2  
**Maximum value reached**: 4

In [ ]: