Episodic Memory — Selective Recall with Similarity Search¶

This notebook demonstrates the similarity_memory() factory, which gives an LLMAgent episodic memory that retrieves past episodes by semantic relevance rather than recency.

Two properties are illustrated:

Selective recall. After each task the agent records an episode in a Qdrant in-memory vector store. When a new task arrives, similarity_memory() searches for the most semantically similar past episodes, not the most recent. A question about Southeast Asia retrieves Southeast Asian country episodes; a question about European economies retrieves European ones, regardless of when they were recorded.
Contrast with recency. The notebook explicitly shows what recency_memory() would have returned for the same queries: the same three most-recent episodes for both, illustrating exactly when similarity search adds value over a simpler recency strategy.

This notebook uses the REST Countries API, which is free and requires no authentication.

In [ ]:

Copied!

# Uncomment the line below to install `llm-agents-from-scratch` from PyPI
# !pip install llm-agents-from-scratch
# Uncomment the line below to install `llm-agents-from-scratch` from PyPI
# !pip install llm-agents-from-scratch

Running an Ollama service¶

To execute the code provided in this notebook, you'll need to have Ollama installed on your local machine and have its LLM hosting service running. To download Ollama, follow the instructions found on this page: https://ollama.com/download. After downloading and installing Ollama, you can start a service by opening a terminal and running the command ollama serve.

Setup¶

In [1]:

Copied!





import os
import shutil
import subprocess
import time
import urllib.error
import urllib.request


def ensure_ollama(host="http://localhost:11434", timeout=15):
    """Start Ollama if not already running and wait until responsive."""

    def _up():
        try:
            urllib.request.urlopen(f"{host}/api/tags", timeout=1)
            return True
        except (urllib.error.URLError, ConnectionError, TimeoutError):
            return False

    if _up():
        return print(f"✓ Ollama already running at {host}")

    ollama_path = shutil.which("ollama")
    if ollama_path is None:
        for candidate in [
            "/teamspace/studios/this_studio/.local/bin/ollama",
            "/usr/local/bin/ollama",
            "/usr/bin/ollama",
        ]:
            if os.path.exists(candidate):
                ollama_path = candidate
                break
    if ollama_path is None:
        raise RuntimeError(
            "Could not find the ollama binary. Install with: "
            "curl -fsSL https://ollama.com/install.sh | sh",
        )

    print(f"Starting Ollama server ({ollama_path})...")
    subprocess.Popen(
        [ollama_path, "serve"],
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL,
    )

    deadline = time.time() + timeout
    while time.time() < deadline:
        if _up():
            return print(f"✓ Ollama up and running at {host}")
        time.sleep(0.5)

    raise RuntimeError(f"Ollama did not start within {timeout}s")


ensure_ollama()
import os
import shutil
import subprocess
import time
import urllib.error
import urllib.request


def ensure_ollama(host="http://localhost:11434", timeout=15):
    """Start Ollama if not already running and wait until responsive."""

    def _up():
        try:
            urllib.request.urlopen(f"{host}/api/tags", timeout=1)
            return True
        except (urllib.error.URLError, ConnectionError, TimeoutError):
            return False

    if _up():
        return print(f"✓ Ollama already running at {host}")

    ollama_path = shutil.which("ollama")
    if ollama_path is None:
        for candidate in [
            "/teamspace/studios/this_studio/.local/bin/ollama",
            "/usr/local/bin/ollama",
            "/usr/bin/ollama",
        ]:
            if os.path.exists(candidate):
                ollama_path = candidate
                break
    if ollama_path is None:
        raise RuntimeError(
            "Could not find the ollama binary. Install with: "
            "curl -fsSL https://ollama.com/install.sh | sh",
        )

    print(f"Starting Ollama server ({ollama_path})...")
    subprocess.Popen(
        [ollama_path, "serve"],
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL,
    )

    deadline = time.time() + timeout
    while time.time() < deadline:
        if _up():
            return print(f"✓ Ollama up and running at {host}")
        time.sleep(0.5)

    raise RuntimeError(f"Ollama did not start within {timeout}s")


ensure_ollama()

✓ Ollama already running at http://localhost:11434

In [2]:

Copied!





import json
import logging
import urllib.error
import urllib.parse
import urllib.request

from tqdm.asyncio import tqdm

from llm_agents_from_scratch import LLMAgent
from llm_agents_from_scratch.data_structures import Task
from llm_agents_from_scratch.llms import OllamaLLM
from llm_agents_from_scratch.logger import enable_console_logging
from llm_agents_from_scratch.memory.recipes import similarity_memory
from llm_agents_from_scratch.tools.simple_function import SimpleFunctionTool
import json
import logging
import urllib.error
import urllib.parse
import urllib.request

from tqdm.asyncio import tqdm

from llm_agents_from_scratch import LLMAgent
from llm_agents_from_scratch.data_structures import Task
from llm_agents_from_scratch.llms import OllamaLLM
from llm_agents_from_scratch.logger import enable_console_logging
from llm_agents_from_scratch.memory.recipes import similarity_memory
from llm_agents_from_scratch.tools.simple_function import SimpleFunctionTool

Defining the Tool¶

The REST Countries API returns geographic and demographic facts for any country by name, with no API key required.

In [3]:

Copied!





def get_country(name: str) -> str:
    """Look up a country by name and return key facts.

    Uses the REST Countries API (https://restcountries.com/). No auth required.
    """
    encoded = urllib.parse.quote(name.strip())
    url = f"https://restcountries.com/v3.1/name/{encoded}?fullText=true"
    req = urllib.request.Request(
        url,
        headers={"User-Agent": "llm-agents-from-scratch/1.0"},
    )
    try:
        with urllib.request.urlopen(req) as resp:
            data = json.loads(resp.read())[0]
    except urllib.error.HTTPError as e:
        if e.code == 404:  # noqa: PLR2004
            raise ValueError(
                f"Country '{name}' not found. "
                "Check the spelling and try again.",
            ) from e
        raise
    return json.dumps(
        {
            "name": data["name"]["common"],
            "region": data["region"],
            "subregion": data.get("subregion", ""),
            "capital": data.get("capital", [""])[0],
            "population": data["population"],
            "area_km2": data.get("area", 0),
            "languages": list(data.get("languages", {}).values()),
            "currencies": [
                v["name"] for v in data.get("currencies", {}).values()
            ],
        },
    )


get_country_tool = SimpleFunctionTool(func=get_country)
def get_country(name: str) -> str:
    """Look up a country by name and return key facts.

    Uses the REST Countries API (https://restcountries.com/). No auth required.
    """
    encoded = urllib.parse.quote(name.strip())
    url = f"https://restcountries.com/v3.1/name/{encoded}?fullText=true"
    req = urllib.request.Request(
        url,
        headers={"User-Agent": "llm-agents-from-scratch/1.0"},
    )
    try:
        with urllib.request.urlopen(req) as resp:
            data = json.loads(resp.read())[0]
    except urllib.error.HTTPError as e:
        if e.code == 404:  # noqa: PLR2004
            raise ValueError(
                f"Country '{name}' not found. "
                "Check the spelling and try again.",
            ) from e
        raise
    return json.dumps(
        {
            "name": data["name"]["common"],
            "region": data["region"],
            "subregion": data.get("subregion", ""),
            "capital": data.get("capital", [""])[0],
            "population": data["population"],
            "area_km2": data.get("area", 0),
            "languages": list(data.get("languages", {}).values()),
            "currencies": [
                v["name"] for v in data.get("currencies", {}).values()
            ],
        },
    )


get_country_tool = SimpleFunctionTool(func=get_country)

Creating the Agent with Memory¶

similarity_memory() creates a Memory backed by an in-process Qdrant vector store. Episodes are embedded at write time using FastEmbed (local ONNX, no external service needed). The max_results parameter controls how many semantically similar episodes are recalled per task. The agent holds the memory in its memories list; TaskHandler calls recall at task start and record at task end automatically.

In [4]:

Copied!

memory = similarity_memory(max_results=3)
llm = OllamaLLM(model="qwen3:14b", think=False)
agent = LLMAgent(llm=llm, tools=[get_country_tool], memories=[memory])
memory = similarity_memory(max_results=3)
llm = OllamaLLM(model="qwen3:14b", think=False)
agent = LLMAgent(llm=llm, tools=[get_country_tool], memories=[memory])

Part 1 — Building Up Episodic Memory¶

Seven country lookups spanning four distinct regions: Southeast Asia (Thailand, Vietnam, Indonesia), Western Europe (France, Germany), East Africa (Kenya), and South America (Brazil). Each task calls get_country once and records an episode at the end.

The first lookup runs with full logging so you can see the tool call and recording in action. The remaining six run concurrently with logging silenced to keep the output readable. After all lookups, memory.summary() shows the store state.

In [5]:

Copied!





enable_console_logging(logging.INFO)

task1 = Task(
    instruction=(
        "What are Thailand's key geographic and demographic facts? "
        "Use the get_country tool. Do not rely on prior knowledge."
    ),
)
result1 = await agent.run(task1)
print(result1)
enable_console_logging(logging.INFO)

task1 = Task(
    instruction=(
        "What are Thailand's key geographic and demographic facts? "
        "Use the get_country tool. Do not rely on prior knowledge."
    ),
)
result1 = await agent.run(task1)
print(result1)

INFO (llm_agents_fs.LLMAgent) :      🚀 Starting task: What are Thailand's key geographic and demographic facts? Use the get_country tool. Do not rely on prior knowledge.
INFO (llm_agents_fs.TaskHandler) :      ⚙️ Processing Step: What are Thailand's key geographic and demographic facts? Use the get_country tool. Do not rely on prior knowledge.
INFO (llm_agents_fs.TaskHandler) :      🛠️ Executing Tool Call: get_country
INFO (llm_agents_fs.TaskHandler) :      ✅ Successful Tool Call: {"name": "Thailand", "region": "Asia", "subregion": "South-Eastern Asia", "capital": "Bangkok", "population": 65859640, "area_...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) :      ✅ Step Result: Thailand is located in the region of Asia, specifically in South-Eastern Asia. Its capital city is Bangkok. The country has a populatio...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) :      No new step required.
INFO (llm_agents_fs.LLMAgent) :      🏁 Task completed: Thailand is located in the region of Asia, specifically in South-Eastern Asia. Its capital city is Bangkok. The country has a popula...[TRUNCATED]
Thailand is located in the region of Asia, specifically in South-Eastern Asia. Its capital city is Bangkok. The country has a population of approximately 65,859,640 people and covers an area of 513,120 square kilometers. The primary language spoken in Thailand is Thai, and the official currency is the Thai baht.

In [6]:

Copied!





logging.disable(logging.INFO)

countries = ["Vietnam", "Indonesia", "France", "Germany", "Kenya", "Brazil"]
lookup_tasks = [
    Task(
        instruction=(
            f"What are {c}'s key geographic and demographic facts? "
            "Use the get_country tool. Do not rely on prior knowledge."
        ),
    )
    for c in countries
]

results = await tqdm.gather(
    *[agent.run(t) for t in lookup_tasks],
    desc="Looking up countries",
)

for country, result in zip(countries, results, strict=False):
    print(f"**{country}**\n{result}\n")
logging.disable(logging.INFO)

countries = ["Vietnam", "Indonesia", "France", "Germany", "Kenya", "Brazil"]
lookup_tasks = [
    Task(
        instruction=(
            f"What are {c}'s key geographic and demographic facts? "
            "Use the get_country tool. Do not rely on prior knowledge."
        ),
    )
    for c in countries
]

results = await tqdm.gather(
    *[agent.run(t) for t in lookup_tasks],
    desc="Looking up countries",
)

for country, result in zip(countries, results, strict=False):
    print(f"**{country}**\n{result}\n")

Looking up countries: 100%|███████████████████| 6/6 [01:55<00:00, 19.27s/it]

**Vietnam**
The tool has successfully retrieved information about Vietnam. Here are the key geographic and demographic facts about Vietnam:

- **Region**: Asia
- **Subregion**: South-Eastern Asia
- **Capital**: Hanoi
- **Population**: 101,343,800
- **Area**: 331,212 square kilometers
- **Languages**: Vietnamese
- **Currencies**: Vietnamese đồng

Let me know if you need further details!

**Indonesia**
The key geographic and demographic facts about Indonesia are as follows:

- **Region**: Asia
- **Subregion**: South-Eastern Asia
- **Capital**: Jakarta
- **Population**: 284,438,782
- **Area**: 1,904,569 square kilometers
- **Languages**: Indonesian
- **Currencies**: Indonesian rupiah

**France**
France is located in the region of Europe, specifically in Western Europe. Its capital city is Paris. The country has a population of approximately 66,351,959 people and covers an area of 543,908 square kilometers. The primary language spoken in France is French, and the official currency is the euro.

**Germany**
The key geographic and demographic facts about Germany are as follows:

- **Region**: Europe
- **Subregion**: Western Europe
- **Capital**: Berlin
- **Population**: 83,491,249
- **Area**: 357,114 square kilometers
- **Languages**: German
- **Currencies**: Euro

Let me know if you need further details!

**Kenya**
I have retrieved the information about Kenya. Now, I will summarize the key geographic and demographic facts about Kenya based on the data provided:

- **Region**: Africa  
- **Subregion**: Eastern Africa  
- **Capital**: Nairobi  
- **Population**: Approximately 53,330,978 people  
- **Area**: 580,367 square kilometers  
- **Languages**: English and Swahili  
- **Currency**: Kenyan shilling  

Let me know if you need further details!

**Brazil**
Brazil is located in the region of the Americas, specifically in South America. Its capital city is Brasília. The country has a population of approximately 213,421,037 people and covers an area of 8,515,767 square kilometers. The primary language spoken in Brazil is Portuguese, and the official currency is the Brazilian real.

In [7]:

Copied!

logging.disable(logging.NOTSET)
logging.disable(logging.NOTSET)

In [8]:

Copied!

print(await memory.summary())
print(await memory.summary())

QdrantMemoryStore: 7 episodes | collection=episodes
  newest: 2026-06-03 00:29:19 | What are Kenya's key geographic and demographic facts? Use t
  oldest: 2026-06-03 00:27:20 | What are Thailand's key geographic and demographic facts? Us
  key_fn: <lambda>
  metadata_fns: none

The Recency Baseline¶

Before running any synthesis tasks, it is worth checking what recency_memory() would recall for any query at this point, regardless of topic. Because the six lookups ran concurrently, the recording order depends on wall-clock completion time rather than submission order. Whatever three episodes happened to finish last appear below. A recency strategy would inject these same three episodes into the system prompt for both the Southeast Asia query and the Western Europe query that follow, regardless of whether they are relevant.

In [9]:

Copied!





print("Most recent 3 episodes (RecencyMemory baseline for any query):\n")
most_recent = await memory.store.read_recent(3)
for ep in most_recent:
    print(f"  - {ep.task.instruction[:65]}")
print("Most recent 3 episodes (RecencyMemory baseline for any query):\n")
most_recent = await memory.store.read_recent(3)
for ep in most_recent:
    print(f"  - {ep.task.instruction[:65]}")

Most recent 3 episodes (RecencyMemory baseline for any query):

  - What are Kenya's key geographic and demographic facts? Use the ge
  - What are Indonesia's key geographic and demographic facts? Use th
  - What are France's key geographic and demographic facts? Use the g

Part 2 — Selective Recall: Southeast Asia¶

The synthesis task asks about Southeast Asian countries. The cell below defines the query and shows which episodes are most semantically similar to it. These are exactly what similarity_memory() injects into the system prompt when the task runs. Compare this to the recency baseline above.

In [ ]:

Copied!





task8 = Task(
    instruction=(
        "Which of the countries you have researched are in Southeast Asia? "
        "Summarise their key facts and compare their populations. "
        "Answer from the country information you have already gathered. "
        "No tools are needed."
    ),
)

print("Episodes that will be recalled for this query:\n")
recalled_sea = await memory.store.recall(task8.instruction)
for ep in recalled_sea:
    print(str(ep))
    print()
task8 = Task(
    instruction=(
        "Which of the countries you have researched are in Southeast Asia? "
        "Summarise their key facts and compare their populations. "
        "Answer from the country information you have already gathered. "
        "No tools are needed."
    ),
)

print("Episodes that will be recalled for this query:\n")
recalled_sea = await memory.store.recall(task8.instruction)
for ep in recalled_sea:
    print(str(ep))
    print()

Now run the task. The agent receives the recalled episodes above in its system prompt. Watch the logs: you should see no 🛠️ Executing Tool Call lines. The agent answers entirely from recalled episodes.

In [11]:

Copied!

result8 = await agent.run(task8)
print(result8)
result8 = await agent.run(task8)
print(result8)

INFO (llm_agents_fs.LLMAgent) : 🚀 Starting task: Which of the countries you have researched are in Southeast Asia? Summarise their key facts and compare their populations. Answer fro...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) : ⚙️ Processing Step: Which of the countries you have researched are in Southeast Asia? Summarise their key facts and compare their populations. Answer ...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) : ✅ Step Result: From the information I have gathered, all three countries—Indonesia, Thailand, and Vietnam—are located in Southeast Asia. Here are thei...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) : No new step required.
INFO (llm_agents_fs.LLMAgent) : 🏁 Task completed: From the information I have gathered, all three countries—Indonesia, Thailand, and Vietnam—are located in Southeast Asia. Here are t...[TRUNCATED]
From the information I have gathered, all three countries—Indonesia, Thailand, and Vietnam—are located in Southeast Asia. Here are their key facts and a comparison of their populations:

### Indonesia:
- **Region**: Asia
- **Subregion**: Southeastern Asia
- **Capital**: Jakarta
- **Population**: 284,438,782
- **Area**: 1,904,569 square kilometers
- **Languages**: Indonesian
- **Currencies**: Indonesian rupiah

### Thailand:
- **Region**: Asia
- **Subregion**: Southeastern Asia
- **Capital**: Bangkok
- **Population**: 65,859,640
- **Area**: 513,120 square kilometers
- **Languages**: Thai
- **Currencies**: Thai baht

### Vietnam:
- **Region**: Asia
- **Subregion**: Southeastern Asia
- **Capital**: Hanoi
- **Population**: 101,343,800
- **Area**: 331,212 square kilometers
- **Languages**: Vietnamese
- **Currencies**: Vietnamese đồng

### Comparison of Populations:
- **Indonesia** has the largest population among the three, with approximately **284 million** people.
- **Vietnam** has the second-largest population, with around **101 million** people.
- **Thailand** has the smallest population among the three, with approximately **65.9 million** people.

In summary, Indonesia, Thailand, and Vietnam are all Southeast Asian countries, but Indonesia has the largest population, followed by Vietnam, and then Thailand.

Part 3 — A Different Query, A Different Subset: European Economies¶

A second synthesis task asks about European economies. The query is semantically distant from Southeast Asia, so a completely different subset of episodes should surface: France and Germany. Even though those episodes were not the most recently recorded, the similarity search surfaces them because they are the most relevant to the query.

In [ ]:

Copied!





task9 = Task(
    instruction=(
        "Which of the countries you have researched are in Western Europe? "
        "How do they compare in population and area? "
        "Answer from the country information you have already gathered. "
        "No tools are needed."
    ),
)

print("Episodes that will be recalled for this query:\n")
recalled_eu = await memory.store.recall(task9.instruction)
for ep in recalled_eu:
    print(str(ep))
    print()
task9 = Task(
    instruction=(
        "Which of the countries you have researched are in Western Europe? "
        "How do they compare in population and area? "
        "Answer from the country information you have already gathered. "
        "No tools are needed."
    ),
)

print("Episodes that will be recalled for this query:\n")
recalled_eu = await memory.store.recall(task9.instruction)
for ep in recalled_eu:
    print(str(ep))
    print()

Now run the task and confirm the agent answers from the recalled European episodes.

In [13]:

Copied!

result9 = await agent.run(task9)
print(result9)
result9 = await agent.run(task9)
print(result9)

INFO (llm_agents_fs.LLMAgent) : 🚀 Starting task: Which of the countries you have researched are in Western Europe? How do they compare in population and area? Answer from the country...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) : ⚙️ Processing Step: Which of the countries you have researched are in Western Europe? How do they compare in population and area? Answer from the coun...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) : ✅ Step Result: From the information I have gathered, Germany and France are both located in Western Europe. Here are their key facts and a comparison ...[TRUNCATED]
INFO (llm_agents_fs.TaskHandler) : No new step required.
INFO (llm_agents_fs.LLMAgent) : 🏁 Task completed: From the information I have gathered, Germany and France are both located in Western Europe. Here are their key facts and a comparis...[TRUNCATED]
From the information I have gathered, Germany and France are both located in Western Europe. Here are their key facts and a comparison of their populations and areas:

### Germany:
- **Region**: Europe
- **Subregion**: Western Europe
- **Capital**: Berlin
- **Population**: 83,491,249
- **Area**: 357,114 square kilometers
- **Languages**: German
- **Currencies**: Euro

### France:
- **Region**: Europe
- **Subregion**: Western Europe
- **Capital**: Paris
- **Population**: 66,351,959
- **Area**: 543,908 square kilometers
- **Languages**: French
- **Currencies**: Euro

### Comparison:
- **Population**: Germany has a larger population than France, with approximately **83.5 million** people compared to **66.4 million** people in France.
- **Area**: France has a larger area than Germany, with approximately **543,908 square kilometers** compared to **357,114 square kilometers** in Germany.

In summary, both Germany and France are in Western Europe, but Germany has a larger population, while France has a larger area.

Key Takeaway¶

similarity_memory() gives the agent something recency-based memory cannot: context that is relevant to the current task, not merely recent.

The two queries issued in Parts 2 and 3 touched completely different regions. A recency strategy would have injected whichever three episodes happened to finish last into both queries. Those episodes are unlikely to be relevant to both Southeast Asia and Western Europe simultaneously. Similarity search retrieved the right subset for each query regardless of when those episodes were recorded.

The factory takes two parameters: collection (the Qdrant collection name, defaults to "episodes") and max_results (how many similar episodes to recall, defaults to 5). Swapping in recency_memory() or reflective_memory() requires changing only the one line that constructs the memory object — the agent and the rest of the notebook are unchanged.