Multi-Turn Trivia Chat¶

This notebook demonstrates stateful multi-turn conversation using chat() with a manually maintained chat_history.

We cast the LLM as a trivia quiz master. The notebook supplies hardcoded user answers so the whole conversation runs end-to-end without human input. After each answer the LLM gives feedback and tracks the score, showing how context accumulates across turns.

Chapter 3 concept: chat() returns the new (user_msg, assistant_msg) pair each turn. By appending both messages to chat_history before the next call, the LLM sees the full conversation and can reason across turns — the foundation of everything the LLMAgent automates later.

In [ ]:

Copied!

# Uncomment the line below to install `llm-agents-from-scratch` from PyPI
# !pip install llm-agents-from-scratch
# Uncomment the line below to install `llm-agents-from-scratch` from PyPI
# !pip install llm-agents-from-scratch

Running an Ollama service¶

To execute the code provided in this notebook, you'll need to have Ollama installed on your local machine and have its LLM hosting service running. To download Ollama, follow the instructions found on this page: https://ollama.com/download. After downloading and installing Ollama, you can start a service by opening a terminal and running the command ollama serve.

In [1]:

Copied!





import os
import shutil
import subprocess
import time
import urllib.error
import urllib.request


def ensure_ollama(host="http://localhost:11434", timeout=15):
    """Start Ollama if not already running and wait until responsive."""

    def _up():
        try:
            urllib.request.urlopen(f"{host}/api/tags", timeout=1)
            return True
        except (urllib.error.URLError, ConnectionError, TimeoutError):
            return False

    if _up():
        return print(f"✓ Ollama already running at {host}")

    # Lightning persistent path first, then standard locations
    ollama_path = shutil.which("ollama")
    if ollama_path is None:
        for candidate in [
            "/teamspace/studios/this_studio/.local/bin/ollama",
            "/usr/local/bin/ollama",
            "/usr/bin/ollama",
        ]:
            if os.path.exists(candidate):
                ollama_path = candidate
                break
    if ollama_path is None:
        raise RuntimeError(
            "Could not find the ollama binary. Install with: "
            "curl -fsSL https://ollama.com/install.sh | sh",
        )

    print(f"Starting Ollama server ({ollama_path})...")
    subprocess.Popen(
        [ollama_path, "serve"],
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL,
    )

    deadline = time.time() + timeout
    while time.time() < deadline:
        if _up():
            return print(f"✓ Ollama up and running at {host}")
        time.sleep(0.5)

    raise RuntimeError(f"Ollama did not start within {timeout}s")


ensure_ollama()
import os
import shutil
import subprocess
import time
import urllib.error
import urllib.request


def ensure_ollama(host="http://localhost:11434", timeout=15):
    """Start Ollama if not already running and wait until responsive."""

    def _up():
        try:
            urllib.request.urlopen(f"{host}/api/tags", timeout=1)
            return True
        except (urllib.error.URLError, ConnectionError, TimeoutError):
            return False

    if _up():
        return print(f"✓ Ollama already running at {host}")

    # Lightning persistent path first, then standard locations
    ollama_path = shutil.which("ollama")
    if ollama_path is None:
        for candidate in [
            "/teamspace/studios/this_studio/.local/bin/ollama",
            "/usr/local/bin/ollama",
            "/usr/bin/ollama",
        ]:
            if os.path.exists(candidate):
                ollama_path = candidate
                break
    if ollama_path is None:
        raise RuntimeError(
            "Could not find the ollama binary. Install with: "
            "curl -fsSL https://ollama.com/install.sh | sh",
        )

    print(f"Starting Ollama server ({ollama_path})...")
    subprocess.Popen(
        [ollama_path, "serve"],
        stdout=subprocess.DEVNULL,
        stderr=subprocess.DEVNULL,
    )

    deadline = time.time() + timeout
    while time.time() < deadline:
        if _up():
            return print(f"✓ Ollama up and running at {host}")
        time.sleep(0.5)

    raise RuntimeError(f"Ollama did not start within {timeout}s")


ensure_ollama()

✓ Ollama already running at http://localhost:11434

Setup¶

We initialise the LLM and seed chat_history with a system message that gives the quiz master its persona and instructs it to ask exactly three specific questions in order. This keeps the notebook reproducible — the hardcoded answers below are written to match those questions.

In [2]:

Copied!





from llm_agents_from_scratch.data_structures.llm import ChatMessage, ChatRole
from llm_agents_from_scratch.llms.ollama import OllamaLLM

SYSTEM_PROMPT = (
    "You are an enthusiastic trivia quiz master running\n"
    "a 3-question general-knowledge quiz.\n\n"
    "Rules:\n"
    "- Ask exactly these three questions, one per turn, in this order:\n"
    "  1. What planet in our solar system is known as the Red Planet?\n"
    "  2. What is the largest ocean on Earth?\n"
    "  3. In what year did the first crewed Moon landing take place?\n"
    "- After each answer, say whether it is correct or incorrect,\n"
    "  and give a one-sentence explanation.\n"
    '- Keep a running score (e.g. "Score: 1/1") visible after each feedback.\n'
    "- After the third answer, announce the final score\n"
    "  and a short closing remark.\n"
    "- Keep all responses concise."
)

llm = OllamaLLM(model="qwen3:14b", think=False)

system_msg = ChatMessage(role=ChatRole.SYSTEM, content=SYSTEM_PROMPT)
chat_history = [system_msg]
from llm_agents_from_scratch.data_structures.llm import ChatMessage, ChatRole
from llm_agents_from_scratch.llms.ollama import OllamaLLM

SYSTEM_PROMPT = (
    "You are an enthusiastic trivia quiz master running\n"
    "a 3-question general-knowledge quiz.\n\n"
    "Rules:\n"
    "- Ask exactly these three questions, one per turn, in this order:\n"
    "  1. What planet in our solar system is known as the Red Planet?\n"
    "  2. What is the largest ocean on Earth?\n"
    "  3. In what year did the first crewed Moon landing take place?\n"
    "- After each answer, say whether it is correct or incorrect,\n"
    "  and give a one-sentence explanation.\n"
    '- Keep a running score (e.g. "Score: 1/1") visible after each feedback.\n'
    "- After the third answer, announce the final score\n"
    "  and a short closing remark.\n"
    "- Keep all responses concise."
)

llm = OllamaLLM(model="qwen3:14b", think=False)

system_msg = ChatMessage(role=ChatRole.SYSTEM, content=SYSTEM_PROMPT)
chat_history = [system_msg]

Turn 1 — Starting the Game¶

The user kicks off the game. The LLM responds with a welcome and asks the first question. Both messages are appended to chat_history before the next call.

In [3]:

Copied!





user_msg, assistant_msg = await llm.chat(
    "Let's play! I'm ready.",
    chat_history=chat_history,
)
chat_history += [user_msg, assistant_msg]

print("[User]         Let's play! I'm ready.")
print(f"[Quiz Master]  {assistant_msg.content}")
user_msg, assistant_msg = await llm.chat(
    "Let's play! I'm ready.",
    chat_history=chat_history,
)
chat_history += [user_msg, assistant_msg]

print("[User]         Let's play! I'm ready.")
print(f"[Quiz Master]  {assistant_msg.content}")

[User]         Let's play! I'm ready.
[Quiz Master]  Sure thing! Let's get started with the first question:

**1. What planet in our solar system is known as the Red Planet?**

Turn 2 — Question 1 Answer¶

The user answers correctly. We append both messages and call chat() again with the full history.

In [4]:

Copied!





answer_1 = "Mars!"

user_msg, assistant_msg = await llm.chat(
    answer_1,
    chat_history=chat_history,
)
chat_history += [user_msg, assistant_msg]

print(f"[User]         {answer_1}")
print(f"[Quiz Master]  {assistant_msg.content}")
answer_1 = "Mars!"

user_msg, assistant_msg = await llm.chat(
    answer_1,
    chat_history=chat_history,
)
chat_history += [user_msg, assistant_msg]

print(f"[User]         {answer_1}")
print(f"[Quiz Master]  {assistant_msg.content}")

[User]         Mars!
[Quiz Master]  Correct! Mars is called the Red Planet due to its reddish appearance, caused by iron oxide (rust) on its surface.  
Score: 1/1

**2. What is the largest ocean on Earth?**

Inspecting the growing chat history¶

After two turns the history already holds 5 messages (system + 2 user + 2 assistant). Every subsequent chat() call sends this full context to the LLM.

In [5]:

Copied!





PREVIEW_LEN = 80
print(f"Messages in chat_history: {len(chat_history)}\n")
for msg in chat_history:
    preview = msg.content[:PREVIEW_LEN].replace("\n", " ")
    suffix = "..." if len(msg.content) > PREVIEW_LEN else ""
    print(f"  [{msg.role.value:10s}]  {preview}{suffix}")
PREVIEW_LEN = 80
print(f"Messages in chat_history: {len(chat_history)}\n")
for msg in chat_history:
    preview = msg.content[:PREVIEW_LEN].replace("\n", " ")
    suffix = "..." if len(msg.content) > PREVIEW_LEN else ""
    print(f"  [{msg.role.value:10s}]  {preview}{suffix}")

Messages in chat_history: 5

  [system    ]  You are an enthusiastic trivia quiz master running a 3-question general-knowledg...
  [user      ]  Let's play! I'm ready.
  [assistant ]  Sure thing! Let's get started with the first question:  **1. What planet in our ...
  [user      ]  Mars!
  [assistant ]  Correct! Mars is called the Red Planet due to its reddish appearance, caused by ...

Turn 3 — Question 2 Answer¶

The user answers incorrectly this time (Atlantic Ocean instead of Pacific Ocean). The LLM should catch the mistake, correct it, and update the score.

In [6]:

Copied!





answer_2 = "I think it's the Atlantic Ocean."

user_msg, assistant_msg = await llm.chat(
    answer_2,
    chat_history=chat_history,
)
chat_history += [user_msg, assistant_msg]

print(f"[User]         {answer_2}")
print(f"[Quiz Master]  {assistant_msg.content}")
answer_2 = "I think it's the Atlantic Ocean."

user_msg, assistant_msg = await llm.chat(
    answer_2,
    chat_history=chat_history,
)
chat_history += [user_msg, assistant_msg]

print(f"[User]         {answer_2}")
print(f"[Quiz Master]  {assistant_msg.content}")

[User]         I think it's the Atlantic Ocean.
[Quiz Master]  Incorrect. The largest ocean on Earth is the Pacific Ocean, which covers more than 60 million square miles.  
Score: 1/2

**3. In what year did the first crewed Moon landing take place?**

Turn 4 — Final Question Answer & Score¶

The user answers the last question correctly. The LLM tallies the final score and wraps up the game.

In [7]:

Copied!





answer_3 = "1969."

user_msg, assistant_msg = await llm.chat(
    answer_3,
    chat_history=chat_history,
)
chat_history += [user_msg, assistant_msg]

print(f"[User]         {answer_3}")
print(f"[Quiz Master]  {assistant_msg.content}")
answer_3 = "1969."

user_msg, assistant_msg = await llm.chat(
    answer_3,
    chat_history=chat_history,
)
chat_history += [user_msg, assistant_msg]

print(f"[User]         {answer_3}")
print(f"[Quiz Master]  {assistant_msg.content}")

[User]         1969.
[Quiz Master]  Correct! The first crewed Moon landing occurred in 1969, with NASA's Apollo 11 mission.  
Score: 2/2

Congratulations! You got 2 out of 3 questions correct. Great job!

Final Chat History¶

After 4 turns the history holds 9 messages. This is exactly what an LLMAgent manages internally — chat_history is the agent's short-term memory.

In [8]:

Copied!





PREVIEW_LEN = 100
print(f"Total messages in chat_history: {len(chat_history)}\n")
for msg in chat_history:
    preview = msg.content[:PREVIEW_LEN].replace("\n", " ")
    suffix = "..." if len(msg.content) > PREVIEW_LEN else ""
    print(f"  [{msg.role.value:10s}]  {preview}{suffix}")
PREVIEW_LEN = 100
print(f"Total messages in chat_history: {len(chat_history)}\n")
for msg in chat_history:
    preview = msg.content[:PREVIEW_LEN].replace("\n", " ")
    suffix = "..." if len(msg.content) > PREVIEW_LEN else ""
    print(f"  [{msg.role.value:10s}]  {preview}{suffix}")

Total messages in chat_history: 9

  [system    ]  You are an enthusiastic trivia quiz master running a 3-question general-knowledge quiz.  Rules: - As...
  [user      ]  Let's play! I'm ready.
  [assistant ]  Sure thing! Let's get started with the first question:  **1. What planet in our solar system is know...
  [user      ]  Mars!
  [assistant ]  Correct! Mars is called the Red Planet due to its reddish appearance, caused by iron oxide (rust) on...
  [user      ]  I think it's the Atlantic Ocean.
  [assistant ]  Incorrect. The largest ocean on Earth is the Pacific Ocean, which covers more than 60 million square...
  [user      ]  1969.
  [assistant ]  Correct! The first crewed Moon landing occurred in 1969, with NASA's Apollo 11 mission.   Score: 2/2...