Input the conversation history of Copilot Chat and Claude Code into ChromaDB and search for your “past self”

🔗 Series Table of Contents: This article is the Implementation Edition (2) of the AI Assistant Operations Notes - Practical record for raising Copilot / Claude Code as your partner series.

What you can learn from this article

Location and structure of conversation logs (JSONL) that are automatically saved by VSCode Copilot Chat and Claude Code
Differences in each JSONL format and design to absorb them with a common interface
How to search for “past self” in natural language by vectorizing with ChromaDB + Ollama
Steps to automatically submit at the end of a session using Claude Code’s Stop hook
Effects and points to note after accumulating dozens of sessions

Target audience

Those who want to use conversations with an AI assistant as permanent memory
Those who rely on search notes every time to ask “How did I solve the problem at that time”
Those who want to add conversation log import function to their own RAG
Those using VSCode Copilot Chat / Claude Code or both

Operating environment

Item	Version
OS	Windows 11
Python	3.13 (venv)
ChromaDB	Persistent mode (PersistentClient)
Embedding	Ollama `nomic-embed-text` (768 dimensions)
VSCode Copilot Chat	Auto save transcripts JSONL to `workspaceStorage`
Claude Code	Automatically save session JSONL to `~/.claude/projects/`

1. Introduction — “What was the solution to that problem?”Moments like this become more common when you work with an AI assistant every day.

What was the solution to PowerShell’s character encoding problem that Copilot taught me back then?

Last week, I discussed with Claude Code and decided on “Authentication bypass design using dev profile”, what was the conclusion?

I feel like I’m about to step into that trap I stepped on a month ago again…

The conversation is supposed to be saved somewhere as a JSONL file, but the file name is a UUID and the content is hundreds of thousands of lines of JSON. It is difficult to search using grep, and the original desire is to “retrieve past solutions using natural language.”

One day I realized.

**Isn’t the conversation log “RAG’s best material” that is generated every day? **

Technical blogs, design documents, Stack Overflow answers—there’s a lot of “other people’s knowledge” out there, but the only solutions you can find in your own context, in your own words, and to your own problems are in your own conversation logs. If you put this into RAG, you can literally create a search engine that allows you to search for your “past self.”

This article describes how this mechanism was created for both Copilot Chat and Claude Code.

2. Two types of transcripts

First, organize where and in what format each file is saved.

2.1 VSCode Copilot Chat

Save to:

C:\Users\<user>\AppData\Roaming\Code\User\workspaceStorage\<workspace-hash>\GitHub.copilot-chat\transcripts\<uuid>.jsonl
````There is a hashed directory for each workspace (folder opened in VSCode), and a **`transcripts/` folder** is dug in that directory and JSONL with UUID names is piled up.

JSONL structure (simplified):

```jsonl
{"type":"user.message","data":{"content":"How to write Virtual Threads in Java 21?"},"timestamp":"..."}
{"type":"assistant.message","data":{"content":"`Thread.ofVirtual().start(() -> { ... })`..."},"timestamp":"..."}
{"type":"assistant.turn_end","data":{...}}

1 row = 1 event. User/assistant messages can be identified by the type field, and data.content contains the text as a string. simple.

2.2 Claude Code

Save to:

C:\Users\<user>\.claude\projects\<project-slug>\<session-id>.jsonl

There is a folder for each project (slug converted from the full path of the repository), and in that folder there is a JSONL with the session ID name.

JSONL structure (simplified):```jsonl {“type”:“user”,“message”:{“role”:“user”,“content”:[{“type”:“text”,“text”:”…”}]}} {“type”:“assistant”,“message”:{“role”:“assistant”,“content”:[{“type”:“text”,“text”:”…”},{“type”:“thinking”,“thinking”:”…”},{“type”:“tool_use”,“name”:“Read”,“input”:{…}}]}} {“type”:“queue-operation”,“operation”:“enqueue”,…} {“type”:“attachment”,…} {“type”:“file-history-snapshot”,…}


**This one is a little complicated**:

- `type` field is `"user"` / `"assistant"` but **body is nested in array of `message.content`**
- Each element of the array is a **content block**, and `type` is divided into `"text"` / `"thinking"` / `"tool_use"` etc.
- User messages contain **auto-inserted tags** such as `<system-reminder>` `<ide_opened_file>`
- **Meta events** like `queue-operation` `attachment` `file-history-snapshot` are also in the same JSONL

In short, **Copilot Chat is a simple format where "1 text = 1 message" and Claude Code is a structured format where "content block array + meta-events are mixed"**. Even though they are called the same "conversation history", the format is different.

---

## 3. Common challenges and designs

I want to handle both in one interface, so I organized them like this.

### Common processing flow````
[JSONL file]
   ↓ Parsing (format dependent)
[user / assistant text pair (= turns)]
   ↓ Chunking (common)
[id/text/metadata set for ChromaDB]
   ↓ upsert (common)
[ChromaDB Collection]

The design is such that only the format-dependent part (parsing) is implemented separately, and everything after chunking is completely standardized. Specifically:

ingest_conversation.py — Parser for Copilot Chat + Call common input process
ingest_claude_code.py — Parser for Claude Code + calls the same common input process
src/db/store.py’s ingest_chunks() — ChromaDB populating logic called by both

Chunking rules

The natural unit of conversation is the “user 1 turn + assistant 1 response” pair. This is called turns.

The upper limit of embedding at one time is placed in MAX_CHUNK_CHARS = 1200 as a safe zone for nomic-embed-text, and we decided to combine 3 turns into 1 chunk.

TURNS_PER_CHUNK = 3
MAX_CHUNK_CHARS = 1200def turns_to_chunks(turns: list[dict], session_id: str, date_str: str) -> list[dict]:
    chunks = []
    for i in range(0, len(turns), TURNS_PER_CHUNK):
        group = turns[i:i + TURNS_PER_CHUNK]
        lines = []
        for t in group:
            lines.append(f"[user] {t['user']}")
            lines.append(f"[AI] {t['assistant']}")
            lines.append("")
        text = "\n".join(lines)
        if len(text) > MAX_CHUNK_CHARS:
            text = text[:MAX_CHUNK_CHARS]
        chunk_index = i // TURNS_PER_CHUNK
        chunks.append({
            "id": f"{session_id}::chunk_{chunk_index}",
            "text": text,
            "metadata": {
                "source_type": "conversation",
                "source_file": session_id,
                "title": f"Conversation session {date_str} (part {chunk_index + 1})",
                "date": date_str,
                "tags": "conversation,copilot", # or "conversation,claude-code"
                "chunk_index": chunk_index,
            },
        })
    return chunks
````Only `tags` of the metadata was differentiated by "`copilot`" or "`claude-code`", and the rest were completely the same. Now, when you `rag search`, conversations with Copilot and Claude Code will be listed in the search results regardless.

---

## 4. Incorporating Copilot Chat

This parser is the core of Copilot Chat (`ingest_conversation.py`).

```python
def parse_conversation(jsonl_path: Path) -> list[dict]:
    """Extract user/AI message pairs from JSONL"""
    turns = []
    user_msg = None

    with open(jsonl_path, encoding="utf-8") as f:
        for line in f:
            line = line.strip()
            if not line:
                continue
            try:
                record = json.loads(line)
            except json.JSONDecodeError:
                continue

            rtype = record.get("type", "")
            timestamp = record.get("timestamp", "")if rtype == "user.message":
                content = record["data"].get("content", "").strip()
                if content:
                    user_msg = {"content": content, "timestamp": timestamp}

            elif rtype == "assistant.message":
                content = record["data"].get("content", "").strip()
                if content and user_msg:
                    turns.append({
                        "user": user_msg["content"],
                        "assistant": content,
                        "timestamp": user_msg["timestamp"],
                    })
                    user_msg = None

    return

Just pick up the flow user.message → assistant.message in order and load it into turns. type’s simplicity allows the parser to fit into 30 lines. I appreciated the honesty on Copilot Chat’s part.

Easily auto-detect the latest transcripts:

TRANSCRIPT_BASE = Path(r"C:\Users\<user>\AppData\Roaming\Code\User\workspaceStorage")def find_latest_transcript() -> Path:
    """Auto-detect latest transcripts across all workspaces"""
    candidates = []
    if TRANSCRIPT_BASE.exists():
        for jsonl in TRANSCRIPT_BASE.rglob("*.jsonl"):
            if "transcripts" in jsonl.parts:
                candidates.append(jsonl)
    if not candidates:
        raise FileNotFoundError(...)
    return max(candidates, key=lambda p: p.stat().st_mtime)

Recursively search all workspaces with rglob and return the latest JSONL with mtime. I made this available to the user via the rag conv command.

# Insert the latest conversation with one command
rag conv

# Input by specifying a specific past session
rag conv "C:\path\to\specific.jsonl"

5. Incorporating Claude Code — Fighting the Noise

Here’s where things get a little interesting. Claude Code is mixed with a lot of meta-events other than conversations, so a filter is necessary.

5.1 Things to take out/throw away| `type` | Something | Import |

|---|---|---| | user | User message | ✅ However, only the text block is extracted from the content array | | assistant | AI response | ✅ Similarly, extract only the text block | | queue-operation | Queue operation (internal processing event) | ❌ | | attachment | File attachment information | ❌ | | file-history-snapshot | File history | ❌ | | ai-title | Session title assignment | ❌ | | last-prompt | Last prompt meta | ❌ |

5.2 Processing content block arrays

User/AI Both messages message.content are arrays and have mixed block types.```python def _extract_text_blocks(content) -> str: """Concatenate and return only text block from message.content""" if isinstance(content, str): return content.strip() if not isinstance(content, list): return "" parts = [] for block in content: if not isinstance(block, dict): continue if block.get(“type”) == “text”: txt = block.get(“text”, "") if isinstance(txt, str) and txt.strip(): parts.append(txt.strip()) return “\n\n”.join(parts)


**Blocks other than `text` (`thinking` / `tool_use` / `tool_result`) are skipped**. `thinking` is the internal thinking of the AI, so it is better not to include it in the flow of conversation because search noise will be reduced, and `tool_use` is structured JSON, so it is not suitable for text search.

### 5.3 Removing auto-inserted tags

User messages are mixed with tags that Claude Code automatically inserts.```html
<system-reminder>The TodoWrite tool hasn't been used recently...</system-reminder>
<ide_opened_file>The user opened the file ...</ide_opened_file>
<command-message>...</command-message>

If you enter RAG with these remaining, when you search for “what was said in that conversation”, tag contents (general noise) will be hit. So I’ll remove it.

NOISE_TAG_RE = re.compile(
    r"<(system-reminder|ide_opened_file|ide_selection|command-message|command-name|command-args|local-command-stdout)>.*?</\1>",
    re.DOTALL,
)

def _strip_noise(text: str) -> str:
    if not text:
        return ""
    cleaned = NOISE_TAG_RE.sub("", text)
    cleaned = re.sub(r"\n{3,}", "\n\n", cleaned) # Clean up empty lines
    return cleaned.strip()

DOTALL cuts across line breaks at once. Just by doing this, I experienced a 30% increase in search accuracy.

5.4 Parser body

Combining these, the parser for Claude Code looks like this:```python def parse_conversation(jsonl_path: Path) -> list[dict]: turns: list[dict] = [] pending_user: dict | None = None

with open(jsonl_path, encoding="utf-8") as f:
    for line in f:
        line = line.strip()
        if not line:
            continue
        try:
            record = json.loads(line)
        except json.JSONDecodeError:
            continue

        rtype = record.get("type")
        msg = record.get("message", {}) or {}

        if rtype == "user":
            content = msg.get("content") if isinstance(msg, dict) else None
            text = _strip_noise(_extract_text_blocks(content))
            if text:
                pending_user = {"content": text, "timestamp": record.get("timestamp", "")}elif rtype == "assistant":
            content = msg.get("content") if isinstance(msg, dict) else None
            text = _extract_text_blocks(content) # AI response does not require denoising
            if text and pending_user:
                turns.append({
                    "user": pending_user["content"],
                    "assistant": text,
                    "timestamp": pending_user["timestamp"],
                })
                pending_user = None

return


It is the same "pair stacking" logic as for Copilot, but the key point is that **block extraction and noise removal are sandwiched in the previous stage**.

---

## 6. Claude Code is automatically entered with Stop hook

For Copilot Chat, `rag conv` is manually executed, but Claude Code has a mechanism called **Stop hook**. Any command can be executed as soon as the session ends. This allows you to **fully automate** it.

Write this in `~/.claude/settings.json`.```json
{
  "hooks": {
    "Stop": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "powershell.exe -NoProfile -ExecutionPolicy Bypass -File \"C:\\Users\\<user>\\git\\my-rag-brain\\scripts\\claude_code_stop_hook.ps1\""
          }
        ]
      }
    ]
  }
}

The called PowerShell script looks like this.

# read JSON from stdin, extract transcript_path and start ingest
$stdin = [Console]::In.ReadToEnd()
$payload = $stdin | ConvertFrom-Json
$transcript = $payload.transcript_path

if (-not (Test-Path $transcript)) { return }# Start in background (don't block Stop hook for long time)
$env:PYTHONIOENCODING = 'utf-8'
Start-Process -FilePath "<venv>\python.exe" `
    -ArgumentList @("<my-rag-brain>\src\pipeline\ingest_claude_code.py", $transcript) `
    -WindowStyle Hidden `
    -RedirectStandardOutput $stdoutLog `
    -RedirectStandardError $stderrLog

Two points:

Read transcript_path from stdin — Claude Code passes it to us in JSON
Background startup with Start-Process — to avoid blocking the Stop hook for a long time

Now, the moment you close Claude Code, the session will be automatically injected into ChromaDB. There’s nothing to do manually.

7. What has changed? A sense of security that allows you to search for your “past self”

After dozens of sessions, something like this started happening.

7.1 “How did you solve that?” problem disappeared

When I run into a character encoding error in PowerShell:

rag search "PowerShell UTF-8 garbled characters"

Then, the top 5 solutions that were created in the past when discussing the same problem with Copilot and Claude Code will be returned. You can instantly remember, “Oh, I put PYTHONIOENCODING=utf-8 in an environment variable.”

7.2 The conclusion of an argument is made permanentIn sessions where design decisions and operational policies are discussed, the three turns before and after the conclusion are saved as one chunk. Later, if you search for `What was the deciding factor behind that design decision?'' the entire discussion flow will come up. Context of` why we did it” that cannot be written in the design document is largely left behind.

7.3 You will be less likely to make the same mistake

The traps that my “past self” stepped into are mysteriously hidden close to my current self. When you look up past logs with rag search "<関連キーワード>" before starting a new task, you will often be reminded of forgotten notes.

Especially after installing Claude Code’s automatic stop hook insertion, my memories accumulate without me even being aware of it, so my search experience has gotten better and better.

8. Points to note/limitations

8.1 Risk of mixing in confidential information

Conversations may include code excerpts, API keys, URLs, and personal information. There is no problem as long as it is closed to the local ChromaDB, but it is essential that you do not commit the ChromaDB directory to the public repository using Git. I have chroma_db/ in .gitignore.

8.2 Noise removal rules require maintenance

Claude Code’s auto-insert tags may increase in the future. The list of tags removed by NOISE_TAG_RE this time is from the range I observed. There is an operational cost of adding new tags as they appear.

8.3 Old conversation embeddings may lose accuracy if not regenerated

When you update Ollama’s embedding model, the old chunk’s embedding and the new chunk’s embedding are in slightly different spaces. Full re-embedding required for full compatibility. This is done by using rag bulk to resubmit all entries.### 8.4 Search accuracy may decrease if you “put everything in”

An unexpected pitfall is that including noisy initial sessions and sessions that ended with errors reduces search accuracy. source_file By filtering by metadata or by dividing source_type into smaller sections, you can narrow down your search queries.

9. Summary

VSCode Copilot Chat automatically saves conversation history to workspaceStorage/.../transcripts/*.jsonl, and Claude Code automatically saves conversation history to ~/.claude/projects/<slug>/<session-id>.jsonl.
Both JSONLs have different structures (Copilot is simple, Claude Code has block arrays + meta-events mixed), but they can be absorbed by implementing separate parsers + standardizing chunking and later.
Noise removal (removal of tags such as <system-reminder>) resulted in a 30% difference in search accuracy
Claude Code side can be fully automated with Stop hook + background startup
After accumulating several dozen sessions, you will be able to search for “past self” in natural language. I can instantly remember that solution, that judgment, that trap.
Handling of confidential information, maintenance of noise removal rules, regeneration when updating embedding models — Although there is an operational cost, the returns obtained are overwhelmingly large.Related articles:
[Mechanism to prevent Copilot from making the same mistake twice — Design to have “memory of discussion” with RAG + MCP] (/blog/copilot-memory-rag-mcp) — MCP server implementation for Copilot to autonomously search in real time the conversation history input in this article
I used GitHub Copilot for 1 month and Claude Code for 2 days — Coding partners and agents are different things — Differences in the personalities of Copilot and Claude Code

Even if the AI forgets, I will record it. Then, when necessary, “remind” from this side. This is now my basic attitude when working with AI.