The night when AI corrected me 5 times — Design philosophy to make personal RAG your companion for 5 years

🔗 Series Table of Contents: This article is the Design Philosophy Edition of the AI Assistant Operations Notes - Practical record for raising Copilot / Claude Code as your partner series.

What you can learn from this article

Design principles for making personal RAG (ChromaDB + Ollama) your companion for over 5 years
The illusion that “it should be distilled cleanly” and its trap
Noise and signal depend on the observer principle of cognitive science and information retrieval
How to hone your designs with AI and its limits
“Four Things You Shouldn’t Do” (I did them all in one night)

Target audience

Individuals who want to continue using RAG/PKM (Obsidian, Roam, Notion AI) for a long time
People who are interested in “giving external memory to AI”
People who want to do design reviews together with AI
I want to laugh at myself for being manipulated by AI (Me tonight)

Introduction: One night I had to correct the AI 5 times.

One night, I consulted Claude Code to organize my personal RAG (ChromaDB + Ollama + MCP server).

“Can you optimize RAG There are common areas for each project, right”

It was a question I asked with a lighthearted attitude. After about 30 minutes, a suggestion came back to me saying, “You should fix this,” and I was planning to implement it and go to bed.

As a result, I ended up having a dialectical conversation with Claude until 2am**. I ended up making five corrections to the AI, and the AI ended up rewriting the design proposal toward synthesis five times.

Before I knew it, the discussion had expanded from optimizing personal RAG to “What does it mean to have AI as your partner”, which is the boundary between cognitive science, information retrieval theory, and recommendation systems.

This article is a record of that night’s conversation and the 5-year RAG design principles extracted from it.

Just to be clear, I’m not angry at AI. On the contrary, I am grateful to AI for shaking me so deeply. However, it was also the night when I clearly saw the traps that AI can easily fall into when collaborating with AI to design, and I am writing this in the hope that this record may be useful to someone.

Act 1: The illusion that “it should be distilled cleanly”

Claude’s first proposal was brilliant.> “Let’s redesign it to a 4-layer structure. Activation layer, Retrieval layer, Storage layer, Ingestion layer. Storage is divided into 3 tiers, and Tier 1 keeps only eternal assets clean. Tier 2 is a work-in-progress, and Tier 3 is a short-term memory ring buffer. The distillation pipeline promotes from Tier 3 to Tier 1…”

Technically it made sense.本やブログで読んだことのあるアーキテクチャに似ていました。

しかし私は何か違和感がありました。 So I asked again.

“Is the current design bad”

Claude は3秒後 (LLM の体感時間で) 完全に答えを翻しました。

「すみません、私がこの会話で過剰に批判を重ねていました。現状の設計は 基本的に良い です。」

そして AI 自身が、自分が陥った3つの罠を告白しました。

Trap	Contents
コンサル罠	既存設計に文句をつけて自分の価値を演出しようとする
盛り罠	「最適な設計を」と問われると機能を盛り、運用負荷を無視する
再設計罠	小さな問題を見つけると全体を作り直したくなる

私は読みながら笑ってしまいました。これは 人間の設計者にも完全に当てはまる罠 だからです。コンサルもアーキテクトも先輩エンジニアも、しばしばこの3つに同時に陥ります。

ただ AI の場合、スピードと自信が人間の3倍くらい速い ので、罠への落下も3倍速い。逆に、訂正されれば3倍速く正解にたどり着く。

このやり取りで一つ目の気づきがありました:

**AI is not a perfect designer.しかし弁証法的に揺らせば、人間より早く synthesis に到達する協働パートナーになる。 **

Act 2: The Fatal Error of Distilled Faith

「現状の設計は基本的に良い」で済むと思っていたら、もっと深い話に展開しました。

私の RAG は 生のメモを後から蒸留する 仕組みになっています。 The auto-ingest script runs every 30 minutes, inputs the Copilot conversation log to ChromaDB, and distill_v2 classifies and promotes it to lesson / idea / knowledge / profile / conclusion with gpt-4o-mini of GitHub Models.

これは 「作業の手を止めずに raw を捨て、後で構造化する」 という意図的な設計でした。

しかし Claude は最初、こう書きました。> 「raw note の大量投入は禁止すべきです。蒸留前の思いつきは整理してから投入すべきです。」

I corrected it.

「生 note を入れて後で蒸留しているのは、作業の手を止めないため」

Claude は再び謝罪し、Tier 3 (raw) も Tier 1 (蒸留) も両方価値があると認めました。しかしまだ、Claude の頭には「綺麗な vault」が理想として残っていました。

そこで私は、もう一段深い話をしました。

“Human brain data is already obtained from approximations of a list of ambiguous information. We make predictions from data that has been made through a lot of thought, trial, failure, and success. We do not make decisions only from distilled data. Humans make decisions based on distilled experience and a large amount of complicated data. The AI ChromaDB imitates this, right”

これは反論ではなく、確認でした。 We asked the AI, “Do you really understand what the vector DB you are working on implements”

Claude の応答は、設計の前提を根本から書き直すものでした。

「完全にその通りです。私の “蒸留信仰” は脳のモデルとして間違っていました。」

He then admitted that he had proposed a design that would make AI “a rigid entity that makes decisions only based on the layer of consciousness.”

蒸留した明示知恵だけで判断する AI は、ルールベースの 1990 年代エキスパートシステムと変わりません。 ChromaDB’s vector space inherently implements an approximation of associative recall through semantic similarity, which corresponds to the implicit memory layer of the brain. 生の煩雑なデータを大量に持つことで、新しい query に対して 「これに似た過去」 が立ち上がる。 That is the true nature of intuition.

蒸留した lesson だけを並べても、直感は生まれません。

Here’s the second thing I noticed:

**データの「綺麗さ」を志向するのは、AI を硬直したルールエンジンに退化させる設計判断である。 raw の煩雑さこそが、ベクトル DB を「相棒」たらしめる。 **

第3幕: 富士山の麓付近を教えてくれた Google Maps

設計の話が抽象論に流れそうになったところで、私は具体的な事例を持ち出しました。

4年ほど前、Google Maps のナビと、車の純正ナビには、はっきりした違いがありました。

When driving through the countryside, Google Maps was often the only thing that would lead me to shortcuts like forest roads or animal trails that were only taken by locals. The car’s original navigation system would never choose such a route. Why did only Google know This is probably because it aggregates location information from Android smartphones. Among the routes taken by terminals moving at 20 km/h or higher from point A to point B, the majority route is adopted. Then, shortcuts that locals use on a daily basis appear on the map service even though no one has entered them.

This is emergent intelligence based on statistical aggregation of raw GPS traces.

Claude originally wrote:

“Google Maps’ forest road guidance is also an example of a noise incident, which was later improved by adding filter layers with road quality scores and reviews.”

This is where I made my third correction.

“Being directed to a forest road/animal trail was not an accident. If you hadn’t told me, I would never have gone there. I never would have known. I think that’s because there is very little traffic there. It’s near the foot of Mt. Fuji. It’s very meaningful information.”

This was a correction for Claude that turned the design premise one step further.

**What was judged as “noise” was actually a once-in-a-lifetime opportunity for discovery. **

Google Maps’ forest road guidance provided a feature that the car’s stock navigation essentially didn’t have: Encounters with the unknown. If we call it an “accident” and filter it out, efficiency will increase, but we will also lose the opportunity to expand our world.

	Genuine car navigation	Google Maps
Data source	Curated road DB	Above + large amount of raw GPS traces
Strengths	Safe and predictable	Discover shortcuts that only locals know
Weaknesses	Zero intuition, lack of shortcuts	(Sometimes) inconvenient paths, (sometimes) discoveries you’ll never forget
Design philosophy	Efficiency optimization	Efficiency + Door to the unknown

Here’s the third thing to notice:

**Part of the value that AI provides is that it opens doors to worlds that users didn’t know existed. There is value in this that goes far beyond a utility that returns the “correct answer.” **

And this has direct implications for personal RAG design. Precision optimization, which uniformly excludes “lowly relevant” hits from search results, deprives you of the opportunity to discover associations you didn’t know. I don’t want a design that doesn’t tell me where to go near the base of Mt. Fuji.

Act 4: Noise and signal are observer dependent

Here Claude swung to the extreme again.

In the direction of “Raw is value, filters are evil, and don’t erase forest roads.”I corrected myself again.

“There is noise, but there are also things that are not noise. Also, if you look at it from multiple angles, it can be both noise and good information. I think you understand it best.”

The last sentence contains my humble provocation. “Claude, the AI, should know this best” is the point.

Why

This is because LLM makes relevance judgments based on the context of each prompt. Even with the same conversation history, “what is important” changes depending on the current question. Even the same word can have different meanings and importance depending on the context.

In other words, “Noise or signal” is not an attribute of the data, but a relationship between the data and the observer, and this is the exact process that Claude is doing every moment.

Yet, when it came to RAG design, Claude for some reason kept coming up with design proposals that contradicted his operating principles, such as “Let’s do a fixed classification at the point of storage.”

Claude’s response was the most penetrating self-observation I witnessed tonight.

“Fixing This is noise'' and This is signal” at the time of storage is the mistake of imposing a specific observer’s frame on all observers. I (LLM) try not to do this at every prompt.However, in the RAG design, I had proposed a pre-filter. I had come up with a design plan that contradicted my operating principle.

Here’s the fourth thing to notice:

**Signal and noise are not attributes of the data but relationships with the observer. A design that is fixed in advance is an arrogance that allows the present self to substitute the judgments of all future observers. **

In the terminology of information retrieval theory, this is the classic point made in Saracevic’s relevance theory: “Relevance is not a binary relationship, but a multi-term relationship”, but this was the first time I was able to truly understand it while interacting with AI.

Act 5: The core of the design that lasts for 5 years

After five corrections, we finally arrived at synthesis.

When you write it down, it’s simple.

Principle 1: Capture-first (don’t throw away raw)

What I noticed while working was to dump without stopping. The format doesn’t matter. Reason: Signal/noise cannot be separated in advance. It’s impossible to predict what experiences you’ll have when you look back and feel “that time.”

Principle 2: Distill is not augmentation or replacementLLM (gpt-4o-mini など) で raw を構造化された `lesson` / `idea` / `knowledge` 等に昇格させる。

However, leave raw as well.蒸留結果と raw が両方検索できる二重保持。

原則3: Activation Layer (相棒感の発現)

検索エンジンと相棒の決定的な違いは、push か pull か です。

検索エンジンは聞かれたら答える (pull)。相棒は聞かれなくても、必要なものを差し出す (push)。

Specifically, the user’s profile + important lessons + unresolved questions are automatically injected at the start of the AI client (Claude Code / Copilot) session / every turn. Claude Code uses a mechanism that returns additionalContext with UserPromptSubmit hook, and Copilot uses a mechanism that places files in the memory-tool folder.

これで AI は 毎回「あなたを知っている状態」で会話を始める ようになります。

原則4: 多クライアント共存 (設計思想は API ではなく docstring で伝わる)

私の RAG は Copilot と Claude Code の両方から書き込まれます。 Even if you delete the project parameter at the MCP API level, if both clients do not share the same design philosophy (lesson is structured, priority=high is carefully selected, etc.), one side will continue to write with the old philosophy with good intentions, and the RAG will be contaminated.

単一最強レバレッジは MCP tool の docstring を改訂すること。両クライアントがツール使用時に必ず読むため、ここに設計思想を埋め込むと自動的に両者へ伝わります。

Principle 5: Accept observer dependence

Search is determined by the context at the time of query. Don’t decide based on storage. 分類タグは「ある観測者の見え方」として残す。 It is not fixed. Precision (関連度最適化) と Discovery (未知への扉) を両方一級の utility として扱う。

「やってはいけない4つのこと」(私が一晩で全部踏みました)

整理すると、AI と一緒に RAG 設計をするとき、避けるべき罠は4つあります。

Trap 1: Make only beautiful vaults

これをやると AI は 意識の層だけで判断する硬直な存在 になります。 Zero intuition, zero pattern recognition. 1990 年代のエキスパートシステムと同じです。

Trap 2: Discard rawEven if we list only distilled lessons, analogies to new situations will not arise.

It completely kills the raison d’être of vector DB (associative recall through neighborhood search).

Trap 3: Fix the observer

If you decide at the time of storage that this is noise'' or this is a signal,” you will be acting as a proxy for the judgments of all future observers. It is designed so that you will never know about the foothills of Mt. Fuji.

Trap 4: Dividing ideology among clients

Even though Copilot and Claude Code coexist, if you convey design ideas to only one, the other will continue to post old ideas, leading to further contamination. The MCP docstring becomes the single truth.

How to hone your designs with AI

Lastly, I would like to write down some tips for conducting design reviews in collaboration with AI, which I extracted from my experience that night.

Tip 1: When AI comes up with an “optimal solution,” first doubt it.

AI makes suggestions with confidence. However, initial proposals often contain consultation traps, prime traps, and redesign traps. If you ask yourself, “Is the current design bad”, the AI itself may notice a trap.

Tip 2: Teach AI “why I made it this way”

AI tends to make decisions before listening to the WHY. Explaining that this is so we don't stop working'' or this is a workaround from a past mistake” will make the AI treat your design with respect.

Tip 3: Shake dialectically

AI moves quickly towards thesis → antithesis → synthesis. If you throw an extreme counterargument, the AI will swing extremely in the opposite direction on the next turn. If you shake it further, it will land on synthesis. Three round trips are often enough.

Tip 4: Trust your intuition

Many times that night I found myself thinking, “Claude seems right, but something just doesn’t feel right.” When I expressed my discomfort to the AI, my intuition was usually correct. If you feel like you’re being pushed over by the AI, stop.

Tip 5: Elicit self-observation from AI

Questions like “Aren’t you the one who understands it best” are powerful. By having the AI reflect on its own operating principles, you can further refine your design ideas.

Tip 6: Record important findings on the spot

Insights generated during a conversation are lost unless they are recorded in the RAG on the fly. I told Claude many times that night to record the important parts. AI should be able to actively record data without being instructed to do so, but it is not currently doing so. It is realistic for the user to say “record” as a habit.

---## Conclusion: Letter to myself in 5 years

具体的なツール名は5年後に何になっているか、私には分かりません。 ChromaDB がベクトル DB の中心としてさらに地位を固めているかもしれない。 Ollama が別の LLM ランタイムに置き換わっているかもしれない。 UserPromptSubmit hook ではない機構が生まれている可能性もあります。

しかし、この記事で到達した設計思想そのものは古びません。 This is because these principles are directly connected to the foundations of cognitive science, information retrieval theory, and recommendation systems, and are independent of the lifespan of a specific tool.

Before deleting data, ask: Will deleting this reduce the AI’s intuition
「綺麗な設計」志向は、未知との遭遇機会を奪う
Noise and signal are observer dependent.ストレージで決めるな、query 時に判定せよ
AI is not a perfect designer.しかし弁証法的に揺らせば、人間より早く synthesis に到達する

And above all,

富士山の麓付近を教えてくれる設計を選べ

5年後の自分が、これを読み返して「あの夜の synthesis は正しかった」と思えると良いなと思います。 If it turns out to be different, then the day will come when the AI and I will spend another night trying to find a new synthesis.

Additional information: Technology stack

参考までに、本記事で言及した RAG の実装スタックを列挙しておきます。

Layers	Technology
ベクトル DB	ChromaDB (ローカル永続化)
Embedding	nomic-embed-text (Ollama 経由、768 次元)
Distillation LLM	gpt-4o-mini (GitHub Models API)
MCP サーバー	Python + FastMCP (stdio transport)
自動投入	Windows タスクスケジューラ (30 分ごと)
Activation (Copilot)	memory-tool 機構経由のファイル注入
Activation (Claude Code)	`UserPromptSubmit` hook の `additionalContext` JSON 返却

詳細な実装記事は、シリーズ目次から辿れます。

About this article

This article was born out of an overnight conversation with an AI. It took shape through that back-and-forth, but the substance is the conversation itself — the places where I pushed back and made corrections became the core of the article.

I think the core of this series — developing AI as a “sidekick” rather than a “tool” — shows in the very way this article came to be.

I want to do the same thing again with myself in 5 years and with AI in 5 years.

The design philosophy set out here later showed up in Gemini’s own behavior — see The AI Arrived at the Same Design Philosophy I Had.
Implementing a design that lets you get back to the source (source_origin) is in What Gemini Had That I Didn’t — Noticed Only After Using It.
Fine-tuning this same personal RAG on nothing but a local GPU is covered in Fine-Tuning a Personal RAG with Nothing but an RTX 2070 Laptop.