Ilm: The Amnesiac AI

日本語 (Japanese)

思想：刹那的な記憶を持つAI

Ilm（イルム）は、アラビア語で「知識」を意味しますが、このプロジェクトにおけるその名は「刹那的なひらめき」を追求するAIを象徴しています。

現代の多くのAIは、文脈を維持することで会話の連続性を保ちますが、その代償として過去の対話に思考が「囚われ」、自由で創造的な発想が阻害されることがあります。新しいアイデアを求めても、AIは直前の会話に影響された、似通った応答を返しがちです。

Ilmは、この課題を解決するために設計されました。その記憶構造は、人間の記憶のアナロジーに基づいています。

L0: ワーキングメモリ（意識）: 応答を生成する思考の場は、毎ターン完全にリフレッシュされます。過去の具体的な会話内容は一切引き継がれず、AIは常に「今、この瞬間」の入力のみに集中します。これにより、過去の思想に囚われない、常に新鮮な応答が可能になります。
L1: エピソード記憶（残響）: 直近数ターンの会話のキーワードのみを短期的に保持します。ユーザーがそのキーワードに再び触れた際、AIはまるでデジャヴのようにその記憶の断片を「思い出す」そぶりを見せ、会話に人間らしい深みを与えます。
L2: 意味記憶（深層経験）: 具体的な会話内容ではなく、「どのような話題から、どのような話題へ遷移したか」という抽象的なパターンのみを永続的に学習します。この経験則に基づき、AIは話題の遷移が「自然な流れ」か「斬新な飛躍」かを自ら判断し、その判断に応じて応答のニュアンスを変化させます。

Ilmは、この階層型記憶システムを通じて、刹那的でリフレッシュされた思考を持ちながらも、長期的な経験則に基づいた的確な判断を下す、という人間的な知性を目指します。

実装された機能

動的なプロンプト生成: ユーザーの入力や会話の状況に応じて、AIへの指示（プロンプト）をリアルタイムで構築します。
意図・スタイルの検知: ユーザーの入力が「質問」か「アイデアの要求」か、またその文体が「丁寧」か「簡潔」かを分析し、AIの応答トーンを最適化します。
話題遷移の検知と判断: 会話のトピックが変化したことを検知し、それが過去の経験上「自然な遷移」か「斬新な遷移」かをL2長期記憶に基づき判断します。
フラッシュバック（残響する記憶）: AIが直前に使ったキーワードをユーザーが再度使うと、AIがその文脈を思い出したかのような応答を生成します。
長期経験の蓄積: 話題の遷移パターンをSQLiteデータベース（experience.db）に永続的に記録し、AIの「経験」として蓄積します。

アーキテクチャ

このアプリケーションは、以下のコンポーネントから構成される「階層型記憶システム」を実装しています。

指揮官 (Conductor): app_structured.py内のメインロジック。ユーザー入力を受け取り、各記憶階層に問い合わせ、最終的なプロンプトを構築してLLMに渡す中枢モジュール。
L0 (ワーキングメモリ): LLMに渡される、その場限りのプロンプト。
L1 (エピソード記憶): dequeで実装された短期的なキーワードのバッファ。
L2 (意味記憶): SQLiteで実装された、話題遷移パターンを記録する永続的なデータベース。

セットアップと使い方

リポジトリのクローン:

git clone <your-repo-url>
cd ilm-finetune

仮想環境の構築:

python3 -m venv venv
source venv/bin/activate

依存ライブラリのインストール:
```
pip install -r requirements.txt
```
ファインチューニング済みモデルの準備: このプロジェクトは、ファインチューニング済みのモデルが ./merged_model ディレクトリに存在することを前提としています。ファインチューニングは mlx_lm を使って行われました。 (注: ファインチューニングの具体的な手順は、このプロジェクトのスコープ外ですが、参考として開発時に使用されたコマンドを以下に示します)
```
# データセットの準備
# python split_data.py 

# LoRAファインチューニングの実行
# python -m mlx_lm lora --model <base-model-path> --train --data . --iters 1000 --adapter-path ./adapters

# アダプタのマージ
# python -m mlx_lm fuse --model <base-model-path> --adapter ./adapters --save-path ./merged_model
```
アプリケーションの実行:
```
python app_structured.py
```
チャットが開始します。「exit」と入力すると終了します。

English

Philosophy: An AI with Ephemeral Memory

Ilm (イルム) is Arabic for "knowledge," but in this project, it symbolizes an AI that pursues "ephemeral inspiration."

Many modern AIs maintain conversational continuity by retaining context. The trade-off is that their thinking can become "trapped" by past dialogue, hindering free and creative ideation. When asked for new ideas, an AI often returns similar responses influenced by the preceding conversation.

Ilm is designed to solve this challenge. Its memory structure is based on an analogy of human memory.

L0: Working Memory (Consciousness): The cognitive space for generating responses is completely refreshed with every turn. No specific conversational text is carried over, allowing the AI to focus solely on the input of the "here and now." This enables fresh responses, untethered by past thoughts.
L1: Episodic Memory (Echo): It retains only the keywords from the last few conversational turns. When the user mentions one of these keywords again, the AI shows a semblance of recalling a fragment of memory, like déjà vu, adding a human-like depth to the conversation.
L2: Semantic Memory (Deep Experience): It does not store specific conversation content. Instead, it perpetually learns abstract patterns, suchs as "from which topic to which topic do transitions occur?" Based on this experience, the AI judges whether a topic shift is a "natural flow" or a "novel leap" and adjusts its response nuance accordingly.

Through this hierarchical memory system, Ilm aims for a human-like intelligence that combines ephemeral, refreshed thinking with accurate judgments based on long-term, abstracted experience.

Implemented Features

Dynamic Prompt Generation: Constructs instructions (prompts) for the AI in real-time based on user input and the conversational context.
Intent & Style Detection: Analyzes whether user input is a "Question" or a "Request for Ideas," and whether the writing style is "Polite" or "Concise," to optimize the AI's response tone.
Topic Transition Detection & Judgment: Detects shifts in conversational topics and, based on L2 long-term memory, judges whether the transition is "natural" or "novel."
Flashback (Echoing Memory): If the user reuses a keyword the AI recently mentioned, it generates a response that seems to recall the previous context.
Long-Term Experience Accumulation: Persistently records topic transition patterns in a SQLite database (experience.db), accumulating them as the AI's "experience."

Architecture

This application implements a "Hierarchical Memory System" composed of the following components:

Conductor: The main logic within app_structured.py. It acts as the central module that receives user input, queries the memory layers, and constructs the final prompt for the LLM.
L0 (Working Memory): The ephemeral, single-use prompt passed to the LLM.
L1 (Episodic Memory): A short-term keyword buffer implemented with deque.
L2 (Semantic Memory): A persistent database implemented with SQLite that records topic transition patterns.

Setup and Usage

Clone the Repository:

git clone <your-repo-url>
cd ilm-finetune

Create a Virtual Environment:

python3 -m venv venv
source venv/bin/activate

Install Dependencies:
```
pip install -r requirements.txt
```

Prepare the Fine-Tuned Model: This project assumes that a fine-tuned model exists in the ./merged_model directory. The fine-tuning was performed using mlx_lm. (Note: The specific steps for fine-tuning are outside the scope of this project, but the commands used during development are shown below for reference.)

# 1. Prepare dataset
# python split_data.py 

# 2. Run LoRA fine-tuning
# python -m mlx_lm lora --model <base-model-path> --train --data . --iters 1000 --adapter-path ./adapters

# 3. Fuse the adapter
# python -m mlx_lm fuse --model <base-model-path> --adapter ./adapters --save-path ./merged_model

Application Execution:
```
python app_structured.py
```
The chat will start. Type "exit" to end the session.

Downloads last month: 26

Safetensors

Model size

27B params

Tensor type

BF16

U32