LLM Workflow

PII-Fi APIをLLM（大規模言語モデル）と組み合わせて、個人情報を保護しながらテキスト処理を行うワークフローを説明します。

Overview

LLMに個人情報を含むテキストを送信すると、情報漏洩のリスクがあります。 PII-Fi APIを使用すると、以下のワークフローで安全にLLMを活用できます。

1. 元のテキスト（PII含む）

田中太郎様、ご連絡ありがとうございます。

↓ PII-Fi API (fakeマスキング（仮名化）)

2. マスキング済みテキスト + マッピング情報

山田花子様、ご連絡ありがとうございます。

↓ LLMに送信（個人情報なし）

3. LLMの応答（ダミーデータ）

山田花子様、承知いたしました。

↓ PII-Fi API (復元)

4. 復元された応答（本物のPII）

田中太郎様、承知いたしました。

Step 1: Masking with Mapping

fake 方式（仮名化）と include_mapping_info: true を使用します。

import requests

API_KEY = "YOUR_API_KEY"
# デモ環境（本番環境では専用エンドポイントを提供）
BASE_URL = "https://api.pii-fi.com/api"

def deidentify_for_llm(text):
    response = requests.post(
        f"{BASE_URL}/detect",
        headers={
            "Content-Type": "application/json",
            "Authorization": f"Bearer {API_KEY}"
        },
        json={
            "text": text,
            "deidentification_type": "fake",
            "include_mapping_info": True
        }
    )
    return response.json()

# 元のテキスト
original_text = "田中太郎様、ご連絡ありがとうございます。090-1234-5678にお電話ください。"

# マスキング（仮名化）
result = deidentify_for_llm(original_text)
deidentified_text = result["deidentified_text"]
mapping_info = result["mapping_info"]

print(f"マスキング済み: {deidentified_text}")
# 出力: 山田花子様、ご連絡ありがとうございます。080-9876-5432にお電話ください。

async function deidentifyForLLM(text) {
    const response = await fetch("https://api.pii-fi.com/api/detect", {
        method: "POST",
        headers: {
            "Content-Type": "application/json",
            "Authorization": `Bearer ${API_KEY}`
        },
        body: JSON.stringify({
            text,
            deidentification_type: "fake",
            include_mapping_info: true
        })
    });
    return response.json();
}

const originalText = "田中太郎様、ご連絡ありがとうございます。";
const result = await deidentifyForLLM(originalText);
const deidentifiedText = result.deidentified_text;
const mappingInfo = result.mapping_info;

Step 2: Send to LLM

マスキング済みテキストをLLMに送信します。個人情報は含まれていないため、安全に処理できます。

OpenAI API Example

import openai

def call_llm(deidentified_text):
    response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "あなたは親切なアシスタントです。"},
            {"role": "user", "content": f"以下のメッセージに返信してください:\n{deidentified_text}"}
        ]
    )
    return response.choices[0].message.content

# LLMにダミーデータを含むテキストを送信
llm_response = call_llm(deidentified_text)
print(f"LLM応答: {llm_response}")
# 出力: 山田花子様、承知いたしました。080-9876-5432にご連絡いたします。

Step 3: Restore

LLMの応答を元のPII値に復元します。

def restore_pii(text, mapping_info):
    response = requests.post(
        f"{BASE_URL}/restore",
        headers={
            "Content-Type": "application/json",
            "Authorization": f"Bearer {API_KEY}"
        },
        json={
            "text": text,
            "mapping_info": mapping_info,
            "mode": "simple"
        }
    )
    return response.json()

# LLMの応答を復元
restored = restore_pii(llm_response, mapping_info)
print(f"復元された応答: {restored['restored_text']}")
# 出力: 田中太郎様、承知いたしました。090-1234-5678にご連絡いたします。

async function restorePII(text, mappingInfo) {
    const response = await fetch("https://api.pii-fi.com/api/restore", {
        method: "POST",
        headers: {
            "Content-Type": "application/json",
            "Authorization": `Bearer ${API_KEY}`
        },
        body: JSON.stringify({
            text,
            mapping_info: mappingInfo,
            mode: "simple"
        })
    });
    return response.json();
}

const restored = await restorePII(llmResponse, mappingInfo);
console.log(`復元された応答: ${restored.restored_text}`);

Marker Mode

LLMがダミーデータを書き換えてしまう場合、マーカーモードを使用すると復元の精度が向上します。

Masking with Markers

result = requests.post(
    f"{BASE_URL}/detect",
    headers={...},
    json={
        "text": "田中太郎様、ご連絡ありがとうございます。",
        "deidentification_type": "fake",
        "include_mapping_info": True,
        "use_markers": True  # マーカーモードを有効化
    }
).json()

print(result["deidentified_text"])
# 出力: [[PII_000:山田花子]]様、ご連絡ありがとうございます。

Restore with Marker Mode

restored = requests.post(
    f"{BASE_URL}/restore",
    headers={...},
    json={
        "text": llm_response,
        "mapping_info": mapping_info,
        "mode": "marker"  # マーカーモードで復元
    }
).json()

💡 Why Marker Mode?

マーカー形式 [[PII_XXX:値]] はLLMが誤って書き換えにくいため、復元の成功率が高まります。特にLLMが名前や電話番号を独自に変更してしまう場合に有効です。

Complete Example

Python

import requests
import openai

API_KEY = "YOUR_PII_DETECTOR_API_KEY"
OPENAI_API_KEY = "YOUR_OPENAI_API_KEY"
# デモ環境（本番環境では専用エンドポイントを提供）
BASE_URL = "https://api.pii-fi.com/api"

def pii_safe_llm_call(user_text, system_prompt="あなたは親切なアシスタントです。"):
    """
    PIIを保護しながらLLMを呼び出す
    """
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {API_KEY}"
    }

    # Step 1: マスキング
    deidentify_result = requests.post(
        f"{BASE_URL}/detect",
        headers=headers,
        json={
            "text": user_text,
            "deidentification_type": "fake",
            "include_mapping_info": True
        }
    ).json()

    deidentified_text = deidentify_result["deidentified_text"]
    mapping_info = deidentify_result["mapping_info"]

    # Step 2: LLMに送信
    openai.api_key = OPENAI_API_KEY
    llm_response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": deidentified_text}
        ]
    ).choices[0].message.content

    # Step 3: 復元
    restore_result = requests.post(
        f"{BASE_URL}/restore",
        headers=headers,
        json={
            "text": llm_response,
            "mapping_info": mapping_info,
            "mode": "simple"
        }
    ).json()

    return {
        "original_input": user_text,
        "deidentified_input": deidentified_text,
        "llm_response_deidentified": llm_response,
        "final_response": restore_result["restored_text"]
    }


# 使用例
result = pii_safe_llm_call(
    "田中太郎様、ご注文ありがとうございます。090-1234-5678に確認のお電話をいたします。"
)
print(f"最終応答: {result['final_response']}")

Best Practices

マッピング情報の保管: mapping_info はセッション中保持し、復元時に使用します
マーカーモードの活用: LLMが値を書き換える可能性がある場合はマーカーモードを使用
エラーハンドリング: 復元に失敗した場合の警告をログに記録
バッチ処理: 大量のテキストを処理する場合は /detect/batch を使用

Next Steps

💻 Sample Code ⚙ Custom Recognizer 📖 API Reference