Why We Built a PII Firewall for AI Chat
When we added an AI companion to UseKYN, we hit a fork in the road that every finance app with AI eventually hits: do we send the user's real financial data to the model, or do we strip identity out of it first?
The easy path is to send everything. Transaction descriptions, bank names, merchant names, debt accounts, investment holdings — dump it all into the prompt and let the model figure out what matters. The AI answers better with more context. Shipping is faster.
We didn't take the easy path. We built a PII firewall. This post explains what that means, how it works, and why we think every AI-powered finance app should have one.
Short version: A PII firewall is a layer that sits between your financial data and any external AI service. It decides what the model is allowed to see — by default, nothing identity-related gets through. Only safe aggregates and summaries pass.
The problem: AI needs context, but identity is dangerous
An AI companion for your finances has to answer questions like "can I afford to skip a debt payment this month?" or "am I on track for my emergency fund?" To answer well, it needs context — balances, debts, goals, spending patterns.
The naive implementation is to pass raw data from your bank and brokerage into the prompt. That works. But it also means your identity and financial life — bank names, merchant names, account numbers, debts at specific lenders — flows to an external AI provider in every message.
Even if that provider has good privacy practices, you've now created a system where:
- A bug in prompt construction could leak more than you intended.
- A change in the AI provider's retention policy could apply retroactively.
- A future feature that logs prompts for debugging could capture identity.
- A compromised log store could expose a user's entire financial profile, not just aggregates.
The safest version of this system is one where identity can't leak, because it's never sent in the first place.
What a PII firewall is
A PII firewall is a software layer that sits between your private financial data and any external AI service. It enforces two rules:
- Allowlist, not denylist. Only explicitly approved fields are included in AI context. New fields added to the database don't automatically flow through — they have to be explicitly added to the allowlist by a developer.
- Defense in depth. Multiple layers check the data: one builds the context, a second verifies nothing identity-bearing slipped in, a third sanitizes the AI's response before it returns to you.
The allowlist approach is the important one. Denylists are fragile — they only block what you remember to block. If someone adds a new column called customer_full_name, a denylist might miss it. An allowlist catches it because the new field simply isn't on the approved list.
What the AI sees vs. what the database stores
| In the Database | What the AI Sees |
|---|---|
| "Chase Sapphire Preferred — $8,214.33 @ 24.99% APR" | "Credit card debt: $8,214 @ 24.99%" |
| "SHELL OIL 12451 ATLANTA GA" | "Transportation: $67" |
| "Fidelity brokerage — AAPL, NVDA, VTI..." | "Portfolio: $31,400 across 12 holdings" |
| "Bank of America checking ****4821" | "Checking balance: $4,213" |
| Full name, address, email, phone | Never sent |
The AI can still answer meaningfully: "Your highest-APR debt is your credit card at 24.99% — a $50 extra payment there saves more than $50 on your auto loan at 6.2%." It doesn't need to know the card is Chase or the lender is Ally. The math works the same.
How it works in UseKYN
Three layers:
1. AI Context Builder — the allowlist
When you ask KYN a question, UseKYN builds a context object from scratch. Every field included in that object is on an explicit allowlist of safe categories: account balances by type, spending categories and amounts, budget utilization, debt types and APRs (not lender names), savings goal names and progress, ticker symbols and portfolio values.
If a developer adds a new field to the user table tomorrow, it does not automatically flow into AI context. It has to be added to the allowlist — which means a code review that specifically discusses whether it's safe.
2. PII Guard — the verifier
Before any request goes out to the AI provider, a second layer scans the assembled payload against PII patterns: SSN formats, credit card number patterns, email addresses, phone numbers, account-number-looking strings. If anything matches, the request is blocked or scrubbed. This is defense in depth — if something slipped past the allowlist, this layer catches it.
3. Response Sanitizer — the output filter
When the AI responds, UseKYN checks the response for anything identity-bearing before returning it to you. This sounds paranoid, but LLMs sometimes echo back things they shouldn't, or hallucinate identity patterns. The sanitizer keeps that contained.
The tradeoff: slightly less "personalized" AI
There's a real tradeoff here. An AI that sees "Chase Sapphire Preferred" can give a slightly more personalized-feeling answer than one that sees "credit_card." Some competitors use that as a selling point — "your AI knows your banks."
We think that tradeoff is worth it. The incremental quality gain from letting the model see your bank names is small. The downside — an identity leak through logs, retention, or a compromised AI vendor — is large. And the user doesn't actually need the AI to say "your Chase card" versus "your credit card" to get useful advice.
This isn't theoretical. In 2023, several consumer-facing AI products had incidents where chat logs were briefly exposed to other users. Any finance app that sends raw identity into those systems is one bug away from exposing a user's entire financial profile. A PII firewall means that even in the worst case, what leaks is anonymized aggregates.
Why every finance AI should have one
Three reasons:
- You can't put the data back in the box. Once identity flows to an external AI provider, you've lost control of it — even if that provider is careful. A firewall means it never leaves.
- Regulations are tightening. US state privacy laws (CCPA, CPRA, and follow-ons) and global frameworks (GDPR, Open Banking) increasingly demand data minimization. Sending maximum data to AI providers is architecturally the opposite of that.
- It builds trust durably. "We have a policy" is weak. "Our architecture doesn't let this happen" is strong. A firewall converts a promise into a property of the system.
The meta-point
AI in finance is going to get more capable. It will get more integrated into every app. The question each app has to answer is: what does the model get to see? The answer shapes everything downstream — what can leak, what can be logged, what regulators can later demand.
We built a PII firewall because the answer we wanted to give our users was "as little as possible, and never your identity." That decision had to be made early, because retrofitting a firewall into a system that was built without one is much harder than starting with one.
Curious what AI chat looks like when identity stays home?
Try asking KYN about your debts, spending, or goals. The answers will have real numbers — but not your bank or merchant names.