How to Build a fraud detection Agent Using LlamaIndex in TypeScript for healthcare

By Cyprian AaronsUpdated 2026-04-21

fraud-detectionllamaindextypescripthealthcare

A fraud detection agent for healthcare flags suspicious claims, billing patterns, and provider activity before they turn into financial loss or compliance exposure. It matters because healthcare fraud is not just a cost problem; it creates audit risk, delays legitimate reimbursement, and can trigger regulatory action if you can’t explain why a claim was flagged.

Architecture

•
Claim ingestion layer
- •Pulls structured claim data from your EHR, claims platform, or data warehouse.
- •Normalizes fields like CPT/HCPCS codes, diagnosis codes, provider IDs, dates of service, and billed amounts.
•
Policy and rules context
- •Stores payer policies, prior authorization rules, coding guidelines, and internal fraud heuristics.
- •Gives the agent grounding so it does not rely only on statistical similarity.
•
Vector index over historical cases
- •Indexes past fraud investigations, denied claims, audit notes, and remediation outcomes.
- •Lets the agent retrieve similar cases before making a recommendation.
•
LLM reasoning layer
- •Uses LlamaIndex to combine retrieved evidence with the current claim payload.
- •Produces a structured fraud risk assessment with reasons and supporting citations.
•
Audit trail store
- •Persists every input, retrieval result, prompt version, and final decision.
- •Required for compliance review and post-incident analysis.
•
Human review queue
- •Routes high-risk or ambiguous cases to a billing specialist or SIU analyst.
- •Prevents automatic denial without oversight.

Implementation

1) Install dependencies and define your data model

Use LlamaIndex’s TypeScript package plus a JSON schema validator if you want strict outputs. In healthcare, keep PHI out of logs and only pass the minimum necessary fields into the agent.

npm install llamaindex zod

import { Document } from "llamaindex";

export interface ClaimRecord {
  claimId: string;
  memberId: string;
  providerId: string;
  cptCodes: string[];
  icd10Codes: string[];
  placeOfService: string;
  billedAmount: number;
  dateOfService: string;
  submittedAt: string;
}

export function claimToDocument(claim: ClaimRecord): Document {
  return new Document({
    id_: claim.claimId,
    text: JSON.stringify(claim),
    metadata: {
      claimId: claim.claimId,
      providerId: claim.providerId,
      billedAmount: claim.billedAmount,
      dateOfService: claim.dateOfService,
      submittedAt: claim.submittedAt,
    },
  });
}

2) Build an index over historical fraud cases

This is where LlamaIndex earns its keep. You are not asking the model to invent fraud patterns from scratch; you are retrieving similar investigations and using them as evidence.

import {
  VectorStoreIndex,
  storageContextFromDefaults,
} from "llamaindex";
import { claimToDocument, ClaimRecord } from "./claims";

async function buildFraudIndex(history: ClaimRecord[]) {
  const docs = history.map(claimToDocument);

  const index = await VectorStoreIndex.fromDocuments(docs);
  const storageContext = await storageContextFromDefaults();

  await index.storageContext.persist({
    persistDir: "./storage/fraud-index",
    storageContext,
  });

  return index;
}

If your environment needs data residency controls, point persistence to a region-locked object store or self-hosted vector DB instead of local disk.

3) Query the index with a current claim and generate a risk assessment

Use asQueryEngine() for retrieval plus synthesis. The pattern below returns a structured answer that your downstream workflow can score and route.

import {
  VectorStoreIndex,
} from "llamaindex";
import { z } from "zod";

const FraudAssessmentSchema = z.object({
  riskLevel: z.enum(["low", "medium", "high"]),
  reasons: z.array(z.string()),
  recommendedAction: z.enum(["auto_approve", "manual_review", "escalate_siu"]),
});

async function assessClaim(claimJson: string) {
  const index = await VectorStoreIndex.fromPersisted({
    persistDir: "./storage/fraud-index",
  });

  const queryEngine = index.asQueryEngine({
    similarityTopK: 5,
    responseMode: "compact",
  });

  const prompt = `
You are a healthcare fraud detection assistant.
Assess this claim for fraud indicators using retrieved historical cases and billing logic.
Return only valid JSON with keys riskLevel, reasons, recommendedAction.

Claim:
${claimJson}

Focus on:
- duplicate billing
- unbundling
- impossible dates of service
- unusual provider utilization
- mismatch between diagnosis and procedure codes
`;

const response = await queryEngine.query({ queryStr: prompt });
const parsed = FraudAssessmentSchema.parse(JSON.parse(response.response));

return parsed;
}

4) Put it behind a review workflow

Do not let the model directly deny claims. Use it as a triage signal that feeds rules-based thresholds and human review. That keeps you safer on compliance and easier to defend in audits.

async function routeClaim(claimJson: string) {
  const assessment = await assessClaim(claimJson);

  if (assessment.riskLevel === "high") {
    return {
      action: "manual_review",
      reasonCodes: assessment.reasons,
      queue: "siu",
    };
    }

  if (assessment.riskLevel === "medium") {
    return {
      action: "manual_review",
      reasonCodes: assessment.reasons,
      queue: "billing_audit",
    };
   }

   return {
     action: "auto_approve",
     reasonCodes: assessment.reasons,
   };
}

Production Considerations

•
Compliance first
- •Treat all claim payloads as sensitive health data.
- •Minimize PHI in prompts, encrypt data at rest/in transit, and maintain access controls aligned with HIPAA or local equivalents.
•
Auditability
- •Persist the exact retrieved chunks, prompt template version, model version, and final output.
- •If an auditor asks why a claim was flagged, you need traceable evidence rather than “the model said so.”
•
Data residency
- •Keep embeddings, vector stores, and logs in the same jurisdiction as your regulated data.
- •If you serve multiple regions, isolate indexes per region instead of mixing patient data across borders.
•
Monitoring
- •Track false positives by provider specialty, payer type, and code family.
- •Watch for drift when billing policies change or new abuse patterns appear.

Common Pitfalls

•
Using free-form LLM output in production
- •If you parse plain text decisions manually, you will eventually break routing logic.
- •Force structured JSON output with a schema like zod and reject malformed responses.
•
Indexing raw PHI without governance
- •Dumping notes or full chart text into the vector store creates compliance debt fast.
- •Store only what is needed for fraud analysis and redact identifiers where possible.
•
Letting the agent make final adjudication decisions
- •Fraud detection should triage; it should not be the final authority on denial or escalation.
- •Keep deterministic business rules and human review in the loop for anything high impact.
•
Ignoring regional policy differences
- •Healthcare billing rules vary by payer, state, country, and specialty.
- •Partition retrieval context by jurisdiction so the agent does not apply one region’s policy to another’s claims.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit