How to Build a fraud detection Agent Using LlamaIndex in TypeScript for healthcare

By Cyprian AaronsUpdated 2026-04-21
fraud-detectionllamaindextypescripthealthcare

A fraud detection agent for healthcare flags suspicious claims, billing patterns, and provider activity before they turn into financial loss or compliance exposure. It matters because healthcare fraud is not just a cost problem; it creates audit risk, delays legitimate reimbursement, and can trigger regulatory action if you can’t explain why a claim was flagged.

Architecture

  • Claim ingestion layer

    • Pulls structured claim data from your EHR, claims platform, or data warehouse.
    • Normalizes fields like CPT/HCPCS codes, diagnosis codes, provider IDs, dates of service, and billed amounts.
  • Policy and rules context

    • Stores payer policies, prior authorization rules, coding guidelines, and internal fraud heuristics.
    • Gives the agent grounding so it does not rely only on statistical similarity.
  • Vector index over historical cases

    • Indexes past fraud investigations, denied claims, audit notes, and remediation outcomes.
    • Lets the agent retrieve similar cases before making a recommendation.
  • LLM reasoning layer

    • Uses LlamaIndex to combine retrieved evidence with the current claim payload.
    • Produces a structured fraud risk assessment with reasons and supporting citations.
  • Audit trail store

    • Persists every input, retrieval result, prompt version, and final decision.
    • Required for compliance review and post-incident analysis.
  • Human review queue

    • Routes high-risk or ambiguous cases to a billing specialist or SIU analyst.
    • Prevents automatic denial without oversight.

Implementation

1) Install dependencies and define your data model

Use LlamaIndex’s TypeScript package plus a JSON schema validator if you want strict outputs. In healthcare, keep PHI out of logs and only pass the minimum necessary fields into the agent.

npm install llamaindex zod
import { Document } from "llamaindex";

export interface ClaimRecord {
  claimId: string;
  memberId: string;
  providerId: string;
  cptCodes: string[];
  icd10Codes: string[];
  placeOfService: string;
  billedAmount: number;
  dateOfService: string;
  submittedAt: string;
}

export function claimToDocument(claim: ClaimRecord): Document {
  return new Document({
    id_: claim.claimId,
    text: JSON.stringify(claim),
    metadata: {
      claimId: claim.claimId,
      providerId: claim.providerId,
      billedAmount: claim.billedAmount,
      dateOfService: claim.dateOfService,
      submittedAt: claim.submittedAt,
    },
  });
}

2) Build an index over historical fraud cases

This is where LlamaIndex earns its keep. You are not asking the model to invent fraud patterns from scratch; you are retrieving similar investigations and using them as evidence.

import {
  VectorStoreIndex,
  storageContextFromDefaults,
} from "llamaindex";
import { claimToDocument, ClaimRecord } from "./claims";

async function buildFraudIndex(history: ClaimRecord[]) {
  const docs = history.map(claimToDocument);

  const index = await VectorStoreIndex.fromDocuments(docs);
  const storageContext = await storageContextFromDefaults();

  await index.storageContext.persist({
    persistDir: "./storage/fraud-index",
    storageContext,
  });

  return index;
}

If your environment needs data residency controls, point persistence to a region-locked object store or self-hosted vector DB instead of local disk.

3) Query the index with a current claim and generate a risk assessment

Use asQueryEngine() for retrieval plus synthesis. The pattern below returns a structured answer that your downstream workflow can score and route.

import {
  VectorStoreIndex,
} from "llamaindex";
import { z } from "zod";

const FraudAssessmentSchema = z.object({
  riskLevel: z.enum(["low", "medium", "high"]),
  reasons: z.array(z.string()),
  recommendedAction: z.enum(["auto_approve", "manual_review", "escalate_siu"]),
});

async function assessClaim(claimJson: string) {
  const index = await VectorStoreIndex.fromPersisted({
    persistDir: "./storage/fraud-index",
  });

  const queryEngine = index.asQueryEngine({
    similarityTopK: 5,
    responseMode: "compact",
  });

  const prompt = `
You are a healthcare fraud detection assistant.
Assess this claim for fraud indicators using retrieved historical cases and billing logic.
Return only valid JSON with keys riskLevel, reasons, recommendedAction.

Claim:
${claimJson}

Focus on:
- duplicate billing
- unbundling
- impossible dates of service
- unusual provider utilization
- mismatch between diagnosis and procedure codes
`;

const response = await queryEngine.query({ queryStr: prompt });
const parsed = FraudAssessmentSchema.parse(JSON.parse(response.response));

return parsed;
}

4) Put it behind a review workflow

Do not let the model directly deny claims. Use it as a triage signal that feeds rules-based thresholds and human review. That keeps you safer on compliance and easier to defend in audits.

async function routeClaim(claimJson: string) {
  const assessment = await assessClaim(claimJson);

  if (assessment.riskLevel === "high") {
    return {
      action: "manual_review",
      reasonCodes: assessment.reasons,
      queue: "siu",
    };
    }

  if (assessment.riskLevel === "medium") {
    return {
      action: "manual_review",
      reasonCodes: assessment.reasons,
      queue: "billing_audit",
    };
   }

   return {
     action: "auto_approve",
     reasonCodes: assessment.reasons,
   };
}

Production Considerations

  • Compliance first

    • Treat all claim payloads as sensitive health data.
    • Minimize PHI in prompts, encrypt data at rest/in transit, and maintain access controls aligned with HIPAA or local equivalents.
  • Auditability

    • Persist the exact retrieved chunks, prompt template version, model version, and final output.
    • If an auditor asks why a claim was flagged, you need traceable evidence rather than “the model said so.”
  • Data residency

    • Keep embeddings, vector stores, and logs in the same jurisdiction as your regulated data.
    • If you serve multiple regions, isolate indexes per region instead of mixing patient data across borders.
  • Monitoring

    • Track false positives by provider specialty, payer type, and code family.
    • Watch for drift when billing policies change or new abuse patterns appear.

Common Pitfalls

  1. Using free-form LLM output in production

    • If you parse plain text decisions manually, you will eventually break routing logic.
    • Force structured JSON output with a schema like zod and reject malformed responses.
  2. Indexing raw PHI without governance

    • Dumping notes or full chart text into the vector store creates compliance debt fast.
    • Store only what is needed for fraud analysis and redact identifiers where possible.
  3. Letting the agent make final adjudication decisions

    • Fraud detection should triage; it should not be the final authority on denial or escalation.
    • Keep deterministic business rules and human review in the loop for anything high impact.
  4. Ignoring regional policy differences

    • Healthcare billing rules vary by payer, state, country, and specialty.
    • Partition retrieval context by jurisdiction so the agent does not apply one region’s policy to another’s claims.

Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides