How to Fix 'state not updating when scaling' in LlamaIndex (TypeScript)

By Cyprian AaronsUpdated 2026-04-21

state-not-updating-when-scalingllamaindextypescript

When you see state not updating when scaling in a LlamaIndex TypeScript app, it usually means your agent or workflow state is being mutated in one execution path, but the scaled runtime is reading from a different instance, store, or process. In practice, this shows up when you move from local dev to multiple workers, serverless functions, or concurrent requests.

The symptom is usually not a crash. You’ll see stale chat history, missing Context values, tools behaving like they forgot prior steps, or logs that show WorkflowEvent firing but the next step reading old state.

The Most Common Cause

The #1 cause is treating in-memory state as if it were shared across scaled instances.

In LlamaIndex TypeScript, this often happens when you keep mutable state in a singleton service, module-level variable, or class property that lives only inside one Node process. When traffic scales horizontally, each instance gets its own copy.

Broken vs fixed pattern

Broken pattern	Fixed pattern
Store workflow state in module scope	Persist state in an external store keyed by session/user
Mutate `this.state` and expect all workers to see it	Load state at the start of each run and save after updates
Rely on a single `Workflow` instance for all requests	Create per-request execution context

// ❌ Broken: in-memory state shared only inside one process
import { Workflow } from "@llamaindex/workflow";

let conversationState = {
  step: 0,
  messages: [] as string[],
};

export class SupportWorkflow extends Workflow {
  async run(input: string) {
    conversationState.step += 1;
    conversationState.messages.push(input);

    return {
      step: conversationState.step,
      messages: conversationState.messages,
    };
  }
}

// ✅ Fixed: load/save state per request using an external store
import { Workflow } from "@llamaindex/workflow";

type ConversationState = {
  step: number;
  messages: string[];
};

interface StateStore {
  get(sessionId: string): Promise<ConversationState | null>;
  set(sessionId: string, state: ConversationState): Promise<void>;
}

export class SupportWorkflow extends Workflow {
  constructor(private store: StateStore) {
    super();
  }

  async run(sessionId: string, input: string) {
    const current =
      (await this.store.get(sessionId)) ?? { step: 0, messages: [] };

    const next = {
      step: current.step + 1,
      messages: [...current.messages, input],
    };

    await this.store.set(sessionId, next);

    return next;
  }
}

If you’re using LlamaIndex’s workflow APIs with classes like Workflow, Context, or event-driven steps such as @step, the same rule applies: anything that must survive scaling needs durable storage.

Other Possible Causes

1. Reusing a single `Context` across concurrent runs

A Context object should usually be request-scoped. If two requests share it, one can overwrite the other’s state.

// ❌ Broken
const ctx = new Context();

await Promise.all([
  workflow.run("request-a", { ctx }),
  workflow.run("request-b", { ctx }),
]);

// ✅ Fixed
await Promise.all([
  workflow.run("request-a", { ctx: new Context() }),
  workflow.run("request-b", { ctx: new Context() }),
]);

2. Assuming worker memory survives serverless cold starts

On Lambda, Vercel Functions, or any autoscaled container setup, memory is disposable. If your code depends on module-level caches for agent memory or retrieved documents, it will look fine locally and fail under scale.

// ❌ Broken
let cachedMemory: any;

export async function handler(req: Request) {
  cachedMemory ??= await loadMemoryFromDisk();
}

Use Redis, Postgres, DynamoDB, or another shared store instead.

// ✅ Fixed
export async function handler(req: Request) {
  const memory = await redis.get(`memory:${req.headers.get("x-session-id")}`);
}

3. Not awaiting async writes before returning

If you update state asynchronously and return early, the next request may read stale data.

// ❌ Broken
store.set(sessionId, nextState);
return { ok: true };

// ✅ Fixed
await store.set(sessionId, nextState);
return { ok: true };

This matters more under load because timing bugs become visible when requests overlap.

4. Version mismatch between LlamaIndex packages

If your project mixes incompatible versions of workflow-related packages, state serialization and event handling can behave inconsistently.

Check for mismatched versions:

{
  "dependencies": {
    "@llamaindex/core": "^0.3.0",
    "@llamaindex/workflow": "^0.2.1"
  }
}

Pin compatible versions and reinstall cleanly:

rm -rf node_modules package-lock.json
npm install
npm ls @llamaindex/core @llamaindex/workflow

How to Debug It

•
Log the process identity
- •Add logs for hostname, pod name, container ID, or Lambda invocation ID.
- •If the “missing” state appears only on certain instances, you have a scaling boundary problem.
•
Log the session key and storage reads/writes
- •Confirm every request uses the same session ID.
- •Check whether reads return older data than the most recent write.

console.log({
  sessionId,
  worker: process.env.HOSTNAME,
  before,
});

•
Verify whether the bug disappears with one replica
- •Run with a single Node process.
- •If it works at one replica but fails at two or more, your issue is almost certainly shared-memory misuse or race conditions.
•
Inspect where Context is created
- •Search for new Context() and make sure it happens inside request handling code.
- •Search for module-level variables holding conversation history or workflow progress.

Prevention

•Keep all per-session agent state in Redis, Postgres, DynamoDB, or another shared store.
•Treat Workflow, Context, and memory objects as request-scoped unless the docs explicitly say otherwise.
•
Add a scale test early:
- •run two replicas,
- •fire concurrent requests for the same session,
- •verify state updates remain consistent.

If you want the short version: don’t let LlamaIndex TypeScript hide distributed-systems problems behind clean APIs. When scaling breaks state updates, it’s usually because your code assumed one process when production gave you many.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit