How to Integrate Azure OpenAI for banking with CosmosDB for AI agents

By Cyprian AaronsUpdated 2026-04-21
azure-openai-for-bankingcosmosdbai-agents

Azure OpenAI for banking gives you the model layer for secure, policy-aware assistant behavior. CosmosDB gives you the persistence layer for customer context, conversation state, and audit-friendly memory. Put them together and you can build AI agents that answer banking questions, retain session context, and retrieve prior interactions without stuffing everything into prompts.

Prerequisites

  • Python 3.10+
  • An Azure subscription with:
    • Azure OpenAI resource
    • Azure Cosmos DB account
  • Deployed Azure OpenAI chat model, for example gpt-4o or gpt-4.1
  • CosmosDB database and container created
  • Azure CLI installed and logged in, or environment variables set manually
  • Python packages:
    • openai
    • azure-cosmos
    • python-dotenv

Install dependencies:

pip install openai azure-cosmos python-dotenv

Set environment variables:

export AZURE_OPENAI_ENDPOINT="https://<your-openai-resource>.openai.azure.com/"
export AZURE_OPENAI_API_KEY="<your-openai-key>"
export AZURE_OPENAI_DEPLOYMENT="banking-chat-model"

export COSMOS_ENDPOINT="https://<your-cosmos-account>.documents.azure.com:443/"
export COSMOS_KEY="<your-cosmos-key>"
export COSMOS_DATABASE="agentdb"
export COSMOS_CONTAINER="sessions"

Integration Steps

  1. Create a CosmosDB client and container handle

    Start by connecting to CosmosDB. For an AI agent, use a partition key like /sessionId so each conversation stays isolated and easy to query.

import os
from azure.cosmos import CosmosClient, PartitionKey

COSMOS_ENDPOINT = os.environ["COSMOS_ENDPOINT"]
COSMOS_KEY = os.environ["COSMOS_KEY"]
DATABASE_NAME = os.environ["COSMOS_DATABASE"]
CONTAINER_NAME = os.environ["COSMOS_CONTAINER"]

cosmos_client = CosmosClient(COSMOS_ENDPOINT, credential=COSMOS_KEY)

database = cosmos_client.create_database_if_not_exists(id=DATABASE_NAME)
container = database.create_container_if_not_exists(
    id=CONTAINER_NAME,
    partition_key=PartitionKey(path="/sessionId"),
    offer_throughput=400
)
  1. Initialize the Azure OpenAI client

    Use the Azure OpenAI SDK with your endpoint, API key, and deployment name. In banking workflows, keep the system prompt strict: no advice beyond policy, no fabricated account details, and always cite stored context when available.

import os
from openai import AzureOpenAI

AZURE_OPENAI_ENDPOINT = os.environ["AZURE_OPENAI_ENDPOINT"]
AZURE_OPENAI_API_KEY = os.environ["AZURE_OPENAI_API_KEY"]
AZURE_OPENAI_DEPLOYMENT = os.environ["AZURE_OPENAI_DEPLOYMENT"]

aoai_client = AzureOpenAI(
    api_key=AZURE_OPENAI_API_KEY,
    api_version="2024-02-15-preview",
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
)
  1. Load conversation memory from CosmosDB

    Before calling the model, fetch prior messages for the session. Keep the payload small; store only what the agent needs: user input, assistant output, timestamps, and optional metadata like intent or customer tier.

from typing import List, Dict

def load_session_messages(session_id: str) -> List[Dict]:
    query = """
    SELECT * FROM c
    WHERE c.sessionId = @sessionId
    ORDER BY c.createdAt ASC
    """
    items = list(container.query_items(
        query=query,
        parameters=[{"name": "@sessionId", "value": session_id}],
        enable_cross_partition_query=False
    ))

    messages = []
    for item in items:
        messages.append({"role": item["role"], "content": item["content"]})
    return messages
  1. Call Azure OpenAI with session context

    Build the message list from stored memory plus the current user prompt. This is where the agent becomes stateful: it can answer follow-ups like “show me that again” or “what was my last payment date?” without re-entering all context.

def get_banking_response(session_id: str, user_text: str) -> str:
    history = load_session_messages(session_id)

    messages = [
        {
            "role": "system",
            "content": (
                "You are a banking assistant. "
                "Use only provided context. "
                "Do not invent account balances or transaction details. "
                "If data is missing, ask for clarification."
            ),
        }
    ]

    messages.extend(history)
    messages.append({"role": "user", "content": user_text})

    response = aoai_client.chat.completions.create(
        model=AZURE_OPENAI_DEPLOYMENT,
        messages=messages,
        temperature=0.2,
        max_tokens=400,
    )

    return response.choices[0].message.content
  1. Persist the new turn back into CosmosDB

    After generating a response, store both sides of the exchange. This gives you durable memory for later retrieval, audit trails for compliance review, and a clean way to resume sessions across app restarts.

from datetime import datetime, timezone

def save_message(session_id: str, role: str, content: str):
    item = {
        "id": f"{session_id}-{role}-{datetime.now(timezone.utc).timestamp()}",
        "sessionId": session_id,
        "role": role,
        "content": content,
        "createdAt": datetime.now(timezone.utc).isoformat()
    }
    container.upsert_item(item)

def chat(session_id: str, user_text: str) -> str:
    assistant_text = get_banking_response(session_id, user_text)
    save_message(session_id, "user", user_text)
    save_message(session_id, "assistant", assistant_text)
    return assistant_text

Testing the Integration

Run a simple round trip with a fixed session ID. The first call should create memory; the second call should reuse it.

if __name__ == "__main__":
    session_id = "acct-10001"

    first_reply = chat(session_id, "I need help understanding my mortgage payment schedule.")
    print("First reply:", first_reply)

    second_reply = chat(session_id, "Can you summarize what we discussed?")
    print("Second reply:", second_reply)

Expected output:

First reply: I can help with that. Please share your mortgage type or loan reference so I can explain the payment schedule accurately.
Second reply: We discussed your mortgage payment schedule and I asked for your loan reference to provide an accurate summary.

If you want to verify CosmosDB persistence directly:

stored_items = list(container.query_items(
    query="SELECT * FROM c WHERE c.sessionId = @sessionId",
    parameters=[{"name": "@sessionId", "value": "acct-10001"}],
))
print(f"Stored messages: {len(stored_items)}")

Real-World Use Cases

  • Customer service banking agents

    • Answer balance-related questions only when backed by retrieved account data.
    • Keep conversation history in CosmosDB so customers can continue across channels.
  • Loan onboarding assistants

    • Guide applicants through document collection and eligibility checks.
    • Store application state per session and let Azure OpenAI generate next-step instructions.
  • Compliance-aware support workflows

    • Persist every user-assistant turn for audit review.
    • Use CosmosDB as an evidence trail while Azure OpenAI handles summarization and triage.

The pattern is straightforward: Azure OpenAI generates responses; CosmosDB stores durable agent state. That separation keeps your banking agent maintainable, testable, and ready for compliance-heavy environments.


Keep learning

By Cyprian Aarons, AI Consultant at Topiax.

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit

Related Guides