How to Integrate OpenAI for insurance with Pinecone for AI agents

By Cyprian AaronsUpdated 2026-04-21

openai-for-insurancepineconeai-agents

Combining OpenAI for insurance with Pinecone gives you a practical pattern for building AI agents that can answer policy questions, retrieve claims context, and ground responses in your own carrier data. OpenAI handles the reasoning and response generation, while Pinecone stores and retrieves the most relevant policy clauses, endorsements, claims notes, and underwriting guidelines.

That matters because insurance workflows are document-heavy and accuracy-sensitive. An agent that can search your indexed knowledge base before answering is much safer than one relying on model memory alone.

Prerequisites

•Python 3.10+
•An OpenAI API key with access to the models you want to use
•A Pinecone account and API key
•A Pinecone index created with the correct embedding dimension for your embedding model
•pip installed
•Basic familiarity with Python async or sync SDK calls
•
Your insurance documents prepared as text chunks:
- •policy wording
- •claims manuals
- •underwriting guidelines
- •FAQ or SOP content

Install the SDKs:

pip install openai pinecone

Set environment variables:

export OPENAI_API_KEY="your-openai-key"
export PINECONE_API_KEY="your-pinecone-key"
export PINECONE_INDEX_NAME="insurance-agent-index"

Integration Steps

•Initialize both clients

Start by wiring up the OpenAI client for embeddings and completions, then connect to Pinecone.

import os
from openai import OpenAI
from pinecone import Pinecone

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])

index_name = os.environ["PINECONE_INDEX_NAME"]
index = pc.Index(index_name)

•Embed your insurance content and store it in Pinecone

Use OpenAI embeddings to convert each policy chunk into a vector, then upsert it into Pinecone with metadata.

documents = [
    {
        "id": "policy_001",
        "text": "Accidental damage is covered under Section 4, excluding wear and tear.",
        "metadata": {"source": "policy", "section": "4", "line_of_business": "home"}
    },
    {
        "id": "claims_014",
        "text": "Claims must be reported within 30 days of discovery.",
        "metadata": {"source": "claims_manual", "section": "reporting", "line_of_business": "commercial"}
    }
]

texts = [doc["text"] for doc in documents]

embedding_response = client.embeddings.create(
    model="text-embedding-3-small",
    input=texts
)

vectors = []
for doc, item in zip(documents, embedding_response.data):
    vectors.append({
        "id": doc["id"],
        "values": item.embedding,
        "metadata": {
            **doc["metadata"],
            "text": doc["text"]
        }
    })

index.upsert(vectors=vectors)
print("Upsert complete")

•Query Pinecone from an agent request

When a user asks a question, embed the query with the same embedding model and search Pinecone for relevant context.

query = "Does this home policy cover accidental damage?"

query_embedding = client.embeddings.create(
    model="text-embedding-3-small",
    input=[query]
).data[0].embedding

results = index.query(
    vector=query_embedding,
    top_k=3,
    include_metadata=True
)

for match in results["matches"]:
    print(match["score"], match["metadata"]["text"])

•Pass retrieved context into OpenAI for the final answer

Take the retrieved chunks and feed them into a chat completion request so the model answers using grounded context.

context_chunks = [
    match["metadata"]["text"]
    for match in results["matches"]
]

context_block = "\n\n".join(context_chunks)

messages = [
    {
        "role": "system",
        "content": (
            "You are an insurance assistant. Answer only using the provided context. "
            "If the answer is not in context, say you do not have enough information."
        )
    },
    {
        "role": "user",
        "content": f"Context:\n{context_block}\n\nQuestion: {query}"
    }
]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    temperature=0.2
)

print(response.choices[0].message.content)

•Wrap retrieval + generation into one agent function

This is the pattern you actually deploy: retrieve first, then generate.

def answer_insurance_question(question: str) -> str:
    q_embedding = client.embeddings.create(
        model="text-embedding-3-small",
        input=[question]
    ).data[0].embedding

    matches = index.query(
        vector=q_embedding,
        top_k=3,
        include_metadata=True
    )["matches"]

    context = "\n\n".join(m["metadata"]["text"] for m in matches)

    resp = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are an insurance agent assistant. Use only retrieved policy context."
                )
            },
            {
                "role": "user",
                "content": f"Context:\n{context}\n\nQuestion: {question}"
            }
        ],
        temperature=0.2
    )

    return resp.choices[0].message.content


print(answer_insurance_question("Is accidental damage covered?"))

Testing the Integration

Run a simple end-to-end test with a known policy clause.

test_question = "What is the deadline to report a claim?"
answer = answer_insurance_question(test_question)
print("QUESTION:", test_question)
print("ANSWER:", answer)

Expected output should look like this:

QUESTION: What is the deadline to report a claim?
ANSWER: Claims must be reported within 30 days of discovery.

If you get an unrelated answer, check these first:

•Your embedding model matches between indexing and querying
•The Pinecone index dimension matches text-embedding-3-small
•Your chunks contain enough policy text to retrieve useful matches
•You are passing retrieved context into the chat prompt

Real-World Use Cases

•
Claims intake agents
- •Pull claim reporting rules, required documentation, and coverage clauses before responding to customers or adjusters.
•
Underwriting copilot
- •Retrieve appetite guidelines, exclusions, and referral rules so underwriters get grounded recommendations.
•
Policy servicing assistant
- •Answer questions about deductibles, endorsements, cancellation terms, and renewal conditions from indexed carrier documents.

Keep learning

•The complete AI Agents Roadmap — my full 8-step breakdown
•Free: The AI Agent Starter Kit — PDF checklist + starter code
•Work with me — I build AI for banks and insurance companies

By Cyprian Aarons, AI Consultant at Topiax.

ShareX / Twitter LinkedIn

Want the complete 8-step roadmap?

Grab the free AI Agent Starter Kit — architecture templates, compliance checklists, and a 7-email deep-dive course.

Get the Starter Kit