Advanced RAG

Last update:

May 18, 2026

Build production-grade RAG pipelines in ColdFusion with agent(). This article shows how to load documents from any source, transform and split them precisely, configure multi-turn query compression, route queries across multiple vector stores, aggregate results, and inject context using custom Mustache prompt templates.

The agent() function is ColdFusion's full-control RAG API. It exposes every pipeline stage individually — document loading, transformation, splitting, segment transformation, vector ingest, query transformation, routing, retrieval, aggregation, and content injection, so you can configure exactly what you need and leave the rest at defaults.

When to use agent() vs simpleRAG()

simpleRAG() is the high-level API: pass document source, chat model, and optional options — the platform handles everything else automatically. agent() is for when you need to override a specific stage that simpleRAG() does not expose.

Requirement	simpleRAG()	agent()
Minimal setup, zero config	Preferred	Works but verbose
Custom document loader (URL, UDF)	No	documentLoader
Pre-split document transformation	No	documentTransformer
Post-split segment transformation	No	segmentTransformer
Nested splitter config (separators, regex)	No	ingestion.documentSplitter
Query transformation (compressing)	No	queryTransformer
Multiple retrievers / multi-store routing	No	queryRouter
Content aggregation with separator	No	contentAggregator
Custom prompt template (contentInjector)	No	contentInjector

Use agent() when you need this nested ingestion model and full retrieval configuration (retrievalAugmentor, queryRouter, and so on), not the flat options struct used by simpleRAG().

Note: Neither style replaces the other: use agent() ingestion when you want a declarative RAG setup; use documentService() when you need procedural control or reuse of intermediate arrays (documents, segments).

Ingestion pipeline overview

The agent() ingestion pipeline runs in a fixed order: Loader → Document Transformer → Splitter → Segment Transformer → Vector Store Ingestor. Each stage is optional — omit any stage to use its default behavior.

Stage	Config key	Role
Document Loader	ingestion.documentLoader	Reads files, URLs, or a custom UDF into document structs.
Document Transformer	ingestion.documentTransformer	Enriches or filters documents before splitting (pre-split).
Document Splitter	ingestion.documentSplitter	Splits each document into overlapping chunks.
Segment Transformer	ingestion.segmentTransformer	Enriches or filters segments after splitting (post-split).
Vector Store Ingestor	ingestion.vectorStoreIngestor	Embeds segments and writes them to the vector store. Accepts batchSize and continueOnError.

Document loader (filesystem, URL, custom UDF)

Optional ingestion.documentLoader makes the loader explicit. When omitted, the loader is inferred from ingestion.source. Three sourceType values are supported: "filesystem", "url", and "custom".

Filesystem loader

Combine ingestion.source with recursive, includePatterns, and documentLoader: { sourceType: "filesystem" }. Then ingest() runs the full pipeline.

<cfscript>

  chatModel = ChatModel({

    provider: "openai",

    modelName: "gpt-4o-mini",

    apiKey: application.apiKey,

    temperature: 0.7

  });

  vectorStore = VectorStore({

    provider: "INMEMORY",

    embeddingModel: {

      provider: "openai",

      modelName: "text-embedding-3-small",

      apiKey: application.apiKey

    }

  });

  docsDir = expandPath("./docs/");

  ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

      source: docsDir,

      recursive: true,

      includePatterns: ["*.pdf"],

      documentLoader: { sourceType: "filesystem" },

      documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

      vectorStoreIngestor: { vectorStore: vectorStore }

    },

    retrievalAugmentor: {

      queryRouter: {

        contentRetrievers: [{

          vectorStore: vectorStore,

          maxResults: 3,

          minScore: 0.3,

          description: "Knowledge base"

        }]

      }

    }

  });

  ragService.ingest();

  answer = ragService.chat("How to renew Adobe subscription?");

  writeOutput(answer.message);

</cfscript>

URL loader

Set ingestion.source to an HTTPS URL. Under documentLoader, use sourceType: "url" and requestOptions for connection and read timeouts.

<cfscript>

  ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

      source: "https://www.adobe.com/products/coldfusion-family.html",

      documentLoader: {

        sourceType: "url",

        requestOptions: { connectionTimeout: 5000, readTimeout: 30000 }

      },

      documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

      vectorStoreIngestor: { vectorStore: vectorStore }

    },

    retrievalAugmentor: {

      queryRouter: {

        contentRetrievers: [{

          vectorStore: vectorStore,

          maxResults: 3,

          minScore: 0.3,

          description: "Knowledge base"

        }]

      }

    }

  });

  ingest = ragService.ingest();

  writeDump(ingest);

</cfscript>

Custom UDF loader

sourceType: "custom" supplies a UDF (implementation) that returns an array of structs, each with text and metadata. Use this when content is built in CFML instead of read from disk.

<cfscript>

  customLoader = function(required struct config) {

    return [

      {

        text: "The population of India in 2024 is 1.44 billion people.",

        metadata: { source: "custom-loader", id: 1 }

      },

      {

        text: "The GDP of India in 2024 is approximately 3.94 trillion USD.",

        metadata: { source: "custom-loader", id: 2 }

      }

    ];

  };

  ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

      source: expandPath("./docs/"),

      documentLoader: {

        sourceType: "custom",

        implementation: customLoader

      },

      documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

      vectorStoreIngestor: { vectorStore: vectorStore }

    },

    retrievalAugmentor: {

      queryRouter: {

        contentRetrievers: [{

          vectorStore: vectorStore,

          maxResults: 3,

          minScore: 0.3,

          description: "Knowledge base"

        }]

      }

    }

  });

  ragService.ingest();

</cfscript>

Document transformer (pre-split enrichment)

An optional ingestion.documentTransformer UDF runs against each loaded document before splitting. Use it to enrich, filter, or restructure documents, for example adding metadata tags, stripping boilerplate headers, or normalizing whitespace, before the splitter processes the text.

Note: The document transformer receives one document struct at a time and must return a document struct (or null/empty to drop the document). It runs after loading and before splitting, so changes affect all downstream segments.

Document splitter configuration

Configure ingestion.documentSplitter as a struct with chunkSize, chunkOverlap, splitterType, separators, and regexPattern. An empty struct {} uses defaults (recursive splitting, chunkSize 1000, chunkOverlap 100).

// Minimal — uses defaults

documentSplitter: {}

// Explicit chunk size and overlap

documentSplitter: { chunkSize: 500, chunkOverlap: 50 }

// Sentence splitter

documentSplitter: { splitterType: "sentence", chunkSize: 500, chunkOverlap: 50 }

// Recursive with custom separators

documentSplitter: {

  splitterType: "recursive",

  chunkSize: 200,

  chunkOverlap: 50,

  separators: [ chr(10) & chr(10), chr(10), " ", "" ]

}

// Regex splitter

documentSplitter: { splitterType: "regex", regexPattern: "\\n" }

Segment transformer (post-split enrichment)

An optional ingestion.segmentTransformer UDF runs against each segment produced by the splitter. Use it to enrich segment metadata, filter out low-quality chunks, or normalise text after splitting but before embedding.

Note: The segment transformer receives one segment struct at a time (with text and metadata keys) and must return a segment struct or null to drop the segment. It runs after splitting and before the vector store ingestor.

Vector store ingestor (batchSize, continueOnError)

Configure ingestion.vectorStoreIngestor with a vectorStore object and optional batchSize and continueOnError.

Option	Description
vectorStore	Required. A configured VectorStore object. Must use the same embedding model as the retriever.
batchSize	How many segments to embed and write per internal batch (e.g. 100). Higher values improve throughput but increase memory use.
continueOnError	When true, ingestion skips or logs failed segments and continues. When false, the job stops on the first error — better for strict validation.

vectorStoreIngestor: {

  vectorStore: vectorStore,

  batchSize: 100,

  continueOnError: true

}

Batching and Async Ingestion

By default, synchronous ingestion (ingest()) keeps the request busy until indexing finishes. Asynchronous ingestion starts that work and returns a Future object right away.

When to use async ingestion

Ingestion may take noticeable time (large folders, many files).
You want a clear place in code to wait (get()) or to check progress.
You are building flows where you must not assume ingest finished until you handle the Future.

Basic pattern

Create your chat model, vector store, and RAG service (simpleRAG() or agent()), same as for synchronous RAG.
Call ingestAsync() on that service. It returns a Future.
When you need the result, call get() on the Future. That waits until ingestion completes and returns the ingest result struct.
Only then call ask(), chat(), or rely on retrieval, if you need a fully indexed store.

Example: ingest asynchronously, then query

<cfscript>

  chatModel = ChatModel({

    provider: "openai",

    modelName: "gpt-4o-mini",

    apiKey: application.apiKey,

    temperature: 0.2

  });

  docsDir = expandPath("./docs/");

  vectorStore = VectorStore({

    provider: "INMEMORY",

    embeddingModel: {

      provider: "openai",

      modelName: "text-embedding-3-small",

      apiKey: application.apiKey

    }

  });

  ragBot = simpleRAG(docsDir, chatModel, { vectorStore: vectorStore });

  // Non-blocking: returns immediately with a Future

  future = ragBot.ingestAsync();

  // Option A: block until indexing finishes

  result = future.get();

  // Option B: poll while doing other work

  /*

  while (!future.isDone()) {

    // small unit of other work, or sleep

  }

  result = future.get();

  */

  // Optional: inspect statistics after ingest

  stats = ragBot.getStatistics();

  writeDump(stats);

  // Safe to query after get() returns successfully

  answer = ragBot.ask("How to upgrade Adobe plan?");

  writeOutput(answer.message);

</cfscript>

Understanding the Future

ingestAsync() returns quickly with a Future handle.
future.get() blocks the current request until ingestion finishes. Your page or request thread still waits at that line; it does not return HTML to the browser before that unless you structure the page differently.
For true background jobs that survive the HTTP response, you typically need application design beyond this pattern (scheduled tasks, message queues, or long-lived workers). The Future pattern is ideal when you want non-blocking composition in code or clear completion before the next step in the same request.

Retrieval pipeline overview

Retrieval is the part of RAG that selects a small amount of text from your knowledge base so the language model can ground its answer in your documents instead of relying only on parametric memory. The retrievalAugmentor struct on an agent() wires optional stages:

Stage	Config key	Role
Rewrite user question for retrieval	queryTransformer	e.g. "compressing" so follow-ups use chat context.
Choose which retriever(s) to use	queryRouter	contentRetrievers list, optional type.
Merge text from multiple retrievers	contentAggregator	type, separator.
Format context and question for the LLM	contentInjector	promptTemplate, metadataKeys.

Typical order in config: declare queryTransformer (if any), contentAggregator (if any), contentInjector (if any), and queryRouter (with contentRetrievers). At runtime the product applies transformation, routes to one or more retrievers, aggregates chunks, then injects context for the language model.

Query transformation (compressing transformer for multi-turn)

Query transformation adjusts the text used for embedding search and context assembly. The compressing transformer rewrites the current user message using chat history so vague follow-ups (for example "Tell me more about that.") still retrieve relevant chunks.

queryTransformer: {

  type: "compressing"

}

Use queryTransformer with CHATMEMORY (for example messageWindowChatMemory) so prior turns exist to resolve references.

<cfscript>

  ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

      source: docsDir,

      documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

      vectorStoreIngestor: { vectorStore: vectorStore }

    },

    retrievalAugmentor: {

      queryTransformer: { type: "compressing" },

      queryRouter: {

        contentRetrievers: [{

          vectorStore: vectorStore,

          maxResults: 3,

          minScore: 0.3,

          description: "Adobe knowledge base"

        }]

      }

    },

    CHATMEMORY: { type: "messageWindowChatMemory", maxMessages: 10 }

  });

  ragService.ingest();

  answer1 = ragService.chat("How to reactivate Adobe subscription?");

  answer2 = ragService.chat("Tell me more about that.");

  writeOutput(answer1.message);

  writeOutput(answer2.message);

</cfscript>

With contentAggregator, the same transformer composes when multiple retrievers or merged chunks are used:

retrievalAugmentor: {

  queryTransformer: { type: "compressing" },

  contentAggregator: {

    type: "default",

    separator: chr(10) & "---" & chr(10)

  },

  queryRouter: {

    contentRetrievers: [{

      vectorStore: vectorStore,

      maxResults: 3,

      minScore: 0.3,

      description: "Knowledge base"

    }]

  }

},

CHATMEMORY: { type: "messageWindowChatMemory", maxMessages: 10 }

If queryTransformer is omitted, retrieval uses the raw user message. The suite still expects a relevant answer when the question is explicit.

Query routing (default vs intelligent routing)

Query routing decides which retrieval path to use. queryRouter holds an array of contentRetrievers, each pointing at a vectorStore and carrying a description that characterises what that retriever covers.

A single-store, single-retriever setup is the minimal case:

retrievalAugmentor: {

  queryRouter: {

    type: "default",

    contentRetrievers: [{

      vectorStore: vectorStore,

      maxResults: 3,

      minScore: 0.3

    }]

  }

}

You can register two or more retrievers pointing at the same vectorStore with different description, maxResults, or minScore:

retrievalAugmentor: {

  queryRouter: {

    contentRetrievers: [

      {

        vectorStore: vectorStore,

        maxResults: 3,

        minScore: 0.3,

        description: "Manage auto renewal settings"

      },

      {

        vectorStore: vectorStore,

        maxResults: 2,

        minScore: 0.3,

        description: "Reactivate Adobe subscription"

      }

    ]

  }

}

Content retrieval (maxResults, minScore)

Each retriever is a struct that must include a vectorStore reference for augmented retrieval.

Field	Description
vectorStore	Required. The vector store to search.
maxResults	Cap on how many chunks to pull (top-K after similarity search). Smaller values retrieve fewer chunks (tighter context); larger values retrieve more overlapping evidence.
minScore	Minimum similarity score [0, 1]. Chunks below it are filtered out. Values outside [0, 1] are rejected. A high threshold (e.g. 0.9) keeps only very similar chunks and may reduce or empty context.
description	Human-readable scope; used by the router, especially with multiple retrievers.

Re-ranking and ordering results:

Ranking: chunks are ordered by similarity score to the query embedding (implementation-specific).
Top-K: maxResults keeps only the first K of that ordering.
Thresholding: minScore drops low-scoring chunks, which changes which evidence reaches the LLM.

Together, these control precision versus recall of retrieved text.

Content aggregation (merging multiple retrievers)

When queryRouter lists several contentRetrievers, each may return its own set of chunks. contentAggregator (for example type: "default" with a separator) merges those strings for the generator.

retrievalAugmentor: {

  contentAggregator: {

    type: "default",

    separator: chr(10) & "====" & chr(10)

  },

  queryRouter: {

    contentRetrievers: [

      {

        vectorStore: vectorStore,

        maxResults: 3,

        minScore: 0.3,

        description: "Economic indicators and inflation data"

      },

      {

        vectorStore: vectorStore,

        maxResults: 3,

        minScore: 0.3,

        description: "Technology and ColdFusion knowledge"

      }

    ]

  }

}

Content Injection and Prompt Templates

The Content Injector is the component that controls how retrieved content is merged with the user's question before the LLM sees it. It sits between the retriever and the LLM, and determines the exact shape of the prompt the model receives.

contentInjector accepts two optional properties. When omitted entirely, ColdFusion uses default injection behavior (retrieved content is prepended before the user message).

Property	Type	Default	Description
promptTemplate	String	Built-in default	Mustache-style template that controls how retrieved content and the user question are assembled into the final prompt sent to the LLM.
metadataKeys	Array of strings	[] (none)	Document metadata fields to append to each retrieved chunk. For example, ["file_name"] causes the source filename to appear next to each chunk.

promptTemplate placeholders

Your template string can contain two Mustache placeholders:

Placeholder	Replaced with
{{contents}}	The retrieved document chunks (one block of text).
{{userMessage}}	The user's original question.

Both placeholders are optional. If {{contents}} is absent, retrieved content is not injected. If {{userMessage}} is absent, the user's question is not included. If neither placeholder is present, ColdFusion logs a warning and falls back to default injection, the application does not crash.

Examples

The examples below use an `agent()` configured with an in-memory vector store and Mistral as the LLM. Your application will substitute its own API keys and model choices. All setup code belongs in `Application.cfc`:

// Application.cfc

component {

    this.name = hash(getCurrentTemplatePath());

    public void function onApplicationStart() {

        application.mistralKey  = "YOUR_MISTRAL_API_KEY";

        application.ollamaBaseUrl = "http://your-ollama-host:11434";

    }

}

Example 1: Default Injection (No contentInjector Configured)

When you omit contentInjector, ColdFusion automatically prepends retrieved content before the user's message. This is the simplest starting point.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

        source: docsDir,

        documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

        vectorStoreIngestor: { vectorStore: vectorStore }

    },

        retrievalAugmentor: {

            queryRouter: {

                contentRetrievers: [{

                    vectorStore: vectorStore,

                    maxResults: 3,

                    minScore: 0.6,

                    description: "Product knowledge base"

                }]

            }

            // No contentInjector — default injection is used

        }

    });

    ragService.ingest();

    answer = ragService.chat("What is Klarna?");

    writeOutput(answer.message)

</cfscript>

Output

Klarna is a payment service that allows you to shop online and pay for your purchases later through various payment options, including Buy Now, Pay Later (BNPL). It may perform a soft credit check upon signup, which does not affect your credit score. You can manage your Klarna purchases and outstanding balances via the Klarna app. If you need more information or have specific questions, you can visit Klarna's FAQ page or contact their Customer Service.

Example 2: Custom Prompt Template

Use promptTemplate when you need precise control over how the LLM sees the context and the question.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

        CHATMODEL: chatModel,

        ingestion: {

            source: docsDir,

            documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

            vectorStoreIngestor: { vectorStore: vectorStore }

        },

        retrievalAugmentor: {

            contentInjector: {

                promptTemplate: "Use this context to answer:" & chr(10) &

                                "{{contents}}" & chr(10) & chr(10) &

                                "Question: {{userMessage}}"

            },

            queryRouter: {

                contentRetrievers: [{

                    vectorStore: vectorStore,

                    maxResults: 3,

                    minScore: 0.6,

                    description: "Knowledge base"

                }]

            }

        }

    });

    ragService.ingest();

    answer = ragService.chat("How to view support cases?");

    writeOutput(answer.message);

</cfscript>

Output

To view your Adobe support cases, sign in to your Adobe account and navigate to the Support history page. There, you will find all your cases listed under the Support cases section. You can review both open and closed cases. If you want to specifically view your closed cases, select the 'Closed' option. Alternatively, you can see all cases, including both open and closed, by selecting 'All'. Please note that you cannot modify or reopen a closed case.

Example 3: Adding Source Metadata to Retrieved Chunks

The metadataKeys property appends document metadata (such as the source filename) to each retrieved chunk before injection. This allows the LLM to reference or cite sources.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

        source: docsDir,

        documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

        vectorStoreIngestor: { vectorStore: vectorStore }

    },

        retrievalAugmentor: {

            contentInjector: {

                metadataKeys: ["file_name"]

            },

            queryRouter: {

                contentRetrievers: [{

                    vectorStore: vectorStore,

                    maxResults: 3,

                    minScore: 0.6,

                    description: "Knowledge base"

                }]

            }

        }

    });

    ragService.ingest();

    answer = ragService.chat("How to resolve payment issues?");

    writeOutput(answer.message);

</cfscript>

Output

To resolve payment issues, check for a Billing issue alert displayed in your plan card. If it appears, your recent payment may have failed. You can resolve this by selecting 'Edit billing and payment' and then adding or updating your payment information. Make sure that your credit card details, including expiration date and billing address, are current. If you have multiple profiles associated with the same email, ensure you are managing the right account.

Example 4: Multiple Metadata Keys

You can specify multiple metadata keys. ColdFusion appends each available field to the chunk text.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

        source: docsDir,

        documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

        vectorStoreIngestor: { vectorStore: vectorStore }

    },

        retrievalAugmentor: {

            contentInjector: {

                metadataKeys: ["file_name", "absolute_directory_path"]

            },

            queryRouter: {

                contentRetrievers: [{

                    vectorStore: vectorStore,

                    maxResults: 3,

                    minScore: 0.3,

                    description: "Knowledge base"

                }]

            }

        }

    });

    ragService.ingest();

    answer = ragService.chat("What are the features of your product?");

    writeOutput(answer.message);

</cfscript>

Common metadata keys available on ingested documents:

Key	Description
file_name	The filename of the source document
absolute_directory_path	Full directory path containing the file
absoluteFilePath	Full file path including the filename
fileSize	File size in bytes

Output

Our product features include native support for Retrieval-Augmented Generation (RAG), which integrates information retrieval with text generation to enhance response accuracy and context. Additionally, ColdFusion 2025 includes advanced document processing capabilities, embedding, and retrieval features, and supports high-dimensional vector storage for semantic similarity search, making it ideal for AI applications.

Example 5: Prompt Template with Metadata (Source Citation)

Combine a custom template with `metadataKeys` to instruct the LLM to cite its sources.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

        source: docsDir,

        documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

        vectorStoreIngestor: { vectorStore: vectorStore }

    },

        retrievalAugmentor: {

            contentInjector: {

                promptTemplate: "Sources:" & chr(10) &

                                "{{contents}}" & chr(10) & chr(10) &

                                "Based on the sources above, answer: {{userMessage}}" & chr(10) &

                                "Cite the source file names.",

                metadataKeys: ["file_name"]

            },

            queryRouter: {

                contentRetrievers: [{

                    vectorStore: vectorStore,

                    maxResults: 3,

                    minScore: 0.3,

                    description: "Knowledge base"

                }]

            }

        }

    });

    ragService.ingest();

    answer = ragService.chat("How to update TIN?");

    writeOutput(answer.message);

</cfscript>

Output

To update your tax identification number (TIN), follow these steps based on your account type: 1. **For Individual Accounts**: Look for the section on how to update your business tax identification number, such as VAT, GST, or NIT. Instructions may be found on Adobe's support page. 2. **For Tax-Exempt Customers in North America**: Follow the step-by-step instructions provided by Adobe to place a tax-exempt order. 3. **For PayPal Users**: Ensure accurate tax information appears on future invoices by updating your VAT or tax details directly in your PayPal account settings.

Example 6: Multi-Turn Conversation (Chat Memory)

Add CHATMEMORY alongside contentInjector to support follow-up questions. The injector formats context correctly on every turn; chat memory preserves conversational state.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

        source: docsDir,

        documentSplitter: { chunkSize: 300, chunkOverlap: 50 },

        vectorStoreIngestor: { vectorStore: vectorStore }

    },

    retrievalAugmentor: {

        contentInjector: {

            promptTemplate: "Reference:" & chr(10) &

                            "{{contents}}" & chr(10) & chr(10) &

                            "Q: {{userMessage}}"

        },

        queryRouter: {

            contentRetrievers: [{

                vectorStore: vectorStore,

                maxResults: 5,

                minScore: 0.3,

                description: "Knowledge base"

            }]

        }

    },

    CHATMEMORY: { type: "messageWindowChatMemory", maxMessages: 20 }

});

    ragService.ingest();

    // Turn 1

    a1 = ragService.chat("What are the system requirements?");

    writeOutput("Turn 1: " & a1.message);

    // Turn 2 - follow-up using prior context

    a2 = ragService.chat("How to auto renew?");

    writeOutput("Turn 2: " & a2.message);

    // Turn 3 - completely different topic

    a3 = ragService.chat("How to convert trial to paid subscription?");

    writeOutput("Turn 3: " & a3.message);

</cfscript>

Output

Turn 1: The system requirements for Adobe products vary depending on the specific product you are using. Generally, you need to have the latest version of your operating system and browser for optimal functionality and security. Make sure your system meets the minimum specifications provided on Adobe's official website for the product you are interested in.Turn 2: To set up auto-renewal for your Adobe subscription, follow these steps: Sign in to your Adobe account. Navigate to the 'Billing and payment' section, where you will find an option to toggle auto-renewal on or off. If you enable auto-renewal, your subscription will automatically renew at the end of each annual term, and you will receive an annual bill until you decide to cancel or opt out.Turn 3: To convert your Creative Cloud trial to a paid subscription, follow these steps: 1. Make sure you have access to the Adobe ID used for your trial, a stable internet connection, and a valid payment method (credit card, debit card, or PayPal). 2. Go to the Creative Cloud website. 3. Sign in to your account. 4. Follow the prompts to convert your trial to a paid membership. This will allow you to continue using Adobe apps and services without interruption.

Example 7: Shorthand Content Retriever

When you only need one retriever, you can use contentRetriever directly on retrievalAugmentor instead of nesting it under queryRouter. This is equivalent but more concise.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

        CHATMODEL: chatModel,

        ingestion: {

            source: docsDir,

            documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

            vectorStoreIngestor: { vectorStore: vectorStore }

        },

        retrievalAugmentor: {

            contentRetriever: {

                vectorStore: vectorStore,

                maxResults: 3,

                minScore: 0.3

            },

            contentInjector: {

                promptTemplate: "Context:" & chr(10) &

                                "{{contents}}" & chr(10) & chr(10) &

                                "Question: {{userMessage}}"

            }

        }

    });

    ragService.ingest();

    answer = ragService.chat("What payment methods do you support?");

    writeOutput(answer.message);

</cfscript>

Output

We support payment methods including credit cards (Mastercard and Visa) and PayPal. You can switch between these methods as needed.

Example 8: Content Injector with Aggregator

When using multiple retrievers, a contentAggregator can merge their results with a custom separator before the injector formats the combined text.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

        source: docsDir,

        documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

        vectorStoreIngestor: { vectorStore: vectorStore }

    },

    retrievalAugmentor: {

            contentAggregator: {

                type: "default",

                separator: chr(10) & "---" & chr(10)

            },

            contentInjector: {

                promptTemplate: "Evidence:" & chr(10) &

                                "{{contents}}" & chr(10) & chr(10) &

                                "Answer this: {{userMessage}}"

            },

            queryRouter: {

                contentRetrievers: [{

                    vectorStore: vectorStore,

                    maxResults: 3,

                    minScore: 0.3,

                    description: "Knowledge base"

                }]

            }

        }

    });

    ragService.ingest();

    answer = ragService.chat("How to manage subscriptions?");

    writeOutput(answer.message);

</cfscript>

Output

To manage your subscriptions, you can go to your Adobe account settings. Here, you can update your credit card information which will affect all subscriptions linked to that card. Additionally, you can manage your marketing communication preferences separately from account-related messages. This gives you control over the updates and communications you receive from Adobe regarding events, app updates, news, and more.

Example 9: Multiple Content Retrievers

Use multiple retrievers to search different semantic areas or apply different scoring thresholds. The injector formats the combined results as a single block.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

        source: docsDir,

        documentSplitter: { chunkSize: 300, chunkOverlap: 50 },

        vectorStoreIngestor: { vectorStore: vectorStore }

    },

        retrievalAugmentor: {

            contentInjector: {

                promptTemplate: "Retrieved documents:" & chr(10) &

                                "{{contents}}" & chr(10) & chr(10) &

                                "User question: {{userMessage}}"

            },

            queryRouter: {

                contentRetrievers: [

                    {

                        vectorStore: vectorStore,

                        maxResults: 3,

                        minScore: 0.3,

                        description: "Technical documentation"

                    },

                    {

                        vectorStore: vectorStore,

                        maxResults: 3,

                        minScore: 0.3,

                        description: "FAQ content"

                    }

                ]

            }

        }

    });

    ragService.ingest();

    answer = ragService.chat("How to integrate with third-party services?");

    writeOutput(answer.message);

</cfscript>

Output

To integrate with third-party services using Facebook, you need to enable Facebook's integration with these services first. You can do this by going to your Facebook settings, navigating to the 'Apps and Websites' section, and turning on the option for apps, websites, and games. Once this is enabled, you should be able to proceed with the integration process. Make sure to check the permissions that the third-party service requests to ensure your privacy and data security.

Example 10: Query Transformer + Content Injector

A queryTransformer with type "compressing" rewrites multi-turn chat history into a single self-contained query before retrieval. This improves retrieval accuracy in long conversations.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

        source: docsDir,

        documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

        vectorStoreIngestor: { vectorStore: vectorStore }

    },

        retrievalAugmentor: {

            queryTransformer: { type: "compressing" },

            contentInjector: {

                promptTemplate: "Context:" & chr(10) &

                                "{{contents}}" & chr(10) & chr(10) &

                                "Question: {{userMessage}}"

            },

            queryRouter: {

                contentRetrievers: [{

                    vectorStore: vectorStore,

                    maxResults: 3,

                    minScore: 0.3,

                    description: "Knowledge base"

                }]

            }

        },

        CHATMEMORY: { type: "messageWindowChatMemory", maxMessages: 10 }

    });

    ragService.ingest();

    answer = ragService.chat("How to reactivate subscription?");

    writeOutput(answer.message);

</cfscript>

Output

To reactivate your Adobe subscription, follow these steps: 1. **Check Payment Methods**: Ensure that the payment method on your account is valid and has sufficient funds. You can update your payment information by logging into your Adobe account. 2. **Visit Account Management**: Go to the Adobe account management page and navigate to the billing section. Here you can see if your account is inactive and follow prompts to reactivate it. 3. **Attempt Payment Again**: If your account is inactive due to failed payment attempts, you may need to attempt the payment again. This can usually be done directly from the account management section. 4. **Contact Support**: ....

Example 11: Full Pipeline

This example combines every component: query transformer, multiple retrievers, aggregator, injector with metadata, and chat memory.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

        source: docsDir,

        documentSplitter: { chunkSize: 300, chunkOverlap: 50 },

        vectorStoreIngestor: { vectorStore: vectorStore }

    },

        retrievalAugmentor: {

            queryTransformer: { type: "compressing" },

            queryRouter: {

                contentRetrievers: [

                    {

                        vectorStore: vectorStore,

                        maxResults: 3,

                        minScore: 0.3,

                        description: "Technical documentation"

                    },

                    {

                        vectorStore: vectorStore,

                        maxResults: 2,

                        minScore: 0.3,

                        description: "Release notes"

                    }

                ]

            },

            contentAggregator: {

                type: "default",

                separator: chr(10) & "---" & chr(10)

            },

            contentInjector: {

                promptTemplate: "Sources:" & chr(10) &

                                "{{contents}}" & chr(10) & chr(10) &

                                "Answer: {{userMessage}}",

                metadataKeys: ["file_name"]

            }

        },

        CHATMEMORY: { type: "messageWindowChatMemory", maxMessages: 10 }

    });

    ragService.ingest();

    answer = ragService.chat("I can't sign in?");

    writeOutput(answer.message);

</cfscript>

Output

It seems you're having trouble signing in with your social account. Please check if your social account is connected properly. If you're receiving an error message stating 'We couldn't connect your social account,' make sure you choose the correct option when prompted. For example, if logging in with Facebook, ensure you select 'Continue as [your name]' instead of 'Not now.' If you still can't sign in, please provide more details.

Example 12: External Vector Store (Qdrant)

The content injector works identically with all supported vector store providers. Swap provider: "INMEMORY" for provider: "qdrant" to use a persistent, production-grade store.

<cfscript>

    qdrantStore = VectorStore({

        provider: "qdrant",

        url: "http://your-qdrant-host:6334",

        apiKey: "YOUR_QDRANT_API_KEY",

        collectionName: "product_docs_" & dateFormat(now(), "yyyymmdd"),

        metricType: "COSINE",

        dimension: 384,

        embeddingModel: {

            provider: "ollama",

            modelName: "all-minilm",

            baseUrl: application.ollamaBaseUrl

        }

    });

    ragService = agent({

        CHATMODEL: chatModel,

        ingestion: {

            source: expandPath("./Documents/product-docs.txt"),

            documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

            vectorStoreIngestor: { vectorStore: qdrantStore }

        },

            retrievalAugmentor: {

                contentInjector: {

                    promptTemplate: "Context:" & chr(10) &

                                    "{{contents}}" & chr(10) & chr(10) &

                                    "Question: {{userMessage}}"

                },

                queryRouter: {

                    contentRetrievers: [{

                        vectorStore: qdrantStore,

                        maxResults: 3,

                        minScore: 0.3,

                        description: "Product knowledge base"

                    }]

                }

            }

        });

    ragService.ingest();

    answer = ragService.chat("What are the system requirements?");

    writeOutput(answer.message);

</cfscript>

Example 13: Ingesting an Entire Document Directory

Point source at a directory path instead of a single file to ingest all documents at once. The injector will format content from any of those files depending on which are most relevant to the query.

<cfscript>

    chatModel = ChatModel({

        provider: "openai",

        modelName: "gpt-4o-mini",

        apiKey: application.apiKey,

        temperature: 0.7

    });

    docsDir = expandPath("./docs/");

    vectorStore = VectorStore({

        provider: "INMEMORY",

        embeddingModel: {

            provider: "openai",

            modelName: "text-embedding-3-small",

            apiKey: application.apiKey

        }

    });

    ragService = agent({

    CHATMODEL: chatModel,

    ingestion: {

        source: docsDir,          // entire directory

        documentSplitter: { chunkSize: 300, chunkOverlap: 50 },

        vectorStoreIngestor: { vectorStore: vectorStore }

    },

        retrievalAugmentor: {

            contentInjector: {

                promptTemplate: "Documents:" & chr(10) &

                                "{{contents}}" & chr(10) & chr(10) &

                                "Answer: {{userMessage}}",

                metadataKeys: ["file_name"]

            },

            queryRouter: {

                contentRetrievers: [{

                    vectorStore: vectorStore,

                    maxResults: 5,

                    minScore: 0.2,

                    description: "Multi-document knowledge base"

                }]

            }

        }

    });

    ragService.ingest();

    answer = ragService.chat("I am not able to subscribe?");

    writeOutput(answer.message);

</cfscript>

Output

It seems you're having trouble with your Adobe subscription. There could be multiple reasons for this, such as having multiple accounts, not authenticating your card details, or other issues. Please check if you are logged in with the correct Adobe ID associated with your subscription. Additionally, verify your payment details to ensure they are correctly authenticated.

Edge cases and behavior

No documents retrieved (high minScore): When minScore is set very high (e.g., 0.99) and no chunks meet the threshold, {{contents}} resolves to an empty string. The LLM receives the template with an empty context section and falls back to its own knowledge. The application does not throw an error.

// minScore: 0.99 — nothing will match; {{contents}} will be empty

contentInjector: {

  promptTemplate: "Context:" & chr(10) &

                  "{{contents}}" & chr(10) & chr(10) &

                  "Answer: {{userMessage}}"

},

queryRouter: {

  contentRetrievers: [{

    vectorStore: vectorStore,

    maxResults: 3,

    minScore: 0.99,    // extremely high — no matches expected

    description: "KB"

  }]

}

Template missing one placeholder: A template with only {{contents}} (no {{userMessage}}) delivers context to the LLM but omits the user's actual question. A template with only {{userMessage}} delivers the question but no retrieved context, effectively bypassing RAG. Both are accepted without errors.

// Only {{contents}} — user question is not passed to the LLM

contentInjector: {

  promptTemplate: "Here is the context: {{contents}}"

}

// Only {{userMessage}} — retrieved content is not injected

contentInjector: {

  promptTemplate: "Please answer: {{userMessage}}"

}

Template with no placeholders:

A template string with no recognized placeholders causes ColdFusion to log a warning and fall back to default injection. The application continues to run.

// No placeholders — ColdFusion logs a warning, default injection is used as fallback

contentInjector: {

  promptTemplate: "No placeholders here"

}

abc

Best practices

Always include {{contents}} in your template. Without it, retrieved documents never reach the LLM and the RAG pipeline provides no benefit.
Keep minScore between 0.3 and 0.7 for most use cases. Start at 0.5 and adjust based on answer quality.
Use metadataKeys: ["file_name"] whenever your knowledge base contains multiple files. This lets the LLM reference or cite the specific source document.
Use queryTransformer: { type: "compressing" } with chat memory. In multi-turn conversations, earlier messages contain important context for retrieval.
Set maxMessages on CHATMEMORY to a finite number. A window of 10–20 messages covers typical conversational depth.
Limit chunkSize based on your embedding model's context window. The all-minilm model handles chunks up to ~512 tokens. A chunkOverlap of 10–20% of chunkSize is a reasonable default.

Multi-store topic routing

Intelligent or multi-domain scenarios use separate VectorStore instances, each populated from different documents. description distinguishes domains (economics vs technology vs astronomy). The query-only service uses queryRouter with multiple contentRetrievers so the router can send the question to the store that matches the user intent.

The pattern has three steps:

Create a separate VectorStore and agent() for each domain and call ingest() on each.
Create a query-only agent() (no ingestion block) with a retrievalAugmentor that lists a contentRetrievers entry per store, each with a description scoping that retriever's domain.
Call chat() on the query service. The router steers each question to the matching store.

<cfscript>

  try {

    chatModel = ChatModel({

      PROVIDER: "mistral",

      APIKEY: application.mistralKey,

      MODELNAME: "mistral-small-latest",

      TEMPERATURE: 0.3

    });

    econStore = VectorStore({

      provider: "INMEMORY",

      embeddingModel: {

        provider: "ollama",

        modelName: "all-minilm",

        baseUrl: application.ollamaBaseUrl

      }

    });

    techStore = VectorStore({

      provider: "INMEMORY",

      embeddingModel: {

        provider: "ollama",

        modelName: "all-minilm",

        baseUrl: application.ollamaBaseUrl

      }

    });

    // Step 1: Ingest domain-specific documents into separate stores

    svcEcon = agent({

      CHATMODEL: chatModel,

      ingestion: {

        source: expandPath("./Documents/test.txt"),

        documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

        vectorStoreIngestor: { vectorStore: econStore }

      }

    });

    svcEcon.ingest();

    svcTech = agent({

      CHATMODEL: chatModel,

      ingestion: {

        source: expandPath("./Documents/coldfusion.txt"),

        documentSplitter: { chunkSize: 500, chunkOverlap: 100 },

        vectorStoreIngestor: { vectorStore: techStore }

      }

    });

    svcTech.ingest();

    // Step 2: Query-only agent routes to the correct store

    queryService = agent({

      CHATMODEL: chatModel,

      retrievalAugmentor: {

        queryRouter: {

          contentRetrievers: [

            {

              vectorStore: econStore,

              maxResults: 3,

              minScore: 0.3,

              description: "Economic data including inflation rates and consumer price indices by year"

            },

            {

              vectorStore: techStore,

              maxResults: 3,

              minScore: 0.3,

              description: "Technology platform documentation for ColdFusion programming language"

            }

          ]

        }

      }

    });

    // Step 3: Router sends question to the matching store

    answer = queryService.chat("What is the inflation of year 1999?");

    hasInflation = findNoCase("6%", answer.message) > 0;

    writeOutput("PASS|routed_to_econ=" & hasInflation & "|answer_relevant=" & hasInflation);

  } catch (any e) {

    writeOutput("ERROR: " & e.message);

  }

</cfscript>

With one retriever, routing is trivial. Every query uses that store. With multiple retrievers, the implementation uses the description values (and optional type on the router) to steer finance vs support vs internal docs questions to the appropriate index.

Note: The query-only agent() has no ingestion block. Ingestion ran separately on svcEcon and svcTech. This pattern lets you update one domain's index without rebuilding the others, and it lets different teams own different corpora independently.

Was this page helpful?

We're glad. Tell us how this page helped.

Found the answer to my problem Understood the instructions Liked the feature

Other suggestions

We're sorry. Can you tell us what didn't work for you?

Didn't find the answer to my problem Couldn't understand the instructions Didn't like the feature

Other suggestions

Thank you for your feedback. Your response will help improve this page.

Was this helpful?

We are sorry the content didn't meet your needs.

Share additional feedback to help us improve.

0/255 | Character limit exceeded.

Thank you so much for sharing your feedback!

Advanced RAG

When to use agent() vs simpleRAG()

Ingestion pipeline overview

Batching and Async Ingestion

Retrieval pipeline overview

Content Injection and Prompt Templates

Multi-store topic routing

On this page