Whatever message this page gives is out now! Go check it out!

Prerequisites and setup

Last update:
May 18, 2026
Set up RAG in ColdFusion in minutes. Learn how to configure your AI model provider via ColdFusion Administrator or Application.cfc, allow-list document paths, and connect to OpenAI, Anthropic, Azure, and more.
Version requirements
Requirement
Minimum
Notes
ColdFusion version
CF2025.0.08
RAG is a new feature in CF2025.0.08; not available in earlier releases
Java / JVM
JDK 11+
Libraries require Java 11 or later
Operating System
Windows, Linux, macOS
All CF-supported platforms are supported
AI provider API key
Required
At minimum one chat model key (OpenAI, Anthropic, Azure OpenAI, etc.)

Configure AI model provider

ColdFusion RAG requires a configured chat model. You set this up once in the CF Administrator or via Application.cfc, and then pass the model object to any RAG function.
Option A: Configure in CF Administrator
Navigate to AI Services in the CF Administrator. Add your AI provider (OpenAI, Azure OpenAI, Anthropic, etc.) and give the configuration a name. Select a vector store, document source, and add the configuration options.
Field
Description
Vector store
Required. The target vector store profile where ingested chunks (and embeddings) are stored. Choose a store that matches your RAG query path and embedding dimension. If the dropdown is empty, create and save a vector store configuration first.
Document Source
Field
Description
Source type
  • Single file: Ingest one document by path.
  • Directory: Ingest all supported files under a folder (respecting your product’s recursion and filter rules).
  • URL: Fetch content from a web address when your product supports it.
File path (or path / URL field)
Path or address for the selected source type. For single file or directory, use an absolute path the server can read. Use Browse server when available to reduce path typos. For URL, enter a full URL per your integration’s requirements.
Supported formats
Typical support includes PDF, Word, Excel, PowerPoint, HTML, CSV, JSON, XML, plain text, Markdown, and related formatsUnsupported files may be skipped or fail per Continue on error.
Configuration Options (advanced)
Field
Description
Parser type
Format-specific parser (for example PDF, HTML, plain text). Choose the parser that matches the dominant file type in this run, or the type your product uses when a single parser is selected for a batch.
Character encoding
Text encoding for parsers that read byte streams (for example UTF-8). Use the encoding that matches your files to avoid mojibake or parse failures.
Max file size (bytes)
Upper bound on file size for ingestion. 0 often means no limit or use product default. Non-zero values reject or skip oversized files early.
Actions
Control
Description
Run ingestion
Starts the ingestion job with the current settings. Ensure vector store and paths are correct before running; large directories can take a long time.
Chunking Configuration (advanced)
Field
Description
Splitter type
How text is split into chunks before embedding. Recursive (when labeled recommended) usually splits on paragraphs and headings first, then sentences, for more coherent chunks. Other types may split on fixed characters or delimiters only.
Chunk size (characters)
Target maximum size of each chunk in characters (not tokens). Larger chunks preserve more context but can reduce retrieval precision; smaller chunks improve granularity but increase vector count and cost. Default 1000 is a common starting point.
Chunk overlap (characters)
Number of characters shared between adjacent chunks. Overlap helps avoid cutting sentences or facts in half at boundaries. Default 200 is typical with 1000-character chunks; adjust if answers miss context at edges.
Custom separators (optional)
Extra delimiter strings (if your product supports them) that force splits—for example specific headings or markers. Leave empty to use the splitter’s built-in rules.
Ingestion options (advanced)
Field
Description
Batch size
How many chunks or documents to process per internal batch (for example 100). Higher values can improve throughput but increase memory use.
Continue on error
When enabled, ingestion skips or logs failed files or chunks and continues with the rest. When disabled, the job may stop on the first error, better for strict validation; worse for large mixed folders.
Option B: Configure in Application.cfc
component {

    this.name = "chatmodelapp";
    this.apikey = "api-key";
    this.pineconeApiKey="api-key-pinecone"
    this.anthropicKey="api-key-anthropic"


    this.mappings["/tool"] = expandPath("./tool");

    boolean function onApplicationStart() {
        application.apiKey = this.apikey;
        application.anthropicKey = this.anthropicKey;
        writeLog(
            text = "Application started. API key initialized.",
            file = "application"
        );
        return true;
    }

    void function onRequestStart(string targetPage) {

        /* ---- Application Re-initialization ---- */
        if (
            structKeyExists(url, "reinit")
            && url.reinit eq 1
            /* add protection as needed */
        ) {
            writeLog(
                text = "Application reinitialization triggered.",
                file = "application"
            );
            applicationStop();
        }
    }
}

Allow-list document path

Files and directories must be allow-listed in the pathfilter.json configuration file, located in cfusion/lib before they can be used as sources for RAG ingestion or document loading.
What this means for developers
Before using a file or directory path in simpleRAG(), agent(), or documentService, the path must be added to the pathfilter.json file located in the ColdFusion installation directory. If you attempt to use a non-allowed path, you will receive a path filter violation error.
Add the required file paths or directory paths to pathfilter.json before running any RAG ingestion. For example,
{
    "comments": "paths should be semi-colon seperated. To Allow a file: {path-of-file}; To Allow a directory & files in it: {path-to-directory}/*; To Allow a directory & sub-directories: {path-to-directory}/**; To Block a file: !{path-of-file}; To Block a directory & sub-directories: !{path-to-directory}/**; Precedence decreases from left to right. Suppose directory A has directory B & C inside it.To Allow B & Block C: !A/C/*;A/**;",
        
    "bytecodeexecutionpaths": "",
    
    "documentaccesspaths": "C:/**;E:/**;",
        
    "schedulerexecutionpaths": "",

    "car": {
        "deploypath": "",
        "associatedfiles": ""
    }
}

Supported document formats

ColdFusion RAG includes built-in parsers for a wide range of document types. The correct parser is selected automatically based on file extension.
Extensions
Parser used
Notes
.txt, .text
Text Parser
Plain UTF-8 text
.md, .markdown
Markdown Parser
Strips Markdown syntax; headings become metadata
.pdf
PDF Parser
Extracts text layer; scanned-only PDFs may return empty content
.doc, .docx, .xls, .xlsx, .ppt, .pptx
Apache POI Parser
Full Office format support including embedded text in tables
.odt, .ods, .odp, .rtf, .html, .htm, .xml, .eml, .msg, .epub
Apache Tika Parser
Broad format coverage via Tika
.csv
CSV Parser
Each row becomes a document chunk
.json
JSON Parser
Extracts string values; nested objects are flattened
.xml
XML Parser
Text content of elements; attributes are included as metadata
.atom, .rss
Feed Parser
Each feed item becomes a document
.properties, .props
Properties Parser
Key-value pairs
.log
Log Parser
Each log line or entry becomes a chunk
.zip, .jar, .war, .tar, .gz and variants
ZIP Document Parser
Recursively unpacks and parses contained files
Any other type
Custom Parser (UDF)
Provide your own parsing logic via a UDF — see section 5.3
Note: ColdFusion selects the parser based on file extension, not MIME type. If a file has an incorrect extension, the wrong parser may be used. You can override the parser explicitly.

Share this page

Was this page helpful?
We're glad. Tell us how this page helped.
We're sorry. Can you tell us what didn't work for you?
Thank you for your feedback. Your response will help improve this page.

On this page