LLMModel

Last update:

May 18, 2026

The LLMModel is the fundamental interface for interacting with large language models (LLMs). It acts as a stateless engine that accepts a sequence of messages and returns a model-generated response. Unlike the Agent, the LLMModel does not maintain conversation history or automatically manage tools. Use the LLMModel when you want a lightweight, stateless call to the underlying provider with no built-in persona, memory, or tool orchestration. If you need structured prompts (system/user messages), personas, memory, or tools, use `Agent()` instead.

ChatModel() function

Description

ChatModel() creates a configured LLMModel instance in ColdFusion, connected to the provider of your choice.

Returns

A LLMModel instance. The returned object exposes .chat(), which executes LLM calls and returns a response struct containing message and metadata.

Syntax

chatConfig = {

    PROVIDER  : "openAi",

    APIKEY    : "#application.apiKey#",

    MODELNAME : "gpt-4o-mini",

    TEMPERATURE : 0.7

};

chatModel = ChatModel(chatConfig);

Parameters

Legend

🔴 Mandatory
🔄 Conditional
⭕ Optional
✅ Supported
— Not applicable

Param Name	Type	Description	Required	OpenAI	Anthropic	Mistral	Gemini	Azure OpenAI	Ollama	Default / Example
`provider`	String	Provider of the model	🔴	✅	✅	✅	✅	✅	✅	`"openai"`
`baseUrl`	String	The base URL for the API endpoint of the model	🔄	✅	✅	✅	✅	✅	✅	`"https://api.openai.com/v1/"`
`apiKey`	String	The authentication key required to access the model's API	🔄	✅	✅	✅	✅	✅	—	`"sk-..."`
`modelName`	String	The specific identifier or name of the model to be used	🔴	✅	✅	✅	✅	✅	✅	`"gpt-4o"`
`temperature`	Number	Controls the randomness of the output (higher = more creative)	⭕	✅	✅	✅	✅	✅	✅	`0.7`
`maxTokens`	Number	Maximum number of tokens to generate in the response	⭕	✅	✅	✅	✅	✅	✅	`2048`
`stop`	Array of String	Strings that, if generated, cause the model to stop generating further tokens	⭕	✅	✅	✅	✅	✅	✅	`["\nUser:", "###"]`
`timeout`	Number	Maximum time in seconds to wait for a model response	⭕	✅	✅	✅	✅	✅	✅	`10`
`responseFormat`	String	Specifies the desired format for the model's output	⭕	✅	—	—	—	✅	✅	`"json"`
`httpClientBuilder`	Struct	Configuration for the underlying HTTP client, including proxy and executor pool settings	⭕	✅	—	✅	—	✅	✅	—
`topP`	Number	Filters token selection by cumulative probability threshold	⭕	✅	✅	✅	✅	✅	✅	`0.95`
`topK`	Number	Filters token selection by choosing only the top K most likely next tokens	⭕	—	✅	✅	✅	—	✅	`40`
`maxRetries`	Number	Maximum number of times to retry a failed API request	⭕	✅	✅	✅	—	✅	—	`2`
`logRequests`	Boolean	Whether to log the requests sent to the model	⭕	✅	✅	✅	✅	✅	✅	`true`
`logResponses`	Boolean	Whether to log the responses received from the model	⭕	✅	✅	✅	✅	✅	✅	`true`
`thinkingType`	String	Specifies the strategy the model uses for internal reasoning before generating a response	⭕	—	✅	✅	—	—	—	`"enabled"`
`thinkingBudgetTokens`	Number	Maximum number of tokens the model can use for its internal reasoning process	⭕	—	✅	✅	—	—	—	`512`
`cacheSystemMessages`	Boolean	Whether to cache system messages to optimize repeated interactions	⭕	—	✅	—	—	—	—	—
`cacheTools`	Boolean	Whether to cache tool definitions for efficiency	⭕	—	—	—	—	—	—	—
`repeatPenalty`	Number	Penalty applied to tokens that have already appeared, discouraging repetition	⭕	✅	—	—	—	—	✅	`0.5`
`seed`	Number	When set, makes the model output deterministic for a given input	⭕	✅	—	—	—	✅	✅	`1337`
`numPredict`	Number	The number of predictions or completions to generate	⭕	—	—	—	—	—	✅	`2000`
`presencePenalty`	Number	Penalty applied to tokens based on whether they are present in the text	⭕	✅	—	—	—	✅	—	`0.0`
`logitBias`	Struct	Biases the probability of specific tokens appearing in the output	⭕	✅	—	—	—	✅	—	`{1504: 100}`
`maxCompletionTokens`	Number	Maximum total number of tokens expected in the entire output	⭕	✅	—	—	—	✅	—	`2048`
`metadata`	Map<String, String>	Additional data passed with the request, for logging or tracking	⭕	—	—	—	—	—	—	—
`maxOutputTokens`	Number	Maximum number of output tokens (equivalent to maxTokens for some providers)	⭕	—	—	—	✅	—	—	`2048`
`candidateCount`	Number	Number of alternative response candidates the model should generate	⭕	—	—	—	✅	—	—	`1`
`allowCodeExecution`	Boolean	Whether the model is allowed to execute code as part of its response generation	⭕	—	—	—	✅	—	—	`false`
`includeCodeExecution`	Boolean	Whether the generated response should include details of any code execution performed	⭕	—	—	—	✅	—	—	`false`
`safetySettings`	Array of String	Configuration for content safety filters to prevent generation of harmful content	⭕	—	—	—	✅	—	—	`"HARM_CATEGORY_HATE_SPEECH"`
`version`	String	The specific API version of the model to be used	⭕	—	✅	✅	—	—	—	`"2023-06-01"`
`beta`	Boolean	Whether to use a beta or experimental version of the model or feature	⭕	—	✅	—	—	—	—	`false`

IMPORTANT:When using ColdFusion’s LLM integration with the OpenAI provider, the topP parameter is not supported for the following models:

gpt-5
gpt-5.1
gpt-5-mini
gpt-5-nano
o1
o3
o4

Calling ChatModel with topP set in the configuration for any of the models above results in an error from the OpenAI API.

You'll see the following response:

        {

          "error": {

            "message": "Unsupported parameter: 'top_p' is not supported with this model.",

            "type": "invalid_request_error",

            "param": "top_p",

            "code": "unsupported_parameter"

          }

        }

.chat()

Once configured, send prompts using .chat() with the LLMModel instance returned from ChatModel().

ChatModel.chat() accepts a plain string only. It does not support structured chat request objects (system message + user message). For structured prompts, use Agent() instead.

Example — plain string:

      response = chatModel.chat("Explain quantum computing in one sentence.");

      writeOutput(response.message);

Was this page helpful?

We're glad. Tell us how this page helped.

Found the answer to my problem Understood the instructions Liked the feature

Other suggestions

We're sorry. Can you tell us what didn't work for you?

Didn't find the answer to my problem Couldn't understand the instructions Didn't like the feature

Other suggestions

Thank you for your feedback. Your response will help improve this page.

Was this helpful?

We are sorry the content didn't meet your needs.

Share additional feedback to help us improve.

0/255 | Character limit exceeded.

Thank you so much for sharing your feedback!

LLMModel

ChatModel() function

.chat()

On this page