Whatever message this page gives is out now! Go check it out!

LLMModel

Last update:
May 18, 2026
The LLMModel is the fundamental interface for interacting with large language models (LLMs). It acts as a stateless engine that accepts a sequence of messages and returns a model-generated response. Unlike the Agent, the LLMModel does not maintain conversation history or automatically manage tools. Use the LLMModel when you want a lightweight, stateless call to the underlying provider with no built-in persona, memory, or tool orchestration. If you need structured prompts (system/user messages), personas, memory, or tools, use `Agent()` instead.

ChatModel() function

Description
ChatModel() creates a configured LLMModel instance in ColdFusion, connected to the provider of your choice.
Returns
A LLMModel instance. The returned object exposes .chat(), which executes LLM calls and returns a response struct containing message and metadata.
Syntax

chatConfig = {
    PROVIDER  : "openAi",
    APIKEY    : "#application.apiKey#",
    MODELNAME : "gpt-4o-mini",
    TEMPERATURE : 0.7
};
chatModel = ChatModel(chatConfig);
    
Parameters
Legend
  • 🔴 Mandatory
  • 🔄 Conditional
  • ⭕ Optional
  • ✅ Supported
  • — Not applicable
Param NameTypeDescriptionRequiredOpenAIAnthropicMistralGeminiAzure OpenAIOllamaDefault / Example
providerStringProvider of the model🔴"openai"
baseUrlStringThe base URL for the API endpoint of the model🔄"https://api.openai.com/v1/"
apiKeyStringThe authentication key required to access the model's API🔄"sk-..."
modelNameStringThe specific identifier or name of the model to be used🔴"gpt-4o"
temperatureNumberControls the randomness of the output (higher = more creative)0.7
maxTokensNumberMaximum number of tokens to generate in the response2048
stopArray of StringStrings that, if generated, cause the model to stop generating further tokens["\nUser:", "###"]
timeoutNumberMaximum time in seconds to wait for a model response10
responseFormatStringSpecifies the desired format for the model's output"json"
httpClientBuilderStructConfiguration for the underlying HTTP client, including proxy and executor pool settings
topPNumberFilters token selection by cumulative probability threshold0.95
topKNumberFilters token selection by choosing only the top K most likely next tokens40
maxRetriesNumberMaximum number of times to retry a failed API request2
logRequestsBooleanWhether to log the requests sent to the modeltrue
logResponsesBooleanWhether to log the responses received from the modeltrue
thinkingTypeStringSpecifies the strategy the model uses for internal reasoning before generating a response"enabled"
thinkingBudgetTokensNumberMaximum number of tokens the model can use for its internal reasoning process512
cacheSystemMessagesBooleanWhether to cache system messages to optimize repeated interactions
cacheToolsBooleanWhether to cache tool definitions for efficiency
repeatPenaltyNumberPenalty applied to tokens that have already appeared, discouraging repetition0.5
seedNumberWhen set, makes the model output deterministic for a given input1337
numPredictNumberThe number of predictions or completions to generate2000
presencePenaltyNumberPenalty applied to tokens based on whether they are present in the text0.0
logitBiasStructBiases the probability of specific tokens appearing in the output{1504: 100}
maxCompletionTokensNumberMaximum total number of tokens expected in the entire output2048
metadataMap<String, String>Additional data passed with the request, for logging or tracking
maxOutputTokensNumberMaximum number of output tokens (equivalent to maxTokens for some providers)2048
candidateCountNumberNumber of alternative response candidates the model should generate1
allowCodeExecutionBooleanWhether the model is allowed to execute code as part of its response generationfalse
includeCodeExecutionBooleanWhether the generated response should include details of any code execution performedfalse
safetySettingsArray of StringConfiguration for content safety filters to prevent generation of harmful content"HARM_CATEGORY_HATE_SPEECH"
versionStringThe specific API version of the model to be used"2023-06-01"
betaBooleanWhether to use a beta or experimental version of the model or featurefalse
IMPORTANT:When using ColdFusion’s LLM integration with the OpenAI provider, the topP parameter is not supported for the following models:
  • gpt-5
  • gpt-5.1
  • gpt-5-mini
  • gpt-5-nano
  • o1
  • o3
  • o4
Calling ChatModel with topP set in the configuration for any of the models above results in an error from the OpenAI API.
You'll see the following response:

        {
          "error": {
            "message": "Unsupported parameter: 'top_p' is not supported with this model.",
            "type": "invalid_request_error",
            "param": "top_p",
            "code": "unsupported_parameter"
          }
        }
      

.chat()

Once configured, send prompts using .chat() with the LLMModel instance returned from ChatModel().
ChatModel.chat() accepts a plain string only. It does not support structured chat request objects (system message + user message). For structured prompts, use Agent() instead.
Example — plain string:

      response = chatModel.chat("Explain quantum computing in one sentence.");
      writeOutput(response.message);
    

Share this page

Was this page helpful?
We're glad. Tell us how this page helped.
We're sorry. Can you tell us what didn't work for you?
Thank you for your feedback. Your response will help improve this page.

On this page