Skip to main content
This documentation was AI-generated. If you find any errors or have suggestions for improvement, please feel free to contribute! Edit on GitHub
The GeminiImage node generates text and image responses from Google’s Gemini AI models. It allows you to provide multimodal inputs including text prompts, images, and files to create coherent text and image outputs. The node handles all API communication and response parsing with the latest Gemini models.

Inputs

ParameterData TypeInput TypeDefaultRangeDescription
promptSTRINGrequired""-Text prompt for generation
modelCOMBOrequiredgemini_2_5_flash_image_previewAvailable Gemini models
Options extracted from GeminiImageModel enum
The Gemini model to use for generating responses.
seedINTrequired420 to 18446744073709551615When seed is fixed to a specific value, the model makes a best effort to provide the same response for repeated requests. Deterministic output isn’t guaranteed. Also, changing the model or parameter settings, such as the temperature, can cause variations in the response even when you use the same seed value. By default, a random seed value is used.
imagesIMAGEoptionalNone-Optional image(s) to use as context for the model. To include multiple images, you can use the Batch Images node.
filesGEMINI_INPUT_FILESoptionalNone-Optional file(s) to use as context for the model. Accepts inputs from the Gemini Generate Content Input Files node.
Note: The node includes hidden parameters (auth_token, comfy_api_key, unique_id) that are automatically handled by the system and do not require user input.

Outputs

Output NameData TypeDescription
IMAGEIMAGEThe generated image response from the Gemini model
STRINGSTRINGThe generated text response from the Gemini model