Google Labs recently introduced Whisk, an experimental tool that allows you to visualize your ideas and tell your story using images as prompts. To put it simply, Whisk lets you upload three images: a subject image, a scene image, and a style image. You can also add a short text prompt to blend these elements into a new, cohesive image.
ImageFX is another experimental tool from Google Labs. Similar to tools like MidJourney (MJ), ImageFX enables you to generate four images at once, delivering results quickly and efficiently.
Both features have been integrated into ComfyUI thanks to the efforts of the developer "ainewsto," who has created a plugin to make these tools available for free.
Plugin Repository: ComfyUI-Labs-Google Plugin
Now you can explore the power of Whisk and ImageFX directly in ComfyUI for free, leveraging advanced image blending and text-to-image capabilities to create stunning visuals in no time!
Using Google Whisk and ImageFX Together
If you haven't found it in the Manager yet, you'll need to install it manually using the Git address:
Then input the plugin address: https://github.com/ainewsto/comfyui-labs-google.git
Once installed, restart ComfyUI. Alternatively, you can opt for a fully manual installation:
cd comfyui/custom_nodes git clone https://github.com/ainewsto/comfyui-labs-google.git |
---|
Choose either of these methods.
After installing the plugin, configuration is necessary. Navigate to the plugin directory: comfyui/custom_nodes/comfyui-labs-google. Inside, you'll find a configuration file named googel.json. The default configuration is as follows, and you need to replace all fields with your own details:
{ "user":{ "name":"", "email":"", "image":"" }, "expires":"", "access_token":"", "cookies":[ { "name":"_ga", "value":"GA1.1.206839949.1734415920" }, { "name":"EMAIL", "value":"" }, { "name":"__Secure-next-auth.session-token", "value":"" }, { "name":"__Host-next-auth.csrf-token", "value":"" } ] } |
---|
While it may seem a bit complex, it's straightforward for frontend or backend developers. If you're not familiar, follow the step-by-step instructions below to obtain the necessary information:
Log in to your Google account.
The parameters in the cookies section will appear after you upload an image in the Whisk interface. On this page, press F12 for the developer tools. Click on the red box, and then press the Upload button, then select Network, and then filter Fetch/XHR.
Still in the developer tools, locate the previously mentioned session entry. Click on it and go to the Preview tab to find other parameters.
Add these parameters into the google.json configuration file. Here's an example configuration (with ......... representing omitted sections):
{ "user": { "name": "Hk Wang", "email": "knner.wang@gmail.com", "image": "https://lh3.googleusercontent.com/a/A..........................................Th1tURSXb-ZbCJXKkZrY=s96-c" }, "expires": "2025-02-13T00:21:13.645Z", "access_token": "ya29.a0A....................................................................f_ApeVpDQJR-g0434", "cookies": [ { "name": "_ga", "value": "GA1.1.58..............3.1736812846" }, { "name": "EMAIL", "value": "%22knner.wang%40gmail.com%22" }, { "name": "__Secure-next-auth.session-token", "value": "eyJh....................................................................48RPQpiriSIZrtZsk3Y.b2eYmQTazsKIkQG9Ltq0FQ" }, { "name": "__Host-next-auth.csrf-token", "value": "c08....................................................................a4d7edaee19c8b" } ] } |
---|
Once the configuration is complete, restart ComfyUI to apply the changes.
Avoid excessive usage: Overusing Google Whisk may trigger security measures, potentially leading to account suspension.
Inputs:
You can upload three images as inputs, or choose to upload only one or two of them:
Subject: The main object or subject image.
Scene: The background or scene image.
Style: The style reference image.
Additionally, you need to provide a prompt (text description).
Parameters:
Select the number of images to generate at once. The default is 3 images.
Optionally, you can set a fixed seed or use a random seed for generation.
Outputs:
generated_images: All generated images.
subject_prompt: The inferred prompt based on the subject image.
scene_prompt: The inferred prompt based on the scene image.
style_prompt: The inferred prompt based on the style image.
Workflow:
The workflow is simple and consists of the following steps:
Upload the input images (subject, scene, style) as needed.
Add a descriptive prompt to guide the generation.
Adjust parameters (number of images, seed settings) if necessary.
Run the process to generate images.
This streamlined process ensures ease of use while delivering high-quality results.
Example:
Prompt:
"Deadpool holding cosmetics in the forest"
Generated Images:
Reverse Prompt
Subject:
A stylized illustration of Deadpool, a Marvel Comics character, is presented against a backdrop of a stylized night sky. The character is depicted in a dynamic pose, his body angled slightly forward, fists clenched. His costume is predominantly red and black, with the characteristic Deadpool mask covering his face. The mask is white with black eye holes and a red mouth. The suit features black accents on the chest, shoulders, and legs, along with a brown belt with a circular emblem. Two katana swords are sheathed on his back. The costume has a textured appearance, suggesting a material like spandex or leather. The background features a large, bright yellow-orange circle resembling a stylized moon or sun, with smaller, glowing yellow-orange dots scattered across the dark teal sky. The sky also has areas of darker shading, suggesting clouds. At the top of the image, the text "Hello, friends of the Atomic Community" is written in a stylized, retro font with a yellow-orange color scheme. The text has a slightly distressed or vintage look. |
---|
Scene:
A jar of Memoire Bee Restory Plumping Nectar cream sits on a bed of moss amongst several rocks. The jar is light teal and white, with the brand name prominently displayed. Several small flowering plants are visible around the jar, including purple, white, and orange flowers. The background is a softly blurred forest scene, with many green trees and dappled light. The moss is a vibrant green, and the rocks are various shades of gray and brown. The overall lighting is soft and diffused, creating a serene and natural atmosphere. |
---|
Style:
A relief sculpture crafted from polymer clay, exhibiting a whimsical, fairytale-like aesthetic. The color palette is vibrant yet soft, featuring warm oranges and yellows contrasted with cool blues and greens. The lighting is soft and diffused, creating a gentle, even illumination across the piece. The style is characterized by a smooth, almost polished texture, with meticulous detailing and a three-dimensional quality that gives the piece depth and texture. The overall rendering is highly realistic in terms of form and texture, despite the fantastical subject matter. The piece has a slightly glossy finish. |
---|
You can input any number of images to generate different effects, and it's not necessary to upload all three at once. For example:
Subject Image: Upload just the main image.
Scene Image: Upload only the background or environmental image.
Style Image: Upload a style reference, like an artistic image.
Feel free to test with any combinations to get creative results. The flexibility allows for various unique outputs depending on the combination of images and prompts!
Main Steps:
Input a prompt.
Choose whether to fix the seed or leave it random.
Select the number of images to generate.
The workflow is simple, as shown below:
Prompt:
"Clay art, a playful three-dimensional sculpture of a character with purple hair, glasses, and a colorful shirt. The character is grinning, sticking out his tongue, and making a 'peace' sign with his hands. The background is bright red and reads 2025. Colorful fireworks are blooming. The art style combines humor and creativity, capturing a lighthearted mood." |
---|
You can use the prompt output from Whisk in ImageFX to generate even more creative results.
The workflow for combining these tools is as follows:
Use the prompt from Whisk.
Input it into ImageFX to generate the output.
Image Prompt:
"mario" |
---|
Whisk Output Prompt:
A polymer clay relief sculpture of Mario. The sculpture exhibits a whimsical, fairytale aesthetic. Mario is depicted in mid-stride, arms raised in celebration. His large head is disproportionate to his body. He wears a red cap with a large white "M," a red short-sleeved shirt, blue overalls with yellow buttons, brown shoes, and white gloves. His skin is a light peach tone. His brown hair is neatly styled under the cap, and he has a prominent brown mustache. His eyes are large and blue. The color palette is vibrant yet soft, with warm yellows and oranges in his clothing contrasting with cool blues and greens in the background. The lighting is soft and diffused, creating even illumination. The style is smooth, almost polished, with meticulous detailing and a three-dimensional quality. The overall rendering is highly realistic in form and texture, despite the fantastical subject matter. The piece has a slightly glossy finish. The background is a simple, cool-toned blue-green. |
---|
Feel free to experiment with combining Whisk's prompt output with ImageFX for different artistic effects!