FLUX-Schnell: How to generate 4 single object outputs using num_outputs=4 without object mixing?

fauad123 · December 4, 2025, 11:54pm

Hi everyone,
I am working on an icon generator using Replicate’s FLUX-Schnell model.

Goal:
Generate 4 icon style images (one centered object per image, plain background) that share consistent style.

Issue:
If I pass one prompt and use num_outputs = 4, the model often:

merges multiple objects into a single output
ignores “one object per icon” rule

Because of this, I currently generate 4 separate prompts via GPT and call the model 4 times sequentially, which works but is slow.

Questions:

Is there a prompt technique to force one distinct object per output?
Or is generating icons individually simply the correct approach for this use case?

John6666 · December 5, 2025, 3:09am

Basically like this:

Short direct answers first:

There is no prompt trick that turns num_outputs=4 into “4 different objects, one per output” with Replicate’s FLUX-Schnell. The model gets one text condition and just samples it multiple times.
For “four different objects, one per icon, consistent style”, the correct approach is four separate prompts. You can make it fast by running the four Replicate calls in parallel instead of sequentially.

Now context and detail.

1. What `num_outputs` really does on Replicate FLUX-Schnell

Check the FLUX-Schnell API schema on Replicate:

prompt: string – “Prompt for generated image”
num_outputs: integer 1–4 – “Number of outputs to generate”(Replicate)

There is no prompts[] field. No list. No per-output prompt.

Internally this means:

Your single prompt is encoded into a single text embedding.
For num_outputs=4, the backend:
- Samples 4 independent noise tensors.
- Runs the diffusion process 4 times.
- Uses the same text embedding each time.

So the system behavior is:

“Generate 4 different attempts at the same image description.”

It is not:

“Prompt for image 1, prompt for image 2, prompt for image 3, prompt for image 4.”

If your prompt says:

“A set of 4 flat icons: a cat, a dog, a car and a tree, each icon with exactly one object.”

then every sample sees that entire sentence. There is no way for the model to know “I am sample 2, I should draw the dog only.”

That is why you see:

Icons mixing multiple objects in one image.
The “one object per icon” instruction being violated when it conflicts with the “set of 4 objects” wording.

This is baked into the API shape.

2. Why prompt engineering cannot reliably fix object routing

You are asking the model to do two things at once:

Global description across images:
“There are four icons, with four different objects.”
Per-image constraint:
“Each output corresponds to a specific object, and only that object.”

But FLUX-Schnell on Replicate only has a global conditioning. There is no per-image conditioning channel.

Prompt engineering can:

Encourage “one isolated object, clean background”.
Encourage a consistent style.

Prompt engineering cannot:

Encode “output index → object ID” mapping.
Decide that output #1 must be “cat” and output #2 must be “dog”.

So:

You can get better single-object behavior by narrowing the prompt.
You cannot get guaranteed “one distinct object per output” with a single prompt and num_outputs=4.

Any hack that “seems to work” (e.g., listing objects in a special order, using numbering) will break unpredictably and is not suitable for a production icon generator.

3. When a single prompt is fine: same object, multiple variants

There is exactly one scenario where num_outputs=4 is appropriate for icons:

Four variations of the same object in the same style.

Example prompt:

flat minimal 2D icon of a single cat,
isolated and centered,
plain light background,
no other objects, no text, bold outline, vector style

Call:

{
  "prompt": "flat minimal 2D icon of a single cat, ...",
  "num_outputs": 4,
  "aspect_ratio": "1:1"
}

You will get 4 different “single cat icon” samples. Still not mathematically guaranteed to be perfect, but generally close.

Key differences:

The prompt describes one subject, not an icon set.
All images should show that subject; variation is only in pose/details.

The moment you describe multiple different objects in one prompt, you lose control over which output shows which object.

4. Correct design for your use case: one prompt per icon

Your goal:

4 images.
Each image = one specific object.
Shared style.
Plain background.

For this, the clean design is:

4.1. Fix a shared style template

Write a short style block that you reuse:

flat minimal 2D icon, single centered object,
simple geometric shapes, bold outline,
plain light solid background,
no text, no frame, no extra objects, vector style

This is your style template.

4.2. Compose one prompt per object

For each icon:

Icon 1: icon of a cat, [STYLE TEMPLATE]
Icon 2: icon of a dog, [STYLE TEMPLATE]
Icon 3: icon of a car, [STYLE TEMPLATE]
Icon 4: icon of a tree, [STYLE TEMPLATE]

So each Replicate call has:

Only one object name in the prompt.
The same style text as all other icons.

This gives:

Strong per-image semantic control (no object mixing).
Very consistent style across icons.

This is exactly how you are using GPT now. The only change I would make is to keep GPT limited to object phrasing and keep the style block fixed in your own code, to avoid style drift.

5. Making it fast: parallel calls vs sequential calls

Right now you do:

Call FLUX for icon 1 → wait.
Call FLUX for icon 2 → wait.
Call FLUX for icon 3 → wait.
Call FLUX for icon 4 → wait.

Total latency ≈ 4 × single-call latency.

You do not need to serialize them. Replicate’s own examples show simple async calls where you can fire multiple predictions and then consume all results.((note Subtitles))

Better pattern:

Start 4 predictions in parallel.
Wait for all 4 to finish.

In Node / TypeScript this is typically:

const inputs = objects.map((obj) => ({
  prompt: `icon of a ${obj}, ${STYLE_TEMPLATE}`,
  aspect_ratio: "1:1",
  num_outputs: 1,
}));

const predictions = await Promise.all(
  inputs.map((input) =>
    replicate.run("black-forest-labs/flux-schnell", { input })
  )
);

// predictions[0]..predictions[3] are your 4 icon images

Effect:

Compute cost is still roughly 4 images.
Wall-clock time is about “one FLUX call” rather than four.

This keeps the “correct” modeling (one prompt per icon) and solves your performance complaint at the engineering layer.

6. What if you later move off Replicate?

If you ever run FLUX yourself (Hugging Face Diffusers, Triton, etc.), the batching story becomes better.

Diffusers’ FluxPipeline supports:

prompt: str | List[str]
num_images_per_prompt: int(Hugging Face)

So you can do:

prompts = [
    f"icon of a cat, {STYLE_TEMPLATE}",
    f"icon of a dog, {STYLE_TEMPLATE}",
    f"icon of a car, {STYLE_TEMPLATE}",
    f"icon of a tree, {STYLE_TEMPLATE}",
]

images = pipe(
    prompt=prompts,
    num_images_per_prompt=1,
    num_inference_steps=4,   # Schnell
    guidance_scale=0.0,      # Schnell requirement
).images

Now you truly get:

4 prompts in.
4 images out.
One model forward pass.

This is exactly the multi-prompt routing that Replicate’s num_outputs does not provide.

7. Direct answers to your two questions

1. “Is there a prompt technique to force one distinct object per output?”

For Replicate’s FLUX-Schnell:

For different objects in each of the 4 outputs:
No. There is no reliable prompt-only method. The API exposes only a single prompt and num_outputs is “number of random samples of that same prompt”, as the schema shows.(Replicate)
For multiple variants of the same object:
Yes, you can usually get “one object per icon” by:
- Describing only one object in the prompt.
- Being explicit: “single [OBJECT], isolated, centered, plain solid background, no other objects, no text”.

But this does not solve “cat + dog + car + tree” in one call.

2. “Or is generating icons individually simply the correct approach for this use case?”

Yes:

For “4 icons, 4 different objects, shared style”, the correct conceptual approach is:
- One prompt per icon,
- Shared style block,
- One image per prompt.

You can and should:

Run those prompts as separate predictions.
Parallelize them to keep latency low.

If you later control the model yourself (Diffusers), switch to:

Batched prompts (prompt: List[str], num_images_per_prompt=1) instead of multiple API calls.

Quick recap

Replicate FLUX-Schnell: one text prompt → N samples. num_outputs=4 does not understand “icon #1..#4”.(Replicate)
You cannot reliably map “4 different object names” to 4 outputs with one prompt. Prompt engineering cannot create routing that the API does not expose.
For 4 different icons:
- Use 4 prompts: “icon of a [object], [fixed style]”.
- Call the model 4 times, but in parallel.
If you later run FLUX via Diffusers, use batched prompts with prompt: List[str] and num_images_per_prompt=1 to get real multi-prompt, single-pass behavior.

Topic		Replies	Views
Diffuser API Inference Community Limited to 1 Image Return Inference Endpoints on the Hub	0	499	April 8, 2023
Generate GIF reply to English text with VQGAN + CLIP Flax/JAX Projects	23	3363	July 2, 2021
Bulk image generation with Stable diffusion/Midjourney/alternatives 🤗Hub	0	868	February 5, 2023
Generate SVGs from smaller SVG icons from text Beginners	0	369	May 31, 2023
[Query] Dreambooth - for multi-class view generation 🧨 Diffusers	1	712	April 3, 2023