Prompt Spray
BTC

Prompt Spray — Frequently Asked Questions

Common questions about multi-prompt strategy — when to spray, how many variations to test, and managing costs.

Your Questions Answered ❓


Is not spraying just a waste of time and money?

No — it is the opposite. The time wasted on iterating a mediocre single-model output through 8 rounds of refinement is almost always more than the 5 minutes spent spraying across 3 models and picking the best one. On subscription plans, the marginal cost of additional prompts is zero. On API plans, a 3-model spray costs $0.03-$0.15 — cheaper than spending 30 minutes fixing a suboptimal single-shot output.


How many variations should I test?

For most tasks, 3 is the sweet spot — either 3 prompt variations on one model, or 1 prompt across 3 models. More than 9 total variations (3x3) hits diminishing returns for non-research tasks. For high-stakes content (product launches, client deliverables, published articles), 9 variations (3 prompts x 3 models) is worth the extra 5 minutes.


Which AI models should I include in my spray?

At minimum: one from OpenAI (ChatGPT/GPT-4o), one from Anthropic (Claude), and one from Google (Gemini). These three have sufficiently different architectures and training that their outputs meaningfully differ. Adding Perplexity is worthwhile when factual accuracy matters, since it grounds responses in search results. Adding an open-source model (Llama, Mistral) is useful for privacy-sensitive tasks.


Does spray strategy work for code generation?

Extremely well. Different models have different coding strengths — ChatGPT tends to produce more verbose but well-commented code, Claude produces cleaner architecture, and Gemini handles data processing well. Spraying a coding task across all three and picking the best approach (or combining ideas from multiple outputs) consistently beats single-model coding.


When is single-shot actually fine?

For simple, low-stakes tasks: quick questions, basic formatting, casual email drafts, simple calculations. If the cost of a suboptimal output is minimal, spraying is overkill. Save spray strategy for work that matters — client deliverables, published content, production code, strategic decisions.


Can I automate the spray process?

Yes. Developers use LiteLLM or OpenRouter to send the same prompt to multiple models in a single API call. Non-developers can use ChatHub to do the same thing in a browser without code. The most advanced teams use PromptFoo to automate spray testing and evaluation as part of their development pipeline.


How do I evaluate which output wins?

Score each output on four dimensions: accuracy, relevance, quality, and usefulness (1-5 each). Total score out of 20. The winner is usually obvious within 30 seconds. For subjective tasks (creative writing, tone), gut feeling after reading all outputs is surprisingly reliable — your brain is a good comparator even when you cannot articulate why one is better.