If you want frontier-level AI without paying frontier prices, Hermes Mixture of Agents is the update to know about. It only just launched, but I’ve run this same panel-of-models pattern for weeks via Fusion and Sakana, and it genuinely delivers.
It runs several models together and fuses them into one stronger answer — a panel of experts beating any single model, with one command to switch on.
Key takeaways
- Hermes Mixture of Agents runs several models in parallel and merges them into one stronger answer.
- It’s a clever way around gated, preview-only models like Fable 5 and GPT-5.6.
- A two-model panel (Opus 4.8 + GPT-5.5) beats either model alone on Hermes Bench — and it’s one command to enable.
How Mixture Of Agents Works
MoA is a virtual model provider. Reference models each answer privately, then an aggregator combines them into the final response your agent uses.
It’s the panel-versus-genius idea: several good models, combined by a sharp chair, beat one model working alone almost every time.
Why It’s So Useful Now
Frontier models keep getting gated — Fable 5 is partner-only, GPT-5.6 is preview. Getting top-tier access is increasingly hard.
Rather than wait, MoA lets you fuse the models you already have into something stronger. You get frontier quality for a fraction of the cost.
Panel Beats Genius: The Numbers
Does it work? On Hermes Bench, an Opus 4.8 aggregator over a GPT-5.5 reference beats either model alone:
- Opus + GPT-5.5 panel (MoA): 0.8202
- Opus 4.8 alone: 0.7607
- GPT-5.5 alone: 0.7412
Combining perspectives genuinely lifts quality on hard tasks — roughly 8% above Opus and 11% above GPT, per Hermes’ own benchmark.
How To Turn It On
Setup is genuinely simple:
- Run
hermes updatefirst - Run
hermes modeland choose the Mixture of Agents provider - Pick a preset (or configure your own in
config.yaml) - Switch anytime with
/model default --provider moaor the/moashortcut
It’s provider-agnostic, so you can plug in any models you like.
Stop Chasing The Model, Build The System
Everyone’s waiting on the next model to change everything. But a mix of today’s models already beats the best single model you can’t even access.
The model is the part you swap; the system is what you own. Build the system instead — that’s the lesson MoA hands you for free.
Where I Run It
I run Mixture of Agents inside my Agent OS, alongside Fusion and Sakana Fugu — three systems on the same panel-of-models idea, all in one dashboard, one click apart.
I flip between MoA, Fusion and Sakana from one dashboard depending on the task. Want the whole stack done for you, with live coaching where I build model panels with you? It’s inside my AI Profit Boardroom (3,800+ operators). New to Hermes? Start free with my AI Money Lab.
The Strongest Preset To Begin With
The best performer on Hermes Bench is an Opus 4.8 aggregator over a GPT-5.5 reference — it beats either model running solo.
And you can mix cheaper models together and still outperform one pricey model alone, which keeps costs down while quality goes up.
How It Compares To Fusion And Sakana Fugu
Mixture of Agents shares its core idea with Fusion and Sakana Fugu: a panel of models reaching near-frontier results together.
Rather than pick one, I run all three inside my Agent OS and switch between them in a click, matching the system to the job.
Is It Worth Setting Up?
If you use AI heavily and keep bumping into a single model’s ceiling, yes. The setup is a few commands and the payoff is consistently better output on hard tasks.
It’s one of those rare updates where a small change in setup produces a real, measurable jump in quality.
Why This Is A Pattern, Not A Trick
Hermes MoA only just dropped, but I’ve run this panel-of-models pattern for weeks via Fusion and Sakana, and the thing I keep noticing is that it’s a repeatable pattern, not a gimmick. Combining a panel of models into one answer reliably beats picking a single model and hoping.
It’s the same reason I’ve built Fusion and Sakana Fugu into my system too. When output quality matters, a panel wins.
The Cost Trade-Off
Running multiple models does use more tokens than one. But you can pair cheaper models and still beat a single premium model on its own.
So the real story isn’t higher cost — it’s better output per pound, with no need for gated, expensive frontier access.
Also on my network: this Hermes Mixture of Agents guide on JulianGoldie.com, JulianGoldie.co.uk, Goldie Agency.
FAQ
What is Hermes Mixture of Agents?
A feature that runs several AI models in parallel and merges their answers into one stronger response.
Why use it over one model?
Frontier models are getting gated; a panel of models you already have can beat any single one with no special access.
Does it really beat a single model?
Yes — on Hermes Bench a panel scored 0.82 vs 0.76 for Opus alone.
How do I enable it?
Run hermes update, then hermes model, and pick the Mixture of Agents provider.
Does it cost more?
More tokens for the extra calls, but you can mix cheaper models and still beat one expensive model alone.
The Bottom Line
Hermes Mixture of Agents gives you frontier-level quality from models you already have, with one command to switch it on.
Stop waiting for the next gated model. Build a panel, own the system, and let it outperform the genius working alone.