🚀 From Dense to MoE: Next-Gen n8n Workflow Generator

I'm excited to share our second-generation n8n workflow generator model!

After releasing the Qwen2.5-Coder-14B model three days ago, we've taken a massive leap forward with the Qwen3-Coder-30B-A3B-n8n-Workflow-Generator - a Mixture of Experts (MoE) architecture that brings both superior quality and incredible speed.

Blog: https://n8nbuilder.dev/blog/qwen3-coder-30b-a3b-n8n-workflow-generator-model

💡 What Makes This Special?

• 30B total parameters with only ~3.5B active per token

• MoE architecture for smarter expert routing

• 75-80 tokens/second on Mac M4 (MLX Q4)

• Complete workflows generated in ~15 seconds

• Better quality than dense models, faster than you'd expect

Why did we move from Dense (14B) to MoE (30B)? Simple - MoE uses specialized "expert" networks that only activate when needed. Think of it as having 30B parameters worth of knowledge, but only using 3.5B at a time. This means:

✅ 30B model quality

✅ 3.5B model speed

✅ Best of both worlds

🛠️ Technical Details:

• Base: Qwen3-Coder-30B-A3B-Instruct

• Fine-tuned with QLoRA on 2,308+ workflow templates

• 8192 token context window

• Available in Transformers, MLX Q4, and LoRA formats

📊 Performance on Mac M4 Pro (64GB):

• Inference: 75-80 tok/s

• Complex multi-node workflows

• AI agent integrations