Replaced reasoning with rules - automotive triage 11 seconds to 280 milliseconds.

Reviewed an automotive incident triage prompt that was costing eleven seconds per decision on average. The prompt was asking the model to both decide and explain - and the deciding was where the cost lived.

Pulled the reasoning out of the model and into a twelve-rule classifier in front of it. The classifier decides; the model still writes the explanation. Latency landed at 280ms average, p99 at 410ms. Accuracy held within the original confidence interval across the four thousand-case regression set.

The lesson isn't anti-model. It's that "model decides" and "model explains" are two different jobs, and conflating them was costing thirty-five times the latency. The model is excellent at the second job. It was being asked to do the first one as a side effect.

Not every problem wants a model. Some want a rule. The hard part is knowing which is which on first read.