Back to log2026-04-22

Scorafy eval pipeline stable across two billing months.

Build / Scorafyscorafyai-evalreliability

The per-respondent Claude pass on Scorafy has now run two full billing months without a regression on the eval suite. Throughput sits around twelve hundred submissions per minute under peak load on a single Vercel region with cold-start under one second.

The non-obvious move was moving content moderation before the rubric pass rather than after. Earlier moderation catches the spam-text submissions that would otherwise have eaten a full Claude call. The cost saving from that single reorder is the difference between Scorafy's gross margin and a margin that doesn't work.

Eval suite covers seventeen question types and four pricing tiers. Regression budget: zero. Two months in, the budget is intact.