Evaluations before prompts: the model selection process we run with clients
How we shortlist models for a regulated agent, the evaluation harness we hand the customer on day ninety, and why we ditched the 'try GPT-4 first' default.
Long-form essays from the engineer, designer, or lead who did the work. No vendor slop, no roundups, no sponsored placements.
The nine new success criteria, translated out of spec-speak. What you will be asked to show on your next bid, and the three tests that catch eighty percent of the issues. With a side-by-side mapping to AODA and Section 508.
How we shortlist models for a regulated agent, the evaluation harness we hand the customer on day ninety, and why we ditched the 'try GPT-4 first' default.
The reservation, compute-right-sizing, and storage-tiering moves that quietly halved our biggest client's warehouse spend, and the one Terraform module we open-sourced along the way.
After forty-plus municipal and provincial submissions, the same five scoring questions trip vendors up. Here is the pattern, and a page-one rewrite that pulled an eighty-three from a sixty-six.
Three tokens, two governance principles, one naming convention. The 'boring' choices that let a design system outlast the VP who approved it.
Compensation is necessary, not sufficient. The staffing, project rotation, and learning-budget practices that keep senior engineers engaged past year three.
When RAG is the right answer, when it is not, and the hybrid architecture we ship to clients who want both a grounded tone and a learned one.
A linter that catches keyboard traps and color-contrast failures before a designer hits export. How it plugs into Figma, Storybook, and our CI.
Expand-contract, dual-write, backfill, cutover, cleanup. The canonical five-step recipe we use, with the exact commit sequence that worked on a fifty-million-row table.
Residency, encryption, identity, and audit. We walked our auditor through the architecture; here is what they pushed back on, and what they signed.
We send one well-shaped essay every two weeks. No tracking pixels, no marketing copy, easy to leave.
Wouessi is a Canadian Aboriginal and Minority Supplier Council (CAMSC) certified business. Our certification supports the supplier diversity programs of Canadian banks, government departments, and enterprise buyers in North America.