All capabilities
Capability / 01

Foundation models

Fine-tuning, LoRA adapters, RLHF.

We adapt open-weights and frontier models to specific domains without burning a foundation-model budget. The point is rarely a bigger model — it's the right model, post-trained on the right data, evaluated against the right benchmark.

§ 01

What we use, and when

Closed frontier models (Anthropic, OpenAI, Google) when latency matters less than reasoning depth and the use case is general. Open-weights models (Llama, Qwen, Mistral, DeepSeek) when we need full control of inference cost, data residency, or post-training. The decision is rarely religious — it follows the eval and the unit economics.

§ 02

Techniques in the toolbox

01

Full fine-tune

When the domain shift is real and the data volume justifies the GPU bill.

02

LoRA / QLoRA

Parameter-efficient adapters. Cheap to train, cheap to swap, cheap to ship.

03

RLHF / DPO

Preference optimization for style, safety, and judgment-call decisions.

04

Distillation

Compress a frontier-model behavior into a smaller, faster, deployable model.

05

Constrained decoding

JSON-mode, regex grammars, schema-constrained generation when structure matters.

06

Tool-augmented prompting

Sometimes the right answer is a smaller model with better tools, not a bigger model.

§ 03

What this gets you

5–20×
Cost reduction vs. frontier defaults
+
Eval gains documented per technique
Open
Weights you actually own

Ready to look at this
in your context?

Start a conversation