The fractional AI systems lead you hire before you hire one.

I embed with ops-heavy SaaS teams and build the production AI systems your engineers will actually run. A decade at Disney, Apple, and a long list of startups before that. One engagement at a time.

Currently embedded with MixShift.

See how I work →

FIELD NOTES · ISSUE 014

When AI evals replace QA, and when they don't

Three weeks in, the eval harness caught what QA missed: cases users care about that look fine on the surface.

Berkeley, CA06:14

~ timurtek/mixshift on main

$ claude run intake-evals

› 247 conversations parsed

› guardrails: passed

› flagged for review: 9

CURRENTLY SHIPPING · APR 2026

Onboarding ops pipeline · 3 of 5 evals shipped

Intake parsing · guardrails · review queue. Two more before the handoff doc.

MixShiftembedded since Jan 2025

Shipped with

Active engagement

Currently embedded with MixShift.

Jan 2025 to present

The AI layer on top of an analytics platform agencies actually use.

Building AI decision-support across MixShift’s Amazon analytics platform: onboarding ops, eval infrastructure, and the operator-grade interfaces that turn the model output into something seller and advertising teams open every day.The case study lands when the engagement wraps.

See the rest of the work

How I work

Three commitments that shape every engagement.

Embed

Embedded, not extracted.

Long engagements, inside your team, on your tools. Not a vendor who emails PDFs. I join your Slack, your standups, your codebase, and your on-call rotation if it matters. I stay long enough to make the systems ship, and long enough to hand them off cleanly.

Build

Operator-grade interfaces.

Every AI system your team uses has to be operable by humans. A decade of design engineering at Disney, Apple, and a long list of startups before that taught me that information architecture matters as much as model selection. Dashboards, trace viewers, eval tools your team will actually open.

Hand off

Ship, then hand off.

Six-month engagements end with your team owning the system. Code, prompts, infrastructure, runbooks. All yours. No lock-in, no retainer-for-life. If the engagement extends, it extends because it's delivering. If it doesn't, you're not stuck.

Thinking

Recent writing.

All essays

Tools & Stack

The repos that turn Claude Code into an agent stack

A verified, leveled list of the repos that turn Claude Code into an agent stack: skills, browser hands, planning discipline, and the ones worth skipping.

4 min readJul 24, 2026

Tools & Stack

The one gate I will not automate

Full automation is not the finish line. Why one deliberately placed human gate beats an unattended agent pipeline, and where that gate belongs in yours.

4 min readJul 20, 2026

Founder Lessons

Build a swipe file that does not rot

How to build a swipe file that stays usable: clip posts into a vault, download the image the day you save it, and let a routine write structured craft notes.

4 min readJul 17, 2026

The fractional AI systems lead you hire before you hire one.

When AI evals replace QA, and when they don't

Onboarding ops pipeline · 3 of 5 evals shipped

Currently embedded with MixShift.

The AI layer on top of an analytics platform agencies actually use.

Three commitments that shape every engagement.

Embedded, not extracted.

Operator-grade interfaces.

Ship, then hand off.

Recent writing.

The repos that turn Claude Code into an agent stack

The one gate I will not automate

Build a swipe file that does not rot

The Production Layer.

Field notes from a rebuild: what I learned redesigning my own site with Claude in the loop.