cat about.txt

I'm EL. I work at the intersection of intelligence, simulation, and silicon.

Professionally, I consult on LLM applications and AI agent architectures — helping teams cut through the hype and ship systems that actually work at scale. I've developed strong opinions about what the field consistently gets wrong.

When I'm not in AI-land: building atmospheric worlds as an indie UE5 developer, and doing things with microcontrollers and embedded systems that probably don't need doing.

This site is where I think out loud.

ls -la lab/

Active experiments. Updated irregularly.

AI ● active

Multi-Agent Memory Patterns

Testing episodic vs semantic memory in long-running agents. Episodic recall degrades predictably; semantic compression has more interesting failure modes.

AI ● active

LLM Eval Without Benchmarks

Building task-specific evaluation that doesn't rely on contaminated standard benchmarks. Real deployments need different metrics than academic papers measure.

UE5 ● active

PCG + Houdini Pipeline

Procedural level generation via UE5's PCG framework with Houdini as pre-process. Getting authorial control without manual placement at scale.

HW ● in progress

RP2040 Environmental Monitor

Custom sensor array on RP2040 — temp, humidity, CO₂, particulates. Writing drivers from scratch. More interesting than using a prebuilt HAT.

HW ● active

Local LLM Stack

Running quantized models on consumer GPUs. 4-bit/8-bit tradeoffs are more nuanced than the papers suggest. Tracking what's actually usable at each scale.

AI ● paused

Tool-Use Reliability

Systematic analysis of tool-calling reliability across frontier models. Structured output helps, but schema complexity is the hidden failure mode nobody talks about.

ls projects/ --all

AI public

agent-kit

Personal toolkit for building LLM agents. Opinionated abstractions for tool use, memory, and multi-agent coordination. Built because existing frameworks make too many wrong choices.

AI private

eval-bench

Task-specific LLM evaluation harness. Generates synthetic test cases from production logs, measures what matters in real deployments rather than academic metrics.

AI public

prompt-surgeon

CLI tool for analyzing and optimizing LLM prompts. Detects failure patterns, suggests structural fixes, runs A/B tests against your target model.

UE5 WIP

[CLASSIFIED]

Atmospheric exploration game in UE5. Heavy procedural generation, volumetric fog, Lumen GI. Not ready to show publicly yet.

HW done

headless-cluster

4-node Raspberry Pi cluster for distributed workloads and local inference. Custom 3D-printed rack, managed switch, shared NFS, k3s orchestration.

HW done

kb-custom

Custom 65% mechanical keyboard. QMK firmware, layout tuned for code. Hand-wired matrix, milled aluminum case, lubed and filmed tactile switches.

lshw --summary

// main rig

  • CPUAMD Ryzen 9 7950X
  • GPUNVIDIA RTX 4090 24GB
  • RAM128GB DDR5-6000
  • Storage2× 2TB NVMe + 8TB HDD
  • OSArch Linux / Windows 11

// microcontrollers

  • RP2040primary embedded target
  • ESP32-S3wireless & BLE projects
  • STM32real-time performance
  • Arduinorapid prototyping

// dev stack

  • EditorNeovim / Rider (UE5)
  • Terminalkitty + zsh + starship
  • VCSgit (obviously)
  • AIClaude, local Qwen

// peripherals

  • MonitorLG 27" 4K OLED
  • KeyboardCustom 65% (hand-wired)
  • MouseG Pro X Superlight 2
  • AudioBeyerdynamic DT 990 Pro

cat contact.txt

Open to consulting engagements and interesting conversations. Particularly interested in teams working on agent systems, AI infrastructure, or anything at the hardware-software boundary.

Response time: usually within 48h. No agencies, please.