---
date: '2026-05-23'
description: Aaron Pham's simulacrum
id: consult
layout: letter
modified: 2026-06-06 23:54:26 GMT-04:00
tags:
  - evergreen
title: consulting.
created: '2026-05-23'
published: '2026-05-23'
pageLayout: letter
slug: consult
permalink: https://aarnphm.xyz/consult.md
generator:
  quartz: v4.6.0
  hostedProvider: Cloudflare
  baseUrl: aarnphm.xyz
full: https://aarnphm.xyz/llms-full.txt
---
My consulting [[thoughts/craft|work]] spans [[thoughts/Machine learning|machine learning]] systems, efficient inference engine, and [[research|model behaviour]]. This [[index|website]] represents the non-physical me, my notes, and interests. I mainly work on open source, and plan to keep doing so. As of 06/14/2026, it consists of 737275 [[thoughts/|words]] ([[thoughts/scripts/calculate_tokens.py|calculated from this script]])

I’m mostly focused on kernel optimization for AI-specific workload and metrics-oriented deployments. I offer the following services:

- prefill/decode custom kernels for latency/high-throughput specific LLMs deployment
- distributed serving and NCCL

If you would like to work with me, please contact me at <services@aarnphm.xyz>

![[thoughts/craft#open source|open-source work]]

