millfolio
A private vault for the documents that are yours. Index your PDFs, Word docs, CSVs, and notes and ask open-ended questions — answered locally, your data never leaving the machine. A frontier model writes the code that runs on your data; it never sees the data itself.
How it works
The inference server →
The from-scratch, pure-Mojo GPU engine — socket to logits, every kernel hand-written — the models it serves, and the performance numbers.
The tools →
The from-scratch Mojo libraries underneath — networking, JSON, PDF extraction, vector search.
The privacy box →
How a frontier model can help with your data without ever seeing it.
A question becomes a program →
The deep dive on the codegen flow: a frontier model writes code that runs on your data, never the data itself.
Measuring quality →
How we quantify what int4 quantization costs, with perplexity.