The Skill System That Writes Itself

Let me get the obvious out of the way: we’re late.

Every serious AI assistant has had skills for months now — little saved bundles of know-how the assistant can reach for when a task comes up. We didn’t ship them. For a while, that nagged at me.

Then I spent some real time with the systems that had them, and the nagging stopped. Because once you look closely, most skill systems quietly hand three jobs to you, the human:

You write the skill.
You store the skill.
You tell the assistant which skill to use, and when.

That’s not an assistant with skills. That’s a filing cabinet with an assistant standing next to it, waiting for you to pull the right folder. The intelligence is still doing the boring part by hand — you.

So we waited, and we asked a different question. Not “how do we let people save instructions for the AI?” but “what would it look like if the assistant did the filing itself?”

This is the answer we shipped.

First, what a “skill” even is

Strip away the jargon and a skill is just a saved recipe for a task you’d otherwise figure out from scratch every time.

“Plan a weekend trip” isn’t one action — it’s a little sequence: check the weather, search for things to do, look at your calendar, draft an itinerary, set a couple of reminders. A skill is that sequence, written down once as a clean set of steps, so the next time you ask, the assistant already knows the shape of the job.

Useful. The interesting part is everything around the recipe — who writes it, who cleans it up, and who decides when to use it. That’s where we did the work.

The one rule: the human shouldn’t be the filing clerk

Everything below follows from a single principle. If a skill system is any good, the person using it should never have to think about the skill system. It should fill itself, tidy itself, find itself, and fit itself to you — in the background, without being asked.

Here’s how Chalie does each of those.

It notices when something was hard

Most assistants only ever learn a skill because you sat down and taught them one. Chalie does that too — but it doesn’t wait for you.

When Chalie finishes one of your requests, it glances back at how much work that request actually took. If the job needed four or more rounds of reaching for tools — search, then read, then check the calendar, then draft, each step feeding the next — that’s Chalie’s signal that the task was non-trivial. A quiet event fires on its own.

Nobody asked it to. You won’t see a thing. In the background, Chalie re-reads what it just did and asks itself one question:

Was that worth remembering?

This is the part I’m proudest of, and it’s a small thing with a big consequence. The trigger isn’t “the user said the magic word.” The trigger is the task itself was complex — measured by how hard Chalie had to work to finish it. Difficulty is the teacher. The harder a job was the first time, the more valuable it is to never have to figure it out from scratch again.

And most of the time, the answer to “was that worth remembering?” is no — on purpose. Which brings me to the part I like even more.

It keeps the recipe, not the mess

Here’s the trap every “let the AI write its own skills” system falls into: the AI just transcribes whatever it did. Including the wrong turns.

Real work is messy. Chalie tries a search that returns junk, backs up, tries a different angle, repeats a lookup it didn’t need, does step three before step two and pays for it. If you save that as a skill, you’ve saved the confusion. Next time, the assistant faithfully reproduces all your dead ends.

So before Chalie saves anything, it does something we literally named dead-end elimination. It reviews the whole messy trail and strips out:

Dead ends — the tool calls that failed or got abandoned for a better approach.
Redundant steps — the lookups that repeated work and produced nothing new.
Bad ordering — steps that clearly belong in a different sequence.

The instruction it follows is one sentence, and it’s the heart of the whole system:

The skill must encode the optimal path — not the discovery journey.

Think of a recipe card. A good one tells you to whisk the eggs and fold in the flour. It doesn’t mention the egg you dropped on the floor, or the time you reached for salt instead of sugar. Nobody wants the bloopers — they want the version that works. Chalie saves the version that works.

It’s also genuinely strict about it. Its default verdict on any task is “not worth saving.” To clear the bar, a workflow has to use several tools together, have a clear beginning and end, and — crucially — be general. A skill that only works for “book a table at this exact restaurant for this one birthday” is useless tomorrow. Chalie throws those away and keeps only the patterns a different person, on a different day, would genuinely follow again.

The result is a skill library that stays small, clean, and trustworthy — because it’s mostly made of “no.”

It finds the right one by itself

You will never have to tell Chalie which skill to use. You won’t even know it has one.

This is the bit that most surprised people I’ve shown it to. In a lot of assistants, a skill only helps if you remember it exists and steer the conversation toward it. The skill is real, but it’s inert until a human points at it.

Chalie points at its own. The ability to search its skill library is always switched on, sitting in the assistant’s hands on every single turn. So when you say “help me plan a trip,” Chalie — entirely on its own initiative — asks its own library “do I already know how to do this?” before it starts improvising. If there’s a matching recipe, it uses it. If there isn’t, it just gets on with the task (and, if the task turns out to be hard, you already know what happens next).

You direct what you want. Chalie handles how — including remembering that it already knows how.

…without drowning in its own notes

There’s a hidden cost to skills that nobody talks about, so let me talk about it.

An AI has a limited amount of attention — a working memory it has to spend on every reply. Every word you make it read is a word it can’t spend on actually thinking. So the lazy way to “give an AI skills” is to dump the whole folder of instructions into its lap every time and let it skim, or to have it rummage through files looking for matches. Both quietly burn that working memory, and it gets worse with every skill you add. The more it knows, the dumber each individual answer gets. That’s a bad trade.

Chalie doesn’t read its skills. It asks them.

The skills live in a small, fast index — think of it as a librarian rather than a pile of books. When Chalie needs a skill, it poses a question to the librarian and gets back only the two or three recipes that actually match. The other seventy stay on the shelf, costing nothing.

And it asks in two ways at once: by meaning (so “organise a getaway” finds the trip-planning recipe even though the words are different) and by exact wording (so specific terms land precisely). The two results are blended into one short, sharp answer. The whole thing runs privately, on the machine, with no folder-rummaging and no working memory wasted. Chalie can know hundreds of skills and stay exactly as sharp on every reply.

It quietly shapes each recipe to you

This is the piece that wasn’t even in my original list of things to write about — and it might be my favourite.

A shared recipe is generic by definition. But you are not generic. Over time, Chalie notices patterns in how you operate — the rhythms, the preferences, the way you like things done. In the background, it maps those patterns onto its skills and writes little personalisation notes: for this person, this skill should be run slightly differently.

So the same built-in “plan a trip” skill isn’t quite the same skill for you as it is for me. Yours has quietly absorbed how you actually travel. You never set this up. It accumulates on its own, the same way a good human assistant gradually learns your habits without you ever sitting them down for a briefing.

And you’re never locked out

For all this autonomy, you stay fully in charge.

You can teach Chalie a skill directly — just tell it, in plain conversation, “learn how to do X,” and it will write the recipe.
You can author your own from the dashboard, edit them, delete them, or take one of the built-in ones and copy-and-customise it into your own version.
And it ships with more than seventy ready-made skills out of the box — each one hand-curated and run through our nightly test suite before it ever reaches you, so day one isn’t a blank library.

Autonomy and control aren’t a trade-off here. Chalie fills the library on its own and hands you the pen.

Why late was worth it

So, yes — we were last to the party. But I don’t think we showed up with the same thing everyone else brought.

The common version of skills is a cabinet you fill, you tidy, and you open. Chalie’s version fills itself when a task proves hard, cleans itself down to only the path that worked, finds the right recipe without being asked, does it all without getting slower, and quietly tailors each one to the person it’s working for.

That’s not a feature we bolted on. It’s the assistant doing the boring part for you — which, when I started building this thing, was the entire point.

A few terms, in plain English

Skill — A saved recipe for a repeatable task: a clean, ordered set of steps the assistant can reuse instead of figuring the job out from scratch each time.

Agent — An AI that doesn’t just chat back, but can take actions on your behalf — searching, drafting, scheduling — to actually get a task done.

Harness — The machinery built around the AI model that decides what it sees, what it remembers, and what it’s allowed to do. Most of what makes an assistant feel smart lives here, not in the model itself.

Tool — A single capability the assistant can use, like “search the web,” “check the calendar,” or “draft a document.” Skills are recipes made of tools.

Tool-calling round (an “iteration”) — One step in a multi-step task, where the assistant uses a tool, looks at the result, and decides what to do next. A hard task takes many rounds; that count is how Chalie measures difficulty.

Working memory / context — The limited amount the AI can pay attention to in a single reply. Anything you force it to read eats into the attention it has left for actually thinking — which is why bloating it with every skill at once makes answers worse.

Search by meaning vs. by wording — Two ways to look something up. By meaning finds matches even when the words differ (“getaway” → “trip”). By wording matches the exact terms. Chalie does both at once and blends the results.

Subconscious — Chalie’s background activity: the thinking and tidying it does while you’re not actively talking to it. Learning skills and personalising them both happen here, out of sight.