Skip to main content

Product POV

Why we built voice-first

By William Simpson · May 18, 2026 · 6 min read

Almost everyone has tried to keep a journal. Almost no one has kept one for long. The Moleskine lives in the drawer. The notes app fills up for two weeks then sits cold for a year. The pattern is so universal it’s funny, except for the part where you started journaling for a reason that hasn’t gone away.

The usual diagnosis is discipline. If you just had more of it, you’d write every day. I don’t buy that. The problem is mechanical. Typing is a bad fit for the kind of thinking a journal is supposed to capture, and the worst time to ask someone to type is exactly when they most need to journal.

Typing is too slow for the thought

The average person speaks at 130 to 150 words per minute. The average person types at around 40. That gap is the whole problem. When you’re trying to capture a feeling, an argument with your partner, or a hard moment from the day, you’re forced to either truncate the thought or lose it. Most people do both without noticing, then conclude they have nothing to say. They have plenty to say. The keyboard just can’t keep up.

This isn’t a failure of effort. It’s a failure of medium. If you’ve ever tried to take meeting notes and missed what someone said because you were still typing the last sentence, you’ve felt the same bottleneck. Now apply it to something more emotionally complicated than meeting notes.

Typing makes you edit while you’re feeling

The harder problem is what typing does to the thought before it lands. Typing is a writerly act. You see the words appear and you fix them. You delete the sentence that sounded too whiny. You smooth the half-formed thought into a complete one. By the time you’ve written a paragraph you’ve performed a small editorial cleanup on the version of yourself that’s allowed onto the page.

This is fine if you’re writing an essay. It is exactly the wrong thing if the journal is for processing what’s actually going on. The editing is what you don’t want.

Voice doesn’t let you do that, or at least not as easily. You start talking and you find yourself saying something you didn’t know you were going to say. The thought arrives in the speaking, not before. That’s most of the value.

Talking is what we already do for the hard stuff

Think about what you do when something heavy is on your mind. You call a friend. You go for a walk and think out loud. You vent at your partner over dinner. The voice channel is what humans use when the thing is hard to put down on paper — when it doesn’t have a tidy ending or a clear thesis yet. It’s been the right channel for that job for a hundred years.

Therapy is a structured version of the same mechanic. An hour a week, sitting in a room, saying things out loud. Nobody walks into a session and asks the therapist if they can type at her for fifty minutes instead. The voice channel works for the messy stuff because the messy stuff doesn’t survive the typing process intact.

A voice journal is the same mechanic, in your pocket, with nobody else on the line. You can capture the argument the night it happened, not on Tuesday at 4pm when you’ve already half-rewritten it in your head. You can describe the dream while it’s still wet. You can leave a one-minute note about something that bothered you at lunch and forget about it until you’re ready to look at it.

This is the difference between a journaling app and a journal for the actually hard stuff. Gratitude lists and bullet-point CBT worksheets are fine, but they’re a different exercise. They want structure. The hard stuff wants the opposite. It wants whatever is closest to the surface, before you’ve cleaned it up for an audience.

Friction is the whole problem

Habits live or die by friction. If the cost of capturing a thought is “open the app, find a quiet surface, type for five minutes,” most thoughts won’t get captured. If the cost is “open the app, hit one button, talk for ninety seconds,” most of them will. The math is not subtle, and it’s the reason every diet, every gym membership, and every journaling app fails the same way. The first version is too much work for the user you’ll actually be next Tuesday.

So we made the talking version the only version. Voice in. Transcribed text out, on the device, so you have something to read back later. Optional AI to surface patterns over weeks if you want it. None of it requires you to type while you’re upset, which is when you need the journal to be easiest.

The privacy story has to hold or none of this matters

A reasonable objection. Audio recordings of someone talking through their own hard stuff sound like the worst possible thing to upload to a server. Agreed. That’s why we don’t.

This is where the architectural choice comes in. Apple ships an on-device speech recognition engine. It runs locally on your phone, with no audio leaving the device, no cloud round-trip, no transcript sitting on someone else’s machine. We use that engine. The audio is processed on your phone and discarded. The transcript stays on your phone too, unless you opt into AI insights, and even then it leaves your phone using your own API key, never through our servers. The BYOK explainer covers that half of the privacy story in detail.

Without on-device transcription, voice-first means uploading audio to a server, and the whole privacy model collapses. You can’t claim “your journal is yours” if you’re shipping recordings of someone’s private reflections to a transcription vendor in someone else’s data center. The on-device engine is what makes the rest of the design honest.

This is why we’re iOS-only at launch. Android has on-device speech recognition too, but the quality and consistency aren’t there yet across devices. We’d rather do one platform well than two platforms in a way that breaks the privacy promise.

What voice doesn’t fix

Voice journaling isn’t magic. It doesn’t make people who don’t want to journal start journaling. If the underlying reason you stopped journaling was that you didn’t actually want to look at what’s going on, no app is going to fix that, and probably you don’t need an app, you need a therapist.

What voice does is remove one barrier. The mechanical one. The “it’s too much effort to type at 11pm when I’m wrecked” one. For the people who want to journal and have been losing the fight with their notes app, that’s the difference between a habit that lasts a month and one that lasts a year.

If you’ve tried journaling and it didn’t stick, it’s probably not you. It’s probably the medium. Try voice for a couple of weeks and see if the practice gets easier. If it does, you’ll know.

More from the blog