Friday, July 27, 2018

GUEST POST: Overcoming Writer's Block with Automatic Transcription

by Jason Kincaid


This article was originally published by Descript, and an expanded version of the article can be found here.



If you’re a writer—of books, essays, scripts, blog posts, whatever—you’re familiar with the phenomenon: the blank screen, a looming deadline, and a sinking feeling in your gut that pairs poorly with the jug of coffee you drank earlier.

If you know that rumble all too well, this post is for you. Maybe it’ll help you get out of a rut; at the very least, it’s good for a few minutes of procrastination.

Here’s the core idea: thinking out loud is often less arduous than writing. And it’s now easier than ever to combine the two, thanks to recent advances in speech recognition technology.

Of course, dictation is nothing new—and plenty of writers have taken advantage of it. Carl Sagan’s voluminous output was facilitated by his process of speaking into an audio recorder, to be transcribed later by an assistant. (You can listen to some of his dictations in the Library of Congress!) And software like Dragon’s Naturally Speaking has offered automated transcription for people with the patience and budget to pursue it.

But it’s only in the last couple of years that automated transcription has reached a sweet spot—of convenience, affordability, and accuracy—that makes it practical to use more casually. And I’ve found it increasingly useful for generating a sort of proto-first draft: an alternative approach to the painful process of converting the nebulous wisps inside your head into something you can actually work with.

I call this process idea extraction (though these ideas may be more accurately dubbed brain droppings).

Part I: Extraction

Here’s how my process works. Borrow what works for you and forget the rest—and let me know how it goes!
  • Pick a voice recorder. Start talking. Try it with a topic you’ve been chewing on for weeks—or when an idea flits into your head. Don’t overthink it. Just start blabbing.
  • The goal is to tug on as many threads as you come across and to follow them as far as they go. These threads may lead to meandering tangents—and you may discover new ideas along the way.
  • A lot of those new ideas will probably be embarrassingly bad. That’s fine. You’re already talking about the next thing! And unlike with text, your bad ideas aren’t staring you in the face.
  • Consider leaving comments to yourself as you go—e.g. “Maybe that’d work for the intro.” These will come in handy later.
  • For me, these recordings run anywhere from 20–80 minutes. Sometimes they’re much shorter, in quick succession. Whatever works.

Part II: Transcription

Once I’ve finished recording, it’s time to harness ⚡️The Power of Technology⚡️

A little background: over the last couple of years, there’s been an explosion of tools related to automatic speech recognition (ASR) thanks to huge steps forward in the underlying technologies.

Here’s how ASR works: you import your audio into the software, then the software uses state-of-the-art machine learning to spit back a text transcript a few minutes later. That transcript won’t be perfect—the robots are currently in the ‘write drunk’ phase of their careers. But for our purposes, that’s fine. You just need it to be accurate enough that you can recognize your ideas.

Once you have your text transcript, your next step is up to you. Maybe you’re exporting your transcript as a Word doc and revising from there. Maybe you’re firing up your voice recorder again to dictate a more polished take. Maybe only a few words in your audio journey are worth keeping—but that’s fine, too. It probably didn’t cost you much. (And good news: the price for this tech will continue to fall in the years ahead.)

A few more tips:

  • Use a recorder/app that you trust. Losing a recording is painful—and the anxiety of losing another can derail your most exciting creative moments. (“I hope this recorder is working. Good, it is... @#*! where was I?”)
  • Audio quality matters when it comes to automatic transcription. If your recording has a lot of background noise or you’re speaking far away from the mic, the accuracy is going to drop. Consider using earbuds (better yet: Airpods), so you can worry less about where you’re holding the recorder.
  • Find a comfortable space. Eventually you may get used to having people overhear your musings, but it’s a lot easier to let your mind “go for a walk” when you’re comfortable in your environment.
  • Speaking of walking: why not go for a stroll? The pains of writing can have just as much to do with being stationary and hunched over. Walking gets your blood flowing—and your ideas, too.
  • I have a lot of ideas, good and bad, while I’m thinking out loud and playing music at the same time (in my case, guitar—but I suspect it applies more broadly). There’s something about playing the same four-chord song on autopilot for the thousandth time that keeps my hands busy and leaves my mind free to wander.
The old ways of doing things—whether it’s with a keyboard or pen—still have their advantages. Putting words to a page can force a sort of linear thinking that is otherwise difficult to maintain. And when it comes to editing, it’s no contest: QWERTY or bust.

But for getting those first crucial paragraphs down (and maybe a few keystone ideas to build towards)? Consider talking to yourself. Even if you wind up with a transcript full of nothing but profanity—well, have you ever seen a transcript full of profanity? You could do a lot worse.

No comments:

Post a Comment