Thursday, February 23, 2006

only in dreams

Cory Doctorow has a prop in his latest opus, Themepunks. I really want one now that I've been thinking about it and wonder whether it is really possible, especially given the state of the art in speech technology. Let's see if I can dig up the precise description of this wonderful yet empheral tool:

"This is a new artifact designed and executed by five previously out-of-work engineers in Athens, Georgia. They've mated a tiny Linux box with some speaker-independent continuous speech recognition software, a free software translation engine that can translate between any of twelve languages, and an extremely high-resolution LCD that blocks out words in the path of the laser-pointer.

"Turn this on, point it at a wall, and start talking. Everything said shows up on the wall, in the language of your choosing, regardless of what language the speaker was speaking."

All the while, Kettlewell's words were scrolling by in black block caps on that distant wall in crisp, laser-edged letters.

I get the feeling this is a bit like what the Japanese call gomi, or, in the story convention schwag, finally ending its projected life within the story in blister packs on a strip at Best Buy. Still I was surprised to find laser pointers on sale at the local Cumberland Farm in 2000. Just 40 years ago, a laser was a massive complex piece of scientific equipment and here it was a novelty item in between smileyface keychains and oversized gaudy incense. Cory puts his laser widget to good story use, including this one faux pas for dramatic effect:

Rat-Toothed Freddy leaned over her shoulder, blowing shit-breath in her ear. "Translation: you're ass-fucked, the lot of you."


Andrea yelped as the words appeared on the wall and reflexively swung the pointer around, painting them on the ceiling, the opposite wall, and then, finally, in miniature, at her PowerBook's lid. She twisted the pointer off.

Still, I wonder whether this speaker-independent and continuous speech recognizer would be free, to say nothing of machine translation, which is one of the more proprietary areas. Well, I guess that's because recently I have put it upon myself professionally to find our how much good a free system can do. In particular, my professor and I were discussing altering an open-source speech synthesis engine (like a version of Festival, maybe FreeTTS) so that rhythm and durational elements can be altered, perhaps even in sync with the recognizer. Researchers in Japan have developed a method based on speech recognition technology to record new TTS voices with minimal data (30 sentences). Perhaps, not the talking laser billboard but your own voice, lovingly recorded and hashed into a simulacra by some artifact so that it may speak to you like you. Science fiction has spurred some innovations which no one would have considered before.

Español | Deutsche | Français | Italiano | Português| Ch| Jp| Ko




Links to this post:

Create a Link

<< Home

All original material of whatever nature
created by Nicholas Winslow and included in
this weblog and any related pages, including archives,
is licensed under a Creative Commons
Attribution-Noncommercial-Sharealike license
unless otherwise expressly stated (2006)