Wednesday, February 11, 2009

Auto Audiobooks

So, the new Kindle has text-to-speach capability, and the Author's Guild is all upset about it infringing their rights. Cory Doctorow has the neatest summary. It's being pretty roundly dismissed, but, I feel, for the wrong reasons. Nearly everyone is complaining about the robotic speech of modern TtS, but I think that's the wrong approach.

I talked about this a bit in a comment on Whatever (John Scalzi's excellent blog). The upshot is, while text-to-speech sucks today, it won't always. In the future it is not ridiculous to imagine that a computer reading a text will not just speak fluidly with reasonable intonation, but could use different voices for different characters, even the voices of famous actors. (On a side note, I wonder whether public figures, who do not need to be paid for likenesses, will also not need to be paid for voice-likenesses. Could I do an audiobook of, say, "A Confederacy of Dunces" using only voices from, say, the 110th Congress?)

For that matter, it would not be terribly difficult to do something like this already, if one is willing to put some work into it. An audiobook markup language could do the job admirably. It would need to have several things:
  • The ability to mark particular characters, and a table of voices to match characters to voice "actors"
  • Markup for pauses, emphasis, volume, and speed (like music)
  • Either markup for or a glossary for pronunciation
This would not be difficult to do, actually.

No comments:

Post a Comment