The Curious Complexity of Being Turned On
In software, the simplest things can turn into a nightmare, especially at a large company.

"expertsexchange.com" is a domain name that can be read in multiple, unintended ways. Howshouldatexttospeechsystemresolvethisambiguity?
Recently, I was contracted to run a list of domain names through the custom-built pronunciation engine that powers my rhyming web site. On the first attempt, I found that the results were embarrassingly bad. A quick inspection revealed the problem: most domain names are severalwordsstucktogether.
When a pronunciation by analogy system encounters an unknown word, it searches its knowledge base for words that look similar, and tries to stitch together their pronunciations. In this case, it was doing just what it was supposed to do. For example, lots of words end with an 'e', and usually that 'e' is silent when at the end of a word. But stick another word on, and the system would try to pronounce the 'e', just like a six-year-old learning to read by sounding out each letter. Most people, on the other hand, would recognize the two words and say them each individually.
Try these domains in the AT&T text to speech system, which many consider to be the best in the world, at http://www.research.att.com/~ttsweb/tts/demo.php.
Time for a bit of dynamic programming. After finding an appropriate scoring function, we can break up text the same way a human reader would. We also use some simple heuristics to say numbers properly.
Although I don't have a speech synthesizer, you can check the raw pronunciation output using this form. The phonemes correspond to the ones in the CMU pronouncing dictionary.
It german speech and means showerlight or youbitch ;-)
I have no experience with dynamic programming, and unlike your phonenum-spelling post, it's hard for me to understand how exactly the problem was broken down. I assume you did this with your 'scoring function'. Would you mind quickly jotting down some pseudocode for how exactly this function works?
Cheers!
In software, the simplest things can turn into a nightmare, especially at a large company.
The day arrived when my project was ready to be unleashed upon the world. I waited until the teacher was hovering nearby and then I started my application, running the FORMAT command on the network drive. Some classmates were watching the screen and she hurried over to see what all the fuss was about.
Now it's a commercial product, but Zwibbler was once a fun side-project, and here's some details on its implementation.
If you have a web site with a search function, you will rapidly realize that most mortals are terrible typists. Many searches contain mispelled words, and users will expect these searches to magically work. This magic is often done using levenshtein distance. In this article, I'll compare two ways of finding the closest matching word in a large dictionary. I'll describe how I use it on rhymebrain.com