Building a better rhyming dictionary
Back in 2007, I created a rhyming engine based on the public domain Moby pronouncing dictionary. It simply reads the dictionary and looks for rhyming words by comparing the suffix of the words' pronunciations. Since that time, I have made some improvements.
Using a comnbiation of techniques from artificial intelligence, math, and linguistics, the rhyming engine can now figure out how to say any word that you enter. That means if you enter a word that is not in the dictionary, it will still be able to find some rhymes.
Rather than looking for technically perfect rhymes, it suggests words that would sound good together in song or poetry. For example, we sometimes ignore consonants, as suggested by this 1985 paper. That way, fervently will rhyme with urgently despite the v/g mismatch.
There is a legal advantage to this technique as well. Many of the standard word lists used by natural language processing researchers include words from an old edition of the Oxford dictionary, and so cannot be used for "commercial purposes". That's why both Rhymezone and Write Express have a relatively limited dictionary size. My rhyming engine can sidestep this issue, since it only needs to be seeded with a small number of words from unrestricted sources, and it can then import words in bulk, and guess the pronunciations without using any restricted content.
I couldn't resist doing some premature optimization. It uses one of my favourite data structures -- the trie. The program starts, reads the entire 260,000 word database, and completes in 60 ms on my netbook web server. It takes about 8 MB of memory. I guess that equates to about 0.48 mega-byteseconds per request.
Why is this hard?Text to speech for English is still a hard problem to solve, and it is an active area of research. Consider the words rough, through, bough, thought, dough, cough, or photOgraph, photOgraphy, or physics, lymphatic, and loophole. In the 80's, and still today in many cases, text to speech is done by hiring specially trained linguists to develop the thousands of rules necessary to create pronunciations. It is only in the last 10 years or so that this task has been automated. My system has over 200,000 hints on how to interpret each part of a word given its context. With further refinements, this could probably be reduced to tens of thousands, which is still a lot.
- Automatically remove wordiness from your writing
- What does your phone number spell?
- Keeping abreast of pornographic research in computer science
- Exploring sound with wavelets
Why are all my lines fuzzy in cairo?Make sure your lines are sharp using this simple trick.
UMA Questions AnsweredA bunch of questions answered about UMA wireless technology.
VP trees: A data structure for finding stuff fastLet's say you have millions of pictures of faces tagged with names. Given a new photo, how do you find the name of person that the photo most resembles?
In the cases I mentioned, each record has hundreds or thousands of elements: the pixels in a photo, or patterns in a sound snippet, or web usage data. These records can be regarded as points in high dimensional space. When you look at a points in space, they tend to form clusters, and you can infer a lot by looking at ones nearby.