A Rhyming Engine
Here's a rhyming engine, written in 1000 lines of C++ code. It uses the freely available Moby dictionary, and full source code is provided. Give it a try. Read on for technical information.
Please try the updated rhyming web site.
Relationship to Rhymebrain.com
I originally wrote this post in 2007. For the past five years I have been researching linguistics and machine learning techniques and created rhymebrain.com. Rhymebrain, at its core, still uses the same techniques, but it does it much faster and supports any human language. Rhymebrain is propreitary. However, the source code presented with this article is released to the public domain.
How it works
Programs involving the english language are quite interesting, because there really are no rules. English is very complicated. A great resource is the Moby project, which is a public domain dictionary. It includes a text file providing word lists, including pronunciation and parts of speech.Although I demonstrate only the rhyming part, my WordDatabase object uses both the part of speech and pronunciation information. This is for future expansion. Download source code here
Overview
- As part of compilation, a perl script combines the Parts Of Speech and Pronunciation dictionary into a single file, dict.txt.
- The rhyming engine reads dict.txt and, for every word, creates a Word object.
- The Word object has the part of speech. It also contains a representation of the pronunciation of the word.
- To figure out if two words rhyme, we compare the last part of their pronunciation. The more syllables that rhyme, the better the rhyme is.
- The output is sorted so that better rhymes are listed first.
Similar projects
I'm not aware of many similar projects. Somebody named "tuffy" made a rhyming dictionary on sourceforge: rhyme.sourceforge.net. However, for the life of me, I can't figure out why it needs to use an external database package. My technique does not need to pre-compute the rhymes, and it is half as many lines of code.Personally, I use the rhymezone.com rhyming dictionary for my rhyming needs.
The Rhyming API
class WordDatabase { public: WordDatabase(); ~WordDatabase(); bool load(const char* filename); bool findRhymes( DynamicArray& results, const char* word, WordFilter* filter, WordArray* wordList = 0); Word* lookup( const char* whichword ); void filter( WordArray& results, WordFilter* filter ); bool makeWords( WordArray& results, const char* text ); bool loadThesaurus( const char* thesaurus ); void addSynonym( Word* base, Word* synonym ); unsigned getSynonyms( WordArray& results, const char* wordtext ); private: StringMap _wordMap; DynamicArray _wordArray; };
In this introduction, I use the word phoneme to mean a part of a word, as pronounced. Using a sequence of phonemes, and whether each phoneme is emphasized, you can completely describe how to pronounce a word, and hence derive its rhymes and syllables.
Preprocessing
The Moby project has separate files for parts of speech and pronunciation. So I wrote a perl script to combine the two files. The resulting word list only includes words that have both part of speech and pronunciation information.From the readme file, I believe that the Moby pronunciation was derived by a British person. Fortunately, for the purposes of rhyming, it doesn't really matter. "Potato" will rhyme with "Tomato" no matter if you say "Po-tay-to" or "po-tah-toh".
A standard set of phonemes
For this project, I also leave it open to include the CMU dictionary, another free online dictinary. The problem is that these two dictionaries use a different set of phonemes. I had to figure out a mapping so that I could possible combine both dictionaries later on. The mapping is below. The fields are:- An enumeration for the phoneme
- How the phoneme is displayed. This is for debugging purposes only, and is different from what is read in from the dictionary file.
- Whether it represents a syllable.
PhoneSet_t PhoneSet[] = { { a_in_dab, "ae", 1 }, { a_in_air, "ey", 1 }, { a_in_far, "ao", 1 }, { a_in_day, "ay", 1 }, { a_in_ado, "ah", 1 }, { ir_in_tire, "ire", 1 }, { b_in_nab, "b", 0 }, { ch_in_ouch, "ch", 0 }, { d_in_pod, "d", 0 }, { e_in_red, "e", 1 }, { e_in_see, "ee", 1 }, { f_in_elf, "f", 0 }, { g_in_fig, "g", 0 }, { h_in_had, "h", 0 }, { w_in_white, "wh", 0 }, { i_in_hid, "i", 1 }, { i_in_ice, "eye", 1 }, { g_in_vegetably, "g", 0 }, { c_in_act, "k", 0 }, { l_in_ail, "l", 0 }, { m_in_aim, "m", 0 }, { ng_in_bang, "ng", 0 }, { n_in_and, "n", 0 }, { oi_in_oil, "oy", 1 }, { o_in_bob, "aa", 1 }, { ow_in_how, "ow", 1 }, { o_in_dog, "ah", 1 }, { o_in_boat, "oh", 1 }, { oo_in_too, "oo", 1 }, { oo_in_book, "ooh", 1 }, { p_in_imp, "p", 0 }, { r_in_ire, "er", 0 }, { sh_in_she, "sh", 0 }, { s_in_sip, "s", 0 }, { th_in_bath, "dth", 0 }, { th_in_the, "th", 0 }, { t_in_tap, "t", 0 }, { u_in_cup, "uh", 1 }, { u_in_burn, "u", 1 }, { v_in_average, "v", 0 }, { w_in_win, "w", 0 }, { y_in_you, "y", 0 }, { s_in_vision, "zh", 0 }, { z_in_zoo, "z", 0 }, { a_in_ami, "a", 1 }, { n_in_francoise, "n", 0 }, { r_in_der, "r", 0 }, { ch_in_bach, "chh", 0 }, { eu_in_bleu, "eu", 1 }, { u_in_duboise, "u", 1 }, { wa_in_noire, "WA", 1 } };
The combined dictionary
Here's the first few lines of the combined dictionary, which includes 110000 words. Each line has three parts. The first is the word, the second is the part of speech (N is noun, etc) and the third is the pronunciation. Right now, I use only the moby dictionary, so the pronunciation keys are weird characters as described in the moby readme file. Each phoneme is separated by a slash character.A N /eI/ AWOL A '/eI/w/A/l Aachen N '/A/k/@/n Aalborg N '/O/lb/O/rg Aalesund N '/O/l/@/,s/U/n Aalst N /A/lst Aalto N '/A/lt/O/ Aar N /A/r
Syllables
It took a bit of thinking to figure out how to count the syllables in the word. Believe it or not, it can be derived from the phoneme alone. The first approach I tried is to split the word into syllables based on consonents. This didn't work. In fact, you get the best results if you split based on vowels. (Linguists reading this are now shouting at the monitor, "Of course, you fool!") In the PhoneSet table above, I mark a phoneme with a 1 if it represents a new syllable. The number of syllables in a words can be derived by adding up the value of the phoneme.Representing words
On startup, the program takes a second to read the dictionary file. It parses the phoneme (separated by slash characters) into a word structure, where each phoneme is represented by a 16 bit number. If the phoneme begins a new syllable, the upper 2 bits are:- 0 if the phoneme doesn't begin a new syllable
- 1 if the phoneme is a secondary stress
- 2 if the phoneme is of primary stress
Rhyming
Rhyming is not very complex once the words are in the phoneme format. To determine whether two words rhyme, you simple compare the suffixes of their pronunciation. Stresses are included, as well. Amateur song writers often try to rhyme things like "hello" and "yellow" and they come up with horrible lyrics, since the words are stressed differently. Since the stresses are stored with the phonemes, a simple string comparison will take care of this automatically, and these words will not be found to rhyme.Other features
Although I didn't include it here, the word database can optionally load in the moby thesaurus, "mobythes.aur". Thus you can figure out if any word has the same meaning as any other.
You can use a "WordFilter" object to filter the results to have a certain number of syllables, or match a set of stresses. For example, you can request a noun phrase of 5 syllables to match the stresses "0101010101", which is iambic pentameter.
Future work
Lots of stuff can be done. I have noticed in hip hop and rap music, the rhyming rules are relaxed greatly. For example, eminem might consider time and spine to be good rhymes. By redefining more phonemes to be identical, you can emulate this type of rhyming.
Hello Steve,
I came across your website while I was searching for android API calls for RhymeZone.
I am currently using MIT android app inventor to make API calls to other apps on my cell
phone. I made some progress, I can call up RhymeZone android app, but I cannot
send text to retrieve results from search. I was hoping you might be able and willing
to assist me. Here is what I have so far...
Action: android.intent.action.MAIN
ActivityClass: com.rhymezone.rzapp.RhymeZoneActivity
ActivityPackage : com.rhymezone.rzapp
DataType: (text???)
DataUri
ExtraKey: (????)
ExtraValue: (word to pass to rhymezone to search on?)
ResultName: (should return selected word?)
pac_northwest1@yahoo.com
British English: Po-tay-to & Tom-ah-to
It only rhymes if you use the American Tom-ay-to!!!
Who says po-tay-to???? The British certainly don't!!
I'm a student of Computer Science from Indonesia. I'm trying to make poem generator project.
I have a difficluties in implementing rhyming algorithm.
What are you using for this rhyme machine?
Do you have any paper about this project so I can learn it and implemented so I can put your name to my bibliography? :)
Because your project seems so good and have a good algorithm too.
Thx
if ( pRhymeWord != NULL ) // precaution
delete pRhymeWord;
}
Thanx
The code works well. However you need to add the followig code in RhymeWordDatabase's destructor to avoid memory leaks.
-----------------------------
// Clear String Map
m_wordMap.clear(); // clear container
// Clear word array
// This is where we can delete the actual allocations
for ( unsigned int i=0 ; i<m_wordArray.size() ; i++ )
{
RhymeWord* pRhymeWord = m_wordArray[i];
if ( pRhymeWord != NULL ) // precaution
delete pRhymeWord;
}
m_wordArray.clear(); // clear container
------------------------
Also, I've removed the RhymeWordArray class and replaced it with a simple vector. I'm not sure if I won't replace the list as well...
This is a real good way to get introduced to this fild.
Cool.
bye
There is no excuse for my using my own DynamicArray class. I should use STL for this, since the std::vector class is great.
As for StringMap, for a word dictionary containing hundreds of thousands of elements, a hashtable is essential instead of the O(logn) stl::map. Hash tables are non-standard in STL (some C++ libraries do not have std::hash_map). I believe my implementation of StringMap is much faster than std::hash_map, because it is based on a list class that only handles pointers and doesn't need to worry about constructors and destructors for the elements.
I used to be upset at the code bloat that STL caused, although modern compilers have gotten better. My list class will result in smaller code than STL, because it uses a technique that I borrowed from Microsoft's MFC library. Instead of duplicating the entire list class for each type, I make the List base class once for void*. Then, the template List<class PTR> does nothing but cast to the appropriate pointer and call the base class functions. This results in smaller code.
Reading through your code quickly, I see that you define your own list classes instead of STL. Any reason for this? Just curious...
Keep up the good work.
I was wondering how to implement this code through a simple solution through php that does not require any binary files on the server, c++, python, or C.