Exploring sound with Wavelets

Here's a rhyming engine, written in 1000 lines of C++ code. It uses the freely available Moby dictionary, and full source code is provided. Give it a try. Read on for technical information.
Please try the updated rhyming web site.
Relationship to Rhymebrain.com
I originally wrote this post in 2007. For the past five years I have been researching linguistics and machine learning techniques and created rhymebrain.com. Rhymebrain, at its core, still uses the same techniques, but it does it much faster and supports any human language. Rhymebrain is propreitary. However, the source code presented with this article is released to the public domain.
Although I demonstrate only the rhyming part, my WordDatabase object uses both the part of speech and pronunciation information. This is for future expansion. Download source code here
Personally, I use the rhymezone.com rhyming dictionary for my rhyming needs.
class WordDatabase { public: WordDatabase(); ~WordDatabase(); bool load(const char* filename); bool findRhymes( DynamicArray& results, const char* word, WordFilter* filter, WordArray* wordList = 0); Word* lookup( const char* whichword ); void filter( WordArray& results, WordFilter* filter ); bool makeWords( WordArray& results, const char* text ); bool loadThesaurus( const char* thesaurus ); void addSynonym( Word* base, Word* synonym ); unsigned getSynonyms( WordArray& results, const char* wordtext ); private: StringMap _wordMap; DynamicArray _wordArray; };
In this introduction, I use the word phoneme to mean a part of a word, as pronounced. Using a sequence of phonemes, and whether each phoneme is emphasized, you can completely describe how to pronounce a word, and hence derive its rhymes and syllables.
From the readme file, I believe that the Moby pronunciation was derived by a British person. Fortunately, for the purposes of rhyming, it doesn't really matter. "Potato" will rhyme with "Tomato" no matter if you say "Po-tay-to" or "po-tah-toh".
PhoneSet_t PhoneSet[] = { { a_in_dab, "ae", 1 }, { a_in_air, "ey", 1 }, { a_in_far, "ao", 1 }, { a_in_day, "ay", 1 }, { a_in_ado, "ah", 1 }, { ir_in_tire, "ire", 1 }, { b_in_nab, "b", 0 }, { ch_in_ouch, "ch", 0 }, { d_in_pod, "d", 0 }, { e_in_red, "e", 1 }, { e_in_see, "ee", 1 }, { f_in_elf, "f", 0 }, { g_in_fig, "g", 0 }, { h_in_had, "h", 0 }, { w_in_white, "wh", 0 }, { i_in_hid, "i", 1 }, { i_in_ice, "eye", 1 }, { g_in_vegetably, "g", 0 }, { c_in_act, "k", 0 }, { l_in_ail, "l", 0 }, { m_in_aim, "m", 0 }, { ng_in_bang, "ng", 0 }, { n_in_and, "n", 0 }, { oi_in_oil, "oy", 1 }, { o_in_bob, "aa", 1 }, { ow_in_how, "ow", 1 }, { o_in_dog, "ah", 1 }, { o_in_boat, "oh", 1 }, { oo_in_too, "oo", 1 }, { oo_in_book, "ooh", 1 }, { p_in_imp, "p", 0 }, { r_in_ire, "er", 0 }, { sh_in_she, "sh", 0 }, { s_in_sip, "s", 0 }, { th_in_bath, "dth", 0 }, { th_in_the, "th", 0 }, { t_in_tap, "t", 0 }, { u_in_cup, "uh", 1 }, { u_in_burn, "u", 1 }, { v_in_average, "v", 0 }, { w_in_win, "w", 0 }, { y_in_you, "y", 0 }, { s_in_vision, "zh", 0 }, { z_in_zoo, "z", 0 }, { a_in_ami, "a", 1 }, { n_in_francoise, "n", 0 }, { r_in_der, "r", 0 }, { ch_in_bach, "chh", 0 }, { eu_in_bleu, "eu", 1 }, { u_in_duboise, "u", 1 }, { wa_in_noire, "WA", 1 } };
A N /eI/ AWOL A '/eI/w/A/l Aachen N '/A/k/@/n Aalborg N '/O/lb/O/rg Aalesund N '/O/l/@/,s/U/n Aalst N /A/lst Aalto N '/A/lt/O/ Aar N /A/r
Although I didn't include it here, the word database can optionally load in the moby thesaurus, "mobythes.aur". Thus you can figure out if any word has the same meaning as any other.
You can use a "WordFilter" object to filter the results to have a certain number of syllables, or match a set of stresses. For example, you can request a noun phrase of 5 syllables to match the stresses "0101010101", which is iambic pentameter.
Hello Steve,
I came across your website while I was searching for android API calls for RhymeZone.
I am currently using MIT android app inventor to make API calls to other apps on my cell
phone. I made some progress, I can call up RhymeZone android app, but I cannot
send text to retrieve results from search. I was hoping you might be able and willing
to assist me. Here is what I have so far...
Action: android.intent.action.MAIN
ActivityClass: com.rhymezone.rzapp.RhymeZoneActivity
ActivityPackage : com.rhymezone.rzapp
DataType: (text???)
DataUri
ExtraKey: (????)
ExtraValue: (word to pass to rhymezone to search on?)
ResultName: (should return selected word?)
pac_northwest1@yahoo.com
British English: Po-tay-to & Tom-ah-to
It only rhymes if you use the American Tom-ay-to!!!
Who says po-tay-to???? The British certainly don't!!
I'm a student of Computer Science from Indonesia. I'm trying to make poem generator project.
I have a difficluties in implementing rhyming algorithm.
What are you using for this rhyme machine?
Do you have any paper about this project so I can learn it and implemented so I can put your name to my bibliography? :)
Because your project seems so good and have a good algorithm too.
Thx
if ( pRhymeWord != NULL ) // precaution
delete pRhymeWord;
}
Thanx
The code works well. However you need to add the followig code in RhymeWordDatabase's destructor to avoid memory leaks.
-----------------------------
// Clear String Map
m_wordMap.clear(); // clear container
// Clear word array
// This is where we can delete the actual allocations
for ( unsigned int i=0 ; i<m_wordArray.size() ; i++ )
{
RhymeWord* pRhymeWord = m_wordArray[i];
if ( pRhymeWord != NULL ) // precaution
delete pRhymeWord;
}
m_wordArray.clear(); // clear container
------------------------
Also, I've removed the RhymeWordArray class and replaced it with a simple vector. I'm not sure if I won't replace the list as well...
This is a real good way to get introduced to this fild.
Cool.
bye
There is no excuse for my using my own DynamicArray class. I should use STL for this, since the std::vector class is great.
As for StringMap, for a word dictionary containing hundreds of thousands of elements, a hashtable is essential instead of the O(logn) stl::map. Hash tables are non-standard in STL (some C++ libraries do not have std::hash_map). I believe my implementation of StringMap is much faster than std::hash_map, because it is based on a list class that only handles pointers and doesn't need to worry about constructors and destructors for the elements.
I used to be upset at the code bloat that STL caused, although modern compilers have gotten better. My list class will result in smaller code than STL, because it uses a technique that I borrowed from Microsoft's MFC library. Instead of duplicating the entire list class for each type, I make the List base class once for void*. Then, the template List<class PTR> does nothing but cast to the appropriate pointer and call the base class functions. This results in smaller code.
Reading through your code quickly, I see that you define your own list classes instead of STL. Any reason for this? Just curious...
Keep up the good work.
I was wondering how to implement this code through a simple solution through php that does not require any binary files on the server, c++, python, or C.