Fighting Blog Comment Spam with Qwen3 and Ollama

I've tried various ways of fighting it over the years. For a while, having written my own blog in php meant the problem wouldn't exist. The bots would relentlessly try submitting comments via wordpress URLs, which my system didn't understand.
When that stopped working, I introduced a hidden email field that only scripts could see. If it was filled out, I could ignore the comment right away. Somehow, that stopped working.
I built a user interface where I could click a delete button on each comment and they would implode on my screen. It was satisfying, but it still required me to log in once in a while and do it. So the comments built up, and I kind of gave up.

Luckily, I happen to have an AI rig in my bedroom, with ollama and several models installed. I decided to set it to work. I won't go into the details here. There is no need! Here is the prompt I mashed into Visual Studio Copilot, with Claude Sonnet 3.7 agent mode selected. Note that I had lost the database schema of my blog, so I didn't even know how the comments were stored.
- Carefully examine the BlogDb class and deduce the schema of the Comments table.
- Get the details of the database from the index.php file, including the connection details.
- Write a python script. On startup, it will connect to the database. If not present, it will add the "classification" string-type column to the comments type, with default value "".
- Devise a prompt that will take blog article contents, and a comment, and decide if the comment is spam or not.
- The python script will then iterate through all comments that have the empty-string classification field. It will forward each one and the spam classification prompt to an ollama server at https://example.com. Use the model "qwen3:32b".
- The python script will, based on the returned classification, update the classification field of the comment in the database to "spam" or "not-spam"
Ten minutes later, it was done. It then spent the next few hours classifying all the comments (about eight seconds each) and writing detailed analysis on its reasoning. They were fascinating to read.

How a programmer reads your resume (comic)

Throw away the keys: Easy, Minimal Perfect Hashing
Perfect hashing is a technique for building a hash table with no collisions in the minimum possible space. They are a easy to build with this simple python function.
Compressing dictionaries with a DAWG
