Fighting Blog Comment Spam with Qwen3 and Ollama

Posted one month ago

I'll admit it. I've been neglecting my blog. I went back recently and found a wall of comment spam.

I've tried various ways of fighting it over the years. For a while, having written my own blog in php meant the problem wouldn't exist. The bots would relentlessly try submitting comments via wordpress URLs, which my system didn't understand.

When that stopped working, I introduced a hidden email field that only scripts could see. If it was filled out, I could ignore the comment right away. Somehow, that stopped working.

I built a user interface where I could click a delete button on each comment and they would implode on my screen. It was satisfying, but it still required me to log in once in a while and do it. So the comments built up, and I kind of gave up.

Luckily, I happen to have an AI rig in my bedroom, with ollama and several models installed. I decided to set it to work. I won't go into the details here. There is no need! Here is the prompt I mashed into Visual Studio Copilot, with Claude Sonnet 3.7 agent mode selected. Note that I had lost the database schema of my blog, so I didn't even know how the comments were stored.

Carefully examine the BlogDb class and deduce the schema of the Comments table.
Get the details of the database from the index.php file, including the connection details.
Write a python script. On startup, it will connect to the database. If not present, it will add the "classification" string-type column to the comments type, with default value "".
Devise a prompt that will take blog article contents, and a comment, and decide if the comment is spam or not.
The python script will then iterate through all comments that have the empty-string classification field. It will forward each one and the spam classification prompt to an ollama server at https://example.com. Use the model "qwen3:32b".
The python script will, based on the returned classification, update the classification field of the comment in the database to "spam" or "not-spam"

Ten minutes later, it was done. It then spent the next few hours classifying all the comments (about eight seconds each) and writing detailed analysis on its reasoning. They were fascinating to read.

Steve Hanov makes a living working on Rhymebrain.com, rapt.ink, www.websequencediagrams.com, and Zwibbler.com. He lives in Waterloo, Canada.

Post comment

Test Driven Development without Tears

Every company that I worked for has its own method of testing, and I've gained a lot of experience in what works and what doesn't. At last, that stack of conflicting confidentiality agreements that I got as a coop student have now all expired, so I can talk about it. (I never signed them anyway.)

Pitching to VCs #2 (comic)

How a programmer reads your resume (comic)

People thought it was a comic, so I never corrected them.

Throw away the keys: Easy, Minimal Perfect Hashing

Perfect hashing is a technique for building a hash table with no collisions in the minimum possible space. They are a easy to build with this simple python function.

Compressing dictionaries with a DAWG

A practical, memory efficient way to store and search large sets of words.

When a reporter mangles your elevator pitch

If a reporter asks you about your new startup company, be careful what you say.

Four ways of handling asynchronous operations in node.js

Javascript was not designed to do asynchronous operations easily. If it were, then writing asynchronous code would be as easy as writing blocking code. Instead, developers in node.js need to manage many levels of callbacks. Today, we will examine four different methods of performing the same task asynchronously, in node.js.

Spoke.com scam

Rant: Why do companies think they can make money by posting false information about you on the Internet?

I didn't know you could mix and match (comic)

My favourite Google Cardboard Apps

I have never been a gamer. The most I've played was Super Mario Bros (the original). I then took a break for a decade or two and spent a few weeks with Simcity 4. All that changed when I got Google Cardboard.