Fighting Blog Comment Spam with Qwen3 and Ollama

Make a web page screenshot service

I'll take you step by step into how to make a service that takes screenshots of webpages and returns them as an image.

Automatically remove wordiness from your writing

Shorten your writing with this tool, made well before AI was popular.

I found Security Vulnerability in your web application

How to detect if an object has been garbage collected in Javascript

If you are writing an application in Javascript, soon you will have to worry about memory leaks. But it is difficult to even know if a memory leak exists. This handy method can help.

My favourite Google Cardboard Apps

I have never been a gamer. The most I've played was Super Mario Bros (the original). I then took a break for a decade or two and spent a few weeks with Simcity 4. All that changed when I got Google Cardboard.

O(n) Delta Compression With a Suffix Array

The difference between two sequences A and B can be compactly stored using COPY/INSERT operations. The greedy algorithm for finding these operations relies on an efficient way of finding the longest matching part of A of any given position in B. This article describes how to use a suffix array to find the optimal sequence of operations in time proportional to the length of the input sequences. As a preprocessing step, we find and store the longest match in A for every position in B in two passes over the suffix array.

Finding Bieber: On removing duplicates from a set of documents

Using a locality sensitive hash, you can mark duplicates in millions of items in no time.

Let's read a Truetype font file from scratch

Walkthough of reading and interpretting a TrueType font file in a few lines of Javascript.

A Quick Measure of Sortedness

How do you measure the "sortedness" of a list? There are several ways. In the literature this measure is called the "distance to monotonicity" or the "measure of disorder" depending on who you read. Here, I propose another measure for sortedness.

My thoughts on various programming languages

Some ill-informed remarks on various programming languages.

A little VIM hacking

The strange man reading a novel in the meeting room

Why is a visitor reading a novel all week in the meeting room?

You can cheat so your web site seems faster than it is

You can make your web site seem faster without actually being faster.

Yes, You Absolutely Might Possibly Need an EIN to Sell Software to the US

After many months, your software sale is complete! You've got a purchase order, sent the invoice, delivered the software. You're already handling some support issues from users at BigCorp. Then BANG! Martha from Procurement emails back, as a favour, just to let you know that BigCorp has not received your W8 form with a valid tax id, and therefore will be withholding 30% of the purchase price of your multi-thousand dollar product for taxes.

Asana's shocking pricing practices, and how you can get away with it too

If one apple costs $1, how much would five apples cost? How about 500? If everyday life, when you buy more of something, you get more bananas for your buck. But software companies are bucking the trend.

5 Ways PowToon Made Me Want to Buy Their Software

Even though I saw through their tricks at every step along the way, I am now a customer and proud of it. It is worthwhile to look at what they did, because these are simple things that you can do to improve your software business.

How I run my business selling software to Americans

Here's what you can do to get the most out of your business in Canada if all of your revenue comes in US dollars.

0, 1, Many, a Zillion

It's common wisdom that there should only be three numbers in source code. But there's actually four. Here's why.

Give your Commodore 64 new life with an SD card reader

Dust off your old Commodore 64, and you could be the coolest kid on the block by plugging SD cards into it instead of floppies.

20 lines of code that will beat A/B testing every time

A/B testing is used far too often, for something that performs so badly. It is defective by design: Segment users into two groups. Show the A group the old, tried and true stuff. Show the B group the new whiz-bang design with the bigger buttons and slightly different copy. After a while, take a look at the stats and figure out which group presses the button more often. Sounds good, right? The problem is staring you in the face. It is the same dilemma faced by researchers administering drug studies. During drug trials, you can only give half the patients the life saving treatment. The others get sugar water. If the treatment works, group B lost out. This sacrifice is made to get good data. But it doesn't have to be this way.

[comic] Appreciation of xkcd comics vs. technical ability

VP trees: A data structure for finding stuff fast

Let's say you have millions of pictures of faces tagged with names. Given a new photo, how do you find the name of person that the photo most resembles?

In the cases I mentioned, each record has hundreds or thousands of elements: the pixels in a photo, or patterns in a sound snippet, or web usage data. These records can be regarded as points in high dimensional space. When you look at a points in space, they tend to form clusters, and you can infer a lot by looking at ones nearby.

Why you should go to the Business of Software Conference Next Year

Most people, having already paid $2000.00 of their hard earned money, and then having flown, driven, or otherwise travelled to Boston to attend a conference, and then having paid an additional $250/night plus $33/night parking and "tourism taxes" to the Seaport Hotel -- most people, after all this, are unlikely to say that it was a waste of time and they should have stayed home watching the remaining salvaged episodes of Doctor Who on Netflix.

In fact, I found it quite useful.

Four ways of handling asynchronous operations in node.js

Javascript was not designed to do asynchronous operations easily. If it were, then writing asynchronous code would be as easy as writing blocking code. Instead, developers in node.js need to manage many levels of callbacks. Today, we will examine four different methods of performing the same task asynchronously, in node.js.

Zero load time file formats

When your app needs to be fast, you can't afford to load things fro disk. In this toy example, an on-disk data structure helps you instantly look up lists of related words.

Finding the top K items in a list efficiently

Do you use sort() to find the top results? Here's a simple trick that will make your software run much faster.

An instant rhyming dictionary for any web site

Sometimes your API has to be simple enough for non-technical people to use it. Find out how to include a rhyming dictionary on your web page just by copying and pasting.

Succinct Data Structures: Cramming 80,000 words into a Javascript file.

jQuery creator John Resig needs a little help storing lists of words in his side project. Let's go overkill and explore a little known branch of computer science called Succinct Data Structures.

Throw away the keys: Easy, Minimal Perfect Hashing

Perfect hashing is a technique for building a hash table with no collisions in the minimum possible space. They are a easy to build with this simple python function.

Why don't web browsers do this?

Why don't web pages start as fast as this computer from 1984?

Fun with Colour Difference

Are you looking for a nifty way to choose colours that stand out? Are you the type of person who is not satisfied until you have mathematically proven that your choice is optimal?

Compressing dictionaries with a DAWG

A practical, memory efficient way to store and search large sets of words.

Fast and Easy Levenshtein distance using a Trie

If you have a web site with a search function, you will rapidly realize that most mortals are terrible typists. Many searches contain mispelled words, and users will expect these searches to magically work. This magic is often done using levenshtein distance. In this article, I'll compare two ways of finding the closest matching word in a large dictionary. I'll describe how I use it on rhymebrain.com

The Curious Complexity of Being Turned On

In software, the simplest things can turn into a nightmare, especially at a large company.

Cross-domain communication the HTML5 way

Making a web application mashable -- useable in another web page -- has some challenges in the area of cross-domain communications. Here is how I solved those problems for Zwibbler.com, using HTML5 cross domain communication.

Five essential steps to prepare for your next programming interview

They put you in a room, give you a problem, and stare at you while you fumble around with markers on a whiteboard for 45 minutes. With a little preparation, you'll look like a pro.

Finding awesome developers in programming interviews

In a job interview, I once asked a very experienced embedded software developer to write a program that reverses a string and prints it on the screen. He struggled with this basic task. This man was awesome. Give him a bucket of spare parts, and he could build a robot and program it to navigate around the room. He had worked on satellites that are now in actual orbit. He could have coded circles around me. But the one thing that he had never, ever needed to do was: display something on the screen.

Compress your JSON with automatic type extraction

JSON is horribly inefficient data format for data exchange between a web server and a browser. Here's how you can fix it.

"Your program is stupid. It doesn't work," my wife told me

The simple and obvious way to walk through a graph

At some point in your programming career you may have to go through a graph of items and process them all exactly once. If you keep following neighbours, the path might loop back on itself, so you need to keep track of which ones have been processed already.

Asking users for steps to reproduce bugs, and other dumb ideas

You can fix impossible bugs, if you really try.

Creating portable binaries on Linux

Distributing applications on Linux is hard. Sure, with modern package management, installing software is easy. But if you are distributing an application, you probably need one Windows version, plus umpteen different versions for Linux. In this article, we'll create a dummy application that targets the following operating systems, which are commonly used in business environments...

Bending over: How to sell your software to large companies

For a micro-ISV, selling to businesses can be more lucrative than selling to consumers. Instead of making a few dollars per sale and hoping for thousands of sales, you sell to only a few customers, and charge much higher rates. But the rates are high for a reason. It takes more time and money to sell to businesses.

Regular Expression Matching can be Ugly and Slow

If you open the first few pages of O'Reilly's Beautiful Code, you will find a well written chapter by Brian Kernighan (Personal motto: "No, I didn't invent C. Who told you that?"). The non-C inventing professor describes how a limited form of regular expressions can be implemented elegantly in only a few lines of C code.

C++: A language for next generation web apps

On Monday, I was pleased to be an uninvited speaker at Waterloo Devhouse, hosted in Postrank's magnificent office. After making some surreptitious alterations to their agile development wall, I gave a tongue-in-cheek talk on how C++ can fit in to a web application.

qb.js: An implementation of QBASIC in Javascript

Play NIBBLES.BAS in your browser. I re-implemented a small part of QBASIC as a compiler in Javascript, so it runs in a webpage.

Zwibbler: A simple drawing program using Javascript and Canvas

Now it's a commercial product, but Zwibbler was once a fun side-project, and here's some details on its implementation.

You don't need a project/solution to use the VC++ debugger

You learn a lot of things on the job as a programmer. Years ago, at my first coop position, I was a little confused when my boss went to Visual C++, and tried to open the .EXE file as a project. What a dolt! I thought. That's not going to work.

Boring Date (comic)

barcamp (comic)

How IE <canvas> tag emulation works

At the time of this writing, Internet Explorer at version 8.0 still lacks the <canvas> tag. But you can easily add the capability by including a short javascript file in your page. At first glance, that's astounding. How do you implement an entire vector graphics API in a few lines of Javascript?

I didn't know you could mix and match (comic)

Sign here (comic)

It's a dirty job... (comic)

The PenIsland Problem: Text-to-speech for domain names

Recently, I was contracted to run a list of domain names through the custom-built pronunciation engine that powers my rhyming web site. On the first attempt, I found that the results were embarrassingly bad. A quick inspection revealed the problem: most domain names are severalwordsstucktogether.

Pitching to VCs #2 (comic)

Building a better rhyming dictionary

Back in 2007, I created a rhyming engine based on the public domain Moby pronouncing dictionary. It simply reads the dictionary and looks for rhyming words by comparing the suffix of the words' pronunciations. Since that time, I have made some improvements.

Does Android team with eccentric geeks? (comic)

Comment spam defeated at last

For years when running this blog, I would have to log in each day and delete a dozen comments due to spam. This was a chore, and I tried many ways to stem the tide.

Pitching to VCs (comic)

How QBASIC almost got me killed

The day arrived when my project was ready to be unleashed upon the world. I waited until the teacher was hovering nearby and then I started my application, running the FORMAT command on the network drive. Some classmates were watching the screen and she hurried over to see what all the fuss was about.

Blame the extensions (comic)

How to run a linux based home web server

Sometimes you need complete control over the server, and don't want to pay $20 to $40 a month for a VPS. In this article, I'll describe step by step how to set up a home web server using Ubuntu, capable of handling modest spikes in traffic.

Microsoft's generosity knows no end for a year (comic)

Using the Acer Aspire One as a web server

A netbook can be ideal for a home web server. They are cheap, and use less power than a CFL light bulb.

When programmers design web sites (comic)

Finding great ideas for your startup

"I just don't have any ideas." This is the #1 stumbling block for budding entrepreneurs. Here are a few techniques to get the creative juices flowing.

Game Theory, Salary Negotiation, and Programmers

When you get a new job, you can breathe a sigh of relief, but not for long. You have an offer letter in your hand, and it is easy to miss one of the most important opportunities of your life: the starting salary. Here's what to do to increase your chances.

Coding tips they don't teach you in school

Some time-saving shortcuts for C code that will make your coworkers scream. In Awe.

When a reporter mangles your elevator pitch

If a reporter asks you about your new startup company, be careful what you say.

Test Driven Development without Tears

Every company that I worked for has its own method of testing, and I've gained a lot of experience in what works and what doesn't. At last, that stack of conflicting confidentiality agreements that I got as a coop student have now all expired, so I can talk about it. (I never signed them anyway.)

Drawing Graphs with Physics

To my surprise, I found that there is a very simple way to arrange graphs that can be expressed in only a few lines of code, using force-directed placement...

Keeping Abreast of Pornographic Research in Computer Science

Burgeoning numbers of Ph.D's and grad students are choosing to study pornography. Techniques for the analysis of "objectionable images" are gaining increased attention (and grant money) from governments and research institutions around the world, as well as Google. But what, exactly, does computer science have to do with porn? In the name of academic persuit, let's roll up our sleeves and plunge deeply into this often hidden area that lies between the covers of top-shelf research journals.

Exploiting perceptual colour difference for edge detection

Think colour isn't important in image processing algorithms? Let's try it both ways, and see for yourself.

Experiment: Deleting a post from the Internet

Once you post something on the Internet, it is hard to get rid of it. As an experiment, I deleted one of my past posts, and I tried to remove all traces of it.

Is 2009 the year of Linux malware?

Is 2009 the year of the linux desktop malware? How long until we see headlines like, "Researchers find massive botnet based on linux 2.30"?

Email Etiquette

If you begin your emails with "Hi, <name>!" then they will seem less rude.

How a programmer reads your resume (comic)

People thought it was a comic, so I never corrected them.

How wide should you make your web page?

Based on 22500 unique IP addresses over the past week.

Usability Nightmare: Xfce Settings Manager

Rant: Why can't anyone make a good settings screen?

cairo blur image surface

This really should have been included in cairo. Instead, everyone that wants to have shadows has to roll their own blur function. Here's my take on it. I'll even release this into the public domain.

Why Perforce is more scalable than Git

Branching on Perforce is kind of like performing open heart surgery. But here's why git can't hope to compete with it.

Optimizing Ubuntu to run from a USB key or SD card

Fortunately, by following the tips below, you can make your USB or SD card based linux system fly!

UMA Questions Answered

A bunch of questions answered about UMA wireless technology.

See sound without drugs

I have created an application that just turns on the microphone and continually plots the FFT magnitude of what it records. It allows control over the window size and sampling rate.

Stock Picking using Python

Python can tell you which stocks to buy. It's a sure thing!

Spoke.com scam

Rant: Why do companies think they can make money by posting false information about you on the Internet?

Copy a cairo surface to the windows clipboard

I just spent several hours debugging clipboard copy of a DIB image. I could copy from my application, and paste into Paint. I could paste into Word. But if I pasted into WordPad, nothing showed up. If I pasted into GIMP, it crashed.

Simulating freehand drawing with Cairo

Free, Raw Stock Data

Scraping financial information is easy with my friend, python.

Why are all my lines fuzzy in cairo?

Make sure your lines are sharp using this simple trick.

A simple command line calculator

A textbook example of recursive descent parsing.

Tool for Creating UML Sequence Diagrams

If you have to draw something called "UML Sequence Diagrams" for work or school, you already know that it can take hours to get a diagram to look right. Here's a web site that will save you some time.

Exploring sound with Wavelets

Here's a program to create scalograms of sound files.

UMA and free long distance

What's to stop me from travelling to another continent, and then making free long distance calls to local numbers back home? Technically, nothing.

UMA's dirty secrets

Recently, many carriers have started offering UMA, or WiFi phones. These are cell phones with WiFi capabilites. Don't be fooled -- you won't be able to get free calls and run skype on them. The UMA technology is meant to extend the carrier's cellular network into your home using your broadband internet connection.

Installing the Latest Debian on an Ancient Laptop

The challenge: Install Linux on a really old laptop. The catch: It has only 32 MB of RAM, no network ports, no CD-ROM, and the floppy drive makes creaking noises. Is it possible? Yes. Is it easy? No. Is is useful? Maybe...

Experiments in making money online

Is it possible to make money on the internet, if you try really hard? I want to find out. I have always been interested in getting money for doing nothing.

Draw waveforms and hear them

A while back I thought it would be interesting to be able to draw arbitrary waveforms and then listen to how they sound. I had an audio engine just laying around, so I whipped up a quick application to do that.

Cell Phones on Airplanes

Much ink has been spilled about the use of cell phones on airplanes. Here's the truth, which will be disappointing to conspiracy theorists: Cell phone signals most definately have an effect on other electronic equipment. Read on for more.

Detecting C++ memory leaks

It's fairly simple to redefine malloc() and free() to your own functions, to track the file and line number of memory leaks.

What does your phone number spell?

Here, I explain a technique for figuring out which words are in which phone numbers. Full C source code is included.

A Rhyming Engine

Here's a rhyming engine, written in 1000 lines of C++ code. It uses the freely available Moby dictionary, and full source code is provided.

Rules for Effective C++

The rules for safe C++ code are surprisingly controversial.

Cell Phone Secrets

How to choose a cell phone in 2006, if you want the best possible radio.