As far as I know I made it up, but I stand ready to be surprised!
Deebster
Distro watch rankings are just which page gets the most hits. Get a bunch of different IPs to load LemmyLinux and it'll be number one (and then actual people will click on it to see what it is and why it's number one).
Thinking there must be another way, I switched to Haproxy.
Hang on, weren't you on Haproxy already? Or do you mean you switched your attention to Haproxy? (If not, what were you in before?)
As others have said, blocking incoming stuff as high up as possible is definitely the right way, and Cloudflare is the right place for you. It's interesting that this bot wasn't caught by Cloudflare, I wonder who runs it.
I feel a company that big would write a more competent bot, but I also wouldn't be too astonished.
I was kinda hoping for another story about some clever compression bomb or similar to slow up the bot - after all, if it's hammering this little site it's surely doing the same to others, even if they haven't noticed yet. After the robots.txt was ignored I was sure, but I guess this mature, restrained response is probably the correct one *discontentedly kicks can down sepia street*
That was given in the original question along with Pythonistas.
Back in the naughties PCLinuxOS was at #1 and people suspected them of cheating. I'm sure some people do try to game it, but there's plenty of organic and bot traffic to compete with.
Besides, I think the popularity thing's kinda backwards - I'd never visit Ubuntu or Fedora because I know what they are, but I'll be clicking on something novel out of curiosity.