Stress Test (Warning: Blog Tech Geekery Follows)
Posted on November 14, 2007 Posted by John Scalzi 20 Comments
I was wondering what would happen to WordPress when I had a day where I pulled in a lot of folks, and yesterday was that day; Whatever pulled in 72K folks, which made it the second busiest day on the site (The Bacon Cat incident still holds top honors, with 76k in one day). And what happened to WordPress, when presented with an outsized bolus of traffic? Nothing at all, which I find encouraging; never once did the site hiccup, all day long. Since one of my main concerns switching over to WordPress was that dynamically generating pages would present a problem, it’s nice to see that I was wrong. I’m sure it helps that I have wp-cache running on the site, so that the database isn’t hit every single time.
So: I’m officially sold on WordPress. I’m sure that makes someone happy somewhere.
That said, a question for you WordPress geeks: Would turning on wp-cache cause underreporting on unique visitors in any way? I have a suspicion, given the size of my raw log file relative to its size on an average day around here, that 72k unique visitors on the day is actually low (my log file was about 10 times normal size, whereas unique visitors was only about twice recent traffic). Any thoughts on this?
It shouldn’t. The cache still doesn’t remove all db activity, instead it makes a db entry for the ‘pre-fetched’ content. What it does, however, is reduce the per-page db SELECTs to1.
Oh, don’t forget that new visitors are more likely to browse around and fill up your log without increasing the uniques.
I’m gonna go with Job 41:11 on that one. It pretty much sums it up.
Actually, I couldn’t get on your site for most of yesterday – I just got an error message. It was the only site I couldn’t reach, so I figured you must have written something popular. And I was right!
Your new visitors are probably a result of your being mentioned in one of the more popular threads in the SomethingAwful forums, which tends to obliterate most bandwidth.
I, for one, am glad yours held up, as I’d forgotten my feedreader didn’t have your blog in it anymore (changed programs). I could not for the life of me find an “RSS”/”XML”/”Feed”/”Subscribe” link anywhere on your blog page, I had to look in the comments of the “About” page to find one (doh).
I wonder if wp-cache is the reason I’ve been able to see your site lately whenever everything else seems to be down for me. And it’s definitely not locally cached — I was spending time going through the archives looking for stuff I’d not read the other day. (Because, well, it was the only thing available to me. Not to diminish the quality the site, of course.)
I could NOT access Whateveresque or Webhostinggeeks.com, so I assume it wasn’t the provider…
“Your new visitors are probably a result of your being mentioned in one of the more popular threads in the SomethingAwful forums, which tends to obliterate most bandwidth.”
Well, that and BoingBoing and Metafilter and Bloglines. It got around.
You had more images than usual in that post, and every call to an image gets written to the log. Alex is also right about people browsing around once they’re here.
… and never mind; I’ve just noticed that all the images were hosted on Flickr.
I had problems posting comments that day. Once I clicked “submit” it went away for a long snooze. I get enough of that in normal conversation without a machine betting bored by them as well.
According to what I’ve read, it shouldn’t change your stats reporting at all:
^ the first time a page is requested, the generated HTML is saved to a cache file with a timestamp. When the same page is requested again, WP-Cache checks if it’s been cached, and if it is fresh (below a configurable amount of time, defaults to 1 hour) it instantaneously feeds the cached page to the client, and completely bypasses all WordPress engine calls.
From the public side, the page is being requested and served just like any other. The only difference is on the back end – WP is serving up a static page, rather than digging around through the database to put it all together.
Personally I see a HUGE discrepancy between my server log files and what Google Analytics has to say. I chalk this up to spambots and various worms looking for a security flaw. Is it possible you were just being crawled a lot, and that your stats program ignores spiders?
If you are getting more new users who click through to other content on the same visit you should see the average time on the site figures increase as well as your singly view visit ratio drop.
Clearly I can’t type. Hopefully you can decipher.
“Is it possible you were just being crawled a lot, and that your stats program ignores spiders?”
I’d have to check with my host, as I’m using their stats program. I suspect something has changed recently (as in the last couple of weeks) because the stats numbers have been acting strangely in that time. I’ll have to check it out.
I might be partly to blame – as I submitted the “Creation Museum” link to FARK. The mods didn’t greenlight it, but it still had a very high number of reads.
This IS a puzzle! There is only one effective way to move forward on this question. John, baby, SHOW US YOUR STATS!
(Everyone quick, be prepared to throw beads and holler “Wooooo!”)
I’ve never used MT, so my angle here is different, but why is it that it doesn’t seem to scale well enough to do dynamic pages? I’d expect any decent CMS to be 100% dynamic, and to behave under a moderate load. I’ve written 1MM+ hit-per-day custom systems that built every page on the fly without sweating, and I’ve seen it done with COTS and FOSS tools as well. What’s special, in the short-bus sense, about MT’s dynamic page building?
Two more drum beats! And it’s super easy to install, just take the bit of code and stick it in your template. And it’s free!
(I previewed this comment before posting just because I could.)