Leaving Identi.ca

When I saw the message that Identi.ca was switching to the pumpio platform, I started thinking when the last time I actually used it was. Turns out, it was the end of 2012. At that time I also discovered post formats for WordPress and saw the status post seems to be a replacement for tweeting.

So I have decided that all my short posts can happen here. This was a driving force to find a new theme, because my old one did not support status posts. Import my old tweets (or whatever they were called on identi.ca) and I am up and running!

The reason I am posting this here is that if people are following my main RSS, they will get the occasional “tweet” from me. If you want to avoid that, the “tweet” category can be excluded using this link.

Hello DigitalOcean!

While my old hosting with Laughing Squid was did its job, I was very limited in what I could do with it. Access only via sftp, 1GB of storage and I was breaking my monthly 25GB traffic limit fairly frequently… So it was time to look for something new.

For me, the primary consideration when choosing a provider is price. My website is not really worth any investment… A thread on reddit pointed me at DigitalOcean (which may or may not be spelled with a space…). It was cheap at $5 per month – in fact cheaper than what I was paying – and 20GB storage on SSD with 1TB transfer is more than enough for me. I also figure that one core and 512MB RAM is enough for my requirements. As a bonus, they provide Arch Linux images, which I am mildly familiar with administrating.

Time to create a “droplet” as they are called. I chose the 32bit Arch install given the low RAM available to me and selected the San Francisco data center given where I live. The DigitalOcean front page tells me this will take 55 seconds. It took 69 seconds but I think I will forgive that! I get an IP address and can ssh in. All good so far.

Checking out what packages are installed, it appears to be all of the base and base-devel groups and ssh. The base-devel group probably is not needed, so I start by removing a whole bunch of those packages including autotools, binutils and gcc. Also, the system is installed on a single ext4 partition, so I can get rid of all packages to deal with other file system types. There are probably other unnecessary packages installed too, but the install really does not take up that much space. And it is not as if Arch is designed to be super slim anyway.

Time to update. The image is from 2013.03, so should be straight forward. Hrmm…

warning: linux: ignoring package upgrade (3.8.4-1 => 3.9.3-1)
warning: linux-api-headers: ignoring package upgrade (3.7.3-1 => 3.8.4-1)

Lets peak in pacman.conf:

# Please note: if you update the linux kernel via pacman and reboot, you will
# lose access to your droplet! Please don't remove 'linux linux-api-headers'
# from IgnorePkg.
IgnorePkg = linux linux-api-headers

Well, that is not right, but quite a common mistake. The package linux-api-headers provides userspace headers for the toolchain, so its update is linked to updates in glibc, gcc etc and not that of the kernel. The linux-headers package provides kernel headers, but is not needed on a VPS. So I open a support ticket suggesting this is fixed. This is where I was impressed. I got an initial acknowledgment response within minutes (from what is potentially an actual person) and further responses before the end of the day from people high up the food chain.

Looking further into the install, it is using netcfg to connect to the network. I guess this will need to be upgraded to netctl one day, but given I have never used either of those, I think I will save that for later. The netcfg configuration file that appears to be automatically set-up during droplet creation looks a bit of a mess, but works.

That is all I could see that is changed from a vanilla Arch install, so overall everything is exactly as I would expect.

I suppose the only other thing to look at is the Digital Ocean control panel. It provides everything I would expect – console access, power cycling, backup facilities. Not sure about having a “root password reset” function though… From there you can update your kernel to a recent Arch one (3.9.2-1). This is something I don’t quite understand. It is the Arch packaged kernel (as demonstrated by the pacman -Qi output), so something is happening in the background on update that is not happening when doing it via the package manager. I wonder if they could do whatever that is via our packaging system and provide a repository allowing a direct update of the kernel.

I can not really conclude much from a couple of days testing, but all seems good and speeds are fine… I think I have my website completely transfered (with only a small redirection “https” + “www prefix” combination issue to fix), so let me know if anything is out of order.

Using New Spam Control

A couple of people have emailed me saying that their comment was not posted to my blog. I have WordPress set to require all comments from a person to be approved, although no further approval is needed for that person to leave subsequent comments. I do not censoring your comments at all. In fact, I let all genuine comments through, troll or not.

So what is happening to the comments that never appear? I have been using two measures to counter spam. Firstly, all posts are closed for commenting after 30 days. Spam does not start arriving on most posts for about a week, and really becomes abundant after about 20 days, so this gets rid of most spam. Secondly, I have been using the Askimet plug-in. This was doing a very good job at sorting spam from non-spam. However, the set-up required at my hosting providers end for enabling SSL for my WordPress administration resulted in all my comments being recorded as coming from the one IP address. That really screws Askimet over, so all comments were being marked as spam. This requires me to manually check my spam comments before I delete them and it seems I was being over-zealous…

So attempt two at preventing spam! Enter the Quiz plugin. Anybody leaving a comment will be required to answer an “extremely tricky” question to prove they are not a spammer. I will still have the moderation turned on for the time being to ensure this catches the spam, but I will turn that off too if successful.

Secure WordPress Administration For Free

Many months ago I noticed that I logged into my blog over plain HTTP and thought to myself that I really must do something about that one day. And that day is… well… a couple of days ago! I honestly was never really too concerned about logging in insecurely as the chances of anyone actually wanting to gain access to this blog and being in a position to exploit the insecure login is minimal. My guess would be that the majority of self-managed WordPress installs are administered over plain HTTP.

So apart from general apathy, what kept me from fixing this? Cost was probably the main issue… Any cost for a SSL certificate would not be particularly justified in my case. I also did not want to use a self-signed certificate as I find the security warnings that all web browsers give about untrusted certificates annoying enough to not want them on my site. That also rules out the free SSL certificates from CAcert, as the CAcert root certificate is not included by most browsers by default.

Then I saw a post somewhere about the free certificates given out by StartSSL. The price is right and the root certificate is commonly included so all seems good. There is not much actual validation that goes on to get one of these – my email and domain name were “verified” by sending emails… – so they would not be good for any site where trust is actually needed (such as anything where any personal and financial data are being collected).

Once validated, all I had to do was provide a CSR and they provided me the certificate. My webhost then uploaded that certificate and broke everything! The HTTPS version of my site was giving the error “ssl_error_rx_record_too_long”, which is actually quite uninformative as it covers a wide range of actual issues, and the HTTP version for some reason lost all access to files even thought they were clearly still there when I checked. This took me a few hours to notice as I had to wait for the DNS entries to propagate, so the issue was reported at 5pm on Friday the 30th of December… I really thought my website would be down until the 3nd of January when the support desk reopened, but everything was fixed a few hours later. So good service given what I pay, but the whole issue could have been avoided with a simple check at their end once the SSL certificate was installed.

Once you have your SSL certificate installed and ready to go, making WordPress enforce SSL usage for all administration tasks is simple. Simply add the following to your wp-config.php file:

define('FORCE_SSL_ADMIN', true);

Now all your blog administration is secure(ish). The final thing to do was to check whether browsing my website using HTTPS worked… No, it did not! I was getting messages about the site only being partially encrypted. A quick search showed I serve all my images using the full URL rather than a relative one. I did this because a certain Linux distribution’s Planet feed did not show images otherwise (or at least that was the case a long time ago – I have not tested lately). I could go through and adjust all my image links to use HTTPS, or just disable HTTPS access to my website. I chose the latter as nothing on my site is that important that it needs to be encrypted and I thought it would be the quicker option… Several hours later and this is the rule you need to add to your .htaccess file to achieve this:

RewriteCond %{ENV:HTTPS} on [NC]
RewriteRule !^wp-(admin/|login.php|includes/|content/)(.*)$ http://allanmcrae.com%{REQUEST_URI} [R,L]

The only real trick there is that the WordPress login and administration interface uses files from the wp-includes and wp-contents directories so they need to be excluded from the RewriteRule.

So… remember how I said self-signed certificates were annoying as all visitors to the site would get a warning. Well, now I force HTTP usage, that whole argument is irrelevant as only I would see the SSL certificate when I access the administration interface. But I at least have the option of serving parts of the site over HTTPS using a recognized certificate if I ever feel the need.

Where Did My Bandwidth Go?

Here is what happens when you make a post with around 2MB of images in it…

Bandwidth Usage

That was a spike from my usual 100MB bandwidth use a day to over 2GB! I usually only use about 2 or 3GB for the whole month, so that was a bit of a surprise. Also, I only pay for 25GB a month so if it sustained at over 800MB a day I was going to be in trouble… (well, it would only be $2 more for an additional 50GB, so not too much trouble!)

So where did all that bandwidth go? Looking at my blog access stats, only about 20% of it is from people actually visiting my site. So the rest seems to come from people looking at my RSS feed, either directly or through sites like Planet Arch Linux that syndicate the feed.

That means I could drastically reduce my bandwidth usage by posting only a summary to my feed. But given I really dislike seeing only article summaries in my feed reader, it is not something I would really want to do. It is not as if my site has any advertising, so there is little point driving people here. Also, I would probably need to spend a few hours getting WordPress to actually provide summaries in the feeds the way I would like them (because WordPress never does anything quite “right”…).

Site Fully Restored

After some interesting attempts at importing my old posts and comments, followed by some manual post recovery and editing of the MySQL database, it appears my site is completely restored and running on my new host. All files should hopefully have migrated too… but let me know if you spot anything missing.

While I was restoring everything, I took the time to update my theme and make my modifications the proper way using a child theme. I’m still not 100% satisfied with the adjustments; the menu at the top could be reduced in height by a few pixels and the line under the header should always span the page. I am entirely stuck on how to achieve those, so I would be very appreciative if any CSS experts out there want to post fixes for those.

Now, on to posting the insightful blog posts I am so well known for!

Moving Hosting Providers

After struggling with my current provider and their unstable MySQL server for the past couple of months, the final straw was broken when the the posts table from my WordPress database became gone. So it is goodbye to 000webhost and your free hosting (hence not too much complaining from me…).

Given my total website requirements are modest – WordPress (PHP-4.3 and MySQL-4.1.2) and some file hosting – there is little point in me getting a VPS (and having to figure out how to set all that up!). So I am giving Laughing Squid a go. I figure you can not go too far wrong at $6 a month.

So now I just have to restore everything… These things always happen when you have critical deadlines for work, so this will take a few weeks. I have backups to restore from (although a couple of my recent blog posts are missing and require rescuing from the Google cache), so everything will be back eventually.

Edit: comments have been temporarily disabled to make my restore easier.

Posted in WebSite on by Allan Comments Off

Spam, Spam, Spam

I had turned off the need to moderate comments before their appearance on this blog as an experiment to see how long it took for spammers to start posting. Turns out, it was not very long… but taking 25 days is still slightly longer than I had expected. So comment moderation is turned back on.

While most spam is obvious posting of links to websites, I just do not understand some of the spam that I have received. One IP address (which is well know for its spam), posted messages like “The best information i have found exactly here. Keep going Thank you” and “Hi, very nice post. I have been wonder’n bout this issue,so thanks for posting“. Do a google search for those phrases and note how frequent those exact comments are. What is strange is that the “people” posting these comments seem to have nothing to gain, at least initially. They listed website their website as google.com and their email address is not shown so no-one can reply to them. I suppose they want to get through that initial moderation phase so that they can posted unhindered crap in the future. You have got to admire their determination…

New Site!

The death of Google Page Creator (and the inability to do anything decent with Google Sites) has finally pushed me to get my own domain and make a “proper” website.  Now all I have to do is figure out how to make my WordPress install look semi-decent.  This could take a while…

Edit: decided to go with a slightly modified simpleX theme for the time being. There are a few things I still do not like about it but it is better than the default WordPress theme.