Tag Archives: fail

Do your backups!

I have a QNAP TS-209 Nas device. It’s a Linux based appliance with 2 hot swap drives.

It has now died by the looks of it, QNAP support has been utterly useless to say the least but I have pretty much resolved to just replacing this unit even if they are able to resurrect it. The problem with the 1xx and 2xx range of QNAP is that its some weird CPU architecture and to enable huge files on them they had to patch the ext3 file system.

The end result is that while the devices are advertised as being ext3 they are in-fact a patched ext3 and you cannot just mount them in a Linux machine. They have also now stopped selling this series of machine so should yours ever die you are just plainly out of luck. QNAP have made a live cd available that’s similarly patched so you should have some hope if you are really in trouble.

In my case the device seem to have also totally corrupted the drives when it died so even in the Live CD scenario both are dead. It seems the SATA interface has gone rather than the disks, the moment I put a disk in it seems the CPU is totally kept busy dealing with blocking I/O requests, out of a 1000 pings about 20 will get replies – and those will be 30 second response times.

This brings me to several points:

  • Everyone knows this (right?) but RAID is not a form of backup, it’s most probably that if one drive in a RAID array gets it’s data corrupted the others will suffer too. It simply protects you against hardware failure on a single drive.
  • You should make backups regularly, as it turns out I made a backup just 12 days before it died so every file that was on the NAS is safe.

I’ve now spent the last 2 or so days duplicating my backups so I am redundantly covered while I look for a replacement. I’d have liked to not buy another QNAP but it’s really unfortunate that they do seem to have the best range of products in this space. All the vendors seem to have stopped selling 2 drive units so I am down to getting a 4 drive QNAP TS-439 now, this will set me back almost GBP900 but will give me 2TB of mirrored space, apple Time Machine backups for all my macs etc, pricey but important given that this is all my photos and music.

In the same week it seems my Apple iMac 24 inch has had similar problems. It isn’t booting, it seems a similar problem has afflicted it, I am not getting the usual I/O errors I saw on other macs when their drives died instead it’s other I/O timeouts that suggest more it’s the controller and not the drives. Thankfully I have Apple Care so it’s in for a free fixup. I do not have backups of this machine – that’s by design – since I keep all my data on servers on the internet and those are backed up off-site nightly. My desktops tend to be disposable and simply terminals to online data even browser bookmarks are stored online. The only thing lost on this machine would be chat logs and browser history, nothing else. I need to make some kind of plan with chat logs as those do tend to be more and more important these days.

So to sum up, even if you have multi redundancy in your drives in a NAS you must still do your backups, it’s easy with QNAPs to even do it off-site as you can rsync to a remote location or even sync to Amazon S3. Of course they also have USB ports so you can place files on an external drive.

Read full storyComments { 3 }

hetzner.de hardware policies

So I use Hetzner a lot for my machines, I’ve about 10 to 15 of their machines now across various clients and am mostly quite happy with them.  They provide a service that matches the price – ie. good enough.

One area of their service though really grates me, they give you old machines, when those machines fail they replace them with other old machines similarly for drives etc.

On more than one occasion now have I had hard drives fail only to see them replaced with other shitty drives.  Each time they claim the drives are well tested and each time they pull the old ‘it could be the cable’ trick and then replace the machine and the drive.

Since this has happened to me every single time I’ve changed a disk so far I have to wonder if this is everyones experience? 

From where I sit its simple.  They made a choice to take out drives reported broken by someone, they then test it and put it back when their tests fail to find any problem, they do this to save them money knowing full well that drives will fail and all they’re doing is shifting the risk onto their clients, while the clients keep subsidizing their expansion. 

So given this is the quality of service they’re aiming at, surely once this policy bites a good long standing user offering some kind of payback for the inconvenience would be good business practice?  Apparently not.

This is pretty poor, even after complaining to them they swapped my chassis and again put a disk with > 6000 hours under its belt in my machine.

So I guess you need to be pretty sure your softraids are setup properly when you want to use this company, their support stand is clear:

I’m sorry but we don’t promise anywhere that we built always new hardware into our servers. I can only ensure you that all hardware is always well tested and without any problem before we build it into a server.

Ie., screw you, we don’t care for any evidence and repeated failures, and we take zero responsibility for our equipment, we’ll just keep taking your money.

Read full storyComments { 0 }

Cogent

I’ve not posted here for a while, been insanely busy but today those lovely people at Cogent kicked me out of my blogging slumber with a shocking display that I simply had to share.

Cogent for those who do not know is a very large Tier 1 ISP, known mostly for many disputes with other ISPs about peering, it has become so bad that in the UK at least they are basically a complete no-go zone for anyone. 

I’ve previously delt with Cogent when a client signed up for a few mbit of Cogent bandwidth on the basis of a £5/mbit pricing structure, they soon realised that you get what you pay for.  Even between racks in the same Data Center you could not reach each other without first hopping over to Europe.  I’ve attempted to resolve this at the time with Cogent and the other ISPs and  both confirmed that it is essentially a waste of time.  Cogent said they can’t speak to the ISP in question since most of the UK ISP industry can’t stand them.  The other ISP basically laughed out loud about being ‘suckered’ into buying Cogent bandwidth.

This is confirmed elsewhere, search the Renesys Blog for Cogent and you’ll find a lot of information about Cogent, mostly bad news.  From an article on their blog about Cogent in the UK you will see this:

Firstly, Cogent has a fairly serious Europe problem right now. They
have been aggressively attacking the European market for a few years
now and making some solid headway. They bought a couple of carriers
(Lambdanet Spain and France, Carrier1 in Germany among them), ruthlessly
integrated them and then proceeded to undersell the market by a factor
of 50-80%. This has made them many enemies.

As a result of this approach to business, Cogent has much less
effective peering in Europe than do many of its larger competitors.
Most of the European PTTs refuse to peer with Cogent anywhere on the
European continent. Recently, some large US carriers (among them
Level (3) ) seem to have adopted a similar approach. This means that
when Cogent sells capacity in Europe, it is forced to drag that
traffic back to the US to hand it off to its peers here. Of course
that means that if the ultimate destination is European, the traffic
has to travel back. This is a burden on both Cogent and the European
carrier and, of course, the customers on both sides. But it’s
unlikely to change because of just how much hate there is for Cogent
among European networkers.

This basically confirms my experience with Cogent and those of many I have spoken too.  As such if you choose to support Cogent you will basically be forced to:

  • Buy a lot of other bandwidth since if you’re hoping to serve UK customers, Cogent is a terrible sole carrier to have
  • You will need to invest in extra hardware, extra admin time, extra complex routing infrastructure and additional overhead on your teams
  • You will forever be at the mercy of everyone who hates Cogent, you will find your self randomly falling off the internet, randomly de-peering with vast swaths of the internet and basically the whole thing will be a pain in the behind.

For these reasons, everyone who I know with Cogent bandwidth use them as last resort backup carriers, they are cheap and basically shit, but ok enough to use as a backup when everything else failed.

Over the last few years Cogent has contacted me direct via email to attempt to sell their wares, always the threads end withe me saying something along these lines:

Furthermore we’d prefer to use companies who do not directly
contact us with marketing material, please remove us from your lists
for future contact.

Today again one of my clients got a mention on Techcrunch which resulted in more spam from Cogent, again to an email address totally unrelated to my business activities, not listed in whois records for the client or anything like that.  The sales person even had the nerve to copy the email that the above quote is from in his mail to me asking if I can have a conference with him.

My response was the usual, no we don’t deal with spammers, you were told to leave us alone now please stop bothering us.  Which resulted in an amazingly pushy email from the sales person, quoted below:

No doubt that writing when being asked not to is, well, borderline. That
said, it is both of our responsibilities to make sure that all options
are explored. You need to confirm that you are aware of all vendors
information, and mine includes getting it out there. 

.

.

Admittedly, this is difficult to resolve via email. However, if I didn’t
think that we could compliment your service, I wouldn’t persist.

This is just amazing, this person really think he can presume to tell me what my responsibilities are, what I need to do, and that I have to indulge his blatant b/s.

After I again pointed out that they were asked to stop mailing me and I pointed out that they were using a private email address held by a UK citizen and as such under the data protection laws they need to stop contacting me when asked, they once again mailed me demanding further information about my customers.  They really are on par with simple Viagra spammers.

Does anyone really think this kind of heavy handed tactic gets them business?

The worst part of it is the ISP who currently provide a large part of our b/w are Cogent customers, Cogent Sales people do not think twice to approach clients of their clients and try to undercut them – effectively trying to steal their customers business away from them. 

Why would any business support such a company?  I would not, I would effectively be negligent in my duties to my clients to ever recommend these clowns for anything since they are just a nightmare waiting to happen.

Read full storyComments { 1 }

Layeredtech’s thanks to old customers

I have been a customer of Layeredtech for years, at present I have only 2 machines there but at times I’ve had 7 or 8.  My one machine is pretty old, I think I got it circa 2002 or so and it’s been doing well, same hardware etc.

Yesterday I received the following email from them:

Layered Tech is committed to being the leader of the Hosted
Infrastructure market by providing our customers with the best products
backed by the best service.  In an effort to improve our customer
experience, we have determined that a small number of existing servers
will need to be relocated from their current data center.  As you are
receiving this message, we have identified that you have one or more
servers in the in area of the Savvis facility that will need to be
moved.  It is our intention to minimize any interruption in service and
we will do our best to work within predetermined time frames that are
convenient to you.

Due to the form factor (chassis type) of
this server, we will need to migrate your data to a new server. We will
work with you so that the impact is as minimal as possible.  

Below
are the servers that are affected by this migration.  Please respond to
this message acknowledging the need to relocate your server(s).  At
that point, we will move this ticket to our Operations Department where
we will work with you on a migration schedule.

From reading this you might assume they will assist you with the migrate and this is a notice of an impending change, perhaps a month or two from now?

In reality the situation is that no, they will not help you migrate your data.  They want you to take out a contract for a new machine and then migrate your data yourself – something which even at best will take 5 to 10 hours on oldish machines like this.

They do not offer any compensation, and when pressed on that point only offer 1 month…the cherry on the cake is that all this has to be done for 18 days from now, in effect they are terminating your old machine forcing you to take a new one and doing it with less than the agreed 30 days notice.  Like it or not.

The sales person who has been coordinating this from their side is incredibly unhelpful and frankly useless, only after much pushing back by me do I even get a hint that anything other than do-it-yourself migration is an option, at this point still waiting for details.

This kind of disregard for customers is typical of large hosting centres, they have thousands of customers and their hard handed handling of their customers is acceptable because at worse they’ll loose a fraction of a percentage of customers, so being unhelpful really does pay off for them since most people will probably just take this crap.

This is shockingly poor service, if you value your data, avoid Layeredtech.

Read full storyComments { 2 }

Nasty PHP Authentication Handling

Sometimes you come across things that just make you wonder what is going on in peoples minds.

For years everyone who wrote applications compatible with the standard HTTP Authentication method has used the REMOTE_USER server variable as set by Apache to check the username that was logged in by the webserver, this has worked well for everyone, CGI’s and all would just grab it there and everyone would be happy.

Along comes PHP and they make great big mess of it, PHP suggests that we use $_SERVER['PHP_AUTH_USER'] instead, and they give some good reasons for this too, except they have severely crippled this for all but Basic and Digest authentication, the following code from main/main.c


        if (auth && auth[0] != ‘\0′ && strncmp(auth, “Basic “, 6) == 0) {
                char *pass;
                char *user;

                user = php_base64_decode(auth + 6, strlen(auth) – 6, NULL);
                if (user) {
                        pass = strchr(user, ‘:’);
                        if (pass) {
                                *pass++ = ‘\0′;
                                SG(request_info).auth_user = user;
                                SG(request_info).auth_password = estrdup(pass);
                                ret = 0;
                        } else {
                                efree(user);
                        }
                }
        }

As you can see above, they only import the user and pass from Apache if the AuthType is Basic, this makes no sense at all.  Why not just check with Apache, if it set the username then import it? Surely Apache know if a user has authenticated? Ditto for password.  It is so broken in fact that PHP in CGI mode also doesn’t work since those headers don’t get set for that either, countless comments and nasty hacks can be found in the PHP user contributed notes about this, but it is all just sillyness.

The reason this is annoying me is that I have written a Single Singon system in PHP, you can host a identity server on any domain and hook any site in any other domain into the SSO system, its a bit like TypeKey

Of course it’s nice to have a easy to use SSO system in PHP but what is the point if you can’t make legacy apps like Nagios, Cacti, RT etc play along with the SSO?  So to solve this I extended Apache::AuthCookie with a new mod_perl module that plugs into Apache and does authentication using my SSO and a small bit of glue that you put on your RT/Cacti/Nagios box.

All’s great, I have SSO to Nagios, RT and countless other things working flawlessly, except of course Cacti because it’s written along the lines of the PHP manual, uses PHP_AUTH_USER instead of REMOTE_USER and so my new fancy AuthType in Apache does not work with Cacti.   As it turns out its a quick 2 liner fix in the Cacti code but you would think PHP would be a bit more generic in this regard since as it stands now I think a lot of people who want to do SSO using hardware tokens and such have issues with PHP being silly.

Read full storyComments { 0 }

Some thoughts on Ubuntu

I’ve been giving Ubuntu another go on a Desktop. Last time I tried it my intention was to see if it really was a viable switching target from OS X and it turns out for that specific task It sux, this time I was more looking for a desktop fit for a Unix knowledgeable person so have had more tolerance for some stupid things like not being able to play popular content out the box.
Here is a random list of things that was worth noting to me:

  • I got Beryl and XGL going on my little Acer 12″ laptop, it’s really nice but unfortunately I can’t get it going on my desktops because they have ATI cards. I will be buying some Intel based cards for these.
  • Laptop support in general is really good, it hibernates and everything works. Getting the Acer buttons for things like enabling Wifi was a right pain, I had to tinker with some kernel modules etc.
  • Wifi support has a long way to go, the UI for configuring keys, choosing networks and types is totally useless and I hope it’s something on their list to fix real soon
  • GAIM has taken a turn for the worst, you should try Kopete for a good IM client
  • Changing the behavior of something as simple as the Backspace key in Firefox can really spoil your day
  • They push out a silly amount of updates and like with OS X they’re pretty big, I’m really glad I am not on a modem in the 3rd world using Ubuntu or OS X.
  • Easyubuntu is pretty good, Ubuntu needs to take this and put together a paid for package, I’d pay for it.
  • Crossover Office is really good, I’ve been running MS Office 2003 using it and been very happy with the results, MS Word is fast, stable and full featured, blows OpenOffice away, currently on the Demo but will buy soon.
  • Wine isn’t too bad either, it runs Digiguide without a problem, this was one of the big things that kept me on Windows.
  • I do sometimes boot into Windows, I do this via VMWare for simple stuff, I’m booting the the physical NTFS partition inside VMWare and its ok, but my AMD 2600 is pretty crap for this, soon I’ll buy a nice Core 2 DUO machine for this and it will solve my speed problems.

So it seems I’ll be moving off a Windows desktop now, I might still get a 2nd Mac instead of the Ubuntu machine that will be the perfect setup for me – 2 x iMac – but till then this will do well enough.

Read full storyComments { 1 }

Cachefly Content Delivery Network

Further to my previous post about webservers for data delivery I’ve been investigating some content delivery networks.
One that’s been getting a bit of bloggers love recently is Cachefly, on paper and on their pretty flash movies it seems pretty awesome. However when it comes time to deliver it falls pretty short.
From their site a few bits from their blurbs:

While traditional (or ‘first-generation’) CDN’s have used ‘DNS tricks’ to circumvent BGP (the internet’s core protocol), BestHop™ instead harnesses the power of BGP to determine how content can be delivered in the most efficient manner over CacheFly’s global CDN footprint. BestHop™ uses Anycast to instruct carriers routers to make connections to the best available point-of-presence for the end user. By combining anycast with our unprecidented international footprint, CacheFly has built the next-generation in Content Delivery Network.

So by using anycast they route you to a static ip – always the same one – and your ISPs routing tables figure out the nearest POP where your files live, this sounds great I had some doubts about Anycast but I’ve been informed that it does in fact work well even for TCP traffic.
Their network seems pretty extensive at the moment, they have a link on their site called “Out Network” that opens a popup, this shows an impressive world view of servers in San Jose, Phoenix, Chicago, Toronto, Ashburn, London, Amsterdam, Stockholm and finally Tokyo. These locations are a good footprint for my needs.
Some more blurps from their site:

Speed counts on the web, and by using CacheFly for your website you can deliver your website up to 10x faster than traditional hosting.

CacheFly makes it faster:

  • CacheFly’s BestHop™ traffic management system delivers your content at blazing speeds by delivering your content from the ‘edge’ of the internet, placing it closer to your visitors.
  • Your content is always available. Guaranteed. So there is no need to worry about losing customers because of an ineffective server or network issues.
  • CacheFly provides you with instant scalability – Our technology means you’re website’s effectiveness will not be impaired by increased traffic. *Embrace* flash crowds. They mean more $$ now, not more headaches.

The load-time of your website affects your bottom line, period.

This sounds great, so I want to host my site furniture on it, buttons, logos and also user profile photos etc. I want the furnishings to load as fast as possible so that there are no perceived delay in the page drawing due to photos loading slowly one by one, and this seems to match my needs perfectly.
Finally some other claims from their site:

  • No changes to your existing hosting solution
  • Easy to implement, be live less than 10 minutes

So what’s the problem then? I signed up for their biggest account since I have about 1GB of jpg’s to host, the account lists 24 hour support, SCP, SFTP and rsync uploads as features along with some other stuff.
Once I opened the account and logged into the control panel I was immediately presented with only FTP upload details, no rsync or ssh based system. Turns out for this you have to open a support ticket which the auto reply claims will be actioned within a day, thats great, 10 minutes indeed.
As for delivery to nearest to your clients, well thats just bullshit to say the least, the way they do it is by hosting all your files in the states by default. If a file gets a lot of hits from a specific region they’ll then wait for some threshold to be reached then they’ll propagate your files internationally, in the words of their support people:

“in order for files to cached internationally, they need to be considered ‘hot’; which will take more than several requests”

I’ve now hit a specific file over 150 times from the UK and it is still being served from the states. So how do they serve it from a specific location given its anycast? Well I hit a webserver in the netherlands, this server then responds with a 302 and redirects me to another server – a non Anycast IP – in the states. So in effect every hit I take on my servers gets translated into 2 hits until they decide a file has hit some magic threshold, and its a varying threshold since a large file (512Kb) hit it early on after only 20 or so hits, a much smaller test file is still being served from the states after 300 UK based hits.
As for their network map showing servers in the UK? Not true at all. From their sales people:

Also, you should note that we are turning up our London POP next week, this will significantly increase the already superb performance of our product.

And another response after I queried some more:

We have routers in london now, however, they deliver traffic to our amsterdam location. Next week we’ll be adding delivery infrastructure, as well as deploying our 10G port to the LINX peering fabric.

So not even close to as advertised, I’ve had hits to files from South Africa, China, Korea, UK, US and AU and only 5 out of 351 were served from Amsterdam, UK traffic though accounted for 337 of the 351 hits.
Cachefly looks good on paper, this is only day 2 of my trial with them and even if they get my rsync access sorted out I can’t see how I can possibly go to phase 2 – actually sending traffic their way – if they can’t get my files actually served globally. I’ve asked them about this and am waiting on a response, it might still turn around and be a winner for now though I think Cachefly is ok for a reasonably popular podcast or a big RSS feed, but not for serving 6 million files per day as quick as possible which is what I need it for.
UPDATE: After waiting 2.5 days for their support to even respond to my request to enable rsync, I’m now closing my account. My recommendation is to avoid Cachefly as far as possible.

Read full storyComments { 1 }
I am sick of Librarything.

I am sick of Librarything.

I’ve previously blogged about Librarything and said I quite like it.
Feature wise it’s fine, I would have liked to also include my DVD’s etc but mostly I just want the book feature to work.
The authors have been introducing all kinds of new wonderful features like groups etc, but the site is incredibly unstable. Each time I want to go add a new book, I am subjected to 2 hours of downtime, maybe I just have bad luck, but tonight I wanted to add 2 books again and the site was down for a hour again.


No planned notices on the blog (ever) no explanations afterwards, its just down indefinitely. This time it lasted an hour or so. It’s like the image above is their actual home page and the working one is the exception to the rule.
Anyway, other things that annoy me, they have these blog widgets, for putting in your site, but they don’t have a API that has enough features, so I parse the blog widgets to build my current reading list, of course they can’t stop messing with the HTML in these widgets so things keep breaking. I know this isn’t really something that will affect someone who use it on their website, but still it’s irritating.
In that hour I used my backup that I made a while ago of the data (librarything have previously had a disaster where their machines failed, their backups failed and they lost data!) and just imported it all into Delicious Library. It cost me more, but it indexes books, DVDs, games, CD’s etc and it actually works. I guess you get what you pay for and it seems $10 or whatever I paid for Librarything just isn’t enough to actually get a working site.

Read full storyComments { 0 }

Why I don’t use Ruby On Rails.

It’s simple, the community around it are a bunch of bigots. See this post on macslash.
The poor guy asks for something simple and gets flamed to death by a bunch RoR idiots who are acting like they need to justify their choices, no thank you. You’d swear its the new third reigh.
There are countless reasons why someone would want to stick to PHP, why does he need to face up to this kind of abuse just because he made a choice?

Read full storyComments { 1 }

Ubuntu is great.

So world and dog is nagging on about Ubuntu, how great it is and how they are switching from <insert anything on the planet> to Ubuntu.
I happened to have a spare 300gig drive lying around so I gave 6.06 a go. My machine is over 2 years old, its practically from the ark, you’d expect things to Just Work.
After install, screen resolution is absolutely dismal, slow refresh rate and random crashes while trying to set to a better resolutoin. Already here you’ve lost a large chunk of users.
Anyway, so I go off looking on Google using Firefox, it opens up with the familiar look of Firefox complete with Mycroft search box, except the search box does nothing by default, you can type into it, hit enter but nothing happens, by default it doesn’t search, have to go fiddle with it to get it working.
Came across a post, that points to another post that points to Wiki for getting ATI cards going. I basically had to do this in a terminal:

sudo apt-get update
sudo apt-get install linux-restricted-modules-$(uname -r)
sudo apt-get install xorg-driver-fglrx
sudo depmod -a
sudo aticonfig --initial
sudo aticonfig --overlay-type=Xv

and then reboot.
Yes, this distro is going places if it can’t even support a crap old ATI Radeon card out of the box and require new users to do stuff in terminals just to get rid of a headache inducing low refresh rate.
Get Real, your grandmother is not going to do this. Give her a Mac and the thing just works.

Read full storyComments { 8 }