Recovering from VMDKs on NetApp NFS Datastores

Ok, so the last post went over the scenario of recovering entire VMs, what if you just want one file? As I mentioned, we used to recover the whole VM to another place, copy the file out, then delete our copy. But that was far from elegant, and again, a pain if the file they wanted was in a snapvaulted location.

How much do you trust the filesystem to be consistant? Well, we take a “crash-consistant” snapshot every morning, where the NetApp system effectively spools off a version of the underlaying VMDK file, without telling the Virtual machine using it. Our recovery rate, over the last two years, and 1050 VMs, has been 100%. It’s not a solution for everyone and everything – for the VMs running high transaction load DBs, like Oracle (yup, we went there!) and Exchange, we use NFS or iSCSI, and use NetApp’s Snapmanager products to quiesce the Applications and take snapshots of their storage in the instant they are flushed.

So our crash consistant snapshots, how do we get files back out of them? Remember the secured recovery console VM in the previous post? Remember the inception reference in the previous post? Add a few more layers into that.

The basic premise is that we mount the NTFS filesystems in the VM, using NTFS-3G, and use e2tools to copy files out of ext3 partitions.

But to get to those points, you have a few problems. The first is to turn your read-only VMDK (NetApp snapshots are read-only) into a device. losetup -r loop0 /path/to/VMDK will do that. Then, find partitions inside this device: kpartx -a -v /dev/loop0. At this point, you can just mount the NTFS partitions from the Windows VMs, but the Linux systems have a few more tricks up their sleeves..

We use LVM, for flexible volume management. It’s burnt into our template. Which means all of our VMs have the same VG and LV names. The first thing we did to prepare this recovery VM was to rename it’s Volume Groups to avoid conflicts. Simple vgchange, edit /etc/fstab and mkinitrd – in that order. If you do mkinitrd before the /etc/fstab edit, the initrd will load root from a non-existant location.

Having prepared our recovery VM in advance, we scan for volume groups inside the /dev/loop0 partitions using vgscan, then bring them online with vgchange -ay VGname

At this point, you’d think we could just mount the LVs, wouldn’t you?

Quick primer on the ext3 filesystem – it’s ext2, with a journal to enable easy recovery after crashes. In these crash consistant VMDK snapshots, there’s an unflushed journal, and the filesystem is flagged as inuse and having one. Linux’s ext3 implementation will attempt to replay the journal of an ext3 filesystem if present, when mounted. Even if you tell it not to load the journal (noload), it will still attempt to make your readonly filesystem read-write to mark the filesystem as clean. And if you try to mount it as ext2, it will also complain, since there’s a journal there. ext3 journals can be removed, but guess what? It’s a read-write operation. All of these things are perfectly reasonable, and there for very very good reasons. Just, not what I’m after, since this is a 100% read-only situation, and I can’t make it readwrite, even if I wanted to.

So we looked at a couple of options, union filesystems (rejected; wanted to copy the whole VMDK if we made a change), guestfish (actually works ok, but is very resource heavy – it essentially boots the VM inside it) and eventually were pointed at e2tools – it’s in early beta, and it hasn’t been updated in 7 years – but it seems perfectly functional.

At this point, we’ve copied our files out, with just cp or e2cp, so how do we get them to the VM? We’re still working on that, but current plan is to use mkisofs to turn them into an .iso, and mount that to the VM for the end-admin to copy them out of.

Then, once all the copies are done, you need to tear down the LVM with vgchange -an, delete the partitions from the kernel with kpartx -d, then remove the loop device with losetup -d and you’re done! We will be automating a lot of this with some shell scripts (think – startrecover, stoprecover to take care of the loop/LVM setup), but even now it’s a lot quicker than what we had.

Pretty neat huh?

Recovering VMDKs on NetApp NFS Datastores

In my day job, I look after the day to day server operations of a university that makes extensive use of vmware and netapp storage. When I started there, and saw they were using NFS for their datastores, I reversed judgement on if they were crazy-smart or just crazy. Thankfully it was the former – crazy-smart.

Using NetApp NFS for VMDK storage allows us to do all sort of cool stuff, especially with regards to backups/recovery/migration. But it had been tedious, especially if someone wanted a single file restored from their VM.. we had to copy the entire VMDK out of the snapshot directory, mount it on another VM somewhere, find the file, and get it back to the customer somehow. And if it was on our secondary filer, we had to do a flexclone, and mount that onto one of the 96 ESX hosts we had, copy the file out.. etc

Wheels spin sometimes, and an idea comes to you. Remember Inception? and all the layers? Going deeper etc? It’s like that.

/home/user/file.txt -> ext3 -> LVM LV -> LVM VG -> LVM PV -> /dev/sda1 -> ESX -> VMDK -> NFS Datastore -> NetApp Data OnTap -> WAFL -> Disks ..

Over the last week, my co-workers and I have been building up a system to make this easier and less disruptive to the infrastructure (which is good for everyone, the less changes you have to make to production, the better). This gist is this..

We have a secured VM, with a couple of NICs – one standard access port, one a VLAN trunk carrying our NAS networks, including the one that the VM Blades use to mount their storage.

Inside this VM, we do magic…

So, 96 blades – that’s a fairly large VM infrastructure. We have two separate environments, in 6 clusters, two routing domains, etc, running a total of 1050+ VMs at last count. Each cluster with their own datastores, diverse physical locations, etc. One of the service improvement projects that I got our great team to do was to implement were some datastores, mounted onto all the clusters, routed where needed. Performance didn’t have to be great, just good enough, and on 10Gb NFS, yeah, it’s pretty good. We have an ISOs datastore, a Templates datastore and a Transfer datastore. The Transfer one was new – the others we’d had for a while.

On our secured VM, we have the Transfer datastore mounted read-write using NFS, as well as the snapvault repository versions of our datastores (mounted read only for safety, but the files are read-only anyway). This now means that if we have to do a full VM recover, we have a simple process –

  • Shut down the VM
  • Edit the settings to remove the hard drives you want to recover (I know, it sound wrong to me too, but trust me..)
  • Storage vMotion the VM onto the Transfer datastore (which, since it doesn’t have any disks, is quick)
  • Locate the version of the VMDK you want in the .snapshot directory of the snapvault location (We have a simple shell script to list all versions)
  • Copy the VMDK files (remember the -flat.vmdk) from the snapvault location into the appropriate directory on the Transfer datastore, using cp &, then running watch ls -l on the destination, if you want a progress indicator
  • Re-add the storage from the vmware settings, finding it in the place you just copied it
  • Power On VM, check it works, then hand back control to customer, and start a storage vMotion to relocate storage back into the correct primary datastore

All done! No messing around on the NetApp making flexclones and mounting them, cleaning them up etc. Depending on your level of risk tolerance, you could copy the VMDK back to the primary location also mounted via NFS, but we consider the small delay of the storage vMotion to be a price worth paying for peace of mind.

Your site gets compromised, what do you do?

.. make people unable to use authentication methods that don’t involve giving you a password, that’s what!

Following on from the Gawker account hack, I have gone and changed a bunch of accounts, even though I may not have actually used a password I generated for Gawker, but it seemed prudent.

Lifehacker have a page up here which details the response..

Including this bit:

2) What if I logged in using Facebook Connect? Was my password compromised?
No. We never stored passwords of users who logged in using Facebook Connect. We have, however, disabled Facebook Connect logins temporarily.


So what you’re saying is, not only are you incompetent, and let people steal your user/password database, you’ve now stopped the only way of stopping it from happening again??

Nothing pisses me off more than websites that require you to register or login to look at attachments on forums, for example. Facebook Connect (or ideally OpenID) are an awesome solution to the problem of having to create/maintain/worry about accounts on every site on the internet. I mean sure, there are idiots in marketing who love the idea of “rich user engagement” from tying them to your site with an account, but I think they severely overestimate their own importance.

.. but don’t get me started on janrain/rpx’s recent change that suggests you put your paypal username/password into HTML hosted on an insecure site so you can join the social engagement “story”. That’s just stupid.

so BC is getting a new Premier

The big news in BC yesterday was that Gordon Campbell stepped down as Premier. Some were loudly proclaiming victory, or expressing happiness of his departure.

As he put it in his statement: When public debate becomes focused on one person, instead of what is in the best interest of British Columbians, we have lost sight about what is important. When that happens, it’s time for a change.

Cause let’s look at the mess he left BC in:

  • One of the lowest unemployment rates in Canada.
  • third highest average hourly wage in Canada
  • lowest tax rate for low-income (0%) and middle-income families in Canada.
  • up to a 70% tax reduction for low income families
  • opened 80 new schools, increased education funding every year, more seats in universities, highest per-pupil funding in Canada
  • Balanced budgets for 9 years until the biggest recession in half a century.
  • 42% reduction in provincial budgets before service cuts
  • A provincial credit rating that has been upgraded 7 times in a row to AAA (the highest possible)
  • biggest real GDP growth in Canada
  • $195 million in new Arts grants
  • $80 million in new permanent sport grants and funding
  • 20% increase in the amount paid per person by income assistance
  • Low-income support program spending up by more than 4x
  • Reduced carbon and greenhouse gas emissions – the most aggressive targets set in Canada, with legal enforcement in place
  • (ganked from voice_of_experience on reddit)

    Oh wait, that’s the good stuff.

    And yet, there’s a downside.. apparently some people don’t like the HST (which, when you look at what else the province gives, is actually a reasonable measure..) or they didn’t like the Olympics (what are you going to do about that now? it worked out fine. Sure it cost a lot of money..) or that the Canada Line doesn’t have enough capacity (it grew faster than expected, that’s success isn’t it?), or that he once got arrested for drink driving (let me tell you about Ralph…)

    I’m confident of history’s view of this period in politics. Also, has anyone seen Idiocracy? No? Never mind, seems like it’s playing out in politics right now, what with this and the Tea Party..

    Australian food

    We went to a place called Moose’s Downunder for lunch on Sunday, who bill themselves as providing a little bit of home and a unique Australian experience in Vancouver.

    Well it’s certainly as described on box. It seems to be staffed entirely by Australians, many of whom are from Perth like the owner. I had an Aussie Burger, with Beetroot + Fried Egg + Pineapple. It did indeed remind me of home. Also the chairs were EXACTLY the same as the ones that KK’s/The Last Drop in Crawley used to have before it turned upmarket. Down to the varnish on the arms turning gooey and coming off.

    On the downsides, just like home they charge for drink refills and extra sauces. So just like home, you don’t have to tip, right? :P I kid, I kid. I did tip, as is the local custom.

    Boeing Aviation Geek Fest 2010

    Today was the 2010 Boeing Aviation Geek Fest.

    Let me begin by saying, going on the Boeing tour at the best of times is pretty geeky. This on the other hand, is a once a year tour they don’t promote heavily, but the aviation geeks find out about one way or another.. It’s slightly more expensive than the regular tour, but it’s really for the hardcore fans.

    We started off the day.. well, first, getting here from Canada. We left home and drove to Sumas. Took about 1.75 hours to get across the border.. first a 60 minute lineup to get to the border, then another 45 minutes in with the good people of Immigration to get our I-94 waiver forms (mostly waiting in lines – despite it not being the usual “tourist” border, they were still very nice), then zooming down the highway and getting to the Future of Flight and “checking in” for 1330 hours.

    The AGF day started with a session from Boeing’s professional aviation geek, Michael Lombardi, who is employed as an aviation historian. He went through the last 40 years of Boeing, and gave some fun insights and back stories, then a bit of a Q+A, then some chatting with each other over free candy (yay halloween), then the tour.

    Let me step back.. the regular Boeing tour is pretty cool, you walk on high level platforms and look out over a sight which is similar to the construction of the USSS Enterprise in the most recent Star Trek movie. This tour, on the other hand, is at ground level, walking on the actual factory floor, and through, around and on planes in various stages of production. Sweeet. You have to wear eye protection, just in case, and watch your step through and around cables. It’s an amazing facility up close.

    Inside the factory we saw 777 LN903 for Turkish Airlines up close and personal, getting to kick the tires, almost literally, in addition to actually walking in and around the pieces that would make up LN908 for Air Egypt. As well as that, we saw the first 747-8i in final body join, a bunch of 787s (including the first 3 for Air India) and the 787 static test article.

    Then, they dragged us out of the factory, with some difficulty and back onto the bus. Which did a tour of the KPAE flightline parking lot. I believe a record for the loudest cheer for doing a left-hand turn was set this day when this was announced. We went up and around all the planes waiting for final fit-out and delivery (this site has pictures of them from a-far). Saw 777s for V Australia and Air New Zealand, as well as all the 787s for ANA, and a bunch of 787-8f’s for Cargolux, Korean Airlines and Cathay Pacific Cargo.

    Then it was back to the Future of Flight center for Pizza and networking with other geeks before heading off to our hotel.

    Everyone knows planes are big, even “small” planes like the 737, but the size of the 747 and 777s are pretty amazing. I gush on the regular factory tour, and it’s probably more interesting for most people than the one we did, but the fact is that almost every international airliner in service today was made in either this factory, or Airbus’s in Toulouse.

    What Boeing makes here is pretty much the pinnacle of humankind’s knowledge of technology and ability to build machines, and it’s amazing privilege to get up close and personal on the factory floor. Future of Flight is an amazing center at the best of times, and I have to say, today was an amazing day. I feel so lucky to have been able to attend. Very few members of the public get to do factory floor tours, with this years and last years, there was some overlap, so it’s probably under 75 people have done this one.

    So thank you very much to Future of Flight, Boeing Commercial Aircraft and Airline Reporter for organising the day! Look forward to next year’s!

    See also: Photos from the Stratodeck

    Dear Cisco, wtf are you thinking?

    As an expatriated person, I find myself thinking of home sometimes. Video conferencing with people from the old country is fun, so I thought I’d have a look at the details on Cisco’s new Umi video conferencing unit.

    Let me say, I have no idea what they’re thinking here. It’s for home use. It costs $599. Then, you have to pay $24/month for a plan to use it. To call other people who have a Umi.

    Because it doesn’t work with Skype, or FaceTime. Or anything other than Google Video chat (which is itself free for non PSTN calls).

    So basically, you’re charging as much as a computer + webcam (which you could hook up to a TV), you can’t connect to Skype, and you’re charging a monthly fee for something everyone else is giving away for free.

    Let me know how that works out for you…