Multi-head virtualized workstation: the end.

Four years ago I've started a journey in workstation virtualization. My goal at the time was to try and escape Apple's ecosystem as it was moving steadily toward closedness (and iOS-ness). I also though back then that it would allow me to pause planned obsolescence by isolating the hardware from the software.
I've been very wrong.

My ESXi workstation was built with power, scalability and silence in mind. And it had all this for a long time. But about 1.5 year ago I've started to notice the hum of one of its graphics card. Recently this hum turned into an unpleasant high pitched sound under load. The fans were aging and I needed a solution. Problem is, one just can't choose any graphics card off the shelf and put it into an ESXi server. It requires a compatibility study: card vs motherboard, vs PSU, vs ESXi, vs VM Operating system. If you happened to need an ESXi upgrade (from 5.x to 6.x for example) in order to use a new graphics card then you need to study the compatibility of this new ESXi with your other graphics cards, your other VM OSes, etc.
And this is where I was stuck. My main workstation was a macOS VM using an old Mac Pro Radeon that would not work on ESXi 6.x. All things considered, every single upgrade path was doomed to failure unless I could find a current graphics card, silent, that would work on ESXi 5.x and get accepted by the Windows 10 guest via PCI passthrough. I've found one: the Sapphire Radeon RX 590 Nitro+. Worked great at first. Very nice benchmark and remarquable silence. But after less than an hour I noticed that HDDs inside the ESXi were missing, gone. In fact, under GPU load the motherboard would lose its HDDs. I don't know for sure but it could have been a power problem, even though the high quality PSU was rated for 1000W. Anyway, guess what: ESXi does not like losing its boot HDD or a datastore. So I've sent the graphics card back and got a refund.
Second problem: I was stuck with a decent but old macOS release (10.11, aka El Capitan). No more updates, no more security patches. Upgrading the OS was also a complex operation with compatibility problems with the old ESXi release and with the older Mac Pro Radeon. I've tried a few things but it always ended with a no-go.
Later this year, I've given a try to another Radeon GPU, less power-hungry but it yielded to other passthrough and VM malfunctions. This time I choose to keep the new GPU as an incentive to deal with the whole ESXi mess.

So basically, the situation was: very nice multi-head setup, powerful, scalable (room for more storage, more RAM, more PCI) but stuck in the past with a 5 years old macOS using a 10 years old Mac Pro graphics card in passthrough on top of a 5 years old ESXi release, the Windows 10 GPU becoming noisy, and nowhere to go from there.

I went through the 5 stages of grief and accepted that this path was a dead-end. No more workstation virtualization, no more complex PCI passthrough, I've had enough. Few weeks ago I've started to plan my escape: I need a silent Mac with decent power and storage (photo editing), I need a silent and relatively powerful Windows 10 gaming PC, I need an always on, tiny virtualization box for everything else (splunk server, linux and FreeBSD experiments, etc.). It was supposed to be a slow migration process, maintaining both infrastructures in parallel for some weeks and allowing perfect testing and switching.
Full disclosure: it was not.

I've created the Mac first, mostly because the PC case ordered was not delivered yet. Using a NUC10i7 I've followed online instructions and installed my very first Hackintosh. It worked almost immediately. Quite happy about the result, I've launched the migration assistant on my macOS VM and on my Hackintosh and injected about 430 Go of digital life into the little black box. Good enough for a test, I was quite sure I would wipe everything and rebuild a clean system later.

Few days later I started to build the PC. I was supposed to reclaim a not-so-useful SSD from the ESXi workstation to use as the main bare metal PC storage. I've made sure nothing was on the SSD, I've shutdown ESXi and removed the SSD and it's SATA cable. I've also removed another SSD+cable that was not used (failed migration attempt to ESXi 6.x and test for Proxmox). I've restarted ESXi just to find out a third SSD has disappeared: a very useful datastore is missing, 7 or 8 VM are impacted, partially or totally. The macOS VM is dead, main VMDK is missing (everything else is present, even its Time Machine VMDK), the Splunk VM is gone with +60 Go of logs, Ubuntu server is gone, some FreeBSD are gone too, etc.
Few reboots later, I extract the faulty SSD and start testing: different cable, different port, different PC. Nothing works and the SSD is not even detected by the BIOS (on both PCs).
This is a good incentive for a fast migration to bare metal PCs.
Fortunately:
- a spare macOS 10.11 VM, blank but fully functional, is waiting for me on an NFS datastore (backed by FreeBSD and ZFS).
- the Time machine VMDK of my macOS VM workstation is OK
- my Hackintosh is ready even though its data is about a week old
- the Windows 10 VM workstation is fully functional

So I've plugged the Time machine disk into the spare macOS VM, booted it, and launched Disk Utility to create a compressed image of the Time machine disk. Then I've copied this 350 Go dmg file on the Hackintosh SSD, after what I've mounted this image and copied the week worth of out-of-sync data to my new macOS bare metal workstation (mostly Lightroom related files and pictures).
I've plugged the reclaimed SSD into the new PC and installed Windows 10, configured everything I need, started Steam and downloaded my usual games.
Last but not least, I've shutdown the ESXi workstation, for good this time, unplugged everything (a real mess), cleaned up a bit, installed the new, way smaller, gaming PC, plugged everything.

Unfortunately, the Hackintosh uses macOS Catalina. This version won't run many of paid and free software I'm using. Say good bye to my Adobe CS 5 suite, bought years ago, good bye to BBEdit (I'll buy the latest release ASAP), etc. My Dock is a graveyard of incompatible applications. Only sparkle of luck here: LightRoom 3 that seems to be pretty happy on macOS 10.15.6.

In less than one day and a half I've moved from a broken multi-head virtualized workstation to bare metal PCs running up-to-date OSes on top of up-to-date hardware. Still MIA, the virtualization hardware to re-create my lab.

What saved me:
- backups
- preparedness and contingency plan
- backups again

Things to do:
- put the Hackintosh into a fanless case
- add an SSD for Time machine
- add second drive in Windows 10 PC for backups
- buy another NUC for virtualization lab
- buy missing software or find alternatives

Recherche Administratrice/teur Systèmes en alternance

Au sein du Service Opérations de la DSI de l'Université Lyon 2, nous cherchons un ou une Administrateur/trice Systèmes en alternance pour nous aider à relever des défis au quotidien.
Lieu de travail : campus de Bron (arrêt de tram T2 Europe Université).

  • Vous habitez la région Lyonnaise ;
  • Vous êtes motivée par les enjeux de la gestion d’un parc de plus de 400 serveurs Linux RedHat/CentOS (70%), Windows, FreeBSD ;
  • Les problématiques d’une ferme de virtualisation multi-site avec balance de charge, PRA, sauvegardes croisées ne vous font pas peur ;
  • Les infrastructures à fort enjeu de disponibilité, les outils d’automatisation, de monitoring et les SIEM vous intéressent ;
  • Vous êtes passionnée par les problématiques système et souhaitez évoluer dans un environnement riche et varié ;
  • Vous êtes curieuse, très rigoureuse et vous avez le sens du service.

Si vous vous reconnaissez dans ce profil, contactez-moi !

terminal - édition d'un script shell

Édition d'un script shell dans le terminal

Escaping the Apple ecosystem: a view of the setup

Here is a quick & dirty view of the physical and logical setup of my new workstation. The linux part is not finished yet (no drivers for Radeon GPU, thank you Ubuntu), it's a work in progress.

esx
Not depicted: each USB controller sports 4 USB ports (yellow) or 2 USB ports (pink and blue). It allows me to plug few devices that won't be "managed" by the USB switch.
USB devices plugged-in on the switch are made available to only one VM at a time. When I press the switch button, they disappear for the current VM and are presented to the next one.

Escaping the Apple ecosystem: part 3

In part 2, I was able to create and use a Windows 7 VM with the Radeon R9 270x in passthrough. It works really great. But OSX and Linux where more difficult to play with.

List of virtual machines

List of virtual machines


Since then, I've made tremendous progress: I've managed to run an OSX 10.11.6 VM properly, but more importantly, I've managed to run my native Mac OS X 10.6.8 system as a VM, with the Mac's Radeon in passthrough.
I've removed my Mac OS X SSD and the Mac's graphics card from the Mac Pro tower, and installed them into the PC tower. Then I've created the VM for the 10.6.8 system, configured ESXi to use Mac's Radeon with VT-d, etc.
The only real problem here is that adding a PCI card into the PC tower makes PCI device numbers change: it breaks almost every passthrough already configured. I had to remake VT-d config for the Windows VM. Apart from that, it went smoothly.
Currently, I'm working on my native 10.6.8 system, that runs as a VM, and the Windows VM is playing my music (because the Realtek HD audio controller is dedicated to the Windows VM).
Moving from a Mac Pro with 4-core 2.8 GHz Xeon to a 6-core 3.5 GHz Core i7 really gives a boost to my old 10.6.8 system.

Running both OSes, the box is almost as silent as the Mac Pro while packing almost twice as more raw CPU power and 2.7x more GPU power.

The Mac Pro is now empty: no disks, no graphics card, and will probably go on sale soon.

to-do list:

  • secure the whole infrastructure ;
  • install 2nd-hand MSI R9 270x when it's delivered ;
  • properly setup Linux to use AMD graphics card.

I might also add few SSDs and a DVD burner before year's end.

Escaping the Apple ecosystem: part 2

In part 1, I've written about the BoM of my project and the associated to-do list.
First item on this list was: build the box. That did not go as smoothly as expected. The motherboard was not fully operational: after few minutes of run time (between 5 and 30), it would trigger a CPU overheat alarm, even when the CPU was idle and cool. Supermicro's Support made me tried a new BIOS, with no effect, so I've finally send the board for exchange. The new board arrived but I've had to delay the rebuilt for few weeks.
Now the PC is up and running. The new motherboard seems to work great, but I've not tested IPMI yet. IPMI was the very first feature I've used on the first board, and there is a slight probability that the CPU overheat problem comes from a probe malfunction related to the BMC. Let's keep that for later.

I've chosen to run this box on VMware ESXi 5.5, because it's quite common (more than the latest 6.x), because it sports features I need like passthrough, and because most VMware based multi-sit PC projects like mine are using ESXi 5.x.

ESXi is quite easy to install, I won't give details. Main hdd in the box is installed with ESXi, which is configured thanks to a USB keyboard and a display plugged in the VGA port of the motherboard. After basic configuration (network, user password...), I've switched to remote configuration through VSphere Client, installed on a Windows PC (really a VM running on the Mac).

General view

General view of ESX's configuration


Configuring passthrough for GPUs is pretty straightforward, because I've started with only one GPU, and because these are discrete PCI cards. On the other hand, passthrough of USB controller can be tricky: many controllers, nothing to identify them except trial & error (unless you have the blueprint for your motherboard telling what physical USB ports belong to what controller).
Go to configuration tab, then "Advanced", and finally click "edit" on the right

Go to configuration tab, then "Advanced", and finally click "edit" on the right


When you click "Edit…" a window opens that lists interesting devices you can try to passthrough.
Choose some. Then you have to reboot the ESXi and add some of these devices to a VM.
passthrough
I've created a Windows 7 pro 64bits VM, with raw device mapping pointing to an SSD. I've added every available PCI devices to this VM (USB, sound, GPU) and installed Windows plus updates.
It's important to remember that initially, most PCI devices might not work at all because of missing drivers on guest OS (here it's Windows). Hence, after installing Windows, the Radeon was detected but not used, and only Intel USB controllers where working. I've installed AMD drivers, and ASMedia drivers (courtesy of Supermicro). I've also installed VMware Tools.
After all this, the Windows VM properly uses my Dell display hocked-up on the MSI R9 270x Radeon, and I can interact with the system thanks to a real keyboard and mouse. Passing through the whole USB controller allows me to use any USB device I want. I've successfully plugged-in and used a thumbdrive, a USB gaming headset and a USB hub.
I've made some GPU/CPU benchmarks and everything looks perfect. I've tested Left 4 Dead 2 game play, and it looked great too (I'll probably have to tweak anti aliasing settings to make it perfect).

The Windows part was quite fast to setup, and is almost done now. I've started to fight with OSX and Ubuntu, but things are not easy with both of them. It looks like my 3 years old graphics card it so new that OSX does not support it until 10.11.x, and Ubuntu won't allow me to install Radeon drivers on 16.x LTS because they wait for some software to stabilize before packaging it…

To-do list:

  • fix problems with OSX and Ubuntu virtualization
  • find another MSI Radeon R9 270X GAMING 2G (of course it's no longer in stock…)
  • fully test Mac OS X 10.6.8 with Mac's graphics card instead of MSI Radeon

Escaping the Apple ecosystem: part 1

X10SRA-F_specBack in late 2012, I've started to think about my post-Apple days. I knew already that I would not endorse the full cloud crap, and the oversimplification of OS X that follows the iOS convergence. My hardware, my OS, my data.

I'm still running a Mac Pro 2010, with Mac OS X 10.6.8, the last great OS from Apple. Unfortunately (in)security is not what it used to be, and browsers, ssl libraries, etc. are updated frequently and older OSes are no longer supported. So it was time to switch away from 10.6.8 for my online activities. Meaning, it's time to create my multi-headed workstation running various OSes with dedicated GPU.

After about a year spent looking for the right pieces of hardware, I've came up with this BOM:

Hopefully it will yield to great results with virtualization, passthrough of GPUs and USB.
For now, only one GPU, just in case it's not working accordingly. That will allow me to test every OSes I want to use with GPU passthrough. I'll buy 2 more GPUs when it's fully tested and satisfactory.

To do list:

  • Assemble the workstation
  • Install and patch VMWare (to allow the virtualization of Mac OS X on non-Apple hardware)
  • Test each VM with GPU+USB+Sound passthrough

Due to the exotic nature of some of these pieces of hardware, I've had to order them from four retailers in two countries: Amazon (France), LDLC (France), PC-Cooling (Germany) and MindFactory (Germany).

I've got only one delivery problem, for the PC case: it was delivered through GLS Group, and it took them more than a week to drive 10 km from their warehouse to my home. When it finally went through my door, the box was trashed and the PC case had one foot slightly hammered into the case. It can't stand still on its 4 feet.
Whatever you buy abroad, just make sure it won't be delivered through GLS. The driver was a pain in the ass, calling me to delay the delivery.

Le PC dans mon téléphone

Précédemment, j'ai évoqué l'ordinateur de demain. J'ai fait un test grandeur nature façon corporate. Comprendre : vous aurez du mal à faire pareil chez vous. J'utilise en effet quelques artifices comme un accès VPN au réseau de mon employeur, et au travers de cet accès une machine virtuelle VMware View.
Au final, j'ai le bureau d'une machine Windows 7 affiché sur la TV du salon, j'ai un clavier et une souris bluetooth, le tout servi par mon téléphone malin, via un tunnel VPN dans une connexion WIFI. La connexion avec la machine virtuelle se fait via le client officiel View, de VMware (donc en PCoIP).

Windows dans ma TV sur mon téléphone

Ci dessus une photo de la TV affichant le bureau de la VM dans laquelle est lancé le client VSphere (outil d'administration de datacenter VMware).

À 2m50, le texte dans l'interface n'est pas hyper lisible sur l"écran 80cm, donc pour le PC dans le canapé, on repassera, ou alors on fera des réglages spéciaux sur la machine virtuelle.

Le client View se comporte relativement bien, néanmoins, il y a comme un petit souci avec le clavier. Ce dernier fonctionne parfaitement en azerty partout, sauf à l'intérieur du Windows, où il fonctionne en qwerty. Le client View dispose d'un champs de saisie qui permet de taper du texte à l'extérieur de la VM et de l'envoyer ensuite vers cette dernière. À cet endroit, c'est bien de l'azerty. Mais le même texte tapé directement dans la VM passe en qwerty. C'est peut être du à mon téléphone, qui est paramétré tout en anglais.

Aussi, suite à un blocage logiciel sur ma VM, j'ai tenté un ctrl-alt-suppr via mon clavier bluetooth. Le programme n'a pas rendu la main, la VM n'a pas rebooté. Non. C'est le téléphone qui a rebooté. Ça m'a fait rire.

Benchmark: virtualized OS X vs Windows

Lately I've discussed the performance drop between a virtualized Mac OS X and the same system running natively on a Mac Pro. My virtualization project is not limited to Mac OS X of course. Windows, Linux, FreeBSD are also part of the deal. In order to further test my virtualized workstation setup, I've created a Windows Server 2008 R2 VM.
Every VM runs on top of ESXi, only one VM at a time so no interference is possible. Each VM uses the ATI Radeon HD 5770 PCIe card directly thanks to VMware passthrough mode. ESXi is running on a Mac Pro, and the native OS X system runs on the same Mac Pro so I have a consistent hardware platform.

I've given Cinebench a ride on this Windows VM, and I must admit, results are appalling… for Mac OS X:

Cinebench OS X 10.8.2 native OS X 10.8.2 VM Windows Server 2008 R2 VM
CORES 4 4 4
LOGICALCORES 2 1 1
MHZ 2800 2663 2800
CBCPUX 5.038354 3.797552 3.962436
CBOPENGL 32.284100 27.319487 53.606468

I'm afraid a virtualized Windows system achieves better results than a native OS X. And not just a little bit better, but 66% better. We knew for ages that Apple ships crappy graphics card drivers and almost obsolete OpenGL. This is one more evidence.

After further research, I've finally succeeded in launching some Valve games on this windows VM: Half Life Lost Coast and Portal. They both run quite nicely. The HL Lost Coast integrated benchmark scores a very nice 229,82 FPS and the portal frame rate displayed by the command cl_showfps 1 was around 200 and 300.
On Team Fortress 2 I've been able to make a proper benchmark. That's not as detailed as my L4D2 bench, but that's enough.
I've recorded a game on TF2, Mac OS X 10.6.8, played it back with the timedemo command on the same system, and on the Windows VM.
It's a short demo (4099 frames) featuring a control point map with 12 players (11 bots, and me). Video settings were the same on both sides, of course.

Mac OS X 10.6.8 Native Windows VM
average 59.04 fps 59.83 fps
variability 2.764 fps 3.270 fps

It looks like something is capping the fps at 60. I don't know if it comes from my settings, or if it comes from outside the game. Both scores are very similar. Mac OS X's only bonus is the smaller variability, meaning its frame rate is more consistent throughout the demo. If only I had sound in my VMs…

Next step: try to configure a Ubuntu VM so it can use the ATI Radeon HD 5770 PCIe card, and make good use of my Steam On Linux beta test account.