Escaping the Apple ecosystem: part 1

X10SRA-F_specBack in late 2012, I've started to think about my post-Apple days. I knew already that I would not endorse the full cloud crap, and the oversimplification of OS X that follows the iOS convergence. My hardware, my OS, my data.

I'm still running a Mac Pro 2010, with Mac OS X 10.6.8, the last great OS from Apple. Unfortunately (in)security is not what it used to be, and browsers, ssl libraries, etc. are updated frequently and older OSes are no longer supported. So it was time to switch away from 10.6.8 for my online activities. Meaning, it's time to create my multi-headed workstation running various OSes with dedicated GPU.

After about a year spent looking for the right pieces of hardware, I've came up with this BOM:

Hopefully it will yield to great results with virtualization, passthrough of GPUs and USB.
For now, only one GPU, just in case it's not working accordingly. That will allow me to test every OSes I want to use with GPU passthrough. I'll buy 2 more GPUs when it's fully tested and satisfactory.

To do list:

  • Assemble the workstation
  • Install and patch VMWare (to allow the virtualization of Mac OS X on non-Apple hardware)
  • Test each VM with GPU+USB+Sound passthrough

Due to the exotic nature of some of these pieces of hardware, I've had to order them from four retailers in two countries: Amazon (France), LDLC (France), PC-Cooling (Germany) and MindFactory (Germany).

I've got only one delivery problem, for the PC case: it was delivered through GLS Group, and it took them more than a week to drive 10 km from their warehouse to my home. When it finally went through my door, the box was trashed and the PC case had one foot slightly hammered into the case. It can't stand still on its 4 feet.
Whatever you buy abroad, just make sure it won't be delivered through GLS. The driver was a pain in the ass, calling me to delay the delivery.

Data leak at Elinchrom

As I'm paranoid, I create a brand new email address each time I've got to register on a web site. It's easy enough and allows me to partition usages and detect abuse. If one of those dedicated addresses ends up in the wild (spammers for example), I just destroy it.

CCBY ©Yuri Samoilov via Flickr

CCBY ©Yuri Samoilov via Flickr


So, in 2010 I registered a support account on Elinchrom's web site with a tailored address : elinchrom (@patpro.net of course). Later, during September 2014, I've received a big spam at that address, neither from Elinchrom nor from one of their partner. After a quick search I've found many other spams blocked by my server. The earlier arrived in 2013.
I've contacted the company, explained the problem, and sent extracts from my mail server logs. After few days, they could find no evidence of a compromise or a data leak on their side. Fair enough, that's not the kind of things you can detect easily, especially if it's years old. On a side note: it's not impossible that I've used the same email address for a contest or event registration affiliated with Elinchrom but not run by them. My bad. At least, this time they deserved the benefit of the doubt.
Following this leak, I've changed the address email to elinchrom2014 (still @patpro.net) and changed the associated password.
The 31th of December 2015 I've received my very first email at this new address: a Paypal phishing attempt, out of a hacked web server somewhere…
Fine.

As I'm super-vigilant since 2014 not to use this address anywhere, the only possible scenario is a data leak at Elinchrom. Going back into my logs, I've found the earliest spam blocked dated from the 24th of October:

Oct 24 17:17:50 postfix/smtpd[84170]: NOQUEUE: reject: RCPT from unknown[202.71.131.54]: 550 5.7.1 Client host rejected: cannot find your hostname, [202.71.131.54]; from=<apache@corp17.net4india.com> to=<elinchrom2014@...> proto=ESMTP helo=<smtp.net4india.com>

I've immediately destroyed this email address, created a new one (longer, with random characters), changed the password again.

Elinchrom: seriously, that's ridiculous, do something. Even the user authentication does not use HTTPS. WAKE UP, it's 2016.

Log aggregation and analysis: logstash

Logstash is free software, as in beer and speech. It can use many different backends, filters, etc. It comes packaged with Elasticsearch as a backend, and Kibana as user interface, by default. It makes a pleasant package to start with, as it's readily available for the user to start feeding logs. For your personal use, demo, or testing, the package is enough. But if you want to seriously use LS+ES you must have at least a dedicated Elasticsearch cluster.

apache-log-logstash-kibana

Starting with Logstash 1.4.0, the release is no longer a single jar file. It's now a fully browsable directory tree allowing you to manipulate files more easily.
ELK (Elasticsearch+Logstash+Kibana) is quite easy to deploy, but unlike Splunk, you'll have to install prerequisites yourself (Java, for example). No big deal. But the learning curve of ELK is harder. It took me almost a week to get some interesting results. I can blame the 1.4.0 release that is a bit buggy and won't start-up agent and web as advertised, the documentation that is light years away from what Splunk provides, the modularity of the solution that makes you wonder where to find support (is this an Elasticsearch question? a Kibana problem? some kind of grok issue?), etc.

Before going further with functionalities lets take a look at how ELK works. Logstash is the log aggregator tool. It's the piece of software in the middle of the mess, taking logs, filtering them, and sending them to any output you choose. Logstash takes logs through about 40 different "inputs" advertised in the documentation. You can think of file and syslog, of course, stdin, snmptrap, and so on. You'll also find some exotic inputs like twitter. That's in Logstash that you will spend the more time initially, tuning inputs, and tuning filters.
Elasticsearch is your storage backend. It's where Logstash outputs its filtered data. Elasticsearch can be very complex and needs a bit of work if you want to use it for production. It's more or less a clustered database system.
Kibana is the user interface to Elasticsearch. Kibana does not talk to your Logstash install. It will only talk to your Elasticsearch cluster. The thing I love the most about Kibana, is that it does not require any server-side processing. Kibana is entirely HTML and Javascript. You can even use a local copy of Kibana on your workstation to send request to a remote Elasticsearch cluster. This is important. Because Javascript is accessing your Elasticsearch server directly, it means that your Elasticsearch server has to be accessible from where you stand. This is not a good idea to let the world browse your indexed logs, or worse, write into your Elasticsearch cluster.

To avoid security complications the best move is to hide your ELK install behind an HTTP proxy. I'm using Apache, but anything else is fine (Nginx for example).
Knowing that 127.0.0.1:9292 is served by "logstash web" command, and 127.0.0.1:9200 is default Elasticsearch socket, your can use those Apache directives to get remote access based on IP addresses. Feel free to use any other access control policy.

ProxyPass /KI http://127.0.0.1:9292 
ProxyPassReverse /KI http://127.0.0.1:9292 
ProxyPass /ES http://127.0.0.1:9200 
ProxyPassReverse /ES http://127.0.0.1:9200 
<Location /KI>
	Order Allow,Deny
	Allow from YOUR-IP 127.0.0.1
</Location>
<Location /ES>
	Order Allow,Deny
	Allow from YOUR-IP 127.0.0.1
</Location>

original data in µs, result in µs. Impossible to convert in hours (17h09)

original data in µs, result in µs. Impossible to convert in hours (17h09)

On the user side, ELK looks a lot like Splunk. Using queries to search through indexed logs is working the same, even if syntax is different. But Splunk allows you to pipe results into operators and math/stats/presentation functions… ELK is not really built for complex searches and the user cannot transform data with functions. The philosophy around Kibana is all about dashboards, with a very limited set of functions. You can build histograms, geoip maps, counters, compute some basic stats. You cannot make something as simple as rounding a number, or dynamically get a geolocation for an IP address. Everything has to be computed through Logstash filters, before reaching the Elasticsearch backend. So everything has to be computed before you know you need it.
Working with Logstash requires a lot of planing: breakdown of data with filters, process the result (geoip, calculation, normalization…), inject into Elasticsearch, taylor your request in Kibana, create the appropriate dashboard. And in the end, it won't allow you to mine your data as deep as I would want.
Kibana makes it very easy to save, store, share your dashboards/searches but is not very friendly with clear analysis needs.

Elasticsearch+Logstash+Kibana is an interesting product, for sure. It's also very badly documented. It looks like a free Splunk, but its only on the surface. I've been testing both for more than a month now, and I can testify they don't have a lot in common when it comes to use them on the field.

If you want pretty dashboards, and a nice web-based grep, go for ELK. It can also help a lot your command-line-illeterate colleagues. You know, those who don't know how to compute human-readable stats with a grep/awk one-liner and who gratefully rely on a dashboard printing a 61 billions microseconds figure.
If you want more than that, if you need some analytics, or even forensic, then odds are that ELK will let you down, and it makes me sad.

Metasploit hearbleed scanner reliably crashes (some) HP ILO

Disclaimer: HP ILO Firmware involved here is NOT the latest available. ILO 3 cards were not affected.

Using Metasploit to scan my network for OpenSSL "Heartbleed" vulnerability I've been quite shocked to get a handful of alerts in my mailbox. Our HP C7000 was no longer able to talk to some of our HP Proliant blades' ILO.

oh-crap

Production was all good, and service was still delivered, but blade management was impossible: Metasploit's auxiliary/scanner/ssl/openssl_heartbleed.rb probe has just crashed six HP ILO 2 cards. That's funny, because the ILO firmware is not vulnerable to heartbleed, it's only vulnerable to the scanner...

heartbleed

I've made some tests to repeat the problem. It happens with a 100% reliability. Quite impressive.

Software versions:

Metasploit
Framework: 4.8.2-2013121101
Console  : 4.8.2-2013121101.15168
openssl_heartbleed.rb : downloaded on April the 22th.

HP ProLiant BL490c G6 (product ID 509316-B21), ILO 2 firmware undisclosed for security reasons.

Log aggregation and analysis: splunk

As soon as you have one server, you might be tempted to do some log analysis. That can be to get some metrics from your Apache logs, your spam filter, or whatever time-stamped data your server collects. You can easily find small tools, or even create a home-made solution to extract info from these files.
Now imagine you have 100, 200, or even thousands of servers. This home-made solution you've created no longer suits your needs.
Different powerful products exist, but I'll focus on two of them: Splunk on one side, Logstash+Elasticsearch+Kibana on the other side. This post is dedicated to Splunk. Logstash will come later.

Both softwares are tools. These are not all-in-one solution. Exactly like a spreadsheet software which is unable to calculate your taxes unless you design a specific table to do so, you must use the software to create value from your logs. Installing and feeding logs into the software is not the end of your work, it's the very beginning.

Splunk is a commercial product. It's incredibly powerful out of the box, and its documentation is very good. Every aspect of the software is covered in depth with numerous examples. It also has an official support. Unfortunately Splunk is very (very) expensive, and no official rates are available online. Hiding the cost of a software is often the promise you won't be able to afford it.
Splunk is well packaged and will run effortlessly on many common systems. At least for testing. Scaling up requires some work. I've been told that apparently, scaling up to more than few TB of daily logs can be difficult, but I don't have enough technical details to make a definitive statement about this.
Rest assured that Splunk is a very nice piece of software. It took me only an hour from the time I've installed it on a FreeBSD server, to the time I've produced a world map showing spam filter action breakdown by location:

amavis-geoloc

One hour. This is very short, almost insane. The map is fully interactive, and you can click any pie chart to display the table of values and the search request allowing you to create this table:

splunk-table

The query syntax is quite pleasant and almost natural. The search box is very helpful and suggests "Common next commands", or "Command history" alongside with documentation and example:

splunk-search

Splunk has some other killing features like users management and access control, assisted (almost automated) regex design for field extraction, or its plugin system. The Field extraction "wizard" is quite impressive, as it allows you to extract new fields out of already indexed logs, without writing any regex nor re-indexing your logs. You just browse your logs, paste samples of data you want to extract, and it builds the filter for you.
Transactions are also a pretty damn great feature: they make correlation of different events possible (login and logout, for example), so that you can track complex behaviors.

More importantly, Splunk appears to be simple enough so any sysadmin wants to use it and does not get rebutted by it's complexity. It's a matter of minute(s) to get, for example, the total CPU time involved in spam filtering last month (~573 hours here). Or if you want, the total CPU time your antispam wasted analyzing incoming emails from facebook (~14.5 hours). But it's definitively a very complex software, and you have to invest a great deal of time in order to get value (analytics designed for you) from what you paid for (the license fees).

I love Splunk, but it's way too expensive for me (and for the tax payers whose I use the money). So I'm currently giving Logstash a try and I'm quite happy about it.

ZFS primarycache: all versus metadata

In my previous post, I wrote about tuning a ZFS storage for MySQL. For InnoDB storage engine, I've tuned the primarycache property so that only metadata would get cached by ZFS. It makes sense for this particular use, but in most cases you'll want to keep the default primarycache setting (all).
Here is a real world example showing how a non-MySQL workload is affected by this setting.

On a virtual server, 2 vCPU, 8 GB RAM, running FreeBSD 9.1-RELEASE-p7, I have a huge zpool of about 4 TB. It uses gzip compression, and stores 1.8TB of emails (compressratio 1.61x) and 1TB documents (compressratio 1.15x). Documents and emails have their own dataset, on the same zpool.
As it's a secondary backup server, isolated from production, I can easily make some tests.

I've launched clamscan (the binary part of ClamAV virus scanner) against a small branch of the email storage tree (about 1/1000 of total emails) and measured the zpool IOs, CPU usage and total runtime of the scan.
Before each run, I've rebooted the server to clear cache.
clamscan is set up so that every temporary files are written into a UFS2 filesystem (/tmp).

One run was made with property primarycache set to all, the other run was made with primarycache set to metadata.

Total runtime with default settings primarycache=all is less than 15 minutes, for 20518 files:

Scanned directories: 1
Scanned files: 20518
Infected files: 2
Data scanned: 7229.92 MB
Data read: 2664.59 MB (ratio 2.71:1)
Time: 892.431 sec (14 m 52 s)

Total runtime with default settings primarycache=metadata is more than 33 minutes:

Scanned directories: 1
Scanned files: 20518
Infected files: 2
Data scanned: 7229.92 MB
Data read: 2664.59 MB (ratio 2.71:1)
Time: 2029.921 sec (33 m 49 s)

zpool iostat every 5 seconds, with different primarycache settings, ~10 minutes range.

zpool IO stats

CPU usage for clamscan process, and for kernel{zio_read_intr_0} kernel thread. 5 seconds sampling, with different primarycache settings, ~10 minutes range.

CPU stats

In both tests, the server is freshly rebooted, cache is empty. Nevertheless, when primarycache=all the kernel{zio_read_intr_0} thread consumes very few CPU cycles, and the clamscan process run's more than twice as fast as the same process with primarycache=metadata.
More importantly, clamscan manages to read the exact same amount of data in both tests, using 10 times less IO throughput when primarycache is set to all.

There is something weird. Let's make another test:

I create 2 brand new datasets, both with primarycache=none and compression=lz4, and I copy in each one a 4.8GB file (2.05x compressratio). Then I set primarycache=all on the first one, and primarycache=metadata on the second one.
I cat the first file into /dev/null with zpool iostat running in another terminal. And finally, I cat the second file the same way.

The sum of read bandwidth column is (almost) exactly the physical size of the file on the disk (du output) for the dataset with primarycache=all: 2.44GB.
For the other dataset, with primarycache=metadata, the sum of the read bandwidth column is ...wait for it... 77.95GB.

There is some sort of voodoo under the hood that I can't explain. Feel free to comment if you have any idea on the subject.

A FreeBSD user has posted an interesting explanation to this puzzling behavior in FreeBSD forums:

clamscan reads a file, gets 4k (pagesize?) of data and processes it, then it reads the next 4k, etc.

ZFS, however, cannot read just 4k. It reads 128k (recordsize) by default. Since there is no cache (you've turned it off) the rest of the data is thrown away.

128k / 4k = 32
32 x 2.44GB = 78.08GB

Damnit.

MySQL on ZFS (on FreeBSD)

logofreebsdLots on FreeBSD users are coming to ZFS since the release of FreeBSD 10, as it's so easy to install the system on top of that powerful filesystem. ZFS default behavior and settings are perfect for a wide range of workloads and uses, but it's not exactly what you need for databases hosting.
I'm running a FreeBSD 9.x server hosting http, php, mysql, mail, and many other things, so a databases-only optimization would be counterproductive. After a good dive into experts do's & don't's here are few steps I've taken to tune my server.

Datasets

If like me you've started hosting MySQL databases a long time ago (say ~15 years ago) you might have some DB using the old MyISAM engine, and some newer ones using InnoDB. Guess what. They use different block sizes, and they use different caching mechanisms.
On top of that, the block size for InnoDB is not the same when you deal with data files, and when you deal with log files.
In order to get the best block size and cache tuning possible, I've created 3 datasets: one for InnoDB data files, one for innodb logs, and one for everything else, including MyISAM databases.

Block size

The block size is engine dependent. MyISAM uses a 8k block size, InnoDB uses a 16k block size for data and 128k for logs. It's easy to setup datasets for a particular block size at creation with -o option, or after creation with zfs set. You must set the proper block size before putting any data on the dataset, otherwise pre-existing data won't use the desired block size.

Caching

MyISAM engine relies on the underlying filesystem caching mechanism, so you must ensure ZFS will cache both data and metadata (that's the default behavior). On the other hand, InnoDB uses an internal cache, so it would be a waste on memory to cache the same data into ZFS and InnoDB. On a dedicated MySQL server, the proper tuning would require to limit ARC size, and to disable data caching in ZFS. On a general-purpose server, ARC size should not be tweaked, but on InnoDB dedicated datasets it's easy to disable ZFS cache for data by setting the property primarycache to "metadata".

On MySQL's side

You must of course tell mysqld where to find data files and logs. I've set those values in my /var/db/mysql/my.cnf file:

innodb_data_home_dir = /var/db/mysql-innodb
innodb_log_group_home_dir = /var/db/mysql-innodb-logs

where /var/db/mysql-innodb and /var/db/mysql-innodb-logs are mount points for the dataset dedicated to InnoDB data files, and the dataset dedicated to InnoDB log files.

As my zpool is a mirror, I've added this setting too:

skip-innodb_doublewrite

Step by step

I've followed this course of action:

1st step: mailing to users, "MySQL will be unavailable for about 5 minutes, hang on."
2nd step: backup /var/db/mysql and edit your my.cnf
3rd step: shutdown your mysql server

sudo service mysql-server stop

4th step: move /var/db/mysql to /var/db/mysql-origin
5th step: create appropriate datasets:

zfs create -o recordsize=16k -o primarycache=metadata zmirror/var/db/mysql-innodb
zfs create -o recordsize=128k -o primarycache=metadata zmirror/var/db/mysql-innodb-logs
zfs create -o recordsize=8k zmirror/var/db/mysql

6th step: move data from /var/db/mysql-origin to your new datasets

cd /var/db/
sudo mv mysql-origin/ib_logfile* mysql-innodb-logs/
sudo mv mysql-origin/ibdata1 mysql-innodb/
sudo mv mysql-origin/* mysql/

and set proper rights:

sudo chown mysql:mysql mysql-innodb-logs mysql-innodb mysql
sudo chmod o= mysql mysql-innodb-logs mysql-innodb

7th step: restart your mysql server

sudo service mysql-server start & tail -f mysql/${HOSTNAME}.err

8th step: mailing to users, "MySQL's back online maintenance duration 5 min 14 sec."

Further reading & references

MySQL Innodb ZFS Best Practices
A look at MySQL on ZFS
Optimizing MySQL performance with ZFS
ZFS for Databases

Dtrace on FreeBSD 9.2

FreeBSD logoAs of FreeBSD 9.2-RELEASE, Dtrace is enabled by default in the GENERIC kernel. We where already able to activate Dtrace, on earlier releases, but we would have to make a custom kernel. Not always what you want because it breaks binary updates.

Now that Dtrace is enabled (and updated) in FreeBSD, everyone can use it without the hassle of a custom kernel. You still have to load Dtrace's modules. Let's see how it works.

first you go root and try a Dtrace related command:

$ sudo -Es
Password:
# dtruss ls
dtrace: failed to initialize dtrace: DTrace device not available on system

Looks like Dtrace is not loaded:

# kldstat 
Id Refs Address            Size     Name
 1   23 0xffffffff80200000 15b93c0  kernel
 2    1 0xffffffff817ba000 23d078   zfs.ko
 3    2 0xffffffff819f8000 84e0     opensolaris.ko
 4    1 0xffffffff81a01000 4828     coretemp.ko
 5    1 0xffffffff81c12000 2bce     pflog.ko
 6    1 0xffffffff81c15000 3078d    pf.ko
 7    1 0xffffffff81c46000 57bcf    linux.ko
 8    1 0xffffffff81c9e000 3c3d     wlan_xauth.ko

lets find Dtrace related modules:

# ls /boot/kernel/dtrace*
/boot/kernel/dtrace.ko			/boot/kernel/dtrace_test.ko.symbols
/boot/kernel/dtrace.ko.symbols		/boot/kernel/dtraceall.ko
/boot/kernel/dtrace_test.ko		/boot/kernel/dtraceall.ko.symbols

/boot/kernel/dtraceall.ko looks like a winner.

# kldload dtraceall
# kldstat 
Id Refs Address            Size     Name
 1   59 0xffffffff80200000 15b93c0  kernel
 2    1 0xffffffff817ba000 23d078   zfs.ko
 3   16 0xffffffff819f8000 84e0     opensolaris.ko
 4    1 0xffffffff81a01000 4828     coretemp.ko
 5    1 0xffffffff81c12000 2bce     pflog.ko
 6    1 0xffffffff81c15000 3078d    pf.ko
 7    1 0xffffffff81c46000 57bcf    linux.ko
 8    1 0xffffffff81c9e000 3c3d     wlan_xauth.ko
 9    1 0xffffffff81ca2000 ba2      dtraceall.ko
10    1 0xffffffff81ca3000 4ed3     profile.ko
11    3 0xffffffff81ca8000 402c     cyclic.ko
12   12 0xffffffff81cad000 23dbaa   dtrace.ko
13    1 0xffffffff81eeb000 fb3d     systrace_freebsd32.ko
14    1 0xffffffff81efb000 109bd    systrace.ko
15    1 0xffffffff81f0c000 45bb     sdt.ko
16    1 0xffffffff81f11000 4926     lockstat.ko
17    1 0xffffffff81f16000 bf0b     fasttrap.ko
18    1 0xffffffff81f22000 6673     fbt.ko
19    1 0xffffffff81f29000 4eeb     dtnfsclient.ko
20    1 0xffffffff81f2e000 1dd92    nfsclient.ko
21    1 0xffffffff81f4c000 47c0     nfs_common.ko
22    1 0xffffffff81f51000 55f2     dtnfscl.ko
23    1 0xffffffff81f57000 45cd     dtmalloc.ko
24    1 0xffffffff81f5c000 44fc     dtio.ko

tadaaaam:

# dtruss ls
SYSCALL(args) 		 = return
mmap(0x0, 0x8000, 0x3)		 = 6418432 0
issetugid(0x0, 0x0, 0x0)		 = 0 0
lstat("/etc\0", 0x7FFFFFFFC3C0, 0x0)		 = 0 0
lstat("/etc/libmap.conf\0", 0x7FFFFFFFC3C0, 0x0)		 = 0 0
open("/etc/libmap.conf\0", 0x0, 0x81FA20)		 = 3 0
...

Happy debugging/tracing/auditing.