Self-hosted password manager: installing Passbolt on FreeBSD

Arthur Duarte CC-BY-SA-4.0

Arthur Duarte CC-BY-SA-4.0

Password managers, or password safes, are an important thing these days. With the constant pressure we (IT people) put our users under to setup a different password for every single registration/application/web site, it's the best, if not only, way to keep track of these secrets. On one hand, the isolated client-side software can be really powerful and/or well integrated with the OS or the software ecosystem of the user, but it lacks the modern touch of "cloud" that makes your data available anywhere and anytime. On the other hand, a full commercial package will come with client for every device you own, and a monthly fee for cloud synchronization, but you have absolutely no control over your data (just imagine that tomorrow the company you rely on goes bankrupt).
Better safe than sorry: I don't rely on cloud services. It comes at a cost, but it's quite rewarding to show the world another way exists.
Disclaimer: I don't give a sh*t about smartphones, so my needs are computer-centric.

In order to store passwords, and more generally speaking "secrets", in such a way that I can access them anywhere/anytime, I've tried Passbolt. Passbolt is an OpenSource self-hosted password manager, written in PHP/Javascript with a database back end. Hence, install and config are not for the average Joe. On the user side it's quite clean and surprisingly stable for alpha software. So once a LAMP admin has finished installing the server part, any non-skilled user can register and start storing passwords.

Enough chit-chat, let's install.

My initial setup was a vanilla FreeBSD 10.3 install, so I've had to make everything. I won't replay every single step here, especially on the configuration side.

Prerequisites:

pkg install apache24
pkg install mod_php56
pkg install php56-gd
pkg install pecl-memcached
pkg install mysql57-server
pkg install pecl-gnupg
pkg install git
pkg install php56-pdo_mysql
pkg install sudo
pkg install php56-openssl
pkg install php56-ctype
pkg install php56-filter

Everything else should come as a dependency.

Tuning:

Apache must allow .htaccess, so you'll have to put an AllowOverride All somewhere in your configuration. You must also load the Rewrite module. Also, go now for SSL (letsencrypt is free and supported). Non-SSL install of Passbolt are for demo purpose only.
Apache will also need to execute gnupg commands, meaning the www user needs an extended $PATH. The Apache startup script provided on FreeBSD sources Apache environment variables from /usr/local/sbin/envvars and this very file sources every /usr/local/etc/apache24/envvars.d/*.env, so I've created mine:

$ cat /usr/local/etc/apache24/envvars.d/path.env
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin

You also need to tune your MySQL server. If you choose the 5.7, you must edit it's configuration. Just add the following line into [mysqld] section of /usr/local/etc/mysql/my.cnf:

sql_mode = 'STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'

This is due to a bug in Passbolt and could be useless in a not to distant future.

Install recipe:

You can now follow the install recipe at https://www.passbolt.com/help/tech/install.
Generating the GPG key is quite straightforward but you have to keep in mind that Apache's user (www) will need access to the keyring. So if you create this key and keyring with a different user, you'll have to mv and chown -R www the full .gnupg directory somewhere www can read it (outside DocumentRoot is perfectly fine).

Use git to retrieve the application code into appropriate path (according to your Apache config):

cd /usr/local/www
git clone https://github.com/passbolt/passbolt.git

Edit php files as per the documentation.

Beware the install script: make sure you chown -R www the whole passbolt directory before using cake install.
On FreeBSD you won't be able to use su to run the install script, because www's account is locked. You can use sudo instead:

sudo -u www app/Console/cake install --no-admin

Same for the admin account creation:

sudo -u www app/Console/cake passbolt register_user -u patpro@example.com -f Pat -l Pro -r admin

Follow the end of the install doc, and you should be ok. Install the Firefox passbolt extension into your browser, and point to your server.

I'm pretty happy with passbolt so far. I'll have to install a proper production server, with SSL and all, but features are very appealing, the passbolt team is nice and responsive, and the roadmap is loaded with killing features. Yeah BRING ME 2FA \o/.

Escaping the Apple ecosystem: a view of the setup

Here is a quick & dirty view of the physical and logical setup of my new workstation. The linux part is not finished yet (no drivers for Radeon GPU, thank you Ubuntu), it's a work in progress.

esx
Not depicted: each USB controller sports 4 USB ports (yellow) or 2 USB ports (pink and blue). It allows me to plug few devices that won't be "managed" by the USB switch.
USB devices plugged-in on the switch are made available to only one VM at a time. When I press the switch button, they disappear for the current VM and are presented to the next one.

Escaping the Apple ecosystem: part 3

In part 2, I was able to create and use a Windows 7 VM with the Radeon R9 270x in passthrough. It works really great. But OSX and Linux where more difficult to play with.

List of virtual machines

List of virtual machines


Since then, I've made tremendous progress: I've managed to run an OSX 10.11.6 VM properly, but more importantly, I've managed to run my native Mac OS X 10.6.8 system as a VM, with the Mac's Radeon in passthrough.
I've removed my Mac OS X SSD and the Mac's graphics card from the Mac Pro tower, and installed them into the PC tower. Then I've created the VM for the 10.6.8 system, configured ESXi to use Mac's Radeon with VT-d, etc.
The only real problem here is that adding a PCI card into the PC tower makes PCI device numbers change: it breaks almost every passthrough already configured. I had to remake VT-d config for the Windows VM. Apart from that, it went smoothly.
Currently, I'm working on my native 10.6.8 system, that runs as a VM, and the Windows VM is playing my music (because the Realtek HD audio controller is dedicated to the Windows VM).
Moving from a Mac Pro with 4-core 2.8 GHz Xeon to a 6-core 3.5 GHz Core i7 really gives a boost to my old 10.6.8 system.

Running both OSes, the box is almost as silent as the Mac Pro while packing almost twice as more raw CPU power and 2.7x more GPU power.

The Mac Pro is now empty: no disks, no graphics card, and will probably go on sale soon.

to-do list:

  • secure the whole infrastructure ;
  • install 2nd-hand MSI R9 270x when it's delivered ;
  • properly setup Linux to use AMD graphics card.

I might also add few SSDs and a DVD burner before year's end.

Escaping the Apple ecosystem: part 2

In part 1, I've written about the BoM of my project and the associated to-do list.
First item on this list was: build the box. That did not go as smoothly as expected. The motherboard was not fully operational: after few minutes of run time (between 5 and 30), it would trigger a CPU overheat alarm, even when the CPU was idle and cool. Supermicro's Support made me tried a new BIOS, with no effect, so I've finally send the board for exchange. The new board arrived but I've had to delay the rebuilt for few weeks.
Now the PC is up and running. The new motherboard seems to work great, but I've not tested IPMI yet. IPMI was the very first feature I've used on the first board, and there is a slight probability that the CPU overheat problem comes from a probe malfunction related to the BMC. Let's keep that for later.

I've chosen to run this box on VMware ESXi 5.5, because it's quite common (more than the latest 6.x), because it sports features I need like passthrough, and because most VMware based multi-sit PC projects like mine are using ESXi 5.x.

ESXi is quite easy to install, I won't give details. Main hdd in the box is installed with ESXi, which is configured thanks to a USB keyboard and a display plugged in the VGA port of the motherboard. After basic configuration (network, user password...), I've switched to remote configuration through VSphere Client, installed on a Windows PC (really a VM running on the Mac).

General view

General view of ESX's configuration


Configuring passthrough for GPUs is pretty straightforward, because I've started with only one GPU, and because these are discrete PCI cards. On the other hand, passthrough of USB controller can be tricky: many controllers, nothing to identify them except trial & error (unless you have the blueprint for your motherboard telling what physical USB ports belong to what controller).
Go to configuration tab, then "Advanced", and finally click "edit" on the right

Go to configuration tab, then "Advanced", and finally click "edit" on the right


When you click "Edit…" a window opens that lists interesting devices you can try to passthrough.
Choose some. Then you have to reboot the ESXi and add some of these devices to a VM.
passthrough
I've created a Windows 7 pro 64bits VM, with raw device mapping pointing to an SSD. I've added every available PCI devices to this VM (USB, sound, GPU) and installed Windows plus updates.
It's important to remember that initially, most PCI devices might not work at all because of missing drivers on guest OS (here it's Windows). Hence, after installing Windows, the Radeon was detected but not used, and only Intel USB controllers where working. I've installed AMD drivers, and ASMedia drivers (courtesy of Supermicro). I've also installed VMware Tools.
After all this, the Windows VM properly uses my Dell display hocked-up on the MSI R9 270x Radeon, and I can interact with the system thanks to a real keyboard and mouse. Passing through the whole USB controller allows me to use any USB device I want. I've successfully plugged-in and used a thumbdrive, a USB gaming headset and a USB hub.
I've made some GPU/CPU benchmarks and everything looks perfect. I've tested Left 4 Dead 2 game play, and it looked great too (I'll probably have to tweak anti aliasing settings to make it perfect).

The Windows part was quite fast to setup, and is almost done now. I've started to fight with OSX and Ubuntu, but things are not easy with both of them. It looks like my 3 years old graphics card it so new that OSX does not support it until 10.11.x, and Ubuntu won't allow me to install Radeon drivers on 16.x LTS because they wait for some software to stabilize before packaging it…

To-do list:

  • fix problems with OSX and Ubuntu virtualization
  • find another MSI Radeon R9 270X GAMING 2G (of course it's no longer in stock…)
  • fully test Mac OS X 10.6.8 with Mac's graphics card instead of MSI Radeon

Cracking passwords: testing PCFG password guess generator

Cracking passwords is a kind of e-sport, really. There's competition among amateurs and professionals "players", tools, gear. There are secrets, home-made recipes, software helpers, etc.
One of this software is PCFG password guess generator, for "Probabilistic Context-Free Grammar". I won't explain the concept of PCFG, some scientific literature exists you can read to discover all the math inside.
PCFG password guess generator comes as two main python programs: pcfg_trainer.py and pcfg_manager.py. Basic mechanism is the following:
- you feed pcfg_trainer.py with enough known passwords to generate comprehensive rules describing the grammar of known passwords, and supposedly unknown passwords too.
- you run pcfg_manager.py, using previously created grammar, to create millions of password candidates to feed into your favorite password cracker (John the Ripper, Hashcat…).

In order to measure PCFG password guess generator's efficiency I've made few tests. Here is my setup:

  • Huge password dump, 117205873 accounts with 61829207 unique Raw-SHA1 hashes;
  • John the Ripper, Bleeding Jumbo, downloaded 20160728, compiled on FreeBSD 10.x;
  • PCFG password guess generator, downloaded 20160801, launched with Python 3.x;

Here's my methodology:

Of these 61829207 hashes, about 35 millions are already cracked. I've extracted a random sample of 2 millions known passwords to feed the trainer. Then I've used pcfg_manager.py to create a 10 millions lines word list. I've also trimmed the famous Rockyou list to it's 10 millions first lines, to provide a known reference.

Finally, I've launched this shell script:

#!/bin/sh
for i in none wordlist jumbo; do
  ./john --wordlist=pcfg_crckr --rules=$i --session=pcfg_cracker-$i --pot=pcfg_cracker-$i.pot HugeDump
  ./john --wordlist=ry10m --rules=$i --session=ry10m-$i --pot=ry10m-$i.pot HugeDump
done

No forking, I'm running on one CPU core here. Each word list is tested three times, with no word mangling rules, with defaults JtR rules, and finally with Jumbo mangling rules.

Some results (number of cracked passwords):

Rules PCFG Rockyou
none 4409362 2774971
wordlist 5705502 5005889
Jumbo 21146209 22781889

That I can translate into efficiency, where efficiency is Cracked/WordlistLength as percentage:

Rules PCFG Rockyou
none 44.1% 27.7%
wordlist 57.1% 50.1%
Jumbo 211.5% 227.8%

It's quite interesting to see that the PCFG generated word list has a very good efficiency, compared to Rockyou list, when no rules are involved. That's to be expected, as PCFG password guess generator has been trained with a quite large sample of known passwords from the same dump I am attacking.
Also, the PCFG password guess generator creates candidates that are not very well suited for mangling, and only the jumbo set of rules achieves good results with this source. Rockyou on the other hand starts quite low with only 27.7% but jumps to 50.1% with common rules, and finally defeats PCFG when used with jumbo rules.

On the word list side, Rockyou is known and limited: it will never grow. But PCFG password guess generator looks like it can create an infinite list of candidates. Let see what happens when I create a list of +110 M candidates and feed them to JtR.

Rules PCFG Efficiency
none 9703571 8.8%
wordlist 10815243 9.8%

Efficiency plummets: only 9.7 M hashes cracked with a list of 110398024 candidates, and only 1.1 M more when the set of rules "wordlist" is applied. It's even less beneficial than with a list of 10 M candidates (+1.3 M with "wordlist" rules, compared to "none").

On the result side, both word list with jumbo rules yields to +21 M cracked passwords. But are those passwords identical, or different?

Rules Total unique cracked Yield
none 6013896 83.7%
wordlist 8184166 76.4%
Jumbo 26841735 61.1%
Yield = UniqueCracked / (PcfgCracked + RockyouCracked)

A high yield basically says that you should run both word lists into John. A yield of 50% means that all pwd cracked thanks to PCFG are identical to those cracked with the Rockyou list.

As a conclusion, I would say that the PCFG password guess generator is a very interesting tool, as it provides a way to generate valid candidates pretty easily. You probably still need a proper known passwords corpus to train it.
It's also very efficient with no rules at all, compared to the Rockyou list. That might make it a good tool for very slow hashes when you can't afford to try thousands of mangling rules on each candidate.

Some graphs to illustrate this post:

every john session on the same graph

every john session on the same graph

every session, zoomed on the first 2 minutes

every session, zoomed on the first 2 minutes

Rules "wordlist" on both lists of candidates

Rules "wordlist" on both lists of candidates

Rules "none", both lists of candidates

Rules "none", both lists of candidates

Escaping the Apple ecosystem: part 1

X10SRA-F_specBack in late 2012, I've started to think about my post-Apple days. I knew already that I would not endorse the full cloud crap, and the oversimplification of OS X that follows the iOS convergence. My hardware, my OS, my data.

I'm still running a Mac Pro 2010, with Mac OS X 10.6.8, the last great OS from Apple. Unfortunately (in)security is not what it used to be, and browsers, ssl libraries, etc. are updated frequently and older OSes are no longer supported. So it was time to switch away from 10.6.8 for my online activities. Meaning, it's time to create my multi-headed workstation running various OSes with dedicated GPU.

After about a year spent looking for the right pieces of hardware, I've came up with this BOM:

Hopefully it will yield to great results with virtualization, passthrough of GPUs and USB.
For now, only one GPU, just in case it's not working accordingly. That will allow me to test every OSes I want to use with GPU passthrough. I'll buy 2 more GPUs when it's fully tested and satisfactory.

To do list:

  • Assemble the workstation
  • Install and patch VMWare (to allow the virtualization of Mac OS X on non-Apple hardware)
  • Test each VM with GPU+USB+Sound passthrough

Due to the exotic nature of some of these pieces of hardware, I've had to order them from four retailers in two countries: Amazon (France), LDLC (France), PC-Cooling (Germany) and MindFactory (Germany).

I've got only one delivery problem, for the PC case: it was delivered through GLS Group, and it took them more than a week to drive 10 km from their warehouse to my home. When it finally went through my door, the box was trashed and the PC case had one foot slightly hammered into the case. It can't stand still on its 4 feet.
Whatever you buy abroad, just make sure it won't be delivered through GLS. The driver was a pain in the ass, calling me to delay the delivery.

Data leak at Elinchrom

As I'm paranoid, I create a brand new email address each time I've got to register on a web site. It's easy enough and allows me to partition usages and detect abuse. If one of those dedicated addresses ends up in the wild (spammers for example), I just destroy it.

CCBY ©Yuri Samoilov via Flickr

CCBY ©Yuri Samoilov via Flickr


So, in 2010 I registered a support account on Elinchrom's web site with a tailored address : elinchrom (@patpro.net of course). Later, during September 2014, I've received a big spam at that address, neither from Elinchrom nor from one of their partner. After a quick search I've found many other spams blocked by my server. The earlier arrived in 2013.
I've contacted the company, explained the problem, and sent extracts from my mail server logs. After few days, they could find no evidence of a compromise or a data leak on their side. Fair enough, that's not the kind of things you can detect easily, especially if it's years old. On a side note: it's not impossible that I've used the same email address for a contest or event registration affiliated with Elinchrom but not run by them. My bad. At least, this time they deserved the benefit of the doubt.
Following this leak, I've changed the address email to elinchrom2014 (still @patpro.net) and changed the associated password.
The 31th of December 2015 I've received my very first email at this new address: a Paypal phishing attempt, out of a hacked web server somewhere…
Fine.

As I'm super-vigilant since 2014 not to use this address anywhere, the only possible scenario is a data leak at Elinchrom. Going back into my logs, I've found the earliest spam blocked dated from the 24th of October:

Oct 24 17:17:50 postfix/smtpd[84170]: NOQUEUE: reject: RCPT from unknown[202.71.131.54]: 550 5.7.1 Client host rejected: cannot find your hostname, [202.71.131.54]; from=<apache@corp17.net4india.com> to=<elinchrom2014@...> proto=ESMTP helo=<smtp.net4india.com>

I've immediately destroyed this email address, created a new one (longer, with random characters), changed the password again.

Elinchrom: seriously, that's ridiculous, do something. Even the user authentication does not use HTTPS. WAKE UP, it's 2016.

Log aggregation and analysis: logstash

Logstash is free software, as in beer and speech. It can use many different backends, filters, etc. It comes packaged with Elasticsearch as a backend, and Kibana as user interface, by default. It makes a pleasant package to start with, as it's readily available for the user to start feeding logs. For your personal use, demo, or testing, the package is enough. But if you want to seriously use LS+ES you must have at least a dedicated Elasticsearch cluster.

apache-log-logstash-kibana

Starting with Logstash 1.4.0, the release is no longer a single jar file. It's now a fully browsable directory tree allowing you to manipulate files more easily.
ELK (Elasticsearch+Logstash+Kibana) is quite easy to deploy, but unlike Splunk, you'll have to install prerequisites yourself (Java, for example). No big deal. But the learning curve of ELK is harder. It took me almost a week to get some interesting results. I can blame the 1.4.0 release that is a bit buggy and won't start-up agent and web as advertised, the documentation that is light years away from what Splunk provides, the modularity of the solution that makes you wonder where to find support (is this an Elasticsearch question? a Kibana problem? some kind of grok issue?), etc.

Before going further with functionalities lets take a look at how ELK works. Logstash is the log aggregator tool. It's the piece of software in the middle of the mess, taking logs, filtering them, and sending them to any output you choose. Logstash takes logs through about 40 different "inputs" advertised in the documentation. You can think of file and syslog, of course, stdin, snmptrap, and so on. You'll also find some exotic inputs like twitter. That's in Logstash that you will spend the more time initially, tuning inputs, and tuning filters.
Elasticsearch is your storage backend. It's where Logstash outputs its filtered data. Elasticsearch can be very complex and needs a bit of work if you want to use it for production. It's more or less a clustered database system.
Kibana is the user interface to Elasticsearch. Kibana does not talk to your Logstash install. It will only talk to your Elasticsearch cluster. The thing I love the most about Kibana, is that it does not require any server-side processing. Kibana is entirely HTML and Javascript. You can even use a local copy of Kibana on your workstation to send request to a remote Elasticsearch cluster. This is important. Because Javascript is accessing your Elasticsearch server directly, it means that your Elasticsearch server has to be accessible from where you stand. This is not a good idea to let the world browse your indexed logs, or worse, write into your Elasticsearch cluster.

To avoid security complications the best move is to hide your ELK install behind an HTTP proxy. I'm using Apache, but anything else is fine (Nginx for example).
Knowing that 127.0.0.1:9292 is served by "logstash web" command, and 127.0.0.1:9200 is default Elasticsearch socket, your can use those Apache directives to get remote access based on IP addresses. Feel free to use any other access control policy.

ProxyPass /KI http://127.0.0.1:9292 
ProxyPassReverse /KI http://127.0.0.1:9292 
ProxyPass /ES http://127.0.0.1:9200 
ProxyPassReverse /ES http://127.0.0.1:9200 
<Location /KI>
	Order Allow,Deny
	Allow from YOUR-IP 127.0.0.1
</Location>
<Location /ES>
	Order Allow,Deny
	Allow from YOUR-IP 127.0.0.1
</Location>

original data in µs, result in µs. Impossible to convert in hours (17h09)

original data in µs, result in µs. Impossible to convert in hours (17h09)

On the user side, ELK looks a lot like Splunk. Using queries to search through indexed logs is working the same, even if syntax is different. But Splunk allows you to pipe results into operators and math/stats/presentation functions… ELK is not really built for complex searches and the user cannot transform data with functions. The philosophy around Kibana is all about dashboards, with a very limited set of functions. You can build histograms, geoip maps, counters, compute some basic stats. You cannot make something as simple as rounding a number, or dynamically get a geolocation for an IP address. Everything has to be computed through Logstash filters, before reaching the Elasticsearch backend. So everything has to be computed before you know you need it.
Working with Logstash requires a lot of planing: breakdown of data with filters, process the result (geoip, calculation, normalization…), inject into Elasticsearch, taylor your request in Kibana, create the appropriate dashboard. And in the end, it won't allow you to mine your data as deep as I would want.
Kibana makes it very easy to save, store, share your dashboards/searches but is not very friendly with clear analysis needs.

Elasticsearch+Logstash+Kibana is an interesting product, for sure. It's also very badly documented. It looks like a free Splunk, but its only on the surface. I've been testing both for more than a month now, and I can testify they don't have a lot in common when it comes to use them on the field.

If you want pretty dashboards, and a nice web-based grep, go for ELK. It can also help a lot your command-line-illeterate colleagues. You know, those who don't know how to compute human-readable stats with a grep/awk one-liner and who gratefully rely on a dashboard printing a 61 billions microseconds figure.
If you want more than that, if you need some analytics, or even forensic, then odds are that ELK will let you down, and it makes me sad.