Short review of the KeyGrabber USB keylogger

keygrapper © keelog.comFew days ago I've bought a USB keylogger to use on my own computers (explanation in french here). Since then, and as I'm sitting in front of a computer more than 12 hours a day, I've got plenty of time to test it.
The exact model I've tested is the KeyGrabber USB MPC 8GB. I've had to choose the MPC model because both my current keyboards are Apple's Aluminum keyboards. They act as USB hubs, hence requiring some sort of filtering so that the keylogger won't try and log everything passing through the hub (mouse, usb headset, whatever…) and will get only what you type.
My setup is close to factory settings: I've just set LogSpecialKeys to "full" instead of "medium" and added a French layout to the keylogger, so that typing "a" will record "a", and not "q".

First of all, using the device on a Mac with a French Apple keyboard is a little bit frustrating: the French layout is for a PC keyboard, so typing alt-shift-( to get a [ will log [Alt][Sh][Up]. "[Up]"? Seriously? The only Macintosh layout available is for a US keyboard, so it's unusable here.

The KeyGrabber has a nice feature, especially on it's MPC version, that allows the user to transform the device into a USB thumbdrive with a key combination. By default if you press k-b-s the USB key is activated and mounts on your system desktop. The MPC version allows you to continue using your keyboard without having to plug it on another USB port after activation of the thumbdrive mode, which is great. You can then retrieve the log file, edit the config file, etc.
Going back to regular mode requires that you unplug and plug back the KeyGrabber.
Applying the "kbs" key combo needs some patience: press all three keys for about 5 seconds, wait about 15-20 seconds more, and the thumbdrive could show up. If it does not, try again. I've not tested it on Windows, but I'm not very optimistic, see below.

I'm using a quite special physical and logical setup on my home workstation. Basically, it's an ESXi hypervisor hosting a bunch of virtual machines. Two of these VM are extensively using PCI passthrough: dedicated GPU, audio controller, USB host controller. Different USB controllers are plugged to a USB switch, so I can share my mouse, keyboard, yubikey, USB headset, etc. between different VMs. Then, the KeyGrabber being plugged between the keyboard and the USB switch, it's shared between VMs too.
Unfortunately, for an unidentified reason, the Windows 10 VM will completely loose it's USB controller few seconds after I've switched USB devices from OSX to Windows. So for now on, I have to unplug the keylogger when I want to use the Windows VM, and that's a bummer. Being able to use a single device on my many systems was one of the reasons I've opted for a physical keylogger, instead of a piece of software.
Worse: rebooting the VM will not restore access to the USB controller, I have to reboot the ESXi. A real pain.

But in the end, it's the log file that matters, right? Well, it's a bit difficult here too, I'm afraid. I've contacted the support at Keelog, because way too often what I see in the log file does not match what I type. I'm not a fast typist, say about 50 to 55 words per minute. But it looks like it's too fast for the KeyGrabber which will happily drop letters, up to 4 letters in a 6 letters word (typed "jambon", logged "jb").
Here is a made-up phrase I've typed as a test:

c'est assez marrant parce qu'il ne me faut pas de modèle pour taper

And here is the result as logged by the device:

cesae assez ma[Alt][Ent][Alt]
rrant parc qu'l ne e faut pa de modèl pour taper

This can't be good. May be it's a matter of settings, some are not clearly described in the documentation, so I'm waiting for the vendor support to reply.

Overall, I'm not thrilled by this device. It's a 75€ gadget that won't work properly out of the box, and will crash my Win 10 system (and probably a part of the underlying ESXi). I'll update this post if the support helps me to achieve proper key logging.

Related posts

Install Fast Incident Response (FIR) on FreeBSD

FIR login screen FIR is a web application designed by CERT Société Générale. It's coded in Python and uses Django.

Installation as documented uses Nginx and pip, two tools I'm not using. I already run several Apache servers, and I do prefer relying on pkg for software installation. For me, pip is more a developer tool: very convenient with virtualenv but not what I would use for production.
If you want to use pip, just go with the official documentation :)

So on a FreeBSD 11.2-RELEASE server, you need to install all required packages. Be careful, py27-MySQLdb uses mysql56-server, not mysql57-server, and FIR works with Python 2.7, not 3.x (as far as I can say).

$ sudo pkg install gettext mysql56-server py27-pip py27-virtualenv git apache24 uwsgi py27-MySQLdb py27-django py27-cssselect py27-django-filter py27-djangorestframework py27-flup6 py27-gunicorn py27-lxml py27-markdown py27-pymongo py27-pyquery py27-dateutil py27-pytz py27-six py27-django-treebeard py27-markdown2 py27-bleach py27-python-gettext

Add those lines to /etc/rc.conf:

mysql_enable="yes"
uwsgi_enable="yes"
apache24_enable="yes"

The requirement list includes dj-database-url and whitenoise, but I was not able to find them in FreeBSD's packages. I've just ignored them and everything seems to work fine.
If needed, sudo pip install… should do the trick.

Follow the documentation:
- configure MySQL
- install FIR without virtualenv (in /usr/local/www for example)
- create and tune production.py
- create installed_apps.txt (you do want the plugins)
- create tables in the db
- create the admin user
- populate the db
- collect static files

On FreeBSD we run uwsgi as a service, under uwsgi UID. So chown must reflect that:

$ sudo chown uwsgi logs/errors.log uploads
$ sudo chmod 750 logs/errors.log uploads

Skip the uWSGI section of the documentation. Instead, create config file for uwsgi with this content:

$ cat /usr/local/etc/uwsgi/uwsgi.ini
[uwsgi]
chdir = /usr/local/www/FIR
module = fir.wsgi

You can now start the service:

$ sudo service uwsgi start

Then, skip the nginx part, and go configure Apache:

- load proxy_module and proxy_uwsgi_module in httpd.conf
- add the following to the relevant part of Apache configuration:

# FIR
ProxyPass /FIR unix:/tmp/uwsgi.sock|uwsgi://usr/local/www/FIR
Alias /files/ /usr/local/www/FIR/files/
Alias /static/ /usr/local/www/FIR/static/
<Directory /usr/local/www/FIR/static>
	Require all granted
</Directory>
<Directory /usr/local/www/FIR/files>
	Require all granted
</Directory>

Restart Apache: you're all set, point your browser to https://your-server/FIR/

Related posts

Écriture inclusive et risques de sécurité

Je ne vais pas rentrer dans le débat du pour ou contre l’écriture inclusive. Mon propos est ici d’étudier l’impact d’une forme de graphie utilisée dans le cadre de l’écriture inclusive française sur la sécurité des systèmes d’information.

Il existe différentes formes ou manière de faire de l’écriture inclusive, utilisant différents artifices typographiques. Malheureusement, une des formes les plus élémentaires - celle qui utilise le point de ponctuation pour séparer les blocs - crée parfois des noms de domaines, comme par exemple : enseignant.es, doctorant.es, etc.

L’auteur d’un texte rédigé avec cette forme d’écriture inclusive peut donc se trouver rapidement à truffer sa prose de noms de domaine espagnols. Dans le cas d’un support totalement maîtrisé comme le papier ou l’image, l’impact est faible car le texte n’est pas immédiatement accessible à la machine. Il faudrait opérer une reconnaissance optique de caractère pour retrouver un texte électronique exploitable.

Malheureusement pour cette forme d’écriture, les supports numériques prévalent de plus en plus, et la maîtrise du support disparaît presque totalement au profit d’intermédiaires, de médias, qui peuvent décider d’améliorer l’expérience utilisateur sans demander son avis à l’auteur initial. Que cet intermédiaire soit un logiciel bureautique, un CMS, une plateforme de réseau social, une application de SMS, une messagerie instantanée, etc. nombreux sont ceux qui vont s’arroger le droit d’activer les URL qu’ils détectent dans les contenus soumis. Ainsi, quand un auteur écrit que « les étudiant.es peuvent se porter candidat.es au concours d’infirmier.es », il est bien possible que le média de publication informe les lecteurs que « les étudiant.es peuvent se porter candidat.es au concours d’infirmier.es ». La différence est de taille : les mots suffixés en .es sont devenus des URL actives.

Qui dit URL actives dit aussi : téléchargement d’un aperçu, possibilité de clic volontaire ou non sur une de ces URL, et dans des cas rares de vulnérabilité logicielle, possibilité de téléchargement de l’URL sans clic de la part du lecteur.

À titre de démonstration et d'expérience j'ai personnellement enregistré les noms étudiant.es et enseignant.es, puis je les ai faits pointer sur une page web que je maîtrise pour mesurer le comportement des utilisateurs en présence de ces noms de domaines. Sur une période de 31 jours, je mesure 5265 requêtes GET tous visiteurs confondus (avec une majorité de robots donc). Sur la même période, je mesure environ 350 visiteurs réels, soit plus de 10 par jour en moyenne, et avec un pic à 54 visiteurs un jour particulier.

Nombre de "GET" par jour sur étudiant.es et enseignant.es

Nombre de "GET" par jour sur étudiant.es et enseignant.es


Estimation du nombre de "vrais visiteurs" par jour sur étudiant.es et enseignant.es


Les traces sur le serveur web montrent aussi qu'un grand nombre des requêtes de robots sont faites par des plateformes de réseau social, au moment où ces plateformes analysent le lien posté par l'utilisateur (création de l'aperçu, de l'URL raccourcie, détection de malware, analyse du contenu promu par l'utilisateur, etc.). Par exemple : Twitterbot, Facebookbot, EveryoneSocialBot, et autres.

L’URL ainsi activée va en général se retrouver sur des supports largement diffusés. On peut penser notamment à un CMS d’institution, mais aussi à des documents PDF, des emails, des messages diffusés sur des plateformes de réseau social comme Facebook, Twitter, Linked’In. Dans une grande université française, un seul auteur peut ainsi toucher directement 20000 à 60000 individus, et indirectement bien plus.

Il est donc normal dans un tel contexte d’envisager ce qu’il se passerait si par exemple quelqu’un décidait d’enregistrer tout ou partie de ces noms de domaines espagnols pour faire des choses avec :

  • Faire du profit (affichage de publicité, vente de contrefaçon, pornographie, etc.)
  • Attaquer les lecteurs de certains contenus en distribuant des malwares
  • Lancer une campagne de phishing contre les lecteurs de ces contenus
  • Porter atteinte à l’image de l’auteur (incitation à la haine, diffamation, etc.)

L’attaque en elle-même est extrêmement simple et économique. La location d’un nom de domaine à l’année coûte une poignée d’euros, et la plupart de ces mots français sont disponibles en .es. Par ailleurs, elle peut se produire à tout moment, et le plus tard est même le mieux : plus le nom de domaine est répandu, plus il est diffusé largement et dans un grand nombre de documents et supports, plus son exploitation devient intéressante.

Comme évoqué ci-dessus, les motivations de l’attaquant peuvent être diverses, néanmoins les effets de l’attaque sur l’auteur ou sur son entreprise/institution sont toujours les mêmes. Inévitablement il se produit un dégât d’image : personne ne trouverait normal en cliquant sur le mot canditat.es dans une communication officielle d’entreprise d’arriver sur un site pornographique (sauf bien évidemment si cette entreprise gère le site en question). Idem si le lecteur voit son antivirus passer au rouge suite à un clic sur infirmier.es.

Ensuite, il peut se produire une perte pour l’entreprise ou pour l’auteur : perte de temps et d’argent car il faut rapidement neutraliser les documents ou les URL dans les documents publiés. Cette opération curative est indispensable, mais impossible en pratique à faire en totalité : en effet il est impossible d’aller modifier les emails déjà envoyés, les documents PDF ou bureautique déjà téléchargés par le public, les SMS, etc.

Si l’attaquant a choisi une posture très offensive en faisant pointer les URL vers des malwares, alors l’auteur des contenus entre dans un nouveau cercle des Enfers : il pourra être amené à découvrir ce qu’il se produit lorsque Google décide de faire disparaître son site web des résultats de recherche, ce qu’il se produit quand tous les antispam de la planète décident de bloquer les emails qu’il envoie. Le blocage ne sera pas immédiat bien sûr, laissant à l’attaquant une fenêtre de tir pendant laquelle il pourra infecter d’innocents lecteurs. D’une pierre deux coups, en quelques sortes. Je n’ai pas de certitude sur le comportement des réseaux sociaux dans ce cas de figure, mais il est possible que certains décident de supprimer les messages contenant des liens vers des malwares, que d'autres suspendent le compte de l'auteur. Autant dire que tout à coup, les choses vont devenir très compliquées pour l’auteur.

À mon avis, il y a ici un risque qui vaut la peine d'être pris en compte. Même si on constate que le nombre de clics volontaires ou non sur ces liens automatiques est relativement faible, il reste très largement supérieur au "taux de conversion" habituel d'une campagne de phishing par email. Et nous protégeons tous nos utilisateurs contre le phishing, n'est-ce pas ?
Ici l'attaque part d'un support légitime, réellement écrit par son auteur : il ne s'agit pas d'un faux message de votre Support utilisateur écrit dans un français approximatif. Non, il s'agit de transformer des contenus légitimes en vecteur d'attaque sans même avoir besoin de modifier ces contenus.

À méditer.

Related posts

My take on the MySpace dump

About a year ago, a full MySpace data breach dump surfaced on the average-Joe Internet. This huge dump (15 GiB compressed) is very interesting because many user accounts have two different password hashes. The first hash is non-salted, and represents a lower-cased, striped to 10 characters, version of the user original password. The second hash, not always present, is salted, and represents the full original user password.
Hence, the dump content can be summarized by this :

id : email : id/username : sha1(strtolower(substr($pass, 0, 9))) : sha1($id . $pass) 

It contains about 116.8 million unique unsalted sha1 hashes, and about 68.5 million salted sha1 hashes.

Of course, people who crack passwords will tell you that the unsalted hashes have no value, because then don't represent real user passwords. They are right. But when you crack those hashes you have a very interesting password candidate to crack the salted hashes. And this is very interesting!

After you cracked most of unsalted hashes, the question is: how do you proceed to crack their salted counterpart? Spoiler alert: hashcat on an Nvidia GTX 1080 is more than 200 times slower than John the Ripper on a single CPU core on this very particular job.

I'm a long time John the Ripper user (on CPU), and I'm pretty fan of it's intelligent design. Working on CPU requires wits and planing. And the more versatile your software is, the more efficient you can be. Hashcat sits on the other end of the spectrum: huge raw power thanks to GPU optimization. But it lacks the most sensible attack mode: "single".

Single mode works by computing password candidates from GECOS data like login, user name, email address, etc. So it makes sense to provide a full password file to JtR, instead of just naked hashes. These passwords metadata are very efficient when you want to create contextual password candidates.
The password retrieved from unsalted hash is more than a clue to retrieve its salted counterpart, in many case it's also the real user password. And when it's not, simple variations handled by mangling rules will do the trick.
You've probably guessed by now: I've created a file where password cracked from non-salted hashes are paired with the corresponding salted hash. The known password impersonate the user login, so that with proper tuning John the Ripper will try only this particular candidate against the corresponding salted hash.
Because of a bug in JtR, I was not able to use this attack on a huge file, I had to split it into small chucks. Nevertheless, I was able to retrieve 36708130 passwords in just 87 minutes. On a single CPU core.
In order to find those passwords with hashcat, I had to rely on a wordlist attack with on a GTX 1080. It took about 14 days to complete. No matter how fast your GPU is (about 1000 MH/s in that particular case), it will brainlessly try every single candidate on every single hash. Remember hashes are salted, so each one requires its own computation. If your file is 60M hashes long, then your GPU will only try 16.6 candidates per second (1000/60). It's very slow and inefficient.

Hashcat performance on a file containing 50% of total hashes.

Sometime, brain is better than raw power. Thank you John ;)

More on this topic:
https://hashes.org/forum/viewtopic.php?t=1715
http://cynosureprime.blogspot.fr/2016/07/myspace-hashes-length-10-and-beyond.html

Related posts

Self-hosted password manager: installing Passbolt on FreeBSD

Arthur Duarte CC-BY-SA-4.0

Arthur Duarte CC-BY-SA-4.0

Password managers, or password safes, are an important thing these days. With the constant pressure we (IT people) put our users under to setup a different password for every single registration/application/web site, it's the best, if not only, way to keep track of these secrets. On one hand, the isolated client-side software can be really powerful and/or well integrated with the OS or the software ecosystem of the user, but it lacks the modern touch of "cloud" that makes your data available anywhere and anytime. On the other hand, a full commercial package will come with client for every device you own, and a monthly fee for cloud synchronization, but you have absolutely no control over your data (just imagine that tomorrow the company you rely on goes bankrupt).
Better safe than sorry: I don't rely on cloud services. It comes at a cost, but it's quite rewarding to show the world another way exists.
Disclaimer: I don't give a sh*t about smartphones, so my needs are computer-centric.

In order to store passwords, and more generally speaking "secrets", in such a way that I can access them anywhere/anytime, I've tried Passbolt. Passbolt is an OpenSource self-hosted password manager, written in PHP/Javascript with a database back end. Hence, install and config are not for the average Joe. On the user side it's quite clean and surprisingly stable for alpha software. So once a LAMP admin has finished installing the server part, any non-skilled user can register and start storing passwords.

Enough chit-chat, let's install.

My initial setup was a vanilla FreeBSD 10.3 install, so I've had to make everything. I won't replay every single step here, especially on the configuration side.

Prerequisites:

pkg install apache24
pkg install mod_php56
pkg install php56-gd
pkg install pecl-memcached
pkg install mysql57-server
pkg install pecl-gnupg
pkg install git
pkg install php56-pdo_mysql
pkg install sudo
pkg install php56-openssl
pkg install php56-ctype
pkg install php56-filter

Everything else should come as a dependency.

Tuning:

Apache must allow .htaccess, so you'll have to put an AllowOverride All somewhere in your configuration. You must also load the Rewrite module. Also, go now for SSL (letsencrypt is free and supported). Non-SSL install of Passbolt are for demo purpose only.
Apache will also need to execute gnupg commands, meaning the www user needs an extended $PATH. The Apache startup script provided on FreeBSD sources Apache environment variables from /usr/local/sbin/envvars and this very file sources every /usr/local/etc/apache24/envvars.d/*.env, so I've created mine:

$ cat /usr/local/etc/apache24/envvars.d/path.env
PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin

You also need to tune your MySQL server. If you choose the 5.7, you must edit it's configuration. Just add the following line into [mysqld] section of /usr/local/etc/mysql/my.cnf:

sql_mode = 'STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION'

This is due to a bug in Passbolt and could be useless in a not to distant future.

Install recipe:

You can now follow the install recipe at https://www.passbolt.com/help/tech/install.
Generating the GPG key is quite straightforward but you have to keep in mind that Apache's user (www) will need access to the keyring. So if you create this key and keyring with a different user, you'll have to mv and chown -R www the full .gnupg directory somewhere www can read it (outside DocumentRoot is perfectly fine).

Use git to retrieve the application code into appropriate path (according to your Apache config):

cd /usr/local/www
git clone https://github.com/passbolt/passbolt.git

Edit php files as per the documentation.

Beware the install script: make sure you chown -R www the whole passbolt directory before using cake install.
On FreeBSD you won't be able to use su to run the install script, because www's account is locked. You can use sudo instead:

sudo -u www app/Console/cake install --no-admin

Same for the admin account creation:

sudo -u www app/Console/cake passbolt register_user -u patpro@example.com -f Pat -l Pro -r admin

Follow the end of the install doc, and you should be ok. Install the Firefox passbolt extension into your browser, and point to your server.

I'm pretty happy with passbolt so far. I'll have to install a proper production server, with SSL and all, but features are very appealing, the passbolt team is nice and responsive, and the roadmap is loaded with killing features. Yeah BRING ME 2FA \o/.

Related posts

Cracking passwords: testing PCFG password guess generator

Cracking passwords is a kind of e-sport, really. There's competition among amateurs and professionals "players", tools, gear. There are secrets, home-made recipes, software helpers, etc.
One of this software is PCFG password guess generator, for "Probabilistic Context-Free Grammar". I won't explain the concept of PCFG, some scientific literature exists you can read to discover all the math inside.
PCFG password guess generator comes as two main python programs: pcfg_trainer.py and pcfg_manager.py. Basic mechanism is the following:
- you feed pcfg_trainer.py with enough known passwords to generate comprehensive rules describing the grammar of known passwords, and supposedly unknown passwords too.
- you run pcfg_manager.py, using previously created grammar, to create millions of password candidates to feed into your favorite password cracker (John the Ripper, Hashcat…).

In order to measure PCFG password guess generator's efficiency I've made few tests. Here is my setup:

  • Huge password dump, 117205873 accounts with 61829207 unique Raw-SHA1 hashes;
  • John the Ripper, Bleeding Jumbo, downloaded 20160728, compiled on FreeBSD 10.x;
  • PCFG password guess generator, downloaded 20160801, launched with Python 3.x;

Here's my methodology:

Of these 61829207 hashes, about 35 millions are already cracked. I've extracted a random sample of 2 millions known passwords to feed the trainer. Then I've used pcfg_manager.py to create a 10 millions lines word list. I've also trimmed the famous Rockyou list to it's 10 millions first lines, to provide a known reference.

Finally, I've launched this shell script:

#!/bin/sh
for i in none wordlist jumbo; do
  ./john --wordlist=pcfg_crckr --rules=$i --session=pcfg_cracker-$i --pot=pcfg_cracker-$i.pot HugeDump
  ./john --wordlist=ry10m --rules=$i --session=ry10m-$i --pot=ry10m-$i.pot HugeDump
done

No forking, I'm running on one CPU core here. Each word list is tested three times, with no word mangling rules, with defaults JtR rules, and finally with Jumbo mangling rules.

Some results (number of cracked passwords):

Rules PCFG Rockyou
none 4409362 2774971
wordlist 5705502 5005889
Jumbo 21146209 22781889

That I can translate into efficiency, where efficiency is Cracked/WordlistLength as percentage:

Rules PCFG Rockyou
none 44.1% 27.7%
wordlist 57.1% 50.1%
Jumbo 211.5% 227.8%

It's quite interesting to see that the PCFG generated word list has a very good efficiency, compared to Rockyou list, when no rules are involved. That's to be expected, as PCFG password guess generator has been trained with a quite large sample of known passwords from the same dump I am attacking.
Also, the PCFG password guess generator creates candidates that are not very well suited for mangling, and only the jumbo set of rules achieves good results with this source. Rockyou on the other hand starts quite low with only 27.7% but jumps to 50.1% with common rules, and finally defeats PCFG when used with jumbo rules.

On the word list side, Rockyou is known and limited: it will never grow. But PCFG password guess generator looks like it can create an infinite list of candidates. Let see what happens when I create a list of +110 M candidates and feed them to JtR.

Rules PCFG Efficiency
none 9703571 8.8%
wordlist 10815243 9.8%

Efficiency plummets: only 9.7 M hashes cracked with a list of 110398024 candidates, and only 1.1 M more when the set of rules "wordlist" is applied. It's even less beneficial than with a list of 10 M candidates (+1.3 M with "wordlist" rules, compared to "none").

On the result side, both word list with jumbo rules yields to +21 M cracked passwords. But are those passwords identical, or different?

Rules Total unique cracked Yield
none 6013896 83.7%
wordlist 8184166 76.4%
Jumbo 26841735 61.1%
Yield = UniqueCracked / (PcfgCracked + RockyouCracked)

A high yield basically says that you should run both word lists into John. A yield of 50% means that all pwd cracked thanks to PCFG are identical to those cracked with the Rockyou list.

As a conclusion, I would say that the PCFG password guess generator is a very interesting tool, as it provides a way to generate valid candidates pretty easily. You probably still need a proper known passwords corpus to train it.
It's also very efficient with no rules at all, compared to the Rockyou list. That might make it a good tool for very slow hashes when you can't afford to try thousands of mangling rules on each candidate.

Some graphs to illustrate this post:

every john session on the same graph

every john session on the same graph

every session, zoomed on the first 2 minutes

every session, zoomed on the first 2 minutes

Rules "wordlist" on both lists of candidates

Rules "wordlist" on both lists of candidates

Rules "none", both lists of candidates

Rules "none", both lists of candidates

Related posts