Detecting Malware in encrypted https traffic

Published in

Impossible Dream

1 min readJun 21, 2017

A fun paper from Cisco. Using a sandbox, they generate a large data set of connections from malware to the internet that uses https. They compare this with a set of https traffic data from an enterprise network (I assume Cisco’s!) and train a ML classifier to detect the malware and identify malware family in encrypted data using:

Differences in TLS setup
Netflow data
Sequence of packet lengths and times and
Byte distribution

This gives 98%+ accuracy, which is pretty impressive!

The major limitation of the work is the sandbox being limited to Windows XP hosts. The authors control for this as best they can by excluding in the training set malware connections that are using the XP TLS stack defaults, but clearly “further work is needed”

Arxiv paper (PDF)

Detecting Malware in encrypted https traffic

Written by Arthur Clune