Detecting Malware in encrypted https traffic

Arthur Clune
Impossible Dream
Published in
1 min readJun 21, 2017

--

A fun paper from Cisco. Using a sandbox, they generate a large data set of connections from malware to the internet that uses https. They compare this with a set of https traffic data from an enterprise network (I assume Cisco’s!) and train a ML classifier to detect the malware and identify malware family in encrypted data using:

  • Differences in TLS setup
  • Netflow data
  • Sequence of packet lengths and times and
  • Byte distribution

This gives 98%+ accuracy, which is pretty impressive!

The major limitation of the work is the sandbox being limited to Windows XP hosts. The authors control for this as best they can by excluding in the training set malware connections that are using the XP TLS stack defaults, but clearly “further work is needed”

Arxiv paper (PDF)

--

--