LightBulb is a utility to aid in finding automated traffic in web proxy logs. For now, lightbulb only deals with BlueCoat proxy log files. However it can be easily modified to read in other log formats.
Ruby 1.8.7 or later
Lightbulb takes input from standard in so that filtering may be done prior to loading in traffic.
cat logfile.txt | ruby lightbulb.rb
This will create an output file in the same directory (lightbulb_report.txt).
The format of the output file is: Entropy, HostIP : => time intervals in seconds The final report has 2 sections. The top section is the statistical outliers with the beacons displayed. The second half of the report is traffic across all hosts without the traffic information displayed.
Below is an excerpt from a sample report.
-0.0, 192.168.1.28 : msdn.microsoft.com => 3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0,3600.0 7.60017315297745, 192.168.2.230 : mail.google.com => 5.0,4.0,208.0,88.0,128.0,167.0,10.0,36.0,217.0,37.0,167.0,133.0,11.0,64.0,225.0,17.0,236.0,47.0,12.0,160.0,128.0,88.0,212.0,13.0,13.0,234.0,40.0,177.0,123.0,14.0,82.0
The first line shows traffic from 192.168.1.28 to msdn.microsoft.com is perfectly regular and beaconed every 3600 seconds. The second line shows traffic to mail.google.com with a much higher entropy. This is evident by the timestamps.
Notes on Entropy.
Lower entropy = Higher regularity Entropy that is closer to zero would imply beacon-like behavior, where on the extreme other end, high entropy would imply random behavior. This can be useful in hunting malware that uses beacons seeded by a PRNG.
One other note that should be made is that lightbulb (in it's current form) needs at least 10 values to calculate the entropy of a list. The reason behind this is confidence. I've found that smaller sets yield a much higher false positive rate. This can be modified in the script.
These details are provided for information only. No information here is legal advice and should not be used as such.