For those unfamiliar with TheBrain
- it is a highly interactive mind-mapping software, which includes a free edition called the PersonalBrain
. I have used this software on-and-off for a few years, and really like it for organizing and interacting with my thoughts when I start a new project. I recently took a look at using the software to organize and visualize web transaction logs for analysis (specifically to extract suspicious / malicious transactions). Below details my experiences - I'm curious to know what others have found to work.
After exchanging a few emails with TheBrain support (they are very responsive) - they shared a 5-page document on their supported XML formats. A DTD
file is available, and is extremely easy to convert your data into "Thoughts" and "Links" (just make sure to properly handle any XML entities
within your data). Thoughts are basically nodes within your mind-map and links can be parent/child or "jump" relationships (I think of these as bi-directional "lookup" relationships or cross-references). In my case I wrote a Perl script to extract and convert portions of my data into the supported XML format and then imported it into PersonalBrain.
The above is data from 5 transactions imported into the PersonalBrain with these relationships:
- ServerIP, Domain, URLPath, RequestType, Country, ASN, Score, AnalyticCheck, and Transactions are all children of Data, and each has related data under each Thought. For example, 18.104.22.168 falls under ServerIP and China falls under Country.
- ServerIP data has jump links to related ASN, Country, and Domain data and vice versa (the links are bi-directional)
- Domain data has jump links to related URLPaths and vice versa
- Transaction data has related jump links to ServerIP, Domain, URLPath, RequestType, Country, ASN, Score, and AnalyticCheck and vice versa
By having the data in this visual format, it is easy to quickly drill-down and view transactions with higher "suspicious" scores and then cross-reference their transaction information with other transactions. You have the ability to view both primary and secondary relationships, similar to what is displayed in the above graph, or a more concise view with just the primary relationships:
Zeus/SpyEye and other bots are often configured to use an IP lookup service (a possible example above) and provide their resolvable IP to the C&C. These types of inter-related transactions can become apparent when correlating botnet web transactions.
I found TheBrain to have a very easy format for converting and importing data to, and to have a very user-friendly, interactive, and fun! interface for working with your imported data. TheBrain includes the ability to "forget" and "remember" thoughts - i.e., the ability to remove and restore thoughts/links, so while you are conducting your analysis you can clear your brain of any data that is in the way.
What I did find TheBrain woefully inadequate for is large data-sets. When I exported all of the data that I wanted to review for a day, the XML file was 1.66GB for all of the Thoughts and Links. When I tried to import into PersonalBrain on my MacBook Pro (4GB, 2.53GHz Core 2 Duo) I ended up letting it attempt the import overnight ... after about 20 hours of waiting for it to import the data I "force quit" the application (though it never said "not responding").
If your data-set is of relatively small size, TheBrain may be a good free tool to add to your arsenal. Feel free to share other good, interactive, free visualization tools for analyzing web transaction logs, particularly if they scale to larger data-sets.