Second Encounters With A Web Comment Spammer

December 01, 2008 - 4 min read

Contents

Article
More blogs

Earlier I wrote my First Encounters with a Web Comment Spammer piece. In that piece I devised a plan to lay a trap of sorts for the web comment spamming application, in order to test the depth of the application's functionality. Well, it's been a few weeks, and now I have some data to share.

The most interesting thing to note is that a few more comment spam applications/crawlers have made their way to my comment form. These new ones exhibit different behavior than the original one I reported on, thus I believe they are entirely different applications. For now, I’m going to stick to the original application I previously discussed; I'll compare my results to these newer spam apps in a future blog post.

One thing I noticed is that many of these comment spam attempts were coming from systems located on the 94.102.60.0/24 network. A large number of them were also using the User-Agent string "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)". Both of these factors turned out to be good indicators of whether the request was coming from a spam bot.

Anyways, here is an example of one submission I received. The names at the beginning of each line are the name of the form field; all of the fields are text input fields (as in, ""), except for 'other' and 'comment' which are textarea fields.

eml: YkfxeDeZjHR
email: [email protected]
name: fvkijvn
phone: CdGpbMFbxGDygCwy
address: ouUMHxpoxwn
url: http://vfokgivkywst.com/
link: geEKiJfvkyRC

other: TyhYgb xnhqpiemubkx, [url=http://fiukrdabbaut.com/]fiukrdabbaut[/url], [link=http://zrywxdmvlfzv.com/]zrywxdmvlfzv[/link], http://klhtciqjlkxr.com/

comment: TyhYgb xnhqpiemubkx, [url=http://fiukrdabbaut.com/]fiukrdabbaut[/url], [link=http://zrywxdmvlfzv.com/]zrywxdmvlfzv[/link], http://klhtciqjlkxr.com/

The most obvious things visible from this data are that the application filled in all fields with random garbage. It managed to put something that resembled an email address into the 'email' field, but not the 'eml' field (which is the actual email address field shown to the user for data entry). The application also managed to put a URL into the 'url' field, but not the 'link' field. This makes me believe the application is pre-programmed with a few specific field names where it will submit data of a specific format. Also interesting/notable is that the application submitted the same blob of link garbage to both textarea fields ('other' and 'comment'), and not any of the text input fields.

In addition to the form fields that were submitted, I collected some other pieces of information to gauge the depth of the spamming application. I discovered that cookies were indeed supported--at least, I could set a cookie on the form display page and the bot would carry that cookie over with the form submission. Hidden form fields were not altered and properly submitted with the rest of the form data. I also found that Javascript is not supported by the application...which is no surprise.

Another thing I failed to notice before is that the application does actually have the ability to handle multi-step submissions. I recognized the behavior in my logs: whenever the form was submitted, the same user-agent would then go through every link on the page (in exact order of appearance, none-the-less) and subsequently request it. I assume this behavior is to deal with web applications that return a "thank you for your submission" page along with a link taking you back to the forum/comment area where the new submission will appear.

Interesting info, perhaps. But I’ve found that I grown bored with this particular application and its lack of intelligence; the newer bots I’ve been seeing have actually been doing a lot more interesting things. I will take a deeper look at these new bots, and how they differ, in my next blog post. After that, I'll share a few effective tricks I've been using to tell these spam bots apart from the humans (without CAPTCHAs!).

Until then!
- Jeff

Thank you for reading

Was this post useful?

Yes, very!

Not really

Disclaimer: This blog post has been created by Zscaler for informational purposes only and is provided "as is" without any guarantees of accuracy, completeness or reliability. Zscaler assumes no responsibility for any errors or omissions or for any actions taken based on the information provided. Any third-party websites or resources linked in this blog post are provided for convenience only, and Zscaler is not responsible for their content or practices. All content is subject to change without notice. By accessing this blog, you agree to these terms and acknowledge your sole responsibility to verify and use the information as appropriate for your needs.