IDN domain filtering in ORF 5.1

Now that the world did not end after all (not in our timezone anyway!), I thought I’d share two piece of news with you. Let’s start with the bad one:

  • You will still receive spam in 2013.
  • On the bright side, you will be better equipped against them.

We are working on a massive overhaul of the URL Harvester engine in ORF, the component responsible for detecting, decoding, prioritizing and sorting URL payloads in spam, such as web links and email address domains. The domain names harvested from URLs are checked in URIBLs such as Spamhaus DBL or to see if they are associated with spam.

Among others, the ORF URL Harvester is receiving Internationalized Domain Name support in ORF 5.1. IDN(A) is a set of standards which enables writing domain names in alphabets and scripts like Cyrillic, Japanese or Hebrew, so with the new IDN support, ORF will be able to discover and lookup links like http://русский The significance of this new feature is explained by the statistics below, but it should also be noted that IDNs can be abused in phishing emails for IDN homograph attacks so this feature also adds an extra bit of security to your email systems.

Lab tests with the new URL harvester resulted in a number of interesting statistics.

  • Total spam emails tested: 16232 (spam age between 48 and 2 hours)
  • Emails with URL payload: 88.55%
  • Emails with IDN URL payload: 4.37%
  • Total IDNs discovered: 26 (all from the .рф ccTLD)
  • Percentage of IDNs blacklisted in Spamhaus DBL: 84.62%

While the sample is too small to draw conclusions of scientific quality, the 4.37% figure and Spamhaus’ high
detection rate shows that IDN support in ORF 5.1 can be a significant contributor toward a better spam detection.

Leave a Reply

Your email address will not be published. Required fields are marked *

AlphaOmega Captcha Classica  –  Enter Security Code