One of the cooler tools in the webappsec hacker’s handbook is Hackvertor. It’s a smart encoding tool written by Gareth Heyes that helps you craft XSS vectors that pass whatever filters you’re trying to evade. Rather than wasting 3 paragraphs describing it, you should just go try out this example that Gareth showed me for obfuscating a simple alert(document.cookie). Check out the Hackvertlet and HVURL features!

Anyway, for obfuscating a single payload to bypass a filter, Hackvertor is excellent. However, my recent research into next generation XSS worms requires some more qualities concerning it’s polymorphic capability. For example, although a single obfuscated payload (a la Samy’s worm code) can propagate all over an infected application, that application is extremely easy to clean with a simple search-and-destroy on the payload code.

I call this type of worm a “Teflon worm”. Get it? Teflon’s easy to clean. If you don’t think that’s funny, you should read it again and reconsider. If you still don’t think it’s funny, listen to this mp3 and let me take credit for being funny by association

To avoid that easy-to-clean Teflon type of payload we need a polymorphic worm that has different payloads. However, they can’t just be different. They have to be difficult to signature. So, a payload that goes from stealCookie24(document.cookie) to stealCookie25(document.cookie) is an improvement, but it’s still not great because the payloads generated will be easy to signature. Really, any deterministic algorithm for shifting payloads will be easy to signature after some analysis.

Another problem with Hackvertor-generated payloads is that they still contain strings that would commonly be blacklisted, such as document.cookie. These few issues shouldn’t be considered “weaknesses” in Hackvertor. The polymorphing code a good worm will need from infection-to-infection are one contextual level up from Hackvertor and bypassing a blacklist is not in and of itself a challenge you need Hackvertor for. However, it would be much more useful if the tool added automatically fragmented and re-assembled these keywords (perhaps any literal string in the payload).

The goal is not to just to avoid eschewing obfuscation, but to actively camouflage our data among other users’ data. To do that, we’re going to need a non-deterministic method of expressing a JavaScript “idea”. Assuming Hackvertor is the tool to help us make this happen, here are 3 key ways we can move Hackvertor towards that destination:

Combining and Layering

A real polymorphic algorithm will randomly combine and layer multiple encoding transformations on the payload. This doesn’t mean you just take 5 algorithms and run the payload through 3 of them. This means looping through a random number of iterations, taking randomly sized substrings of the payload and running them through multiple layers of encoding.

Random Expressiveness

Encoding is one way of performing transformations. Another way of performing transformations is altering the expressions of full or partial JavaScript statements. An example transformation would turn “alert(” into “alert (“. Another transformation would take that statement and turn it into “/*ZXC*/alert (“. Because of the expressiveness of JavaScript, you could create hundreds of these types of transformations.

Out-of-order String Fragmenting

One of the failures of many of the encoders I see is that when they fragment keywords to avoid blacklists, they fragment them in a way that is easy to see through for a computer. For example, take these two pieces of Samy’s exploit code:

  1. eval('document.body.inne'+'rHTML')
  2. eval('J.onr'+'eadystatechange=BI')

Can you use this fragmenting technique to beat any blacklist? Not a 3-D blacklist. Let’s run these things through a filter that removes everything that’s not alphanumeric. The results: evaldocumentbodyinnerHTML and evalJonreadystatechangeBI – and now our blacklist works again. I call this technique a 3-D blacklist because it gives depth to data. It’s an absolutely terrible mechanism and easily beatable, but it would work today because nobody’s trying to beat it. The way to beat it? Don’t fragment your bad words in order. Reverse the order, make it random, etc.

Whenever I teach a class on webappsec, I mention Samy’s worm. Of the whole thing, his fragmenting technique is the piece that seems to impress the students most. It’s simultaneously very simple and very clever. Once I’m done breaking down the worm, however, I let them know how ugly Samy is to quickly prevent his already out-of-control fan club from growing any more. Okay, I don’t tell them he’s ugly. Only because everyone that saw us at OWASP San Jose kept saying we looked alike. Jeff Williams found this picture of how he looks much more like Vladimir Lenin, which, by the transitive nature of comparison, means I look like Lenin. You be the judge:
Samy KamkarVladimir Lenin

samy, lenin, then me (and my mail order bride Jen).

Anyway, Gareth has told me he will be implementing these features in the future and that we should keep an eye on his blog. The next great XSS worm can’t be easy to signature, so this is important research.

In closing, go Liverpool!