Stay ahead of Web 2.0 worms

XSS marks the spot

Mon 7 Jan 2008 // 21:49 UTC

Canonicalize: Attackers will use tricks to attempt to bypass your validation. By using various kinds of encoding, either alone or in combination, they hide code inside data. You have to decode this data to its simplest form - "canonicalize" it - before validating, or you may allow code to reach some backend system that decodes it and enables the attack. But canonicalizing is easier said than done. Take a string that's been doubly encoded using multiple different (possibly unknown) encoding schemes, add any number of backend parsers, consider the browser's permissive parsing, and you have a real mess - as seen here:

?input=test%2522%2Bonblur%253D%26quotalert&amp%23x28document.cookie%2529

Encode output: Encoding isn't all bad. You can use encoding to tell the browser that data shouldn't be run as code. "Whitelist output encoding" is simply replacing everything except a small list of safe characters (such as alphanumerics) with HTML entities before sending to the browser. The browser will render these entities instead of interpreting them, which defuses XSS attacks. Except that it doesn't always work. First, many frameworks and libraries use "blacklist" encoding where only a very small set of characters is encoded. Also, there are some places in HTML, such as href and src attributes, which allow HTML entities to be executed. So the following insane URL is actually executed by browsers:

Assign character set: Even if you're doing great input validation and output encoding, attackers may try to trick the browser into using the wrong character set to interpret the data. An innocuous string in one character set may be interpreted as a dangerous string in a different character set. Many browsers will attempt to guess the character set if one is not specified. So, the harmless-looking string +ADw- will be interpreted as a < character if the browser decides to use UTF-7. This is why you should always be careful to set a sane character set like UTF-8, as seen here:

Content-Type: text/html; charset=UTF-8

Page:

More about

COMMENTS

TIP US OFF

Send us news

Topics

Special Features

Vendor Voice

Resources

Channel

Stay ahead of Web 2.0 worms

XSS marks the spot

More about

TIP US OFF

Other stories you might like

US government reportedly ponders crimping China's use of RISC-V

White House tweaks HIPAA to shield medical files of those seeking reproductive care

Intel Foundry ticks another box in quest to fab mil-spec chips for US DoD

Getting on board with AI

Using its own sums, AMD claims it's helping save Earth with Epyc server chiplets

Waymo robotaxi drives down wrong side of street after being alarmed by unicyclists

Banned Nvidia GPUs sneak into sanction-busting Chinese servers

Miles of optical fiber crafted aboard ISS marks manufacturing first

Seagate joins the HDD price hike party, blames AI for spike in demand

SpaceX workplace injury rates are rocketing

Miracle-WM tiling window manager for Mir hits 0.2.0

GM shared our driving data with insurers without consent, lawsuit claims

About Us

Our Websites

Your Privacy