An attacker may provide a unicode string to a system component that is not
unicode aware and use that to circumvent the filter or cause the classifying
mechanism to fail to properly understanding the request. That may allow the
attacker to slip malicious data past the content filter and/or possibly
cause the application to route the request incorrectly.
Attack Execution Flow
Explore
Survey the application for
user-controllable inputs:
Using a browser or an automated tool, an attacker
follows all public links and actions on a web site.
He records all the links, the forms, the resources
accessed and all other potential entry-points for
the web application.
Attack Step Techniques
ID
Attack Step Technique Description
Environments
1
Use a spidering tool to follow and record
all links and analyze the web pages to find entry
points. Make special note of any links that
include parameters in the URL.
env-Web
2
Use a proxy tool to record all user input
entry points visited during a manual traversal of
the web application.
env-Web
3
Use a browser to manually explore the
website and analyze how it is constructed. Many
browsers' plugins are available to facilitate the
analysis or automate the discovery.
env-Web
Indicators
ID
type
Indicator Description
Environments
1
Positive
Inputs are used by the application or the
browser (DOM)
env-Web
2
Inconclusive
Using URL rewriting, parameters may be part
of the URL path.
env-Web
3
Inconclusive
No parameters appear to be used on the
current page. Even though none appear, the web
application may still use them if they are
provided.
env-Web
4
Negative
Applications that have only static pages or
that simply present information without accepting
input are unlikely to be susceptible.
env-Web
Outcomes
ID
type
Outcome Description
1
Success
A list of URLs, with their
corresponding parameters (POST, GET, COOKIE, etc.)
is created by the attacker.
2
Success
A list of application user
interface entry fields is created by the
attacker.
3
Success
A list of resources accessed by
the application is created by the
attacker.
Security Controls
ID
type
Security Control Description
1
Detective
Monitor velocity of
page fetching in web logs. Humans who view a page
and select a link from it will click far slower
and far less regularly than tools. Tools make
requests very quickly and the requests are
typically spaced apart regularly (e.g. 0.8 seconds
between them).
2
Detective
Create links on some
pages that are visually hidden from web browsers.
Using IFRAMES, images, or other HTML techniques,
the links can be hidden from web browsing humans,
but visible to spiders and programs. A request for
the page, then, becomes a good predictor of an
automated tool probing the
application.
3
Preventative
Use CAPTCHA to prevent
the use of the application by an automated
tool.
4
Preventative
Actively monitor the
application and either deny or redirect requests
from origins that appear to be
automated.
Experiment
Probe entry points to locate
vulnerabilities:
The attacker uses the entry points gathered in the
"Explore" phase as a target list and injects various
Unicode encoded payloads to determine if an entry
point actually represents a vulnerability with
insufficient validation logic and to characterize
the extent to which the vulnerability can be
exploited.
Attack Step Techniques
ID
Attack Step Technique Description
Environments
1
Try to use Unicode encoding of content in
Scripts in order to bypass validation
routines.
env-Web
2
Try to use Unicode encoding of content in
HTML in order to bypass validation
routines.
env-Web
3
Try to use Unicode encoding of content in
CSS in order to bypass validation routines.
env-Web
Indicators
ID
type
Indicator Description
Environments
1
Positive
The application accepts user-controllable
input.
env-Web
Outcomes
ID
type
Outcome Description
1
Success
The attacker's Unicode encoded
payload is processed and acted on by the
application without filtering or
transcoding
2
Failure
The application decodes the
charset and filters the
inputs.
Security Controls
ID
type
Security Control Description
1
Preventative
Implement input
validation routines that filter or transcode for
Unicode content.
2
Preventative
Specify the charset of
the HTTP
transaction/content.
3
Detective
Monitor inputs to web
servers. Alert on unusual charset and/or
characters.
4
Preventative
Actively monitor the
application and either deny or redirect requests
from origins that appear to be attack
attempts.
Attack Prerequisites
Filtering is performed on data that has not be properly
canonicalized.
Typical Likelihood of Exploit
Likelihood: Medium
Methods of Attack
Modification of Resources
API Abuse
Injection
Examples-Instances
Description
A very common technique for a unicode attack involves traversing
directories looking for interesting files. An example of this idea
applied to the Web is
In this case, the attacker is attempting to traverse to a directory
that is not supposed to be part of standard Web services. The trick is
fairly obvious, so many Web servers and scripts prevent it. However,
using alternate encoding tricks, an attacker may be able to get around
badly implemented request filters.
In October 2000, a hacker publicly revealed that Microsoft's IIS
server suffered from a variation of this problem. In the case of IIS,
all the attacker had to do was provide alternate encodings for the dots
and/or slashes found in a classic attack. The unicode translations
are
. yields C0 AE
/ yields C0 AF
\ yields C1 9C
Using this conversion, the previously displayed URL can be encoded
as
An attacker needs to understand unicode encodings and have an idea (or
be able to find out) what system components may not be unicode
aware.
Indicators-Warnings of Attack
Unicode encoded data is passed to APIs where it is not expected
Solutions and Mitigations
Ensure that the system is Unicode aware and can properly process Unicode
data. Do not make an assumption that data will be in ASCII.
Ensure that filtering or input validation is applied to canonical
data.
Assume all input is malicious. Create a white list that defines all valid
input to the software system based on the requirements specifications. Input
that does not match against the white list should not be permitted to enter
into the system.
Vision and Technical Leadership provided by Cigital, Inc.
This Web site is hosted by The MITRE Corporation.
Copyright 2009, The MITRE Corporation. CAPEC and the CAPEC logo are trademarks of The MITRE Corporation.