An attacker supplies the target software with input data that contains
sequences of special characters designed to bypass input validation logic.
This exploit relies on the target making multiples passes over the input
data and processing a "layer" of special characters with each pass. In this
manner, the attacker can disguise input that would otherwise be rejected as
invalid by concealing it with layers of special/escape characters that are
stripped off by subsequent processing steps.
The goal is to first discover cases where the input validation layer
executes before one or more parsing layers. That is, user input may go
through the following logic in an application: <parser1> -->
<input validator> --> <parser2>. In such cases, the attacker
will need to provide input that will pass through the input validator, but
after passing through parser2, will be converted into something that the
input validator was supposed to stop.
Attack Execution Flow
Explore
Determine application/system inputs where
bypassing input validation is
desired:
The attacker first needs to determine all of the
application's/system's inputs where input validation
is being performed and where he/she wants to bypass
it.
Attack Step Techniques
ID
Attack Step Technique Description
Environments
1
While using an application/system, the
attacker discovers an input where validation is
stopping him/her from performing some malicious or
unauthorized actions.
env-All
Indicators
ID
Type
Indicator Description
Environments
1
Positive
When provided with unexpected input,
application provides an error message stating that
the input was invalid or that access was
denied.
env-All
Experiment
Determine which character encodings are
accepted by the application/system:
The attacker then needs to provide various
character encodings to the application/system and
determine which ones are accepted. The attacker will
need to observe the application's/system's response
to the encoded data to determine whether the data
was interpreted properly.
Attack Step Techniques
ID
Attack Step Technique Description
Environments
-1
Determine which escape characters are
accepted by the application/system. A common
escape character is the backslash character,
'\'
env-All
-1
Determine whether URL encoding is accepted
by the application/system.
env-All
-1
Determine whether UTF-8 encoding is accepted
by the application/system.
env-All
-1
Determine whether UTF-16 encoding is
accepted by the application/system.
env-All
-1
Determine if any other encodings are
accepted by the application/system.
env-All
Indicators
ID
Type
Indicator Description
Environments
-1
Positive
System provides error message similar to the
one it provided when a positivie indicator was
received for the first step.
env-All
Outcomes
ID
Type
Outcome Description
0
Success
Application/system accepts at
least one high level character encoding where
characters can be represented with multiple ASCII
characters.
0
Failure
Application/system interprets
each character separately.
Security Controls
ID
Type
Security Control Description
0
Detective
Detect and alert on
appearance of encodings in log messages (e.g.
"Unsuccessful login by
<joe")
Combine multiple encodings accepted by
the application.:
The attacker now combines encodings accepted by
the application. The attacker may combine different
encodings or apply the same encoding multiple
times.
Attack Step Techniques
ID
Attack Step Technique Description
Environments
-1
Combine same encoding multiple times and
observe its effects. For example, if special
characters are encoded with a leading backslash,
then the following encoding may be accepted by the
application/system: "\\\.". With two parsing
layers, this may get converted to "\." after the
first parsing layer, and then, to "." after the
second. If the input validation layer is between
the two parsing layers, then "\\\.\\\." might pass
a test for ".." but still get converted to ".."
afterwards. This may enable directory traversal
attacks.
env-All
-1
Combine multiple encodings and observe the
effects. For example, the attacker might encode
"." as "\.", and then, encode "\." as
"\.", and then, encode that using
URL encoding to "%26%2392%3B%26%2346%3B"
env-All
Indicators
ID
Type
Indicator Description
Environments
-1
Positive
Application/System interprets the multiple
encodings properly.
env-All
Outcomes
ID
Type
Outcome Description
0
Success
Attacker bypasses input
validation layer(s) and passes data to application
that it does not expect.
Security Controls
ID
Type
Security Control Description
0
Preventative
Ensure that the input
validation layer is executed after as many parsing
layers as possible.
0
Preventative
Determine the details
of any parsing layers that get executed after the
input validation layer (this may be necessary in
the case of filesystem access, for example, where
the operating system also includes a parsing
layer), and ensure that the input validator
accounts for the various encodings of illegal
characters and character sequences in those
layers.
Exploit
Leverage ability to bypass input
validation:
Attacker leverages his ability to bypass input
validation to gain unauthorized access to system.
There are many attacks possible, and a few examples
are mentioned here.
Attack Step Techniques
ID
Attack Step Technique Description
Environments
-1
Gain access to sensitive files.
env-All
-1
Perform command injection.
env-All
-1
Perform SQL injection.
env-All
-1
Perform XSS attacks.
env-All
Indicators
ID
Type
Indicator Description
Environments
-1
Positive
Success outcome in previous step
env-All
-1
Negative
Failure outcome in previous step
env-All
Outcomes
ID
Type
Outcome Description
0
Success
Gaining unauthorized access to
system functionality.
Attack Prerequisites
User input is used to construct a command to be executed on the target
system or as part of the file name.
Multiple parser passes are performed on the data supplied by the
user.
Typical Likelihood of Exploit
Likelihood: Medium
Methods of Attack
Injection
Modification of Resources
Examples-Instances
Description
The backslash character provides a good example of the multiple-parser
issue. A backslash is used to escape characters in strings, but is also
used to delimit directories on the NT file system. When performing a
command injection that includes NT paths, there is usually a need to
"double escape" the backslash. In some cases, a quadruple escape is
necessary.
Original String: C:\\\\winnt\\\\system32\\\\cmd.exe /c
<parsing layer>
Interim String: C:\\winnt\\system32\\cmd.exe /c
<parsing layer>
Final String: C:\winnt\system32\cmd.exe /c
This diagram shows each successive layer of parsing translating the
backslash character. A double backslash becomes a single as it is
parsed. By using quadruple backslashes, the attacker is able to control
the result in the final string.
From G. Hoglund and G. McGraw. Exploiting Software: How to Break Code.
Addison-Wesley, February 2004.
Attacker Skills or Knowledge Required
Skill or Knowledge Level: Medium
Probing Techniques
Description
Initially a fuzzer can be used to see what the application is
successfully and escaping and what causes problems. This may be a good
starting point.
Description
Manually try to introduce multiple layers of control characters and
see how many layers the application can escape.
Indicators-Warnings of Attack
Description
Control characters are being detected by the filters
repeatedly.
Solutions and Mitigations
An iterative approach to input validation may be required to ensure that
no dangerous characters are present. It may be necessary to implement
redundant checking across different input validation layers. Ensure that
invalid data is rejected as soon as possible and do not continue to work
with it.
Make sure to perform input validation on canonicalized data (i.e. data
that is data in its most standard form). This will help avoid tricky
encodings getting past the filters.
Assume all input is malicious. Create a white list that defines all valid
input to the software system based on the requirements specifications. Input
that does not match against the white list should not be permitted to enter
into the system.