CAPEC-3: Using Leading 'Ghost' Character Sequences to Bypass Input Filters
Attack Pattern ID: 3
Some APIs will strip certain leading characters from a string of parameters. An adversary can intentionally introduce leading "ghost" characters (extra characters that don't affect the validity of the request at the API layer) that enable the input to pass the filters and therefore process the adversary's input. This occurs when the targeted API will accept input data in several syntactic forms and interpret it in the equivalent semantic way, while the filter does not take into account the full spectrum of the syntactic forms acceptable to the targeted API.
Likelihood Of Attack
The table below shows the other attack patterns and high level categories that are related to this attack pattern. These relationships are defined as ChildOf and ParentOf, and give insight to similar items that may exist at higher and lower levels of abstraction. In addition, relationships such as CanFollow, PeerOf, and CanAlsoBe are defined to show similar attack patterns that the user may want to explore.
Standard Attack Pattern - A standard level attack pattern in CAPEC is focused on a specific methodology or technique used in an attack. It is often seen as a singular piece of a fully executed attack. A standard attack pattern is meant to provide sufficient details to understand the specific technique and how it attempts to accomplish a desired goal. A standard level attack pattern is a specific type of a more abstract meta level attack pattern.
Determine if the source code is available and if so, examine the filter logic.
If the source code is not available, write a small program that loops through various possible inputs to given API call and tries a variety of alternate (but equivalent) encodings of strings with leading ghost characters. Knowledge of frameworks and libraries used and what filters they apply will help to make this search more structured.
Observe the effects. See if the probes are getting past the filters. Identify a string that is semantically equivalent to that which an adversary wants to pass to the targeted API, but syntactically structured in a way as to get past the input filter. That encoding will contain certain ghost characters that will help it get past the filters. These ghost characters will be ignored by the targeted API.
Once the "winning" alternate encoding using (typically leading) ghost characters is identified, an adversary can launch the attacks against the targeted API (e.g. directory traversal attack, arbitrary shell command execution, corruption of files)
The targeted API must ignore the leading ghost characters that are used to get past the filters for the semantics to be the same.
The ability to make an API request, and knowledge of "ghost" characters that will not be filtered by any input validation. These "ghost" characters must be known to not affect the way in which the request will be interpreted.
The table below specifies different individual consequences associated with the attack pattern. The Scope identifies the security property that is violated, while the Impact describes the negative technical impact that arises if an adversary succeeds in their attack. The Likelihood provides information about how likely the specific consequence is expected to be seen relative to the other consequences in the list. For example, there may be high likelihood that a pattern will be used to achieve a certain impact, but a low likelihood that it will be exploited to achieve a different impact.
Perform white list rather than black list input validation.
Canonicalize all data prior to validation.
Take an iterative approach to input validation (defense in depth).
Alternate Encoding with Ghost Characters in FTP and Web Servers
Some web and FTP servers fail to detect prohibited upward directory traversals if the user-supplied pathname contains extra characters such as an extra leading dot. For example, a program that will disallow access to the pathname "../test.txt" may erroneously allow access to that file if the pathname is specified as ".../test.txt". This attack succeeds because 1) the input validation logic fails to detect the triple-dot as a directory traversal attempt (since it isn't dot-dot), 2) some part of the input processing decided to strip off the "extra" dot, leaving the dot-dot behind.
Using the file system API as the target, the following strings are all equivalent to many programs:
As you can see, there are many ways to make a semantically equivalent request. All these strings ultimately result in a request for the file ../test.txt.
A Related Weakness relationship associates a weakness with this attack pattern. Each association implies a weakness that must exist for a given attack to be successful. If multiple weaknesses are associated with the attack pattern, then any of the weaknesses (but not necessarily all) may be present for the attack to be successful. Each related weakness is identified by a CWE identifier.