CAPEC-3: Using Leading 'Ghost' Character Sequences to Bypass Input Filters
Attack Pattern ID: 3
Some APIs will strip certain leading characters from a string of parameters. An adversary can intentionally introduce leading "ghost" characters (extra characters that don't affect the validity of the request at the API layer) that enable the input to pass the filters and therefore process the adversary's input. This occurs when the targeted API will accept input data in several syntactic forms and interpret it in the equivalent semantic way, while the filter does not take into account the full spectrum of the syntactic forms acceptable to the targeted API.
Likelihood Of Attack
This table shows the other attack patterns and high level categories that are related to this attack pattern. These relationships are defined as ChildOf and ParentOf, and give insight to similar items that may exist at higher and lower levels of abstraction. In addition, relationships such as CanFollow, PeerOf, and CanAlsoBe are defined to show similar attack patterns that the user may want to explore.
Standard Attack Pattern - A standard level attack pattern in CAPEC is focused on a specific methodology or technique used in an attack. It is often seen as a singular piece of a fully executed attack. A standard attack pattern is meant to provide sufficient details to understand the specific technique and how it attempts to accomplish a desired goal. A standard level attack pattern is a specific type of a more abstract meta level attack pattern.
Survey the application for user-controllable inputs: Using a browser, an automated tool or by inspecting the application, an adversary records all entry points to the application.
Use a spidering tool to follow and record all links and analyze the web pages to find entry points. Make special note of any links that include parameters in the URL.
Use a proxy tool to record all user input entry points visited during a manual traversal of the web application.
Use a browser to manually explore the website and analyze how it is constructed. Many browsers' plugins are available to facilitate the analysis or automate the discovery.
Manually inspect the application to find entry points.
Probe entry points to locate vulnerabilities: The adversary uses the entry points gathered in the "Explore" phase as a target list and injects various leading 'Ghost' character sequences to determine how to application filters them.
Add additional characters to common sequences such as "../" to see how the application will filter them.
Try repeating special characters (?, @, #, *, etc.) at the beginning of user input to see how the application filters these out.
Bypass input filtering: Using what the adversary learned about how the application filters input data, they craft specific input data that bypasses the filter. This can lead to directory traversal attacks, arbitrary shell command execution, corruption of files, etc.
The targeted API must ignore the leading ghost characters that are used to get past the filters for the semantics to be the same.
The ability to make an API request, and knowledge of "ghost" characters that will not be filtered by any input validation. These "ghost" characters must be known to not affect the way in which the request will be interpreted.
This table specifies different individual consequences associated with the attack pattern. The Scope identifies the security property that is violated, while the Impact describes the negative technical impact that arises if an adversary succeeds in their attack. The Likelihood provides information about how likely the specific consequence is expected to be seen relative to the other consequences in the list. For example, there may be high likelihood that a pattern will be used to achieve a certain impact, but a low likelihood that it will be exploited to achieve a different impact.
Use an allowlist rather than a denylist input validation.
Canonicalize all data prior to validation.
Take an iterative approach to input validation (defense in depth).
Alternate Encoding with Ghost Characters in FTP and Web Servers
Some web and FTP servers fail to detect prohibited upward directory traversals if the user-supplied pathname contains extra characters such as an extra leading dot. For example, a program that will disallow access to the pathname "../test.txt" may erroneously allow access to that file if the pathname is specified as ".../test.txt". This attack succeeds because 1) the input validation logic fails to detect the triple-dot as a directory traversal attempt (since it isn't dot-dot), 2) some part of the input processing decided to strip off the "extra" dot, leaving the dot-dot behind.
Using the file system API as the target, the following strings are all equivalent to many programs:
As you can see, there are many ways to make a semantically equivalent request. All these strings ultimately result in a request for the file ../test.txt.
A Related Weakness relationship associates a weakness with this attack pattern. Each association implies a weakness that must exist for a given attack to be successful. If multiple weaknesses are associated with the attack pattern, then any of the weaknesses (but not necessarily all) may be present for the attack to be successful. Each related weakness is identified by a CWE identifier.