Individual CAPEC Dictionary Definition (Release 1.1)
Individual CAPEC Dictionary Definition (Release 1.1)
| Attack Pattern ID | Pattern Abstraction: Detailed 3 | | Typical Severity | Medium | | Description | Summary An attacker intentionally introduces leading characters that enable getting the input past the filters. The API that is being targetted, ignores the leading "ghost" characters, and therefore processes the attacker's input. This occurs when the targetted API will accept input data in several syntactic forms and interpret it in the equivalent semantic way, while the filter does not take into account the full spectrum of the syntactic forms acceptable to the targetted API.
Some APIs will strip certain leading characters from a string of parameters. Perhaps these characters are considered redundant, and for this reason they are removed. Another possibility is the parser logic at the beginning of analysis is specialized in some way that causes some characters to be removed. The attacker can specify multiple types of alternative encodings at the beginning of a string as a set of probes.
One commonly used possibility involves adding ghost characters—extra characters that don’t affect the validity of the request at the API layer. If the attacker has access to the API libraries being targeted, certain attack ideas can be tested directly in advance. Once alternative ghost encodings emerge through testing, the attacker can move from lab-based API testing to testing real-world service implementations.
Attack Execution Flow
Determine if the source code is available and if so, examine the filter logic.
If the source code is not available, write a small program that loops through various possible inputs to given API call and tries a variety of alternate (but equivalent) encodings of strings with leading ghost characters. Knowlege of frameworks and libraries used and what filters they apply will help to make this search more structured.
Observe the effects. See if the probes are getting past the filters. Identify a string that is semantically equivalent to that which an attacker wants to pass to the targeted API, but syntactically structured in a way as to get past the input filter. That encoding will contain certain ghost characters that will help it get past the filters. These ghost characters will be ignored by the targeted API.
Once the "winning" alternate encoding using (typically leading) ghost characters is identified, an attacker can launch the attacks against the targetted API (e.g. directory traversal attack, arbitrarary shell command execution, corruption of files)
| | Attack Prerequisites |
The targetted API must ignore the leading ghost characters that are used to get past the filters for the semantics to be the same.
| | Typical Likelihood of Exploit |
Medium
| | Methods of Attack | | | Examples-Instances | Description Alternate Encoding with Ghost Characters in FTP and Web Servers
Some web and FTP servers fail to detect prohibited upward directory traversals if the user-supplied pathname contains extra characters such as an extra leading dot. For example, a program that will disallow access to the pathname “../test.txt” may erroneously allow access to that file if the pathname is specified as “…/test.txt”. This attack succeeds because 1) the input validation logic fails to detect the triple-dot as a directory traversal attempt (since it isn’t dot-dot), 2) some part of the input processing decided to strip off the “extra” dot, leaving the dot-dot behind.
Using the file system API as the target, the following strings are all equivalent to many programs:
.../../../test.txt ............/../../test.txt ..?/../../test.txt ..????????/../../test.txt ../test.txt
As you can see, there are many ways to make a semantically equivalent request. All these strings ultimately result in a request for the file ../test.txt. Related Vulnerability | | Attacker Skill or Knowledge Required |
Medium
| | Resources Required | | | Solutions and Mitigations |
Perform white list rather than black list input validation.
Canonicalize all data prior to validation.
Take an iterative approach to input validation (defense in depth).
| | Attack Motivation-Consequences | - Privilege Escalation
- Data Modification
| | Context Description | Building “Equivalent” Requests
A large number of commands are subject to parsing or filtering. In many cases a filter only considers one particular way to format a command. The fact is that the same command can usually be encoded in thousands of different ways. In many cases, an alternative encoding for the command will produce exactly the same results as the original command. Thus, two commands that look different from the logical perspective of a filter end up producing the same semantic result. In many cases, an alternatively encoded command can be used to attack a software system, because the alternative command allows an attacker to perform an operation that would otherwise be blocked.
Mapping the API Layer
A good approach to help identify and map possible alternate encodings involves writing a small program that loops through all possible inputs to a given API call. This program can, for example, attempt to encode filenames in a variety of ways. For each iteration of the loop, the “mungified” filename can be passed to the API call and the result noted.
The following code snippet loops through many possible values that can be used as a prefix to the string \test.txt. Results of running a program like this can help us to determine which characters can be used to perform a ../../ (dots and slashes) relative traversal attack.
int main(int argc, char* argv[]) { for(unsigned long c=0x01010101;c != -1;c++) { char _filepath[255]; sprintf(_filepath, "%c%c%c%c\\test.txt", c >> 24, c >> 16, c >> 8, c&0x000000FF );
try { FILE *in_file = fopen(_filepath, "r");
if(in_file) { printf("checking path %s\n", _filepath); puts("file opened!"); getchar(); fclose(in_file); } } catch(...) {
} } return 0; }
Slight (but still automatic) modifications can be made to the string in creative ways. Ultimately, the modified string boils down to an attempt to use different tricks to obtain the same file. For example, one resulting attempt might try a command like this:
sprintf(_filepath, "..%c\\..%c\\..%c\\..%c\\scans2.txt", c, c, c, c);
A good way to think about this problem is to think of layers. The API call layer is what the examples shown here are mapping. If an engineer has placed any filters in front of the API call, then these filters can be considered additional layers, wrapping the original set of possibilities. By pondering all the possible inputs that can be provided at the API layer, we can begin uncovering and exercising any filters that the software has in place. If we know that the software definitely uses file API calls, we can try all kinds of filename encoding tricks that we know about. If we get lucky, eventually one set of encoding tricks will work, and we can get our data successfully through the filters and into the API call.
Drawing on the techniques described in Chapter 5 of "Exploiting Software: How to Break Code" (See reference - G. Hoglund and G. McGraw) , we can list a number of possible escape codes that can be injected into API calls (many of which help with the filter avoidance problem). If the data are eventually being piped into a shell, for example, we might be able to get control codes to take effect. A particular call may write data to a file or a stream that are eventually meant to be viewed on a terminal or in a client program. As a simple example, the following string contains two backspace characters that are very likely to show up in the terminal’s execution:
write("echo hey!\x08\x08");
When the terminal interprets the data we have passed in, the output will be missing the last two characters of the original string. This kind of trick has been used for ages to corrupt data in log files. Log files capture all kinds of data about a transaction. It may be possible to insert NULL characters (for example, %00 or '\0') or to add so many extra characters to the string that the request is truncated in the log. Imagine a request that has more than a thousand extra characters tacked on at the end. Ultimately, the string may be trimmed in the log file, and the important telltale data that expose an attack will be lost.
Ghost Characters
Ghost characters are extra characters that can be added to a request. The extra characters are designed not to affect the validity of the request. One easy example involves adding extra slashes to a filename. In many cases, the strings
/some/directory/test.txt
and
/////////////////some/////////////directory//////////////test.txt
are equivalent requests.
From G. Hoglund and G. McGraw. Exploiting Software: How to Break Code. Addison-Wesley, February 2004.
| | Injection Vector |
Web Form, URL, Network Socket, File
| | Payload |
The payload is the parameter that an attacker is supplying to the targetted API that will allow the attacker to elevate privilege and subvert the authorization service.
| | Activation Zone |
The targetted API is the activation zone. These attacks often target the file system or the shell to execute commands.
| | Payload Activation Impact |
Failure in authorization service may lead to compromises in data confidentiality and integrity.
| | Related Weaknesses | | CWE-ID | Weakness Name | Weakness Relationship Type |
|---|
| 173 | Failure to Handle Alternate Encoding | Targeted | | 41 | Failure to Resolve Path Equivalence | Targeted | | 172 | Encoding Error | Targeted | | 171 | Cleansing, Canonicalization, and Comparison Errors | Targeted | | 179 | Incorrect Behavior Order: Early Validation | Targeted | | 180 | Incorrect Behavior Order: Validate Before Canonicalize | Targeted | | 181 | Incorrect Behavior Order: Validate Before Filter | Secondary | | 183 | Permissive Whitelist | Secondary | | 184 | Incomplete Blacklist | Secondary | | 20 | Insufficient Input Validation | Targeted | | 74 | Failure to Sanitize Data into a Different Plane (aka 'Injection') | Targeted |
| | Related Security Principles | -
Defense in Depth
-
Reluctance to Trust
-
Least Privilege
| | Related Guidelines | - Perform input validation and filtering on data in its canonical form.
- Understand the APIs to which user input will be passed and know how permissive they are. Perform appropriate input validation given that information.
| | Purpose | Exploitation | | CIA Impact | | Confidentiality Impact | Integrity Impact | Availability Impact |
|---|
| Low | Low | High |
| | Technical Context | | Architectural Paradigm | Framework | Platform | Language |
|---|
| All | All | All | All |
| | References | G. Hoglund and G. McGraw. Exploiting Software: How to Break Code. Addison-Wesley, February 2004.
| | Source | | Submission(s) |
|---|
| Submitter | Organization | Date | Comment |
|---|
| G. Hoglund and G. McGraw. Exploiting Software: How to Break Code. Addison-Wesley, February 2004. | Cigital, Inc | 2007-03-01 | |
| Modification(s) |
|---|
| Modifier | Organization | Date | Comment |
|---|
| Eugene Lebanidze | Cigital, Inc | 2007-02-26 | Fleshed out content to CAPEC schema from the original descriptions in "Exploiting Software" | | Sean Barnum | Cigital, Inc | 2007-03-05 | Review and revise | | Richard Struse | VOXEM, Inc | 2007-03-26 | Review and feedback leading to changes in Name, Attack Execution Flow and Examples | | Sean Barnum | Cigital, Inc | 2007-04-13 | Modified pattern content according to review and feedback |
|
|