CAPEC

Common Attack Pattern Enumeration and Classification
Common Attack Pattern Enumeration and Classification

A Community Knowledge Resource for Building Secure Software

Home > CAPEC List > Individual CAPEC Dictionary Definition (Release 1.1)   View the CAPEC List

Individual CAPEC Dictionary Definition (Release 1.1)
Individual CAPEC Dictionary Definition (Release 1.1)

XPath Injection
Attack Pattern ID
Pattern Abstraction: Detailed

83

Typical Severity

High

Description

Summary

An attacker can craft special user-controllable input consisting of XPath expressions to inject the XML database and bypass authentication or glean information that he normally would not be able to. XPath Injection enables an attacker to talk directly to the XML database, thus bypassing the application completely. XPath Injection results form the failure of an application to properly sanitize input used as part of dynamic XPath expressions used to query an XML database. In order to successfully inject XML and retrieve information from a database, an attacker:

Attack Execution Flow

  1. 1. Determines the user-controllable input that is used without proper validation as part of XPath queries

  2. 2. Determines the structure of queries that accept such input

  3. 3. Crafts malicious content containing XPath expressions that is not validated by the application and is executed as part of the XPath queries.

Attack Prerequisites

XPath queries used to retrieve information stored in XML documents

User-controllable input not properly sanitized before being used as part of XPath queries

Typical Likelihood of Exploit

High

Methods of Attack
  • Injection
Examples-Instances

Description

Consider an application that uses an XML database to authenticate its users. The application retrieves the user name and password from a request and forms an XPath expression to query the database. An attacker can successfully bypass authentication and login without valid credentials through XPath Injection. This can be achieved by injecting the query to the XML database with XPath syntax that causes the authentication check to fail. Improper validation of user-controllable input and use of a non-parameterized XPath expression enable the attacker to inject an XPath expression that causes authentication bypass.

Attacker Skill or Knowledge Required

Low - XPath Injection shares the same basic premises with SQL Injection. An attacker must have knowledge of XPath synax and constructs in order to successfully leverage XPath Injection

Resources Required

None

Probing Techniques

The attacker tries to inject characters that can cause an XPath error, such as single-quote ('), or content that may cause a malformed XPath expression. If the injection of such content into the input causes an XPath error and the resulting error is displayed unfiltered, the attacker can begin to determine the nature of input validation and structure of XPath expressions used in queries.

Indicators-Warnings of Attack

Too many exceptions generated by the appplication as a result of malformed XPath queries

Solutions and Mitigations

Strong input validation - All user-controllable input must be validated and filtered for illegal characters as well as content that can be interpreted in the context of an XPath expression. Characters such as a single-quote(') or operators such as or (|), and (&) and such should be filtered if the application does not expect them in the context in which they appear. If such content cannot be filtered, it must at least be properly escaped to avoid them being interpreted as part of XPath expressions.

Use of parameterized XPath queries - Parameterization causes the input to be restricted to certain domains, such as strings or integers, and any input outside such domains is considered invalid and the query fails.

Use of custom error pages - Attackers can glean information about the nature of queries from descriptive error messages. Input validation must be coupled with customized error pages that inform about an error without disclosing information about the database or application.

Attack Motivation-Consequences
  • Privilege Escalation
  • Information Leakage
Context Description

The primary cause of XPath Injection is use of improperly validated input. In the absence of such validation, it becomes possible to inject content that can be interpreted as part of XPath expressions used in querying the XML database. The second most important reason is use of XPath expressions created dynamically to query the database. Another factor, albeit a minor one, is the use of default error pages that reveal information about the structure of XPath queries.

It is important to realize that, wherever possible, it is easier to leverage XPath injection than SQL Injection since an XML document usually has no access control associated with it. The attacker can extract the document structure since the contents of the XML document are not bound by privilege considerations in the same manner that tables in a relational database are. Also, in case of SQL injection, the application is limited in querying the database by the privilege of the database account used by the application.

Consider the following simple XML document that stores authentication information and a snippet of Java code that uses XPath query to retireve authentication information:
<?xml version="1.0"?>
<users>
<user>
<login>john</login>
<password>abracadabra</password>
<home_dir>/home/john</home_dir>
</user>
<user>
<login>cbc</login>
<password>1mgr8</password>
<home_dir>/home/cbc</home_dir>
</user>
</users>

The Java code used to retrieve the home directory based on the provided credentials is:

XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression xlogin = xpath.compile("//users/user[login/text()='" + login.getUserName() + "' and password/text() = '" + login.getPassword() + "']/home_dir/text()");
Document d = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new File("db.xml"));
String homedir = xlogin.evaluate(d);

Assume that user "john" wishes to leverage XPath Injection and login without a valid password. By providing a username "john" and password "' or ''='" the XPath expression now becomes

//users/user[login/text()='john' or ''='' and password/text() = '' or ''='']/home_dir/text()

which, of course, lets user "john" login without a valid password, thus bypassing authentication.

This situation occurred due to the use of improperly filtered input and the use of dynamic XPath query. Parameterizng the XPath query provides a second line of defense, should input validation fail. The approach to parameterizing the query in Java is to use a resolver to resolve the bound parameters:

public class LoginResolver implements XPathVariableResolver {
Login login = null;
public Object resolveVariable(QName variableName) {
if (variableName == null)
throw new NullPointerException("The variable name cannot be null");
  
if (variableName.equals(new QName("username")))
return new String(this.login.getUserName());
else if (variableName.equals(new QName("password")))
return new String(this.login.getPassword());
else
return null;
}
public LoginResolver(Login login){
this.login = login;
}
}

The corresponding XPath expression and query are:

xpath.setXPathVariableResolver(new LoginResolver(login));
XPathExpression xlogin = xpath.compile("//users/user[login/text()=$username and password/text() = $password]/home_dir/text()");
Document d = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new File("db.xml"));
String homedir = xlogin.evaluate(d);

A similar attack pattern that seeks to extract information, including the XML document structure, is known as Blind XPath Injection and is based on the lack of proper input validation and non-parameterized XPath queries. The difference lies in the fact that bypassing authentication does not require knowledge of the rest of the document and the corresponding query can be quite easily discerned. With Blind XPath Injection, the attacker asks the database a number of Boolean questions by formulating appropriate XPath expressions.

Injection Vector

User-controllable input used as part of dynamic XPath queries

Payload

XPath expressions intended to defeat checks run by XPath queries

Activation Zone

XML database

Payload Activation Impact

The impact of payload activation is that it is interpreted as part of the XPath expression used in the query, thus enabling an attacker to modify the expression used by the query.

Related Weaknesses
CWE-IDWeakness NameWeakness Relationship Type
91XML Injection (aka Blind XPath Injection)Targeted
74Failure to Sanitize Data into a Different Plane (aka 'Injection')Secondary
20Insufficient Input ValidationSecondary
390Detection of Error Condition Without ActionSecondary
Relevant Security Requirements

Special characters in user-controllable input must be escaped before use by the application.

Only use parameterized XPath expressions to query the XML database.

Custom error pages must be used to handle exceptions such that they do not reveal any information about the architecture of the application or the database.

Related Security Principles
  • Reluctance to Trust
  • Failing Securely
  • Defense in Depth
Related Guidelines
  • Never Use Input as Part of a Directive to any Internal Component
  • Handle All Errors Safely
Purpose

Penetration

Exploitation

CIA Impact
Confidentiality ImpactIntegrity ImpactAvailability Impact
HighHighMedium
Technical Context
Architectural ParadigmFrameworkPlatformLanguage
Client-ServerAllAllAll
References

CWE - XML Injection

CWE - Input Validation

CWE - Improper Error Handling

Source
Submission(s)
SubmitterOrganizationDateComment
Chiradeep B Chhaya2007-01-30Second Draft
Modification(s)
ModifierOrganizationDateComment
Malik HamroCigital, Inc2007-02-27Reformat to new schema and review
Sean BarnumCigital, Inc2007-03-05Review and revise
 
Page Last Updated: April 18, 2008