INJ04-G: Prevent XML External Entity Attacks
Introduction
XML External Entity (XXE) attacks occur when an XML parser is tricked into processing external content within an XML document. An attacker can manipulate the parser to access local files, execute arbitrary code, or initiate denial-of-service attacks. XXE attacks can lead to severe confidentiality, integrity, and availability breaches, making them a critical concern for any application that processes XML input.
Entity declarations
in XML define shortcuts for commonly used text or special characters. These declarations can specify either internal
or external
entities. Internal entities include their content directly within the declaration, while external entities reference content via a uniform resource identifier (URI). Entities can be parsed
or unparsed
. Parsed entities contain replacement text, whereas unparsed entities are resources that may or may not be text and, if text, might not be XML. Parsed entities are invoked by name using an entity reference, while unparsed entities are invoked by name.
According to the W3C XML Recommendation, section 4.4.3, "Included If Validating," when an XML processor recognizes a reference to a parsed entity, it must include the replacement text to validate the document. However, if the entity is external and the processor is not validating the XML document, including the replacement text is optional. This optional inclusion means that not all XML processors are susceptible to external entity attacks during validation.
An attacker might exploit XXE vulnerabilities to access sensitive information by manipulating the entity's URI. This manipulation could reference local files containing sensitive data such as passwords or private personal information, named pipes, or Windows UNC paths to read or probe the network. Additionally, an attacker might initiate a denial-of-service attack by specifying input URIs like /dev/random
or /dev/tty
to crash or indefinitely block a program.
The following figure demonstrates an application that receives an XMLRPC-like request from an attacker to update their document.
The attacker includes a short DTD into the document to define the "file" external entity, which references a configuration file local to the vulnerable application. After evaluating the XML document, the configuration file's contents are included inline as a field. Because the evaluation of entities occurs within the XML parser, the application processing this request cannot determine that the attacker provided the content as a literal string in the field. As a result, the attacker can retrieve their previously submitted user profile information containing the extracted file data.
Non-compliant Code Example
This non-compliant code example parses an XML file without disabling or validating external entities:
// Noncompliant Code Example
static function receiveXMLStream(inStream : InputStream,
defaultHandler : DefaultHandler) : void {
var factory : SAXParserFactory = SAXParserFactory.newInstance()
var saxParser : SAXParser = factory.newSAXParser()
saxParser.parse(inStream, defaultHandler)
}
This application might use external entities to include a copyright statement in multiple documents:
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY copyright SYSTEM "file:///c:/xmldata/copyright" >]>
<foo>©right;</foo>
Using external entities might improve the system's maintainability by not requiring the copyright statement to be updated in multiple files each year. However, this same program may be susceptible to an XXE attack if an attacker can specify the contents of an external entity, either by injection or by supplying the entire XML file.
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///etc/password" >]>
<foo>&xxe;</foo>
A SAX (Simple API for XML) or a DOM (Document Object Model) parser will attempt to access the URI specified by the SYSTEM attribute. Accessing the URI may allow the attacker to exfiltrate the contents of a sensitive file, probe a network, or execute a denial-of-service attack.
Compliant Solution (EntityResolver)
This compliant solution defines a CustomResolver
class that implements the interface org.xml.sax.EntityResolver
. This interface enables a SAX application to customize the handling of external entities. The customized handler uses a simple whitelist for external entities. The resolveEntity method returns an empty InputSource when an input fails to resolve any specified safe-entity source paths.
// Compliant Solution
static function receiveXMLStream(inStream : InputStream,
defaultHandler : DefaultHandler) : void {
var factory : SAXParserFactory = SAXParserFactory.newInstance()
var saxParser : SAXParser = factory.newSAXParser()
var reader : XMLReader = saxParser.getXMLReader()
reader.setEntityResolver(new CustomResolver())
reader.setContentHandler(defaultHandler)
var is = new InputSource(inStream)
reader.parse(is)
}
The setEntityResolver
method registers the instance with the corresponding SAX driver. When parsing malicious input, the empty InputSource returned by the custom resolver causes a java.net.MalformedURLException
to be thrown. Note that you must create an XMLReader
object on which to set the custom entity resolver.
Compliant Solution (Disable External Entities)
Java web applications using XML libraries are particularly vulnerable to XXE injection attacks because most Java XML SAX parsers enable XXE by default. To use these parsers safely, explicitly disable referencing of external entities in the SAX parser implementation.
The XML processor can be configured to use only a locally defined DTD or disallow any inline DTD specified within the user-supplied XML document(s).
Many XML parsing engines are available for Java, and each has a mechanism for disabling inline DTD to prevent XXE. Search your XML parser's documentation for how to "disable inline DTD" specifically.
For Java XMLInputFactory
, the following code will work:
// Compliant Solution
xmlInputFactory.setProperty(
XMLInputFactory.SUPPORT_DTD, false
)
The primary defence against XXE attacks is to configure the application XML parser to disallow DOCTYPE declarations (DTDs)
// Compliant Solution
documentBuilderFactory.setFeature(
"http://apache.org/xml/features/disallow-doctype-decl",
true
)
An exception occurs with this feature set if the XML file has a DOCTYPE
declaration and parsing stops.
If the application requires DOCTYPE
declarations, the SAX parser can be configured to disallow the declaration of external entities:
// Compliant Solution
documentBuilderFactory.setFeature(
"http://xml.org/sax/features/external-general-entities",
false
)
External parameter entities can also be disabled:
// Compliant Solution
documentBuilderFactory.setFeature(
"http://xml.org/sax/features/external-parameter-entities",
false
)
Risk Assessment
Failure to sanitize user input before processing or storing it can result in injection attacks.
Rule | Severity | Likelihood | Remediation Cost | Priority | Level |
---|---|---|---|---|---|
INJ04-G | Medium | Likely | Medium | L8 | L2 |
Additional resources
Was this page helpful?