Pro PHP XML and Web Services phần 6 docx

94 377 0
Pro PHP XML and Web Services phần 6 docx

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Element Node An element node is processed through the creation of a start tag using the QName of the ele- ment, the processing of any namespace and attributes nodes, and the processing of any child nodes and an end tag for the element. I will not explain the semantics of this generation because the PHP extensions actually handle the actual generation of the serialized form. ■Tip In canonical form, element nodes must always have a starting and ending tag. Empty tags are not allowed. You must write <element /> as <element></element>. One point I would like to mention, though, concerns using empty namespace declarations. Within a document, you can use them to indicate that an element is not in any namespace. Typi- cally, however, this is used only when within a default namespace and indicates that nodes within the current scope are not within a default namespace. For example: <element1 xmlns="http://www.example.com> <element2> <element3 xmlns=""></element3> </element2> </element1> The element node element1 sets the default namespace to http://www.example.com. This namespace is automatically inherited by element2. The element3 node removes the default namespace by setting xmlns="" so that any element falling within the children of element3 would not be in any namespace unless otherwise set by one of those node. With respect to canonical form, only elements that would otherwise be in a default namespace can set an empty namespace. So, based on this rule, the following is invalid: <! The following is invalid > <element1 xmlns=""> <element2></element2> </element1> The node element1 is not within a default namespace so cannot define an empty default namespace in canonical form. In canonical form, it looks like this: <element1> <element2></element2> </element1> Namespace Node Namespace nodes are processed only if they are not in scope based on the same prefix and name- spaceURI of an ancestor element also within the node set. Consider the following document: CHAPTER 12 ■ XML SECURITY454 6331_c12_final.qxd 2/16/06 4:39 PM Page 454 <element1 xmlns:a="http://www.example.com/a" xmlns:b="http://www.example.com/b"> <element2 xmlns:a="http://www.example.com/a" xmlns:b="http://www.example.com/Z"> <element3></element3> </element2> </element1> The canonical form of this document looks like this: <element1 xmlns:a="http://www.example.com/a" xmlns:b="http://www.example.com/b"> <element2 xmlns:b="http://www.example.com/Z"> <element3></element3> </element2> </element1> You can see that the namespace with the prefix a was removed from element2. The namespaceURI and prefix are both in scope from its parent so are not serialized. The namespace with prefix b on element2 was included because the namespaceURI for that prefix changed and is no longer the same namespace. Attribute Node The PHP extensions already handle the processing of attribute nodes when being serial- ized. The only possible issue may deal with the values of attributes. The serialized value of the node is modified by replacing the characters &, <, and " with their entity references and the whitespace characters #x9 (tab), #xA (line feed), and #xD (carriage return) with their character references. Notice that the > character is not modified. Basically, you will modify the attribute values when serialized if using any special characters. If this is the case, then XSL may be helpful because you can use a template to match attributes, which in turn calls a PHP function to process the attribute value and return a modified string. For the sake of this chapter and because of having to build all this manually, I will use simplified attribute values that need no special handling. Text Node Text nodes are processed by converting the characters &, <, and > to their entity references. The whitespace character #xD is also replaced by &#xD;. Processing Instruction Node PI nodes will already have been taken care for you during serialization, unless the value is empty. They consist of the <? characters followed by the target, a space, the value, and the closing ?> characters. An empty value would not place a space after the target. Do not confuse an empty value with a value consisting of whitespaces. Consider the following: <?php?> <?php ?> <?php ?> The canonical forms of these are as follows: CHAPTER 12 ■ XML SECURITY 455 6331_c12_final.qxd 2/16/06 4:39 PM Page 455 <?php?> <?php?> <?php?> Each of these has no value, so no additional space was added. ■Caution PIs that have empty values will need to be handled like attributes when creating the canonical form. The suggested method is to use XSL to create the values properly. This, of course, is needed only if the document can have PI nodes with empty values. In all other cases, serialization using the PHP extensions will work correctly. Comment Node Comment nodes are a little special. Canonical form can be generated without comment nodes. If this is the case, comment nodes have no bearing during serialization. In this case, you need to remove all comments in the document. Again, you could do this using XSL or using the DOM API. This is an example of doing this with DOM in combination with XPath: $xPath = new DOMXPath($dom); $cnodes = $xPath->query('//comment()'); foreach ($cnodes AS $cnode) { $cnode->parentNode->removeChild($cnode); } Introducing Exclusive XML Canonicalization An issue faced earlier in the chapter when working with canonical XML dealt with extracting a document subset and inserting it into a different context. This caused many problems because canonical XML includes the document subset’s ancestor namespace declarations and attributes within the XML namespace. For instance, a wrapper node encapsulates a sub- set and might be used for something like transport. If you are familiar with SOAP, you know this would be equivalent to its envelope. The following document is in canonical form: <subdoc> <element>content</element> </subdoc> The document subset is then encapsulated within an envelope: <envelope xmlns="http://www.example.com" xml:lang="en"> <subdoc> <element>content</element> <subdoc> </envelope> CHAPTER 12 ■ XML SECURITY456 6331_c12_final.qxd 2/16/06 4:39 PM Page 456 When canonical XML is applied to the subset in this case, the serialized version is much different: <subdoc xmlns="http://www.example.com" xml:lang="en"> <element>content</element> </subdoc> Dealing with something like digital signatures becomes a nightmare. The original docu- ment no longer has the same canonical form as the latter one even though it is the same document/subset. Trying to extract the subset and place it within a different context, such as within another document, becomes impossible. This is why you might also hear canonical XML referred to as inclusive canonical XML. It includes the context of a subset’s ancestors. To deal with this issue, exclusive XML canonicalization was devised. It excludes, rather than includes, the context of a subset’s ancestors. This means namespace declarations and attributes in the XML namespace from a subset’s ancestors are not part of the canonicaliza- tion process when performing exclusive XML canonicalization. Taking the enveloped subdoc and using exclusive XML canonicalization, the results are probably more of what you had originally expected: <subdoc> <element>content</element> <subdoc> The document subset remains in the same form as it originally was. This area is where canonical XML and exclusive XML canonicalization differ. Data Model The data model for exclusive XML canonicalization is the same as that for canonical XML with a few exceptions. These exceptions, as previously noted, fall into the area of namespace decla- ration handling. You have already seen that a search of ancestor nodes not within the node set for namespace declarations and attributes from the XML namespace is not performed under exclusive XML canonicalization. Serialization of namespace declarations themselves also dif- fers and depends upon a few factors. You can use an InclusiveNamespaces PrefixList parameter with exclusive XML canoni- calization. It is a list containing prefixes and/or a token that indicates a default namespace. This parameter plays a role in how namespaced nodes are rendered in canonical form. ■Note For the sections dealing with prefixes not in the InclusiveNamespace PrefixList, assume the list is NULL, meaning it does not contain any prefixes or tokens. This will help you understand the process. Prefixed Namespace Nodes Namespaced nodes with a prefix not in the InclusiveNamespaces PrefixList, if used, are rendered if they meet the following criteria: CHAPTER 12 ■ XML SECURITY 457 6331_c12_final.qxd 2/16/06 4:39 PM Page 457 • The parent element is in the node set. • The namespace is visibly utilized by the element, which includes its attributes. • The prefix has not already been rendered by an ancestor within the output, or the prefix has been rendered by an ancestor yet refers to a different namespace. The term visibly utilize means that either the element or one of its attributes uses the prefix of the namespace within its qualified name. The following document will be serialized using exclusive XML canonicalization. It is assumed that all nodes are within the node set. <n1:element1 xmlns:n1="http://www.example.com/ns1" xmlns:n2="http://www.example.com/ns2"> <n2:element2 n1:att1="value" xmlns:n3="http://www.example.com/ns3"> some content </n2:element2> </n1:element1> Based on the rules for namespace serialization, the canonical form ends up like the following: <n1:element1 xmlns:n1="http://www.example.com/ns1"> <n2:element2 n1:att1="value" xmlns:n2="http://www.example.com/ns2"> some content </n2:element2> </n1:element1> As you can see, the n2 namespace was not serialized on the n1:element1 element. It is not visibly utilized there. Moving to the n2:element2 element, the n2 namespace declaration is added because it meets all the criteria. Its parent element, n2:element2, is in the node set, it is visibly utilized by the element (notice the n2 prefix for the element name), and the prefix has not yet been rendered. The n3 namespace was not rendered because it is not visibly uti- lized. The n2:element2 element is not in the n3 namespace and does not contain any attributes within the n3 namespace. Default Namespace Nodes The rules for processing tokens that represent default namespace nodes not in the InclusiveNamespaces PrefixList are different from those for canonical XML for empty namespaces, xmlns="". The empty namespace is output only if the element visibly utilizes the default namespace, the element does not define a default namespace that is in the node set, and the nearest ancestor that is output and that visibly utilizes the default namespace has a default namespace in the node set. This may sound a little confusing, so take a look at the following document: <element1 xmlns=""> <element2 xmlns="http://www.example.com/default"> <element3 xmlns=""> <element4 xmlns=""> Some Content CHAPTER 12 ■ XML SECURITY458 6331_c12_final.qxd 2/16/06 4:39 PM Page 458 </element4> </element3> </element2> </element1> The canonical form using exclusive XML canonicalization is as follows: <element1> <element2 xmlns="http://www.example.com/default"> <element3 xmlns=""> <element4> Some Content </element4> </element3> </element2> </element1> The only element that declares an empty default namespace is element3. InclusiveNamespaces PrefixList The InclusiveNamespaces PrefixList throws a little curve to the rules already defined for han- dling namespace nodes. A namespace node matching a prefix or token in the list is rendered according to the rules of canonical XML rather than those of exclusive XML canonicalization. Namespace nodes in the node set that match a prefix or token in the list, unlike those not in the list, do not need to have parent elements in the node set. This can make your output look a little strange because it can result in non-well-formed XML, which is perfectly acceptable when generating a canonical form for a document subset. For the sake of sanity (because this leads to much greater complexity than you are already dealing with), the discussion of name- space nodes without an element in the node set is out of the scope of this chapter. Documents and document subsets used within this chapter will conform to those described in the next section. Constrained Implementation (Non-Normative) Section 3.1 of the Exclusive XML Canonicalization specification deals with a non-normative way to implement exclusive XML canonicalization. It assumes that subsets are well-formed and that when an element is in a node set, so is its namespace axis. When an element is not in a node set, neither is its namespace axis. These are the types of documents and document subsets that will be used within this chapter when working with the XML extensions in PHP. The following steps come directly from the specifications for section 3.1: 1. Recursively process the entire tree (from which the XPath node set was selected) in document order starting with the root. (The operation of copying ancestor xml: namespace attributes into output apex element nodes is not done.) 2. If the node is not in the XPath subset, continue to process its children element nodes recursively. CHAPTER 12 ■ XML SECURITY 459 6331_c12_final.qxd 2/16/06 4:39 PM Page 459 3. If the element node is in the XPath subset, then output the node in accordance with canonical XML except for namespace nodes, which are rendered as follows: a. ns_rendered is a copy of a dictionary, off the top of the state stack, of prefixes and their values that have already been rendered by an output ancestor of the name- space node’s parent element. b. Render each namespace node if and only if it is visibly utilized by the immediate parent element or one of its attributes or if it is present in InclusiveNamespaces PrefixList and if its prefix and value do not appear in ns_rendered. c. Render xmlns="" if and only if the default namespace is visibly utilized by the immediate parent element node or the default prefix token is present in InclusiveNamespaces PrefixList and the element does not have a namespace node in the node set declaring a value for the default namespace and the default namespace prefix is present in the dictionary ns_rendered. 4. Insert all the rendered namespace nodes (including xmlns="") into the ns_rendered dictionary, replacing any existing entries. Push ns_rendered onto the state stack, and recurse. 5. After the recursion returns, pop the state stack. This list contains generalized instructions on how exclusive XML canonicalization could be implemented. As you get into the “Introducing XML Signatures” and “Introducing XML Encryption” sections, you will see examples using PHP that demonstrate this generalization. ■Note The canonical forms used with digital signatures and encryption are generated using exclusive XML canonicalization. Introducing XML Signatures XML signatures can verify the integrity and source of data and that the data has not been altered from its original state. It does this by using keys. One of the most commonly used methods involves public and private keys. An author of a document would use a private key to sign the data. This would create a digital signature, which is then added to an XML docu- ment. The receiver, who must have a copy of the author’s public key, would then use that key to verify the signed data. Upon a successful verification, the receiver knows three things: • The author is the genuine originator of the document, which is known as signer authentication. • The data has not been altered from its original form, which is called integrity. • Neither the data nor the checksum has been tampered with, which may occur if someone is trying to alter data while keeping the integrity of the data in order to deceive the receiver of the data. This is commonly known as message authentication. CHAPTER 12 ■ XML SECURITY460 6331_c12_final.qxd 2/16/06 4:39 PM Page 460 The XML-Signature Syntax and Processing specification (http://www.w3.org/TR/ xmldsig-core/) specifies the syntax and processing rules for creating and representing digital signatures. It is named such because it uses XML syntax for the signature. You can apply XML signatures to virtually any type of digital data including data within an XML document as well as remote resources accessible from a URI. Understanding the Types of Signatures Three types of XML signatures exist: enveloped signatures, enveloping signatures, and detached signatures. Enveloped Signatures Enveloped signatures are signatures that are contained within the XML content that is being signed. In simple terms, an enveloped signature is an XML signature structure that is a child of a signed document. For example: <mydocument> <mydata1>some data</mydata1> <mydata2>more data</mydata2> <Signature> <! Signature Data > </Signature> </mydocument> The XML signature, denoted by the Signature element and its contents, is placed within the document being signed. In this case, the data would include the data from the mydocument element and all of its content but exclude the actual XML signature structure, which begins with the Signature element. Enveloping Signatures XML signatures can also be enveloping. This means the data being signed lives within the XML signature structure: <mydocument> <mydata2>more data</mydata2> <Signature> <! Signature Data including the reference to the Object element > <Object Id="mydata"> <mydata1>some data</mydata1> </Object> </Signature> </mydocument> Although the XML signature structure does not need to be embedded within an XML document since its structure is in XML format, I have shown it this way because you have not been introduced to the structure and because it illustrates how an enveloping signature can be encapsulated within another document, for which the encapsulating document has no bearing on the signature. In this case, the signature would include a reference to the Object CHAPTER 12 ■ XML SECURITY 461 6331_c12_final.qxd 2/16/06 4:39 PM Page 461 element. The data within the Object element is the data being signed. I will explain how to reference data later in the “Introducing the XML Signature Structure” section. Detached Signatures Enveloped signatures means the signature is encapsulated within a document being signed. It is not necessary that the entire document be signed, and it is quite possible you have only a single element in the document that is signed. In fact, it is also quite possible that the data being signed does not even live within the document and resides remotely and is accessible through a URI. Detached signatures are used just for these purposes: <mydocument> <mydata2>more data</mydata2> <Signature> <! Signature Data including the reference to the Object element > </Signature> <Object Id="mydata"> <mydata1>some data</mydata1> </Object> </mydocument> With this example, the data within the Object element is again being signed. This time, however, the element lives outside the signature, and the signature is being applied only to that particular element and not the entire document: <Signature> <! Signature Data including the reference to remote data > </Signature> This example refers to data being signed that lives outside the document entirely. Rather than referencing the Object element from the previous examples, the XML signature, in this instance, references remote data using a URI. Again, I will explain this in detail in the “Intro- ducing the XML Signature Structure” section when I break down the structure. Just as in the previous example, it would also be valid to encapsulate the XML signature within a document, and the document has no bearing on the signature or the referenced data. Introducing the XML Signature Structure The structure of XML signatures can get quite complex. An entire book could be written on this subject alone. For this reason, I will keep things simple and cover only the core syntax. This chapter will introduce how to create and verify basic XML signatures using PHP. After you understand this, you should be able to implement more advanced signatures based on the specifications. The document in Listing 12-2 illustrates a valid enveloping signature. CHAPTER 12 ■ XML SECURITY462 6331_c12_final.qxd 2/16/06 4:39 PM Page 462 ■Note The XML signature in Listing 12-2, as well as all other examples in this chapter, uses the string "secret" for the HMAC key.Attributes named Id are ID attributes. No DTDs are being used in this section, although you could use a DTD to automatically define these as IDs within the document. Refer to the speci- fications for the schemas for each element and attribute list. Listing 12-2. Example of Enveloping Signature <Signature xmlns="http://www.w3.org/2000/09/xmldsig#"> <SignedInfo> <CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315" /> <SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#hmac-sha1" /> <Reference URI="#object"> <DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1" /> <DigestValue>nTZuluErIxkl4DgMsBO/E5TiLRA=</DigestValue> </Reference> </SignedInfo> <SignatureValue>OUubDO2l6XUIODuLSjKAtjYlaTk=</SignatureValue> <Object Id="object">Hello World!</Object> </Signature> The data being signed in this case is the Object element. It lives within the Signature ele- ment, thus creating an enveloping signature. Using this example, you’ll now see how the XML signature is composed and structured. Signature Element The Signature element is the root of an XML signature and is bound to the http://www.w3.org/ 2000/09/xmldsig# namespace. This element contains all the information needed to verify an XML signature. SignatureValue Element The SignatureValue element contains the Base64-encoded value of the actual digital signature, which in Listing 12-2 is the value OUubDO2l6XUIODuLSjKAtjYlaTk=. I’ll explain how to compute this value later in the “Generating a Signature” section as well as when you get into actually generating an XML signature. SignedInfo Element The SignedInfo element is a container element that provides information regarding how a signature is processed, the location of the data that is signed, and the value for data integrity. This element also accepts an optional Id attribute. Using this attribute allows the element to be referenced by other signatures or objects. CHAPTER 12 ■ XML SECURITY 463 6331_c12_final.qxd 2/16/06 4:39 PM Page 463 [...]... The value defines the encoding used within the object The MimeType and Encoding attributes are purely informational It is completely up to the application whether they need to be used 465 63 31_c12_final.qxd 466 2/ 16/ 06 4:39 PM Page 466 CHAPTER 12 ■ XML SECURITY Creating a Signature Creating a signature involves generating the Reference and SignatureValue elements The rest of the information within a... TR/xmlenc-core/) specifies a process for encrypting data and representing the result in XML The XML Encryption Requirements specification (http://www.w3.org/TR /xml- encryption-req/) specifies the requirements for implementing XML encryption Encryption Granularity You can use XML encryption to sign virtually any type of data This includes both XML and non -XML- based data Just like XML signatures, the data... examine the XML document and know the type of payment being made without needing access to any of the encrypted information For example: 1001 Joe Smith 475 63 31_c12_final.qxd 4 76 2/ 16/ 06 4:39 PM Page 4 76 CHAPTER 12 ■ XML SECURITY... does need to be an XML document or have anything to do with XML XML encryption is just a process and standard structure for encrypting some data, packing it up as a standard structure, and possibly providing some information about the type of encryption and data used Super Encryption Data that is encrypted more than once is called super encryption It’s possible you are a security zealot and are trying... encrypted and possibly what type of data was encrypted Using 485 63 31_c12_final.qxd 4 86 2/ 16/ 06 4:39 PM Page 4 86 CHAPTER 12 ■ XML SECURITY the loaded document, the first step is to locate the EncryptedData element and determine the algorithm, KeyInfo element, and parameters to use: $xpath = new DOMXPath($encdom); $query = "//*[local-name()='EncryptedData' and " "namespace-uri()='http://www.w3.org/2001/04/xmlenc#']";... an XML declaration output 63 31_c12_final.qxd 2/ 16/ 06 4:39 PM Page 471 CHAPTER 12 ■ XML SECURITY */ $canonical = $dom->saveXML($copyInfo, LIBXML_NOEMPTYTAG); /* Calculate HMAC SHA1 */ $hmac = hmac($key,$canonical); print $hmac."\n"; $bhmac = base64_encode(pack("H*", $hmac)); /* Handle whitespaces for presentation layout */ $addPrev = NULL; $addPost = NULL; if ($Object->previousSibling->nodeType == XML_ TEXT_NODE)... The following steps are taken from the XML Encryption Syntax and Processing specification (http://www w3.org/TR/xmlenc-core/) for those needing to perform more advanced encryption than that shown in this chapter: 63 31_c12_final.qxd 2/ 16/ 06 4:39 PM Page 481 CHAPTER 12 ■ XML SECURITY 1 Select the algorithm (and parameters) to be used in encrypting this data 2 Obtain and (optionally) represent the key a... cleaned up and formatted for presentational purposes.) 483 63 31_c12_final.qxd 484 2/ 16/ 06 4:39 PM Page 484 CHAPTER 12 ■ XML SECURITY 1001 Joe Smith ... in handy Introducing XML Encryption XML signatures only get you so far They provide the mechanisms to verify the integrity and authenticity of the data, but the data is still, in most cases, in plain text The W3C has defined some specifications in order for systems to implement a common format to perform XML encryption The XML Encryption Syntax and Processing specification (http://www.w3.org/ TR/xmlenc-core/)... algorithms, key systems, and more complex structures using the specifications as a reference The next chapter is a break in the examination of XML technologies and provides an introduction to PEAR and some of the XML packages it offers The packages provide functionality written in PHP that can be used rather than having to write your own custom code for many of the common XML needs and technologies 489 . authentication. CHAPTER 12 ■ XML SECURITY 460 63 31_c12_final.qxd 2/ 16/ 06 4:39 PM Page 460 The XML- Signature Syntax and Processing specification (http://www.w3.org/TR/ xmldsig-core/) specifies the syntax and processing. following: < ?php? > < ?php ?> < ?php ?> The canonical forms of these are as follows: CHAPTER 12 ■ XML SECURITY 455 63 31_c12_final.qxd 2/ 16/ 06 4:39 PM Page 455 < ?php? > < ?php? > < ?php? > Each. steps here are to generate the DigestValue and generate the actual signature. CHAPTER 12 ■ XML SECURITY 466 63 31_c12_final.qxd 2/ 16/ 06 4:39 PM Page 466 Generating the Reference According to the

Ngày đăng: 12/08/2014, 13:21

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan