XML in 60 Minutes a Day phần 6 pps

72 217 0
XML in 60 Minutes a Day phần 6 pps

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

422541 Ch08.qxd 6/19/03 10:11 AM Page 330 331 This chapter focuses on transforming XML documents for output, but not the same way as Chapter 7, “XML and Cascading Style Sheets,” did. While cas- cading style sheets pertain to adding visual style to an XML document for its eventual display, these Chapter 9 transformations prepare XML data for fur- ther processing. These transformations will utilize the Extensible Stylesheet Language Transformation (XSLT) language, which is one component of the Extensible Stylesheet Language (XSL) family—XSL, XSLT, and XPath. Unfortunately, in a single introductory-level chapter like this, we can only scratch the surface of XML transformations. But we’ll show you the basics by discussing why transformations are necessary and explaining the operational model of a transformation—that is, how a transformation parser operates on a source XML document, according to instructions in a specific style sheet, to create a target document. We take you step-by-step through a simple transfor- mation to introduce you to some of the considerations, concepts, components, and syntax involved. In the lab exercises, you will install and configure TIBCO Software, Inc.’s transformation software application called XMLTransform and then use it to do similar transformations. XML Transformations CHAPTER 9 422541 Ch09.qxd 6/19/03 10:11 AM Page 331 Why Transform XML Data? More and more XML vocabularies and documents are being developed by organizations within common industries, by individual organizations, and by individuals themselves. They are drawn to XML by its capability to represent data with unique and arbitrary element type names, its structuring capabili- ties, and its human-readable nature. But because several data standards were already in existence when XML came along, and several XML-related data standards have been developed since XML appeared, two general data com- patibility problems have arisen: how to get XML to fit in with the existing non- XML standards and how to develop some level of compatibility among the XML-related vocabularies and data. Although XML may present an effective format for structuring data, by itself it isn’t a data-related panacea. It still has to get along with various data- bases, provide data for publishing tools, and cooperate with voice and video applications. At times, its documents must be expanded, reduced, reordered, and otherwise modified to meet many data challenges. Thus, wherever it comes from and whatever standards it meets when it’s created, XMLdata can’t always be used in its original form. It has to be transformed into another XML or non-XML format first. This is especially true as XML strives to meet the demands of the world of commerce and e-commerce. As more businesses link to their customers, clients, and other partners—or as departments within indi- vidual organizations are linked—the need arises to exchange information and conduct transactions online. These businesses create even more demands for data conversion. Take, for example, invoices. Invoices can be presented on a screen or printed, but they can also be used to “feed” applications pertaining to inventory, shipping, accounting, and even tax preparation. All or part of the data from a single XML invoice document might wind up as comma-delimited values in a database file, as part of an SQL script or HTTP message, or com- bined with a sequence of calls on a particular programming interface. Converting the data involves several related and important activities: finding the raw data, extracting what is needed, converting it to a form that is useful to another party, and transmitting it to that party so that they can further manipulate it (add to it, subtract from it, add it to databases, distribute it fur- ther, or display it). Just as increasing pressures exist to easily share and transmit information among organizations, pressures also arise to do so without having to create or purchase proprietary or otherwise-customized software. The XML develop- ment community has responded to some extent by creating the Extensible Stylesheet Language family of languages, and especially the XSL Transforma- tion language, the primary subject of this chapter. These languages provide mechanisms for other XML developers to mine their XML data and modify it so that it can be used to the benefit of the rest of the connected world. 332 Chapter 9 422541 Ch09.qxd 6/19/03 10:11 AM Page 332 Converting XML to HTML for display is very common. At present, it may be the most common application of XML transformation. Consequently, this aspect of XML transformation is discussed in the text and lab exercises here. We show you how to perform transformations, and how to display the trans- formed data with your browser. The W3C and Transformations We will focus primarily on XML document transformations using XSLT, which is one component of a trio of XML-related languages: ■■ Extensible Stylesheet Language (XSL) ■■ XSL Transformations (XSLT) ■■ XML Path language (XPath) Let’s briefly discuss the development history of XSL, XSLT, and XPath. The Extensible Stylesheet Language (XSL) Because of its influence on the languages we will be using, it’s important to know something of the origins of the Extensible Stylesheet Language. XSL has- n’t always been the XSL it started out to be. The original XSL proposal was drafted and submitted to W3C in 1997, and a W3C XSL Working Group was formed just prior to the February 1998 endorsement of the first edition of the XML Recommendation. XSL’s developers originally thought XSL would be a platform- and media- independent formatting language composed of two parts: a formatting lan- guage and a transformation language. The formatting language would be a set of descriptive XML elements called formatting objects that would describe the various parts of page media as tables, headers, footnotes, and so on. The trans- formation language, in turn, would convert the structure and components (elements, attributes, and so on) found in one source XML document into a new structure (a result tree), consisting of those formatting objects, perhaps even in new and different target documents. However, during XSL’s development, the original XSL concept evolved, and three separate XML-related programming languages developed: XSL Formatting Objects (XSL or XSL-FO). The XML vocabulary for specifying formatting semantics. XSL Transformation (XSLT). The language for transforming XML documents. XML Path language (XPath). An expression language used to access or refer to parts of an XML document. XML Transformations 333 422541 Ch09.qxd 6/19/03 10:11 AM Page 333 XSLT 1.0 and XPath 1.0 became W3C Recommendations in November 1999; but for technical and nontechnical reasons, the (modern) Extensible Stylesheet Language (XSL) 1.0 Recommendation wasn’t fully developed and endorsed as a W3C Recommendation until October 2001. XSL shares functionality and is compatible with the latest versions of CSS, although it uses a different syntax. But XSL also adds advanced styling features in the areas of pagination and scrolling, result tree construction, page layout, display areas, internationaliza- tion, and linking. The XSL-FO vocabulary was designed so that data could be displayed with a wide variety of media—on-screen, hard copy, or voice. For further information on XSL, start at the W3C’s XSL Web site at www.w3.org/Style/XSL/WhatIsXSL.html XSL Parsers An XSL/XSLT parser (their functions are often combined in a single applica- tion) takes an XML document and an XSL style sheet and produces a render- ing of the document. XSL and XSLT processors are readily available. Some processors are standalone; others can be integrated with other integrated development environments. You can find several by checking these Web sites: ■■ The W3C Web site at www.w3.org/Style/XSL/ ■■ The XSL Implementations page at the Open Directory Project Web site at http://dmoz.org/Computers/Data_Formats/Markup_Languages/ XML/Style_Sheets/XSL/Implementations/ ■■ The software library Web page of “The XML Cover Pages–Extensible Stylesheet Language (XSL)” at http://xml.coverpages.org/ xslSoftware.html In the lab exercises, you will download and install TIBCO Software Inc.’s application named XMLTransform, which is part of their TIBCO Extensibility platform. It is an individual XSL processor that doesn’t require an integrated development environment. The XSL Transformation Language (XSLT) XSLT is the language we’ll use for the actual XML document transformations in this chapter. XSLT is designed for transforming one XML document into another (or into HTML), and it uses its own kind of style sheet to do so. But don’t confuse XSLT style sheets with cascading style sheets. Cascading style sheets concentrate on how data is displayed. XSLT style sheets actually change the structure and type of XML data. They can add, subtract, duplicate, and sort 334 Chapter 9 422541 Ch09.qxd 6/19/03 10:11 AM Page 334 nodes (elements, attributes, text, processing instructions, namespaces, com- ments, and other components). XSLT style sheets, therefore, have a vocabulary and structure different from CSS. XSL and XSLT use XML notation, whereas, as we saw in Chapter 7, CSS uses its own vocabulary. XSLT style sheets can transform one XML document into another XML document, one using an XML vocabulary different from the original. XSLT is often used as a general- purpose XML processing language, independent of XSL, to create HTML Web pages, other text formats, audio and video presentations, and database input from XML data. Although they are quite different, and CSS is more appropriate for some tasks, XSL, XSLT, and CSS can also be used together. For example, XSL/XSLT can be used to transform XML data from a source document to a target docu- ment, and then CSS can be used to style the resulting target document data. Like CSS, XSLT is different from conventional programming languages because it is a declarative language that uses template rules to specify how XML documents should be processed. Unlike conventional programming lan- guages, which are sequential, these declarative template rules can occur in any order. Like XPath, XSLT considers documents to be composed of nodes in a tree- like structure. Its style sheets declare what output should be produced when the parser matches a pattern in a given source XML document. At this writing, the XSL Working Group has generated Working Draft documents for the XSLT 2.0 and XPath 2.0 Recommendations. For information regarding the new proposals, check them out at www.w3.org/TR/xslt20/ and www.w3.org/TR/xpath20/, respectively. XML Path Language (XPath) As we discussed in Chapter 8, “XLinks,” the XML Path language (XPath) is used to find the information in an XML document. XPath considers docu- ments to be composed of nodes of various types in a treelike logical structure and, so, allows us to address parts of an XML document. In Chapter 8, we used XPath to create links. But XPath is an important com- ponent of XML style sheet transformations because it enables us to specify the parts of a document that we want to transform. Using XPath we can specify the locations of structures or data in an XML document and then process the information in them with XSLT. In practice, we’ll see that—just as when we applied XPath with XPointer and XLink—it can be difficult to determine where XSLT stops and XPath starts. But with practice, using the two together will become almost second nature. XML Transformations 335 422541 Ch09.qxd 6/19/03 10:11 AM Page 335 Sample XML Transformation: Tabulating a List of Diamonds The best way to discuss XML transformations at an introductory level is to actually do a sample transformation. Throughout the remainder of this chap- ter, a sample transformation will be examined to illustrate some XSLT trans- formation concepts, syntax, and structure. The transformation extracts a portion of a list of diamonds currently stored in an XML document called gems1_source.xml. It then displays the extracted portion in a browser in HTML format. You will do the same transformation exercise in Lab 9.3. Our approach here is to briefly describe the overall process, then examine the source document. After that, we’ll examine the XSLT style sheet in some detail, since that is where the transformation is defined and shaped. There are two basic phases to a transformation: Structural transformation. The data is converted from the structure of the incoming source XML document to the structure of the target output. Formatting. The new structure is output in the required format (examples: markup appropriate for HTML, PDF, DB2, Oracle, or other formats). Figure 9.1 illustrates a basic XSLT transformation process inside an XSL/ XSLT compatible application. The tornado in the lower portion of the figure represents the two-phase transformation, as defined in the XSLT style sheet. The documents and other terms in the figure will be clarified as the chapter progresses. Figure 9.1 Basic XSLT transformation process. Our Application XML Parser XSL Parser Their Application gems1_source.xml transformation resultsgems1_xform.xsl 336 Chapter 9 422541 Ch09.qxd 6/19/03 10:11 AM Page 336 Briefly summarized, that overall process is as follows: 1. An application activates an XML parser and passes it the name of a source XML document, which contains the source nodes in a treelike structure. The application could be an integrated development environment, an industry- or organization-specific application, or some commercial application. In Figure 9.1, the application is called “our application,” and the source document is represented by gems1_source.xml. 2. From references within the source document, the XML parser locates a validating DTD or schema and an XSLT style sheet (represented in the figure by gems1_xform.xsl). 3. The XML parser validates the various documents and passes control to an XSL parser. The XSL parser, using the XSLT style sheet as its guide, performs the specified transformation according to the style template rules in the style sheet and generates the appropriate structure contain- ing the transformation results (also called the results tree). The results may, depending on “our application,” become an actual target file. Regardless, the results will, in turn, be used as a data source by another application (represented by “their application”). It is likely that any subsequent formatting of the data for display, if applicable, will be done by “their application” using cascading style sheets. The XML Source Document Have a look at the gems1_source.xml source document in Figure 9.2. It is a well formed and, we presume, valid XML document. In the figure, we have numbered all the nodes. Attributes and pseudo-attributes have been num- bered according to their corresponding prolog statements or element nodes. The root node contains three prolog statement nodes and the root element node named <diamonds>. The root element, in turn, contains several more nested element nodes, some of which have attribute nodes. By now, you should recognize most of the statements, element types, and attributes in the gems1_source.xml document. The third node (second line) of the XML document contains a style sheet processing instruction statement with a type=”text/xsl” pseudo-attribute. Here, the parser is told to find and apply an XSL type of style sheet (if the value of the type had been specified as text/css, then the processor would have to apply a cascading style sheet). The href=”gems1_xform.xsl” pseudo-attribute tells the processor where to look for the style sheet file. The interpretation of this instruction is “look in the same directory in which you found this XML document for an XSLT style sheet doc- ument named gems1_xform.xsl.” XML Transformations 337 422541 Ch09.qxd 6/19/03 10:11 AM Page 337 Figure 9.2 The XML source document. Table 9.1 lists six pseudo-attributes that may appear in style sheet process- ing instructions. <?xml version = "1.0" encoding = "UTF-8"?> <?xml-stylesheet href = 'gems1_xform.xsl' type = 'text/xsl'?> <! DOCTYPE DIAMONDS SYSTEM "gems1.dtd" > <diamonds> <info>To go back to the Home Page, <link type = "simple" href = "http://localhost/SpaceGems" OnClick = "location.href='http://localhost/SpaceGems' "> click here. </link> </info> <info>To see our Magical Gems, <link xmlns:xlink = "http://www.w3.org/1999/xlink" xlink:type = "simple" xlink:href = "magicgems.xml" xlink:actuate = "onRequest" xlink:show = "new" xlink:title = "To Magical Gems and Spells"> Magic Gems</link> </info> <gem> <name>Cullinan</name> <carat>3106</carat> <color>H</color> <clarity>VS1,VS2-Very Slightly Imperfect</clarity> <cut>Rough</cut> <cost>2174200</cost> </gem> <gem> <name>Dark</name> <carat>500</carat> <color>J</color> <clarity>SL1,SL2-Slightly Imperfect</clarity> <cut>Rough</cut> <cost>450000</cost> </gem> <gem> <name>Sparkler</name> <carat>105</carat> <color>F</color> <clarity>IF-Internally Flawless</clarity> <cut>Super Ideal</cut> <cost>126000</cost> </gem> <gem> <name>Merlin</name> <carat>41</carat> <color>D</color> <clarity>FL-Flawless</clarity> <cut>Ideal</cut> <cost>82000</cost> </gem> </diamonds> 2A 2 1 3 4 5 6 7 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 8 9 3A 2B 3B 7B 7A 7C 9A 9C 9E 9D 9B 9F 338 Chapter 9 422541 Ch09.qxd 6/19/03 10:11 AM Page 338 Table 9.1 Pseudo-Attributes Used in <?xml-stylesheet ?> Processing Instructions PSEUDO-ATTRIBUTE EXPLANATION alternate “Yes” or “no”; default is “no” charset Optional; the character set pertaining to the style sheet href Required; indicates the location of the style sheet; format is URI media Optional; indicates the type of target medium/media title Optional; names the style sheet type Required; indicates the kind of style sheet (for example, text/xsl indicates an XSL style sheet; text/css indicates a cascading style sheet) An XSLT style sheet can also be embedded in an XML source document. If it is, then the style sheet declaration in the source document is similar to: <?xml-stylesheet type=”text/xml” href=”#stylesheetIdName” ?> The following element should appear later in the document, and the style sheet components would follow: <xsl:stylesheet id=”stylesheetIdName” > Figure 9.3 depicts the nodal structure of the gems1_source.xml document. The source tree is presented here so that, once we’ve reviewed the transforma- tion, you can compare the source tree with the result tree. Source and target trees are valuable design tools for planning transformations and valuable result-checking tools. This source tree is not just an element tree; it shows not only the elements in gems1.xml but other types of nodes as well: elements, attributes, and declarations. We suggest creating nodal structure diagrams for all documents you want to transform. You can make them as simple or as com- plex as you want. For example, empty elements (there aren’t any in this case) might be in different types of containers, or we could have indicated which elements contain text and which have other entities (to keep things simple here, we stayed with text only). In Figure 9.3, we included node numbers in the diagram that correspond to the numbers in the source document; attribute and pseudo-attribute numbers have been grayed. XML Transformations 339 422541 Ch09.qxd 6/19/03 10:11 AM Page 339 [...]... declares the default decimal-format xsl:namespace-alias Declares that a namespace URI is an alias for another namespace URI xsl:attribute-set Defines a named set of attributes A following name attribute specifies the name of the attribute set (continued) 343 344 Chapter 9 Table 9.2 (continued) ELEMENT NAME EXPLANATION xsl:variable One of two elements used to bind variables (the other is xsl:param) Adds... Adds a name attribute and specifies a parsed character data-related name as a value for it That specified value becomes a variable name that can thereafter be combined with other specifications (for example, element names) to search for data or to create display specifications For more details and examples, refer to www.w3.org/TR/xslt For the difference, see xsl:param, below xsl:param Binds variables... Chapter 9 Figure 9 .6 Splash screen for XMLTransform Lab 9.2: XML- to -XML Transformation Using TIBCO’s XMLTransform software, you will transform data from one XML format to another This lab simulates a very typical scenario, where an XML data instance needs to be transformed into a different format This is common when systems or vendors have to exchange information, as when Vendor A has the necessary... that this file has some data inside its elements XML Transformations 4 Without making any changes, exit the file 5 Using Notepad again, open the vendorB .xml file This file has no data inside the elements This is the target file for the reformatted data 6 Without making any changes, exit this file, too 7 Start XMLTransform Click Start, Programs, XMLTransform 1.1.0 8 Click Continue Trial, if necessary... end tag—will appear in the output This is a pretty far-reaching rule, in this case, because what falls between those node 5 tags is an entire HTML document! XML Transformations Classroom Q & A Q: In the W3C XSLT 1.0 Recommendation and in other XML books, I’ve seen that template rule start tags can contain name attributes, but the explanations aren’t very good When do we use name=”value”... software 2 Retrieve the TIBCO email message and follow the software downloading instructions in it 3 Install the software; accept all the suggested defaults 4 Start the TIBCO XMLTransform tool: a Click Start, Programs, XMLTransform 1.1.0 b To initialize the product, enter the information TIBCO sent you in the email message c If the image shown in Figure 9 .6 appears, you are ready to move on to Lab 9.2... vendorB .xml file and call the resulting file vendorB_T2 .xml Lab 9.3: Simple XML- to-HTML Transformation In this lab, you will take information from inside the elements of the XML instance file, arrange it inside an HTML table, and display it in a browser Unlike the previous lab, you will be required to do some manual coding Perform these steps: 1 Open XMLTransform, Start, All Programs, XMLTransform... Click Create Transform A workspace similar to Figure 9.7 should appear 10 At this point, prior to opening any files, review the XMLTransform workspace Familiarize yourself with the workspace by observing the following objects within it: a Note that there are three panes in the top frame called Input, Graph, and Output Most of the work takes place here b Note that both the Input and Output panes allow... right-hand side of the Web page is a column with a heading that says “Free Trial Downloads.” Click the XMLTransform XML Mapping and Transformation Solution link e Click Try on the top of the page f Fill out the required information on the form and click Submit After you click the Submit button, TIBCO will send you an email with all the necessary information for you to install and initialize their XMLTransform... name=”value” in the start tag? A: Template rules themselves can be given names of their own, then later be invoked by their names For those template rules, the respective elements are given name attributes that specify the name of the template The value specified for the name attribute is a qualified name If such a template rule’s tag contains a name attribute, . other is xsl:param). Adds a name attribute and specifies a parsed character data-related name as a value for it. That specified value becomes a variable name that can thereafter be combined with. developers originally thought XSL would be a platform- and media- independent formatting language composed of two parts: a formatting lan- guage and a transformation language. The formatting language would. or com- bined with a sequence of calls on a particular programming interface. Converting the data involves several related and important activities: finding the raw data, extracting what is needed,

Ngày đăng: 14/08/2014, 12:20

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan