Hands-On Microsoft SQL Server 2008 Integration Services part 19 docx

10 286 0
Hands-On Microsoft SQL Server 2008 Integration Services part 19 docx

Đang tải... (xem toàn văn)

Thông tin tài liệu

158 Hands-On Microsoft SQL Server 2008 Integration Services Exercise (Configure File System Task) In this final part, you will configure the File System task to move the downloaded zipped files from the C:\SSIS\downloads folder to the C:\SSIS\downloads\Archive folder. This task will move the files one by one with each iteration of the Foreach Loop Container. It will use the variable User::fname, populated by Foreach Loop Container, to determine the source filename. 8. Drag and drop the File System task from the Toolbox within the Enumerating Files Container. 9. Double-click the File System Task icon to open the File System Task Editor. Select Move File in the Operation field. In the General area on the right pane, fill in the following details: Figure 5-6 Configuring the Foreach Loop Container to enumerate fully qualified filenames Chapter 5: Integration Services Control Flow Tasks 159 Name Archive downloaded files Description This task copies downloaded files from the ‘downloads’ folder to the ‘Archive’ folder. 10. In the Source Connection section, set IsSourcePathVariable to True. 11. Click in the SourceVariable field and then click the down arrow to see the drop- down list. Choose User::fname as shown in Figure 5-7. 12. In the Destination Connection section, verify that IsDestinationPathVariable is set to False. Figure 5-7 Configuring the File System task for moving files 160 Hands-On Microsoft SQL Server 2008 Integration Services 13. Click in the DestinationConnection field and then click the down arrow to see the drop-down list. Choose the <New Connection…> to open File Connection Manager Editor. In the Usage type field, select Existing Folder and type C:\ SSIS\downloads\Archive in the Folder field. Note that a File Connection Manager named Archive has been created. 14. The OverwriteDestination field allows you to overwrite the files with the same name at destination folder. Be mindful while configuring this option in the production environment. Leave it set at the default value of False. Click OK to close the File System Task Editor. 15. Now that your package is ready to be run, press 5 on the keyboard to run the package and notice how the Enumerating Files Container changes from yellow followed by Archive Downloaded Files task changing from yellow to green. This cycle is repeated twice before both the objects stop processing and turn green to declare success of the operation. Each time Archive Downloaded Files task changes color from yellow to green, one file has been moved. Stop debugging the package by pressing - 5. 16. Run Windows Explorer to check that the files have been moved from C:\SSIS\ downloads folder to the Archive subfolder in this directory. 17. Press -- to save all the files in this solution and then choose File | Close Project. Review You have configured the Foreach Loop Container to enumerate over files in a folder and pass the filenames via a variable to the File System task. The variable passed by the Foreach Loop Container was used to set the source filename in the File System task, which was configured to move files from a dynamic source to the hard-coded destination Archive folder. In this exercise, you have seen the functionality provided by SSIS components to run in synchronization, where one component was reading the files one by one and passing the information to the other component that was moving those files to a different folder as it receives the filenames from the parent container. Web Service Task You can read data from a Web Service method and write that data to a variable or a file using the Web Service task. For example, you can obtain a list of postal codes from the local postal company, write it to a flat file using the Web Service task, and then do the lookup against this postal codes file to clean or standardize your data at loading time. Web Service task uses the HTTP Connection Manager to connect to the web service. HTTP Connection Manager specifies the server URL, user credentials, optional client certificate details, time-out length, proxy settings, and so on. Chapter 5: Integration Services Control Flow Tasks 161 The Web Service Description Language (WSDL) is an XML-based language used for defining web services in a WSDL file, which lists the methods that the web service offers, the input parameters that the methods require, the responses that the methods return, and how to communicate with the web service. Thus, a web service requires a WSDL file to get details of settings to communicate with another web service. The HTTP Connection Manager can specify in the Server URL field a web site URL or a WSDL file URL. If you specify the WSDL file URL in the Server URL field, the computer can download the WSDL file automatically. However, if you are specifying the web site URL, you must copy the WSDL file to the local computer. XML Task Whenever you are working with XML data, you will be most likely using the XML task to perform operations on the XML documents. This task is designed to work with the XML documents from the workflow point of view, whereas if you want to bring XML data, i.e., the content of an XML document in the data flow, to apply transformations, you will be using the XML Source adapter while configuring your Data Flow task. The XML Source adapter is available in the Data Flow Sources section in the Toolbox when you’re working with the Data Flow task on the Data Flow panel. Using the XML task, you can perform the following operations on XML documents: 1. Retrieve XML documents and dynamically modify those documents at run time. 2. Select a segment of the data from the XML document using XPath expressions similar to how you select data using an SQL query against database tables. 3. Transform an XML document using XSLT (extensible stylesheet language transformations) style sheets and make it compatible with your application database. 4. Merge multiple documents to make one comprehensive document at run time and use it to create reports. 5. Validate an XML document against the specified schema definition. 6. Compare an XML document against another XML document. The XML task can automatically retrieve a source XML document from a specified location. To perform this operation, the XML task can use a File Connection Manager, though you can directly enter XML data in the task or specify a variable to access the XML file. If the XML task is configured to use a File Connection Manager, the connection string specified inside the File Connection Manager provides the information of the path of the XML file; however, if the XML task is configured to use a variable, the 162 Hands-On Microsoft SQL Server 2008 Integration Services specified variable contains the path to the XML document. At run time, other processes or tasks in the package can dynamically populate this variable. Like the retrieval process of XML documents, the XML task can save the result set after applying the defined operation to a variable or file. By now, you can guess that to write to a file, the XML task will be using a File Connection Manager. The XML Task Editor has a dynamic configuration interface that changes depending upon the type of operation you choose to apply to the XML documents. Following are the descriptions of these configuration areas. Input Section As mentioned, the XML task can retrieve the source document that is specified under the Input section in the XML Task Editor. You can choose from three available SourceType options: Direct Input allows you to type in XML data directly in the Source field; File Connection allows you to specify a File Connection Manager in the Source field; and Variable allows you to specify a variable name in the Source field. Second Operand Section This section defines the second document required for the operation to be performed. The type of second document depends on the type of operation. For example, the second document type will be an XML document if you are merging two documents, while the second document will be an XSD (XML Schema Definition) document if you are trying to validate an XML document against an XSD schema. Again, like the Input section, you can choose between the three types—Direct input, File connection, and Variable—in the SecondOperandType field and, based on your choice, specify the document details in the SecondOperand field. Output Section In this section, you specify whether you want to save the results of the operation performed by running the XML task. You can save the results to a variable or a file by using the File Connection Manager to specify the destination file. You can also choose to overwrite the destination. Operation Options Section This section is dynamic and changes with the option selection. For example, for a Diff operation, this section will change to the Diff Options section (see Figure 5-8), and for Merge operation, this will become the Merge Options section with its specific fields relevant to the operation. The two operations XSLT and Patch do not have this section at all. Chapter 5: Integration Services Control Flow Tasks 163 The XML task has six predefined operations for you to use. The configuration layout of the options changes as soon as you select a different operation. Validate You can validate the XML document against a Document Type Definition (DTD) or XML Schema Definition (XSD) schema. The XML document you want to validate is specified in the Input section in the Editor, and the schema document is specified in the Second Operand section. The type of schema document depends upon what you specify for ValidationType—XSD or DTD. With either type of ValidationType, you can choose to fail the operation on a validation failure in the FailOnValidationFail field. Figure 5-8 The XML Task Editor 164 Hands-On Microsoft SQL Server 2008 Integration Services XSLT You can perform XSL transformations on the XML documents using XSLT style sheets. The Second Operand should contain the reference to the XSLT document, which you can type directly into the field or specify by using either the File Connection Manager or a variable. XPATH Using this operation, you can perform XPATH queries and evaluations on the XML document. The Second Operand should contain a reference to the second XML document, which you can type directly into the field or specify by using either the File Connection Manager or a variable. You can select the type of XPATH operation in the XPathOperation field. The XPathOperation field provides three options. Evaluation c Return the results of an XPath function such as sum(). Node list c Return the selected nodes as an XML fragment. Values c Return the results in a concatenated string for text values of all the selected nodes. Merge Using this operation, you can merge two XML documents. This operation adds the contents of the document specified in the Second Operand section into the source document. The operation can specify a merge location within the base document. One thing to note here is that the XML task merges only the documents that have Unicode encoding. To determine whether your documents are using Unicode encoding, open the XML document with any editor or using Notepad and look at the beginning of the document to find [encoding="UTF-8"] in the declaration statement. UTF-8 indicates the 8-bit Unicode encoding. Diff Using this operation you can compare the source XML document to the document specified in the Second Operand section and write the differences to an XML document called a Diffgram document. The Diff operation provides a number of options to customize this comparison: DiffAlgorithm c Provides three choices: Auto, Fast, and Precise. You can choose between comparison algorithm to be fast or precise. e Auto option lets the Diff operation decide whether to select a fast or precise comparison based on the size of the documents being compared. Chapter 5: Integration Services Control Flow Tasks 165 IgnoreComments c Specifies whether comment nodes are compared. IgnoreNamespaces c Indicates whether the namespace URI (uniform resource identifier) of an element and its attribute names are compared. IgnorePrefixes c Specifies whether prefixes of element and attribute names are compared. IgnoreXMLDeclaration c Specifies whether the XML declarations are compared. IgnoreOrderOfChildElements c XML documents have hierarchical structure, and this option specifies whether the order of child elements is compared. IgnoreWhiteSpaces c Specifies whether white spaces are compared. IgnoreProcessingInstructions c Specifies whether the processing instructions are compared. IgnoreDTD c Specifies whether the DTD is ignored. FailOnDifference c Specifies whether the task fails if the Diff operation fails, e.g., an XML document fails to validate according to the validation schema. SaveDiffGram c Choose to save the comparison result in a Diffgram document. Patch Using this operation, you can apply the Diffgram document you saved earlier in the package during the Diff operation to an XML document. By doing this, you actually create a new XML document that includes the contents of the Diffgram document created earlier by the Diff operation. Execute SQL Task The Execute SQL task is the main workhorse task to run SQL statements or stored procedures and uses the power of the underlying relational database. If you have used DTS, you may have used this task. Typically in DTS, once you have loaded data into a database and you apply transformations using the Execute SQL task. These transformations vary from generating salutations to lookup transformations, deriving columns, or applying business rules using the SQL Server relational engine. The design philosophy used in SQL Server 2008 Integration Services allow you to perform many of these tasks during the loading phase while data is still in memory, thereby increasing performance by reducing the repeated and inefficient transformations that require data to be staged or involved read/write operations on hard disks. The power of the Execute 166 Hands-On Microsoft SQL Server 2008 Integration Services SQL task is still available in SSIS in a more usable form by providing ability to use variables, to create expressions over the properties of the task, and to return a result set to the control flow that can be used to populate a variable. Using the Execute SQL task, you can perform workflow tasks such as create, alter, drop, or truncate tables or views. You can run a stored procedure and store the result to a variable to be used later in the package. You can use it to run either a single SQL statement or multiple SQL statements, parameterized SQL statements, and save the rowset returned from the query in to a variable. You have already used this task in the “Using System Variables to Create Custom Logs” Hands-On exercise in Chapter 3, where you used a parameterized SQL statement, and in the “Contacting Opportunities” Hands-On exercise in Chapter 4, where you saved the resulting rowset to a variable, which then got enumerated over by a Foreach Loop Container. If you scroll to the Maintenance Plan Tasks in the Control Flow Toolbox, you will see a similar task, the Execute T-SQL Statement task. The Execute T-SQL Statement task has a more simple interface than the Execute SQL task and is focused on performing maintenance tasks on SQL Server databases using T-SQL. It doesn’t give you any facility to run parameterized queries and direct the result set to the work flow, whereas the Execute SQL task has a more complex interface and is designed for use in a relatively complex workflow where you need to use SQL Statements against not only the SQL Server but a variety of sources, deal with variables, run parameterized queries, or direct the result set to the data flow. Keep this task open in front of you and try various selections as we go through each option, as the task contains dynamic fields that change depending upon the choices you make in certain fields. The Execute SQL Task Editor includes General, Parameter Mapping, Result Set, and Expressions pages. General Page In this page, you define a Name and Description for the task under the General section. In the Options section, you can specify a TimeOut value in seconds for the query to run before timing out. By default, the TimeOut value is 0 (zero), indicating an infinite time. The CodePage field allows you to specify the code page value. In the Result Set section, you choose one of four options based upon the result set returned by the SQL statement you specify in this task. Based on the type of SQL statement—i.e., whether it is a SELECT statement or INSERT/UPDATE/DELETE statement—the result set may or may not be returned. Also, the result set may contain Chapter 5: Integration Services Control Flow Tasks 167 zero rows, one row, or many rows, and the following four options in the Execute SQL Task Editor ResultSet field allow you to configure them: None c Use this value when you use INSERT, UPDATE, or DELETE SQL statement that returns the result set containing zero rows. Single Row c When the SQL statement or a stored procedure returns a single row in the result set. Full result set c When the query returns more than one rows. XML c When the SQL statement returns a result set in the XML format. In the SQL Statement section ConnectionType field, options are the EXCEL, OLE DB, ODBC, ADO, ADO.NET, and SQLMOBILE connection manager types, used for connecting to a data source. Depending on the type of connection manager you’ve chosen, the Connection field provides a drop-down list of already configured connection managers of the same type or provides a <New Connection…> option to let you add a connection manager of the appropriate type. The interface provided by the <New Connection…> option changes to match your selection of the connection manager specified in the ConnectionType field. Depending on the data source type, you can write a query using an SQL statement in the dialect the specified data source can parse. Further, you can specify the source from where the SQL statement can be read for execution in the SQLSourceType field. The selection of the source in the SQLSourceType field changes the next field dynamically (which is coupled to it) to match the SQLSourceType choice. The options available in the SQLSourceType field and how it affects the coupled field are explained here: Direct input c Allows you to type an SQL statement directly in the task. is changes the coupled field to SQLStatement, which provides an interface in which to type your query. File connection c If you have multiple SQL statements written in a file, you can choose this option to enable the Execute SQL task to get the SQL statements from the file. Selecting this option changes the coupled field to FileConnection, which allows you to specify a File Connection Manager to connect to an existing file containing SQL statements. Variable c Enables the Execute SQL task to read the SQL statement from a variable. is option changes the coupled field to the SourceVariable field, which provides a drop-down list of all the system and user variables. . 158 Hands-On Microsoft SQL Server 2008 Integration Services Exercise (Configure File System Task) In this final part, you will configure the File System task. read/write operations on hard disks. The power of the Execute 166 Hands-On Microsoft SQL Server 2008 Integration Services SQL task is still available in SSIS in a more usable form by providing. file; however, if the XML task is configured to use a variable, the 162 Hands-On Microsoft SQL Server 2008 Integration Services specified variable contains the path to the XML document. At run

Ngày đăng: 04/07/2014, 15:21

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan