Database development for the world wide web

Database Development for the World-Wide-Web Introduction The Internet is a well-established fact of life, and while a lot of the Internet’s content comes in the form of ‘static’ web pages composed of text files in HTML format and images, our interest is in the methods that we can use to publish data from database tables on web pages. Web sites that derive their content from database files are often referred to as ‘dynamic’ web sites because if any of the data in the source database is changed (new data added, existing data deleted or updated), the next person to access the web site will see these changes immediately. Dynamic web sites are created by producing programs or scripts that generate the web pages on demand. While this may initially seem like a very wasteful use of computer processing power since there may be long periods during which the data does not change and yet the web server continues to slavishly re-generate web pages that it has generated many times before, the process of publishing this ‘live’ data on the web is only marginally less efficient that that required to copy existing files of HTML text stored on a disk. In these tutorials, you will discover the basic mechanisms for creating dynamic web sites and by following the included practical sessions, you will gain practice in the techniques necessary to implement dynamic, data-driven web sites. Practical Issues If you intend to follow the practical sessions throughout these tutorials, you will need only a few tools to allow you to do so. The first and most important tool is a Web Server: don’t panic, this does not mean you need to go out a purchase a big, industrial strength computer and start negotiating with an Internet Service Provider to provide you with a T2 connection to the Internet. In fact, web servers tend to come free with many computer operating systems, and operate perfectly well for executing web pages with active content for delivery to a browser on the local machine. If you have a computer with a copy of Windows 98, Millennium, 2000 or XP, you will almost certainly have the software necessary to set up a web server. For the practical exercises described here, you won’t even need an internet connection. If you have Windows 95, you can download a copy of Personal Web Server from the Microsoft web site (although of course you do need a connection to the web to do this). If you are running a computer installed with Linux, or an Apple Macintosh, you will also have web server software either already installed or easily available, although it is most likely that the practical sessions described here will not be compatible with your specific system – you will have to access one of the web-sites specific to your own computers, operating systems and web servers to find similar support material. Web Sites and Web Pages The Request-Response structure A Website is a collection of files stored on a computer that is connected to the Internet and organised in a way that allows them to be accessed by any other computer that is connected to the internet and running a Web Browser. The World-wide-web was invented by Tim Berners-Lee, a physicist at Cern, the international atomic research facility located near Geneva in Switzerland. It was devised as a way for computers connected to the Internet to share documents easily, and is based on the idea of Hypertext – textual documents which are interlinked by a system of document references within them. Berners-Lee devised the Hyper-Text Transport Protocol (HTTP) and the Hyper-Text Mark-up Language (HTML) that became the core format for every website. HTTP is a simple system which is understood by web servers. When a request for a document arrives at a web server, the server responds by sending the document back to the address that issued the request – a very simple client-server arrangement not unlike the way database servers respond to SQL. From the client end, the request incorporates the address of the web server and the name and location (actually, a virtual location) of the document on the server. When this request is issued, a large database distributed across the internet (the Domain Name Service or DNS) is used to work out the actual Internet address of the server (a numeric address that locates it uniquely among all of the networked computers in the world) and the request is directed there. Figure 1: Request-Response. The Internet Address of a web page or media file is issued at a browser and received by the web server. The HTML text of the web page is returned to the browser. Typically, the request comes from a Web Browser running on some computer, and is for a specific web page or resource, for example, www.BigBusiness.com/Documents/SomeDocument.html. In this case, the DNS would look up the Internet address for www.BigBusiness.com and pass the request for Documents/SomeDocument.html to it. The very clever part (alluded to by the word Hypertext in the acronym HTML) is that the document can contain text marked up in a special format understood by web browsers, and in particular, is likely to contain the addresses for other web pages. In this way, web pages contain links to other web pages, which may also contain links to other web pages and so on. The world-wide- web is a hugely interconnected network of web pages that refer to each other. This request-response organisation makes it trivially easy to create a web site by simply taking a bunch of documents and embedding inter-links within them at strategic points – a simple example is where index document contains a list of hyper- links to other pages on the site, and these pages contain links back to the index. An easy extension to this is that most web browsers can now display graphical content, and a web page can contain hyper-links to graphics files using the same request-response mechanism. In fact, a current web browser can cope with a large range of different types of content including audio, animations and video, all retrieved by the same request-response method. Request Response Web ServerWeb Browser Static Pages, Server-Side Scripting and Client-Side Scripting Using simply HTML text files and a smattering of other types of file containing graphics or other information, it is easy to build a ‘static’ website that will provide access to information about you, your hobby or your business. The down side of this is that as the details you wish to display change (obvious fundamental things like your age, which changes annually, or the products your company sells, which may change on a daily or weekly basis), so you need to maintain the files that make up the website to make the information up to date. The way around this is to create your website as a dynamic entity; one in which the information published can be updated automatically because the source material is kept up to date not by you altering the text that makes up the web pages, but by changes made to the source of the material. For example, if you were to display your age on a web page by simply putting a number in the HTML text, this will never change and so you will need to manually change the HTML text every year to keep it accurate. If however, you published your age by having it automatically worked out from your date of birth and the current date (something that is made available by just about every computer operating system), then it will always be correct and you will never need to update the page. If you published your company’s list of products not by simply listing them in a HTML file, but by extracting the product data from your company’s inventory database and sending this data back with some formatting to the browser, again the page will always be up to date (or, at least, as up to date as the source database). This type of publishing, known as server-side scripting, is done by embedding instructions into the web pages stored on the server to calculate or extract the necessary information and format it as necessary before sending it on to the client browser that issued the request. There is a growing variety of ways in which this can be done – programs can be written to generate HTML from data extracted by a database and connected to the web server, usually by placing them is a special directory on the server that is used to host these CGI (Command Gateway Interface) programs. Alternatively special ‘script’ statements can be embedded in web pages that are executed by the web server as it processes the request for the page. The generic name for the second method is Active Server Pages, and describes the general mechanism where the web server is configured to interpret scripted instructions embedded in web pages. Active Server Pages can be written in a variety of languages that can be ‘plugged-in’ to a web server; e.g. JavaScript, PHP, Python and VBScript. We will use the latter in the practical sessions, since this language is similar to the Macro language used in Microsoft Access, and is the standard scripting language for Microsoft web servers. Figure 2: Active Server pages can generate a ‘live’ response to a web page request Script can also be created to execute in the client’s browser. This client-side script is an efficient way of doing some of the work, but has limitations that make it not particularly useful for our purposes. For a start, you can not publish data from a database that you host on the server by executing a script on the client’s computer; apart from it being an absurd idea (the database is at your end of an internet connection – not the client’s), the browser would not allow such a thing on the basis of security, your web server would need to have been set up to trust scripts run from an anonymous source (a security nightmare), and there would need to be software on either side of the connection capable of maintaining the direct access the client has to your database. For these and other reasons, client-side script is never used to do much more than alter the way that information is presented in the browser, ‘jazz-up’ interactions between the user and the page on the browser or validate information that the user is required to send to the server. Even then, we can never be sure that the browser accessing a web page is capable of allowing script to run, or that if the browser is capable of executing scripts, that the facility has not been disabled. All in all, client-side scripting is a non-starter for publishing data. Generating a page from a web server If we have a computer with a web-server running on it, for example Microsoft’s Internet Information Server (IIS), we can create a website by simply adding files that contains some HTML text to a directory that the web server is set up to ‘publish’. For The Internet Web Server Active Server Page Database Request Response example, the following simple file, placed in a folder that was configured to be published by the web server, would be enough to display a static message on a client’s browser: <html> <head> <title>Static Web Publishing Test Page</title> </head> <body> <h1>Hello HTML</h1> </body> </html> Listing 1: Hello.html – this is a static html file, which would simply be sent from the server unchanged. The text in listing 1 is HTML. Everything enclosed in angled brackets <> is treated by the web browser as ‘mark-up’, special code to indicate something to the browser. The only pieces of non-mark-up text in this listing are the page title (enclosed in <title> </title> tags), and some text to be displayed in a heading style (<h1>… </h1>). This would result in the following display on a web browser (Internet Explorer, or IE) running on any computer connected to the internet: Figure 3: The Output from Hello.html Note that the web address (shown above the page in IE’s address bar) is given as http://paddington/DreamHome/Hello.html. The Web Server is hosted on a PC that I call Paddington (it is a laptop and I carry it around in a small suitcase – you figure it out), and I have set up a website in it called DreamHome (after the example database in the book). The file was saved to the folder that this website is hosted in with the name Hello.html. Of course, this is a static web page. To make a dynamic page, we need to change two things. First, the page needs to be named with an ASP file extension instead of HTML, so that IIS recognises it as a file containing script that is to be executed, and the page also need to have some script embedded in it. <%@ LANGUAGE = VBScript %> <html> <head> <title>Dynamic Web Publishing Test Page</title> </head> <body> <h1>Hello HTML: the time is <%=Time%></h1> </body> </html> Listing 2: Hello.asp – this page contains script code embedded in the html to generate ‘live’ results Figure 4: The output from LIting 2 – note the display of the time. The significant difference between listings 1 and 2 is that the code in listing 1 will always display the same information, while the code in listing 2 will display information that changes as the time changes. The script ‘tag’, <%=Time%>, indicates a function that will be executed by the server when the page is sent to the client – in this case, the function returns the current time-of-day. Any text in an ASP file that appears within the ‘script tags’, <% and %>, will be directed to the script engine of the server. Provided this text is valid ASP code (i.e. provided it is syntactically correct for the language named at the top of the page – VBScript in this case), the script engine will execute it. Pages, sessions and applications A file with an ASP extension will generate a single page of a web site. For a web site to display multiple hyperlinked pages there will need to be a collection of HTML and/or ASP files on the server (it is normal to mix HTML pages and ASP pages on the same site to mix static and dynamically generated pages). The full set of ASP pages is referred to as an ASP Application; all of these pages are designed to operate together to fulfil some specific business purpose. Within an ASP application, there can be some specially named pages – typically Global.asa and index.htm, which have a specific purpose for a web application. With a normally configured installation of IIS, index.htm is the page that someone who points their web browser to the site without giving a specific page name (e.g. http://paddington/DreamHome) would get to. Typically, this page is the home page of a site and should be the ‘welcome’ page, providing links to key pages in the web and acting as a top-level menu. Global.asa performs a different function. This page contains a number of blocks of script (subroutines) that will execute automatically when a website is in use, providing code that initializes (sets-up) the site when it is first accessed, closes down database connections or other housekeeping if it has not been accessed for a while, and sets up and manages sessions – sequences of page accesses from a user of the site. Figure 5: A small active web site, with a mixture of static (htm) pages, acti9ve (asp) pages and application pages (index.htm and Global.asa). The pages link (arrows) to each other to provide navigation through the site and to organize necessary services (e.g. login and credit checking). An ASP-based site typically interacts with many users, effectively doing so concurrently so that at any one time, there may be many different users all currently interacting with any page within the site. The sequence of pages accessed by each user is regarded as a session, and it is useful to be able to track any one user through all of the pages that they visit, either to gather statistics on the use of the site, or Index.htm Global.asa HomePage.htm LogIn.htm ViewOrders.aspCatalogue.asp CheckCredit.asp PlaceOrder.asp Login.asp simply to save users from having to repeatedly send the same information to the site from page to page. Session state Tracking users through a site brings up one problem that is central to the way that Active Server Page websites are designed. Each request for an ASP page will execute some script. In sequence these ASP pages may, for example, allow a user to log-in to the site (to provide them with personalized information, or simply to prevent unauthorized access to restricted facilities), allow them to browse through several pages, perhaps selecting products to add to a shopping cart, and then to continue to a check-out page where the sale can be completed by the user entering credit card and delivery details (see figure 5). The problem is simply that the web application has no ‘memory’ that goes beyond a single page, so there is no simple way to know what stage in a sequence a client is on. Good website design demands this, because there is no compulsion for a user who has, say, added some items to a shopping cart to complete the process by going through the checkout page, or even to stay at the same website. You might think that a way around this would be to keep track of the user as they browse from page to page of the site, gathering information as they go and keeping this information ready for when the next page request arrives. However, as designers of a website, we have no knowledge of, or way of controlling the number of simultaneous users of the site. A site could at any time have zero, 100 or 1000000 current visitors, and if the web server was required to actively track each of these, it could take up a lot of time and memory (or none). For the site to remain efficient regardless of how many current visitors there are, it is necessary for the server to NOT be aware of the sequence of page requests for each and every visitor. However, for it to keep track of the current stage that each visitor to the site is at, the website must have some way of retaining their current status. Well designed web-sites are built to be ‘stateless’ – that is, there is no specific storage of data between sequential visits to active server pages. This makes is difficult to track where in a sequence a user of a site is – have they just logged in or are they proceeding to the checkout? ASP allows a number of techniques to be used to track a user through a site, from the deployment of special ‘Session’ objects which are embedded in each page returned to a client so that they can be retrieved on the next request, to the use of ‘Cookies’ – strings that can be stored in the client’s computer from page to page (or even session to session). We can also use a couple of web programmers’ tricks, such as storing session data in variables hidden in web forms to be passed back and forth between server and client. All of these techniques involve embedding session data in the response to a page request, with the page organised so that the same data will be returned if a link on that page is used. Figure 6: Session state is maintained by sending data back and forth between the server and the browser embedded in the request-response data. The envelope here depicts this session data. All of the methods used to store data between ASP pages are there to save having to store the data at the server, where they could take up valuable space that would limit the number of simultaneous clients that could be serviced. The result is that well- designed web sites are inherently ‘scalable’, meaning that the software imposes no limit on the number of clients that the site can deal with at any one time. Of course there are limits imposed by the hardware the site is running on, the speed of the internet connection that the server has and other factors, but not the ASP website design. Application state ASP applications place demands on a web server, and it would be unwise to create a site that subsequently hogged space on the server whether it was servicing any clients or not. Managing session state goes a long way to reducing the load on a server, but a server could easily fill up with inactive ASP sites because each site must maintain a certain amount of information simply to function. A website typically keeps track of such things as the number of current users, page statistics (how many times each page has been visited) and database connections that can be used by a number of Web Server Response Next Request The World- Wide- Web [...]... value="NONE"> Listing 5: HTML code to produce the form shown in figure 9 All of the controls on the above form are encoded in the web form as , or tags At the receiving ASP page, the information can be extracted using Request.Form(“”), where represents whatever tag name has been given to each of the controls on the form For example, Request.Form(“cboRegion”)... existing database The objects contained in ADO are interrelated so that they collaborate on performing data access tasks for us A typical sequence of operations we might perform to retrieve some data for display from a database might be something like: 1 Connect to a database 2 Open the database (making the data in it accessible) 3 Send a SQL query to the database connection defining the set of information... Using the Recordset that is returned from the connection as the result of the query, step through the current record in it, displaying the contents of each field in the record 5 Try to move to the next row in the Recordset 6 If the next row is not empty, return to step 4 7 Close the Recordset 8 Close the database connection This sequence of steps indicates something about the structure that relates the. .. used to enhance the way they use your site For example, you can personalize a page so that it is able to greet the client by name, remind them of their current status or guide them to new information that might have been added since the last time they visited the site However, the Internet is no longer a medium that you can trust (whether you are a client browsing web- sites or a host of a web site); some... on the RS.MoveNext statement at the end of the loop – without this, the same order’s retails will be written out to the client’s browser forever (or until either the web response times out or the client gets bored and moves away from the site) An action query does not return a Recordset However, we can get some indication of whether the action was successful or not by checking for any errors after the. .. of these server objects Web Page Development Issues and Strategies Now that we’ve been introduced to the components that are used to make up an active website, we ought to look at some of the problems inherent in creating a website that derives most of the information it publishes from a database Database Connections The ADO connection object described earlier is our conduit to the source of information... certain ‘load’ on the database server (or on the database ‘driver’ if the database is on the local computer, as an Access DB would be) The former strategy can work well if it is not taken to extremes, the latter will always work but may involve a lot of creating and destroying of connections which can add significantly to the work the processor (and database server) must do Connections into a database that... checked out When customers 3 and 2 next visit the site, these orders will be waiting for them to make up their mind on, and we can arrange the site to give them the chance to either cancel or complete the orders If you have ever gone to a web site to make a purchase and, reaching the checkout stage, discovered items that you had selected on a previous visit and then forgotten about sitting in your shopping... that the user can enter data in to In a POST operation, data is again sent in Name/Value pairs, but now these are sent as a separate stream of binary information, retrievable from the Form property of the Request object For example, a web form can be used to pass registration details to the website: Figure 9: A HTML form – each control on this will provide data that can be collected from the Request.Form... this database structure as unrealistic as the original for the purposes of running a property agency, but the structure is perfectly adequate for illustrating how a database can be managed by a web site, and easy to understand The database tables For practical purposes, we would normally group database tables into two distinct groups – those that would be managed locally by the staff who run the company, . Database Development for the World-Wide-Web Introduction The Internet is a well-established fact of life, and while a lot of the Internet’s content comes in the form of ‘static’. are executed by the web server as it processes the request for the page. The generic name for the second method is Active Server Pages, and describes the general mechanism where the web server. alter the way that information is presented in the browser, ‘jazz-up’ interactions between the user and the page on the browser or validate information that the user is required to send to the

Database development for the world wide web

Thông tin tài liệu

Từ khóa liên quan

Mục lục

Database Development for the World-Wide-Web

Introduction

Practical Issues

Web Sites and Web Pages

The Request-Response structure

Static Pages, Server-Side Scripting and Client-Side Scripting

Generating a page from a web server

Pages, sessions and applications

Session state

Application state

Development Choices

Tools

IIS – Installation Issues

ADO

Object Model

Database Neutrality

IIS Server Objects

Server

Global.asa

Application

Session

Request

Response

Web Page Development Issues and Strategies

Database Connections

Client ‘state’

Cookies and client tracking

Tài liệu cùng người dùng

Tài liệu liên quan