Project 2 : Part II - CGI

CS219 - Programming for the World Wide Web

Due : Friday, February 29, 1999. Midnight.

Introduction

In the first part of this project, you implemented a simple web-server capable of delivering static pages and running CGI programs. Now you will use your server to build a simple shopping cart application. In the process, you will write some CGI scripts, work with persistent cookies, and interact with a database.

Please read this entire handout before you begin work.


Overview

You will be building an online shopping cart interface for the CS219 Music Shack--an online store with a very poor selection of music and so-so prices (but this is beside the point).

The store is built around a relational database system that contains the current catalog of CDs, customer information, and individual orders. Unfortunately, this database doesn't have any sort of web-interface. Your job is to provide a web-interface to the database that allows customers to do the following :

To do this, you will be writing a collection of CGI scripts that talk to the database and manage a shopping cart.

The Database

The music database lives on a remote machine called 'rustler.cs.uchicago.edu.' You don't have any direct access to that machine, but the database, also being a networked application, can be accessed remotely with an appropriate client. For this project, you will be working with two tables of information as follows:
Table : Items
-------------
     item_number     int
     item_title      char(50)
     item_artist     char(50)
     item_price      money
     item_category   char(15)  

Table : OrderItem
-----------------
     order_id        char(50)
     order_item      int
     order_quantity  int
The Items table contains information about the CDs available in the catalog. The OrderItem table contains information about a particular item that has been ordered.

The database also contains tables about customers and billing information, but you will not have to worry about that until the next project (security and authentication).

Accessing the Database

The database is a freely available database package called mSQL (available at http://www.hughes.com.au). A client has been installed on classes in the directory

/usr/local/classes/current/CS219/packages/bin/

For debugging and to generally see what is in the database, you can use the program 'msql' as follows (things you type are in bold):

% > msql -h rustler project2

Welcome to the miniSQL monitor.  Type \h for help.


mSQL > select item_title from Items
    -> \g

Query OK.  32 row(s) modified or retrieved.

 +----------------------------------------------------+
   item_title                                          
 +----------------------------------------------------+
   Zoot Suit Riot                                      
   Supposed Former Infatuation Junkie                  
   The Dirty Boogie                                    
   The Book of Secrets                                 
   Before These Crowded Streets                        
   Left of the Middle                                  
   Version 2.0                                         
   Adore                                               
   Whitechocolatespaceegg                              
   From The Choirgirl Hotel                            
   Kind of Blue                                        
   ...
 +----------------------------------------------------+
This interface to the database is really only used for debugging. For your CGI scripts, you will need to access the database in a way that makes it easy to extract information and produce HTML pages. mSQL has both a C API and its own scripting language that are described in:

/usr/local/classes/current/CS219/packages/doc/manual-html/manual.html

I will describe a Python interface to mSQL that has been installed on classes. You are free to use any interface to mSQL you can find, but I recommend sticking with the Python interface described here.

Accessing the database from Python

To connect to the database from Python, you execute the following code:
# Load the database module
from msql import *          

# Connect to the database
h = msqlConnect("rustler.cs.uchicago.edu")
if h < 0:
     print "Couldn't connect"

# Select the database
if msqlSelectDB(h,"project2") < 0:
     print "Couldn't find the database"

# Do a bunch of stuff
...

# Close the database (when you are done)
msqlClose(h)
To extract information from the database, you have to send it a SQL query. This is done as follows:
nrows = msqlQuery(h,"select item_title from Items")
Upon completion, the query function returns the number of rows returned from the database. To fetch the returned data, you have to do the following:
r = msqlStoreResult()            # Store the query results
for i in range(0,nrows):         # Loop over rows fetching the results
    row = msqlFetchRow(r)
    print row
This code simply loops over the number of rows and fetches each row, one by one. The msqlFetchRow function returns the row as a Python tuple.

An Example

Here is an example of fetching all of the items from the database in the "Rock" music category.
from msql import *
h = msqlConnect("rustler")
msqlSelectDB(h,"project2")

nrows = msqlQuery(h,"""select item_artist, item_title, item_price from Items 
                     where item_category='Rock'""")
r = msqlStoreResult()
for i in range(0,nrows):
      print """
Artist: %s
Title : %s
Price : $%s""" % msqlFetchRow(r)

msqlClose(h)

The output of this program is as follows:
Artist: Cherry Poppin' Daddies
Title : Zoot Suit Riot
Price : $11.88

Artist: Alanis Morissette
Title : Supposed Former Infatuation Junkie
Price : $12.58

Artist: Brian Setzer Orchestra
Title : The Dirty Boogie
Price : $11.88

Artist: Loreena McKennitt
Title : The Book of Secrets
Price : $11.88

Artist: Dave Matthews Band
Title : Before These Crowded Streets
Price : $11.88

Artist: Natalie Imbruglia
Title : Left of the Middle
Price : $11.88

Artist: Garbage
Title : Version 2.0
Price : $11.88

Artist: Smashing Pumpkins
Title : Adore
Price : $14.99

Artist: Liz Phair
Title : Whitechocolatespaceegg
Price : $11.88

Artist: Tori Amos
Title : From The Choirgirl Hotel
Price : $12.99

Database Queries

All of your interaction with the database will be in the form of SQL queries issued with the msqlQuery() function. Here are all the queries you will need for this project. Given that this isn't a database course, you shouldn't need to do much more than this.

  1. Getting a list of music categories
    msqlQuery(h,"select distinct item_category from Items")
    
    This will return a distinct list of all of the available music categories. Currently, the database has five categories : 'Rock','Jazz','Country','Blues', and 'Offensive'.

  2. Getting a list of titles in a given category
    msqlQuery(h,"select * from Items where item_category='Rock'")
    
    This will return a list of all items available in a given category. The '*' means that all five fields in the Items table will be returned. You will use this produce a listing of titles.

  3. Retrieving a single item
    msqlQuery(h,"select * from Items where item_number=13")
    
    This returns the item with a specific item number. You will use this to display information about a single item.

  4. Retrieving a group of items
    msqlQuery(h,"select * from Items where item_number=13 or item_number=7 
                        or item_number=23 or item_number=31")
    
    This would return items 13,7,23, and 31 from the database. You might use a query like this to display the contents of a shopping cart.

CGI in Python

All of your Python CGI scripts will look roughly like this:
#!/usr/local/classes/current/CS219/packages/bin/python

import cgi
from msql import *
import string
import sys
import os

# Any other modules you think are necesssary

# See what kind of method is being used (GET or POST)
method = os.environ["REQUEST_METHOD"]
if method == "POST":
    clen = string.atoi(os.environ["CONTENT_LENGTH"])
    query_string = sys.stdin.read(clen)      
elif method == "GET":
    try:
        query_string = os.environ["QUERY_STRING"]
    except:
        query_string = ""	
else:
    print "\nCGI error"
    sys.exit(1)

# Now parse the query string
query_data = cgi.parse_qs(query_string)

# Individual query parameters can be accessed like this
val = query_data["name"]

# Print the HTTP header
print "Content-type: text/html"
print ""

# Open up the database
h = msqlConnect("rustler.cs.uchicago.edu")
msqlSelectDB(h,"project2")

# Print a bunch of content
...

# Close the database and exit
msqlClose(h)

Browsing the Catalog

Your first step is to make a CGI interface that allows visitors to browse through the catalog. To do this, you need to write some scripts that extract information about the database and produce a nice HTML display. The browsing interface should have at least three different screens: Each of these is now described.

The category listing

The first display should produce a listing of available music categories. This listing might look something like this :

Available Categories

Each category would be a link to a page showing the available titles in that category.

Here is a CGI script that produces the above listing:

#!/usr/local/classes/current/CS219/packages/bin/python

import sys
import cgi
from msql import *
import string

# Print a list of available categories
print "Content-type: text/html"
print ""

print """
<html>
<head>
<title>Available Categories</title>
</head>
<body bgcolor="#ffffff">
<h1>Available Categories</h1>
"""
print "<ul>"
# Open up the database
h = msqlConnect("rustler.cs.uchicago.edu")
msqlSelectDB(h,"project2")

nrows = msqlQuery(h,"select distinct item_category from Items")
r = msqlStoreResult()

for i in range(0,nrows):
	category = msqlFetchRow(r)[0]	
	print """<li><a href="/cgi-bin/titles.cgi?category=%s">%s</a>""" % 
                                                                         (category,category)

print "</ul>"

print """
</body>
</html>
"""
# Close the database and exit
msqlClose(h)
The output of this script produces a list of categories with links to another CGI program 'titles.cgi' like this:
<ul>
<li><a href="/cgi-bin/titles.cgi?category=Rock">Rock</a>
<li><a href="/cgi-bin/titles.cgi?category=Jazz">Jazz</a>
<li><a href="/cgi-bin/titles.cgi?category=Country">Country</a>
<li><a href="/cgi-bin/titles.cgi?category=Blues">Blues</a>
<li><a href="/cgi-bin/titles.cgi?category=Offensive">Offensive</a>
</ul>

The Title Listing

When a user clicks on one of the music category links, a new display containing all of the titles in that category should be produced. This might look like the following:

Jazz

Kind of Blue
Miles Davis
Our price: $8.38
Gershwin's World
Herbie Hancock
Our price: $11.38
A Love Supreme
John Coltrane and Johnny Hartman
Our price: $12.99
Combustication
Medeski Martin & Wood
Our price: $12.99
This listing will be produced in a manner very similar to the last CGI script--only the database query will be different.

Item Display

When a user clicks on one of the items in the title listing, another CGI script should be invoked to produce a display containing only that item. For example:

Kind of Blue


Miles Davis
Our Price: $8.38
Again, this display is produced with a database query that asks for a specific item number (note : displaying a picture is optional).

Shopping Cart

After you figure out how to browse the catalog, your next step is to let users select various items and put them in a shopping cart. This is a little more tricky since HTTP is a stateless connection. You will have to maintain a collection of shopping carts on the server and use cookies to figure out which connections correspond to each shopping cart.

Modify the item display

First, modify the item display CGI program so that it places a link "Add to cart" on the screen. When a user clicks on this link, another CGI program should be invoked to add the item to a cart.

Obtaining a session identifier

When a user first puts something in their cart, you need to produce a session identifier that uniquely identifies the cart. For this project, a session identifier server is running on 'rustler.cs.uchicago.edu' at port 11000. To get a session ID, you can connect with this server as follows:
from socket import *
s = socket(AF_INET,SOCK_STREAM,0)
s.connect("rustler.cs.uchicago.edu",11000)
session_id = s.recv(64)
s.close()
If you print out the session ID you receive, it will look something like this:
19991211226173210-18
The motivation for using a session-id server is to provide a centralized service for producing unique identifiers. Since CGI programs only run for brief periods of time (and often in parallel with other CGI requests on a heavily loaded server), it can be difficult to produce a unique session identifier (also consider the possibility of everyone in class trying to produce unique identifiers). By using a centralized server, it can keep track of what id's have been previously assigned and always come up with a unique value that is halfway sensible (plus, it makes the project just a tad bit more evil--which is, of course, a good thing).

Baking up a persistent cookie

The session identifier generated in the previous section will be used to identify shopping carts. However, to map shopping carts to each user, you need to send the session id back to the web-browser as a persistent cookie. This can be done by putting the following item in the response header:
Set-Cookie: sessionid=19991211226173210-18; path=/; 
   domain=classes.cs.uchicago.edu
...
Content-type: text/html

...
(Note : The Set-Cookie line does not have a line-break as shown!)

Assuming that the browser allows cookies, the cookie value will be saved on the user's machine. On subsequent connections, the browser will check the path and domain attributes of the cookie. If they match, the cookie value will be sent to the web-server on subsequent requests.

Getting a cookie

After a cookie has been sent to the browser, it will be presented to the server on subsequent requests. The server receives the cookie as an extra request header field like this:
GET /cgi-bin/details.cgi HTTP/1.0
...
Cookie: sessionid=19991211226173210-18
...
To pass the cookie on to a CGI program, the server reads the cookie out of the request header and places it in an environment variable HTTP_COOKIE. The CGI program can receive the cookie as follows:
import os
...
http_cookie = os.environ["HTTP_COOKIE"]

Expiring a cookie

If the server wants to expire a cookie, it sends the cookie to the browser with an expiration date and time. For example:
Set-Cookie: sessionid=19991211226173210-18; path=/;
    domain=classes.cs.uchicago.edu; expires=Thur, 21-Jan-99 15:45:30 GMT
(Again, there is no line break in the real Set-Cookie line)

When expiration date and time is reached, the browser will delete the cookie. To immediately remove a cookie from the browser, simply send a cookie with an expiration date in the past.

If no cookie expiration date is given, the cookie will expire when the user shuts down their browser.

The shopping cart

On the web-server, the session identifier is used to identify a particular shopping cart. In each shopping cart, you would keep a list of the items that have been selected so far. Since each item has a unique number, the easiest way to represent the shopping cart is as a list of item numbers.

The most tricky part of the shopping cart is making sure it persists between connections. Perhaps the simplest way to do this (but not necessarily the best way) is to write each shopping cart to a file on the server. For example, you could maintain a directory of "carts" that might look like this:

/carts/
       19991211226173210-18
       19991211226182109-19
       19991211226191100-22
       19991211226201011-30
       19991211226218776-31
The data in each cart might be as simple as
1 13 27 10 31
Where each number is an item number. When an item is added to the cart, its item number is simply appended to the list. When an item is removed from the cart, the item number is removed from the list.

The checkout procedure

To finalize the cart, the user has to go through a checkout procedure. The first part of the checkout procedure should produce a list of the cart contents such as the following:

Here is your shopping cart

Title                         Artist                   Price   
-----                         ------                   -----   
Version 2.0                   Garbage                  11.88   
Suicidal Tendencies           Suicidal Tendencies      13.47   
Kind of Blue                  Miles Davis              8.38    

                                                       Total   
                                                       -----   
                                                       33.73   
To produce this list, you will have to take the item numbers in the cart and perform a database query to extract the title and artist information. The user should then be able to click a "Confirm" button to finalize the order.

When the order has been confirmed, the order information should be placed into the database. This can be done as follows:

items = [ ... ]                # List of item numbers
session_id = "..."             # Session identifier

# Open up the database
h = msqlConnect("rustler.cs.uchicago.edu")
msqlSelectDB(h,"project2")

for i in items:
     msqlQuery("insert into OrderItem values('%s',%s,1)" % (session_id,i))

# Close the database and exit
msqlClose(h)
This is really just inserting order information into the database that contains the session identifier, item number, and quantity. For instance:
19991211226191100-22     1     1
19991211226191100-22     13    1
19991211226191100-22     27    1
19991211226191100-22     10    1
19991211226191100-22     31    1
To later retrieve the contents of an order, you can issue a query like this:
select * from OrderItem where order_id = '19991211226191100-22'
After the checkout procedure has been completed, the session identifier should be expired and the shopping cart deleted from the server. A user should not be able to add new items or remove things from their shopping cart after confirmation.

How the cart is supposed to work

Now that you've seen the pieces, here is how the cart is supposed to work.
  1. Initially, a visitor to the site has no shopping cart. Of course, they can browse around all they want.
  2. When a user first decides to put something in their cart:
    1. The server contacts the session-id server to get a new session identifier.
    2. The session-id is used to create a new shopping cart (perhaps as a file).
    3. The selected item is placed in the newly created cart.
    4. The session-id is sent back to the browser as a persistent cookie.
  3. When a user puts something else into their cart
    1. The server passes the persistent cookie value (set in Step 2) to the CGI program.
    2. The CGI program reads the session id from the cookie and tries to load a shopping cart.
    3. If the cart doesn't exist or has expired, a new cart is created.
    4. If the cart does exist, the item number is appended to the contents of the cart.
  4. When a user removes something from their cart.
    1. The CGI program reads the session id from the cookie and tries to load a shopping cart (as before).
    2. If the cart doesn't exist or has expired, an error message is generated.
    3. If the cart does exist, the item number is removed from the cart.
  5. The user checks out
    1. The server loads the shopping cart (as before).
    2. User is given a list of their shopping cart and asked to confirm.
  6. Confirmation
    1. The server loads the shopping cart again.
    2. The items in the cart are placed into the database as an order.
    3. The cookie and session id are expired (it must not be possible to add or remove items from the cart).
    4. The shopping cart is removed from the server.

Abandoned Shopping Carts

In some cases, a user might add things to a shopping cart but never check out. As a result, the server will eventually become cluttered with abandoned carts. To fix this, shopping carts should periodically be cleaned up. One approach is to simply write a program that checks the modification date of all active shopping carts and deletes those that have not been recently used. This program can be run as part of the CGI process, but could also be set up as a periodically running program on the server (a Unix cron job).

What to do

Step 1: Implement CGI

Make sure your server can run CGI programs. You will also have to modify your server to correctly pass cookies from the browser to CGI programs by placing the cookie value in the HTTP_COOKIE environment variable.

Step 2: Category List

Create a CGI program the lists the available categories in the database. Clicking on a category should produce a listing of titles. This is easy (especially since the code was already given above)

Step 3: Title listing

Create a CGI program that takes a category name and produces a list of titles in that category. Clicking on a title should take the user to a details screen.

Step 4: Details page

Create a CGI program that takes a specific item number and lists the details about that item. A button should placed on the page to allow a user to add the item to their shopping cart.

Step 5: Add item to cart

Make a CGI program that takes an item number and adds it to a shopping cart. This program should check to see if a session id has been passed as a cookie. If so, an existing shopping cart should be opened. If not, a new session identifier should be created and passed back to the browser as a cookie.

Step 6: View cart

Create a CGI program that allows a user to view the contents of their shopping cart. This page should read in the current shopping cart and display its contents. This page should also allow users to remove items from their cart.

Step 7: Checkout

Create a CGI program that displays the current contents of the shopping cart and asks a user to confirm the contents.

Step 8: Confirmation

Create a CGI program that takes the contents of the cart and places them into the database as an order. The session identifier, cookie, and shopping cart should then all be expired. A confirmation message should then be presented to the user (with a link to go back to the top of the music store).

Extra Credit