Gavilan

CSIS054: Perl Programming

Homework set 4

 

 

1. Download the web server log file (150 MB). (You can also practice with this smaller log file.) Use perl to parse the file and generate statistics on:

  • What is the most popular file accessed? The top 20? The top 20 that end in one of: .php , .html , .htm ?

  • Who (which IP address) was the most prolific visitor? (That is, which IP address generated the most lines in the log?) Who where the top 10 visitors?
  • Rank the popularity of the various browsers/search bots that visited this web page. (Note that some similar browsers report different ID strings. Use your best judgement to combine these to present the most useful information.)

You might be interested in reading about the standard Apache log format for web server logs.

Explore the logs with hash tables code for ideas...

 

 

2. Download the CNC GCODE file. It is used to control a cnc mill. Unfortunately, a mistake has been made. To correct it, we need to add 1.55 (inches) to every "Y" value in the file. The File.

So for example, a line like the following:

N100 X0.474 Y0.287

Should be changed to:

N100 X0.474 Y1.837

This change should be made anywhere in the file where there is a Y coordinate listed, except if there is a G90 code on that line.

 

3. (If you didn't do this in the previous assignment) Save the file words.txt. Write a script that prints each word that occured in the file, one per line, in alphabetical order. If a word occurs in words.txt more than once, only report it once.

(A word is a sequence of 1 or more alphabetical characters and the hyphen character. Words are separated by white-space and/or punctuation. Do not include punctuation, except for hyphenated words, in your report.)

(If you've already completed the above) Generate a report of:

  • the top 20 most frequently used words in the file, which are also
  • 5 characters in length or greater.

 

4. Explore the Content Management script. Choose two of the following enhancements, and add it to the script:

  • Implement next and previous links on each page. (With the right behavior for the first and last page.)
  • Extend the data fields to something more useful. Instead of age, you could add title, category, and/or other features. (Remember our model at barefooters.org)
  • Implement a template for the table of contents page. Use a category data field to separate the pages into groups.

 

 

 



Address of this page is http://hhh.gavilan.edu/phowell/csis054/04problems.html
Please contact Peter Howell at phowell@gavilan.edu for questions or comments.
Last updated March 23, 2011.