Program Design

concordance.py: 34 points


The design of your program is up to you, but you should certainly divide the work to be done among several functions. For the other functionality, you may want to use the following functions:

standardize_word(word):

This function takes a string word containing a single word, and returns a new string that has the letters of word converted to lowercase with punctuation removed from the beginning and end of the word.

ReadMe

Python provides a variable that contains common punctuation characters called string.punctuation, which you can access by importing the string module.

import string

The string.punctuation variable can then be passed into the strip() function to remove the leading and trailing punctuation from a given string:

word = word.strip(string.punctuation)

We do not expect you to remove punctuation from within a word. For example: won’t is fine.

Reminder

After you complete each function be sure to commit and push your changes!

add_word(word, line_number, concordance):

This handles the work of recording that a given word was found on the given line_number in the dictionary concordance.

Reminder

After you complete each function be sure to commit and push your changes!

print_entry(word, concordance):

This handles printing one word of the output and its line numbers. concordance is the dictionary storing the concordance, so concordance[word] should be the list of line numbers on which word occurs.

Running the code:

  1. Get the file name from the user
  2. Open the file
  3. Read the file by looping through one line at a time
  4. split the line into words
  5. call standardize_word() to prepare each word for adding to the concordance
  6. call add_word() to add the word and its current line_number to the concordance.
  7. loop over the keys of the concordance and call print_entry() on each word to handle the output. The loop should print the words from the text in alphabetical order.
  8. Make sure to close the file when you are done using it

Make sure that before you submit your lab, you try out some filenames that do not match the names of what we provide as test files in your GitHub repository or the Testing section. This will help you try out your try/except statement.

ReadMe

You need to print all of the words in your concordance in alphabetical order, followed by their line numbers. For this, we can use Python’s build-in sorted() function. sorted() takes any iterable (i.e., anything that can be iterated over) and converts it into a sorted list. So, if we want to get a sorted version of our concordance keys, we can write

sorted(concordance.keys())

which will conveniently return a list of our keys in alphabetical order.