Program Design
concordance.py: 34 points
The design of your program is up to you, but you should certainly divide the work to be done among several functions. For the other functionality, you may want to use the following functions:
standardize_word(word)
:
This function takes a string word
containing a single word, and returns a new string that has the letters of word
converted to lowercase with punctuation removed from the beginning and end of the word.
README
Python provides a variable that contains common punctuation characters called string.punctuation
, which you can access by importing the string
module.
import string
The string.punctuation
variable can then be passed into the strip()
function to remove the leading and trailing punctuation from a given string:
word = word.strip(string.punctuation)
We do not expect you to remove punctuation from within a word. For example: won’t is fine.
Reminder
After you complete each function be sure to commit and push your changes!
add_word(word, line_number, concordance)
:
This handles the work of recording that a given word
was found on the given line_number
in the dictionary concordance
.
Reminder
After you complete each function be sure to commit and push your changes!
print_entry(word, concordance)
:
This handles printing one word
of the output and its line numbers. concordance
is the dictionary storing the concordance, so concordance[word]
should be the list of line numbers on which word
occurs.
Running the code:
- Get the file name from the user
- Open the file
- Read the file by looping through one line at a time
- split the line into words
- call
standardize_word()
to prepare eachword
for adding to the concordance - call
add_word()
to add theword
and its currentline_number
to theconcordance
. - loop over the keys of the
concordance
and callprint_entry()
on eachword
to handle the output. The loop should print the words from the text in alphabetical order. - Make sure to close the file when you are done using it
Make sure that before you submit your lab, you try out some filenames that do not match the names of what we provide as test files in your GitHub repository or the Testing section. This will help you try out your try/except
statement.
README
You need to print all of the words in your concordance in alphabetical order, followed by their line numbers. For this, we can use Python’s build-in sorted()
function. sorted()
takes any iterable (i.e., anything that can be iterated over) and converts it into a sorted list. So, if we want to get a sorted version of our concordance keys, we can write
sorted(concordance.keys())
which will conveniently return a list of our keys in alphabetical order.