# Sets
A set is another built-in data structures supported by Python for the mathematical notion of a set, i.e. a collection of elements. Unlike a dictionary, the elements in a set don't have values associated with them. You could simulate a set using a dictionary, but adding a key for each element, and setting that key's value to something arbitrary, like 0, or an empty string, or none. That said, if you don't have data associated with each element, and simply whant to keep track of a set of items, using a set is the way to go.

Like dictionarys (and unlike lists), sets are not ordered, but testing membership and addinging or removing elements is very fast. Sets do not store duplicate elements: adding an element to a set that already contains that element has no effect.

Here are some examples of syntax involving sets.


    team = set()                       # makes a set with 0 elements
    team = {"adam", "roberto""}        # makes a set with 2 elements
    len(team)                          # 2 
    team.add("bob")                    # adds "bob" to team
    team.remove("adam")                # removes "adam" from the team
    for p in team :                    # iterates through elements of team
    "bob" in team                      # True
    "jackie" in team                   # False
    "sam" not in team                  # True
                  

# Anagrams
**anagrams.py: 20 points**

An anagram is just a rearrangement of the letters in the word to form another word or words. For example, if you rearrange the letters in

    oberlin student 
                
you can get 

    let none disturb 
                
or
   
    intends trouble
                
and many many more. 
For this part of the assignment, you'll be writing a program called anagrams.py to generate your own anagrams. To decide which anagrams are at least plausibly interesting, your program will have to decide which strings are legitimate words. Your program should prompt the user for a file containing a word list, and a word for anagrammating.

# Program Outline

Broadly speaking, the steps you'll follow will include the following. 

* Read in a text document containing a word list. Here are two: [words1.txt](http://www.cs.oberlin.edu/~ctaylor/classes/150S20/Labs/Lab08/words1.txt), [words2.txt](http://www.cs.oberlin.edu/~ctaylor/classes/150S20/Labs/Lab08/words2.txt). The first is very small, just for testing purposes. The second contains about 4000 common words. Even for relatively short strings, we'll need to use some optimizations if we want to generate anagrams using that word list. 

* Build a set words containing each word from the text file. Since we have a lot of words, by using a set rather than a list will save us a lot of time when testing membership (which is basically all we'll be using it for). 

* Create a function called contains(s, word) which returns two values. The first value should be a boolean indicating whether the string s contains the letters necessary to spell word. If the answer is True, the second value should be what remains of word after the letters in s have been removed. If the answer is False, the second value returned should just be an empty string. For example, 


  	contains("zombiepig", "bozo")        # returns False, ""
  	contains("zombiepig", "biz")         # returns True, "omepig"
                	
* Create a recursive function called grams(s, words, sofar) that takes in a string s, a set of words words, and a list of words sofar. This function finds all anagrams of s using elements found in words. Each of these anagrams is printed, along with the words in sofar, on its own line. 

You might be wondering why we're passing around the variable sofar. Indeed, when we want to find the anagrams of a string given by the user, we'll pass in an empty list. However, that list will be critical for making use of recursion. Let's look at an example to see why. Suppose we want to find anagrams of

   
      robopirate 
                	
We'll look through our wordlist for words that are contained in this string. The string "cat" doesn't appear in "robopirate", but "air" does. So one thing our function call will do is begin looking through the remainder of "robopirate" with "air" removed, looking for further anagrams. That is, it'll continue to look for strings contained in 

    	robopte
            	    
Our list includes "bro", which is contained in "robopte", so another recursive call will be made on the remains, namely "opte". Our wordlist contains "poet", leaving us with an empty string. At this point we've used up all the letters in the string, so we have an anagram, namely 

      air bro poet 
                  
Unfortunately, if we want to print our anagram, we're in trouble, since we haven't kept a record of the previous words we found. That's where sofar comes in. This list will track the words we've found so far in this particular branch of the recursion. That is, 


      grams("robopirate", words, [])
                  
will call (among other things) 


  	  grams("robopte", words, ["air"]) 
                	
which in turn calls 


  	  grams("opte", words, ["air", "bro"]) 
                	
which in turn calls 


  	  grams("", words, ["air", "bro", "poet"]) 
                	
which can now print the complete anagram.
With this in mind, we're ready to describe the overall structure of this function. We loop over every word w in our wordlist. For each word w that's found in our string s, we make a recursive call on the remainder of s, and with a new list, equal to the current list with w added on. If we make a call where s is the empty string, we can just print the contents of sofar.

**Suggestions and Tips**

When trying to determine whether one string s contains a word w, the string replace function is very useful. In particular, 


  	  s.replace(ch,'',1) 
                	
creates a new string that is identical to s except the first occurence (parameter 3) of ch (parameter 1) is replaced by the empty string (parameter 2). 

When printing a list of strings, the string join function is pretty handy: 


    	" ".join(L) 
                	
returns a new string containing every element of the list L, glued together with a space. Of course, you could join the elements with any string, but a space makes the most sense here. 

**Test output**

If you print all strings from words1.txt that are contained in "robopirate", you should get the following: 

      or bro rat bat air ape poet poor ripe taboo orbit
                
Note that this doesn't contain "rabbit" ("robotpirate" only has one "b"). If you run your program using words1.txt for your word list on the string "robopirate", you should get (but maybe not in the same order): 

     or ape orbit
     or orbit ape
     bro air poet
     bro poet air
     air bro poet
     air poet bro
     ape or orbit
     ape orbit or
     poet bro air
     poet air bro
     orbit or ape
     orbit ape or
                
It doesn't matter if your output is in another order. 

**Improvements (Extra Credit)**

Once that's working, add in the following optimization and improvements: 

*3 pts.* Things slow down a lot when we have longer word lists. One nice optimization makes use of the following observation: if the user's string s doesn't contain a particular word w, then no remainder of s will contain w either. So instead of iterating through the set of all words at each step of the recursion, we need only iterate through the "plausible" words.
Add a "preprocessing" step: Instead of adding every string found in the word file to the set words, only add those that are contained in the user's input string. Fortunately you already have created a function that can help you out here!

*3 pts.* Let the user specify a minimum length of words allowable in their anagrams. 

*3 pts.* Let the user specify a maximum number of words allowable in their anagrams. 

