Due by 11:59.59pm Friday, October 30th 2015
Note: You may work with a partner on this project. Let me know who you are working with (along with a team name and desired repository name), and I'll set up a shared private GitHub repository for this assignment. As there are a number of components to this assignment, I'd encourage you to get started before the last minute.
Also go through the interactive GitHub tutorial if you have not done so.
In this class, you'll be re-creating a few common Unix tools. As we have discussed in class, the general Unix tool philosophy is that you have a program that does one thing (hopefully well) and can be composed together to perform more powerful operations.
The first tool you will be creating will allow you to reformat text. This will give you some experience working with C strings and command line arguments.
I'm interested in seeing how much time students estimate an assignment will take versus how much time they actually spend on the assignment. What I'd like you to first do is read through this assignment and create a README with your estimated time to complete it. (Feel free to list time for individual components as well if you'd like.) After you get done, I'd like you to add in the actual amount of time you spent.
The program you'll be creating is called format. Its job will be to read in input and "neaten" it up. It reads in paragraphs of words and rearranges them such that they fit nicely onto a line of specified width inserting line breaks as needed. A paragraph is separated from other paragraphs by one or more empty lines (which might contain whitespace).
This program is based on the Unix fmt program.
INPUT: Atop a large, ice-covered plateau, a struggle for survival is occurring. Two groups of mechanical creatures -- Chompers and Lobbers -- are battling to control this square, slippery field for reasons that are beyond human comprehension. However, the Lobbers believe that you, The Programmer knows what you are doing and have decided that you will instruct them in their activities during this fateful day.
OUTPUT: Atop a large, ice-covered plateau, a struggle for survival is occurring. Two groups of mechanical creatures -- Chompers and Lobbers -- are battling to control this square, slippery field for reasons that are beyond human comprehension. However, the Lobbers believe that you, The Programmer knows what you are doing and have decided that you will instruct them in their activities during this fateful day.
format supports 4 command line arguments which change the behavior slightly.
The -w flag allows you to change the width of the output lines. The default width is 72 characters per line.
Using the same input as above, we can specify our program to run as ./format -w 40 and get the following:
Atop a large, ice-covered plateau, a struggle for survival is occurring. Two groups of mechanical creatures -- Chompers and Lobbers -- are battling to control this square, slippery field for reasons that are beyond human comprehension. However, the Lobbers believe that you, The Programmer knows what you are doing and have decided that you will instruct them in their activities during this fateful day.
The -r flag allows you to specify that you want all the text aligned on the right side, not the left. Using the same input as above, we can specify our program to run as ./format -r and get the following:
Atop a large, ice-covered plateau, a struggle for survival is occurring. Two groups of mechanical creatures -- Chompers and Lobbers -- are battling to control this square, slippery field for reasons that are beyond human comprehension. However, the Lobbers believe that you, The Programmer knows what you are doing and have decided that you will instruct them in their activities during this fateful day.
The -j flag allows you to fully justify the text. That is, each line extends from the left side all the way to the maximum width of the line. To simplify things, do this for even the last line of a paragraph.
Using the same input as above, we can specify our program to run as ./format -j and get the following:
Atop a large, ice-covered plateau, a struggle for survival is occurring. Two groups of mechanical creatures -- Chompers and Lobbers -- are battling to control this square, slippery field for reasons that are beyond human comprehension. However, the Lobbers believe that you, The Programmer knows what you are doing and have decided that you will instruct them in their activities during this fateful day.
Professor Kevin Woods came up with a nifty way to calculate the number of spaces that go between words in the fully justified case. If you know the number of words on a line, then you can figure the number of gaps that need to be filled.
Similarly, if you know the length of the total number of words on that line, you can determine the number of spaces that should be inserted.
You then can apply integer division to calculate the total number of spaces that should have been seen by the time you finish a gap.
(# of gaps so far) * (total # of spaces) # of spaces seen so far = ---------------------------------------- (total # of gaps)
As an example, imagine we have 4 words on a line, and 8 extra spaces. There are 3 gaps to fill. At the first one we put 2 spaces:
1 * 8 / 3 = 2 spaces
At the second, 3 spaces because the total needs to be:
2 * 8 / 3 = 5 spaces
And finally, 3 more spaces to finish off the set:
3 * 8 / 3 = 8 spaces
While you aren't required to use this formula to determine spacing, it seems to work quite well. Certainly better than just pushing the excess into one of the edge positions. (That would have been 2-2-4 instead of 2-3-3.)
Finally, you should also support the -s flag to indicate that you want multiple blank lines to be compressed into just one. (Normally, you'd have a bunch of empty paragraphs.)
INPUT: I like cheese. How about you?
OUTPUT: I like cheese. How about you?
These options should be cumulative -- you can specify width, alignment, and skipping of blank lines together. The -r and -j flags do not make sense being used together. If they are, just use whichever was specified last.
Some guidelines you should follow when working on your solution:
You might also want to think about how you can break this down into the basic functionality and how each flag modifies the behavior. There are many ways to correctly implement these specifications -- read in and assemble a line from by reading a word at a time then processing the line OR read in a bunch of words, then assemble the lines from that.
You should be doing a process of stepwise refinement. Add in one new component and test to be sure it works before moving on to the next. Trying to do everything all at once will likely only lead to much confusion. Also, sketching out your design on paper beforehand is invaluable.
For example, when sketching things out, you decide you'd like it if you could just implement some of the functionality of Java's Scanner.next(), you'd have a design of how to handle the basic case. Then you should implement that function and then test to see if it behaves as it should on a variety of input. Once it is working, you can move on to the next phase of your design.
Note: There is no need to dynamically allocate space in this assignment. You should be able to use fixed size buffers for all of the operations requested.
In addition to the flags listed above, I'd like you to implement a -h and -? flag that prints out a brief usage message and then exits the program with a non-zero value. You should do the same behavior if an unknown flag is passed to your program.
You should have your program's main function return 0 upon successful completion of the assigned task.
You'll also be creating a man page to accompany your tool. Man pages are simply text files that have some additional annotations with formatting instructions -- somewhat similar to LaTeX if you've encountered that as well.
Traditionally, the tool nroff was used to do typesetting of man pages. These days, many systems now have groff and just have nroff as an alias to groff. You can get more info on the various options by seeing the manpage groff_man(7).
At a minimum, you will need to know the following macros (all go at the start of the line).
.\" Sample man page for CSCI 241 .\" Benjamin Kuperman - Fall 2011 .TH sample_man 1 "06 October 2011" "CSCI 241" "Oberlin College" .SH NAME .B sample_man \- an example of a sample man page .SH SYNOPSIS .B sample_man [ -o outputfile ] <filename> .SH DESCRIPTION Does everything a sample man page should do. .SH OPTIONS .IP "-o outputfile" Do things to be written to an output file. .SH AUTHOR Benjamin Kuperman (Fall 2011) .SH BUGS None!
You can also see/download a longer version.
You should name your file based on the standard convention <name_of_program>.<manual_section>. So for the above page, it would be sample_man_page.1. (Recall that section 1 is user commands.)
To view the processed man page use either of the following commands:
Here is the HTML version of above:
sample_man(1) Oberlin College sample_man(1) NAME sample_man - an example of a sample man page SYNOPSIS sample_man [ -o outputfile ] <filenames> DESCRIPTION Does everything a sample man page should do. OPTIONS -o outputfile Do things to be written to an output file. AUTHOR Benjamin Kuperman (Fall 2011) BUGS None! CSCI 241 06 October 2011 sample_man(1)
Create a file called README that contains
Now you should make clean to get rid of your executables and handin your folder containing your source files, Makefile, and README.
% cd ~/cs241 % handin -c 241 -a 5 hw5 % lshand
Here is what I am looking for in this assignment: