CS-151 Labs > Lab 1. The First Cup of Java


Part 3. Summer School

For this part of the lab, we will be using a data set of Teaching Assistant evaluation scores collected by the Statistics Department of the University of Wisconsin-Madison. Our research question for this dataset is “Do classes offered during the summer have higher or lower average evaluation scores than those offered during the regular year?” You will create a class named EvalAnalysis that answers this question.

The data is in tae.prn which you should download and place in your lab1 directory.

Each line of the file contains 5 numbers separated by spaces. These numbers are the following for each class:

  1. Course instructor (categorical, 25 categories);
  2. Course (categorical, 26 categories);
  3. Summer or regular semester (binary) 1=Summer, 2=Regular;
  4. Class size (numerical); and
  5. Course Evaluation Scores (categorical) 1=Low, 2=Medium, 3=High

For our purpose, we will be looking at summer versus regular semester, and course evaluation score – we will only need the values from columns 3 and 5.

The program should take in the name of a file from the command line, and then loop through every line of the file, and calculate the average course evaluation scores for both summer and regular courses.

Use one Scanner to read the file line-by-line, and another Scanner to parse each line. You have seen an example of this in the warmup.

To calculate the average scores, keep two running totals: one where you add together all the course evaluation scores for summer courses, and one where you add together all the course evaluation scores for regular courses. You should also keep track of the total number of summer courses and the number of regular courses. Then after you have gone through the file, you can calculate the average by dividing the course eval total by the number of courses.

If you need to skip the number in a particular column, you can call scanner.nextInt() to read in the number, but not save it anywhere, effectively ignoring it. For example, if I wanted to skip the first 10 numbers in a line but save the 11th one, my code would look like this.

for (int i = 0; i < 10, i++){
    scanner.nextInt();
}
int eleventh = scanner.nextInt();

After running your program you should see something like this:

The average review of a summer class is: x.xxxxxxx
The average review of a regular class is: x.xxxxxxx

where the xs will be replaced with the actual number you calculate. Recall that unlike in Python, Java does not have different forms of division for integer versus float division; instead, it does integer division if both of the numbers are integers, and float division if either is a float. To get the correct answer here, since both your numbers are integers, first multiply one of them by 1.0f to transform it into a float, and then perform the division.

We have provided you with two files, sample.prn and tae.prn. File sample.prn contains artificial data that you can use to test the correctness of your program. If your program is correct, it should give you the following results:

The average review of a summer class 1.5714285
The average review of a regular class 2.2173913

The actual data is in file tae.prn (tae stands for Teaching Assistant Evaluations). In order for your code to “see” the input files, you would need to place them into your lab1 folder that holds your bin and src directories.