Bible Words due Mon 02 Oct 14:30

This assignment is intended to impart several skills and concept...
...ity of a solution
\item convert a tree to an array


Your program will read the words of the Bible and count how many times each word appears. Then you will display (in order) the 20 most frequently used words.

To start work get the latest version of the ds_homework repository from BitBucket. The input file is named biblewords.txt and can be found in the data directory. It contains the text of the entire KJV Bible with one word per line, no punctuation, and all words capitalized. You're welcome.

Your program should work in the following way:

  1. read a word from the data file
  2. search for the word in a binary search tree (keyed by word)
  3. if the word is found then you will increment the count
  4. if the word is not found then you will insert it into the tree
  5. if not at the end of the file then go back to step 1

After the word counts have been established create an array that references the tree elements and then sort the array by word count. Display the 20 most frequently appearing words along with their counts in a formatted table.

In addition you will display the amount of time for each task (processing, sorting, and total). You may assume that no word is longer than 25 characters and that that there will be no more than 15,000 unique words (there are over 700,000 words, total).

Create a Binary Search Tree Container Class

Create a container class that implements a binary search tree. Your container class should should implement methods for insert and search in accordance with the properties of a binary search tree.

Your class will also need an unusual sort method that creates and sorts an array of references to the tree. You'll also need an unconvention display() method that displays the first 20 elements of the sorted array.

Grading and Submission

You will turn in your work by committing it to the git repository you shared with the instructor in the first homework assignment.

Programs will be graded according to the following criteria:

Correctness/Completeness 24 pts
Documentation 2 pts
Conventions 2 pts
Design 4 pts
Version Control 4 pts
Total 36 pts

Quick Links