Computer Science 61C Project 1:
philspel, a simple, silly spelling checker

David Patterson

This project is designed to serve as an introduction to the C language. To complete it, you will have to use the C file input/output library, do memory allocation, manipulate strings, and coerce strings to void pointers and vice versa. Although there is conceptually a lot to learn to complete this project, the actual code you need to write is short.

philspel is a very simple and silly spelling checker. It accepts a single command line argument, the name of a dictionary to use. This dictionary consists of a list of valid words to use in checking the input. Words are separated by newlines. The first word will not be preceded by anything, and the last word is followed by a newline.philspel processes standard input and copies it to standard output. A word is a sequence of one or more consecutive alphabetical letters unbroken (not interrupted) by any non-letter character. For each word in the input, it looks at the word, the word converted entirely to lowercase letters, and the word with all but the first letter converted to lowercase. If any of the three variations are found in the dictionary, the word is copied to standard output. Otherwise, the word is copied to standard output, with the string " [sic]" (without the quotation marks but with the space) appended. All other input is copied to standard output unchanged, character for character.

You should make a directory in your home directory named proj1, and copy all the files from
~cs61c/proj1/
into your proj1 directory. This copies over a makefile for the project and several code files. hashtable.c and hashtable.h are the code and header files which define a simple generic hashtable which you may use. You should not need to modify these files at all. philspel.h defines the functions in philspel.c. You will need to implement 4 functions in philspel.c: stringHash(void *s), stringEquals(void *s1, void *s2), readDictionary(char *filename), and processInput(). You may modify philspel.h if you wish to declare additional helper functions which you implement in philspel.c.

Also included is a sample dictionary, input, and output. Your output should EXACTLY match ours, since we will be using automated scripts to grade your program. Another useful dictionary for testing is contained in /usr/dict/words. You can type
gmake test
in your proj1 directory to compile and test your program against a sample set of inputs. You can, however, safely output all sorts of debugging information to stderr, as this will be ignored by our scripts and by the test routine provided in the Makefile.

Furthermore, you can assume that both the dictionary and the input won't contain words longer than 60 characters (per word). However, for extra credit, you should ensure that your program fails gracefully (doesn't core dump or exit with a non zero value) if you get words which are longer than 60 characters. You may not assume anything, except what is explicitly stated in this assignment.

To submit your project, use the command
submit proj1
in your proj1 directory. Make sure that you test things first by running
gmake test
in your proj1 directory! This project is due on Friday, September 8, 2000 by 11:59pm (23:59).

As computer systems crash when put to higher than normal load, not to mention other random times like the first day of lab, it is likely that computers will crash shortly before deadlines. Assume that will happen, and plan ahead. If you aim to turn in all assignments 24 hours before they are due this semester, life will be more enjoyable.

REMEMBER: Unlike Java, dynamic data must be allocated before a pointer can make sense, and that it's proper etiquette to free dynamic data once you are done with it. It may prove helpful to read through the "C survival notes" prepared by the Computer Science Undergraduate Association. One example is located at www.CSUA.Berkeley.EDU/~dans/HelpSessions/CSurvival/CSurvivalNotes_1999Feb01
Your output must exactly match the specified format, which makes correctness the primary goal of this project. Also, you are to do this work individually--there will be no partners for this project. It is encouraged to assist your classmates with their project, but don't copy code.

You may work from a non-EECS instructional machine, but your programs will be tested on machines in the labs on the 2nd floor of Soda, or machines similar to them. You should make sure that your programs can compile and run correctly on any machine in the 2nd floor labs. 61C is about "machine structures," which includes learning that not all machines function identically. We may, at our discretion, help students using other programming environments, but neither we, nor root, is under any obligation to provide tech support for working from a non-EECS instructional machine, or using any non-EECS instructional software.

proj1.html 1.6 Fri, 08 Sep 2000 19:00:21 -0700 cs61c