Skip to content

Homework 3: Strings

Due Monday, July 8, 2024 at 11:59pm

In this assignment, you will write 2 functions in C. Although it doesn't seem like much, some of these functions require quite a bit of thinking and understanding.

  • Read the entire assignment first before you start
  • Start early and do not do all of the assignment in one sitting; coding is fun but fighting for hours with broken code is not
  • Do not hesitate to seek help if you are stuck

Synopsis

readline: reads a single line from a file. A file is separated into lines by the newline character ('\n'); this function reads the first line of arbitrary length into memory.

concat: allocates a new string is the concatenation of all the input strings.

Written: You will answer some simple questions at the end.

Learning Objectives:

  • Allocating, accessing, and manipulating C strings of arbitrary length
  • Get comfortable around pointers
  • Practice heap allocation and manual memory management

Getting started

See homework 1.

What is provided?

  • A main function that parses command-line arguments and tests your implementation of readline and concat.
  • A test_readline function that processes a text file line by line, printing each line along with its corresponding line number and the length of the line.
  • A test_concat function that processes a text file, concatenating lines into groups of ten and printing each group with a header and footer.
  • Some text files for testing.

Please read through the entire source file before starting your implementation.

Specification

Forbidden functions

Usually in this class, you may use any standard library functions. In this assignment, however, some functions are prohibited since using them defeats the points of this exercise.

Here is the list of prohibited functions. Any standard library function that is not included in this list is still available.

  • readline from readline/readline.h
  • getline, fgets, gets, scanf, fscanf from string.h
  • strcat, strncat, sprintf, snprintf, strlcat from string.h

readline

int readline(FILE *file, char **line_p);

A file is separated into lines by the newline character '\n'. A line contains at least one character. New lines begin at the start of the file and after each newline character; the newline character is included in the line that it terminates.

readline reads the first unread line of the input stream from file such that repeated invocations of readline eventually consumes the entire content of file.

The characters comprising the line are made into a C string (i.e. a NUL-terminated array). The C string is allocated on the heap and the pointer to the data is assigned to *line_p. It is the caller's responsibility to free the allocated data. The length of the string, which is the number of characters in the array except the terminating '\0' but including '\n' if any, is returned to the caller.

When readline is called with no more data to read, it sets *line_p to NULL and returns 0.

When readline is called with NULL file or NULL line_p, it may abort. readline must not leak memory.

You will implement this function in readline.c. You may use any function from the standard library except getline.

concat

char* concat(int n_strings, char** strings);

This function takes an array of strings and concatenates them into a single string. The resulting string is heap-allocated and must be freed by the caller. n_strings is the number of strings in the array and strings is an array of C strings, each of which is null-terminated.

concat may abort if n_strings is not positive or strings is NULL.

Crashing vs Aborting

Crashing: An unexpected and abrupt termination of your C program due to errors like segmentation faults or division by zero. It's an indication of a bug and should be avoided at all costs, as it may result in data loss or system instability.

Aborting: A deliberate stoppage of your program when an unrecoverable error or violation of assumptions is detected. It is preferable to crashing, as it's a controlled exit and allows for post-mortem analysis. This is usually done by using the assert(condition) function in <assert.h>assert will abort the program if the condition is not met.

In this course, unless otherwise specified, your program must not crash, and crashing in testing will incur a higher penalty than aborting or producing wrong results.

Testing

You should run the unit tests frequently throughout implementation. ./string --test-readline <file> tests your implementation of readline, and ./string --test-concat <file> tests concat. The tests for concat uses readline, so you should be confident in your readline implementation before moving on to concat.

Make sure to check for memory leaks and errors by running

valgrind --leak-check=full ./string --test-readline file.txt
or
valgrind --leak-check=full ./string --test-concat file.txt

You should see valgrind reports:

==XXXXXX== All heap blocks were freed -- no leaks are possible
==XXXXXX==
==XXXXXX== For lists of detected and suppressed errors, rerun with: -s
==XXXXXX== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Written

You need to answer some questions in WRITTEN.txt.

Submission checklist

Everything below is inside hw3 directory of your coursework repository.

  • string.c contains your implementation of the readline function and the concat function.
  • The following command produces no errors or warnings and produces an executable called string.
    clang -std=c11 -Wall -Wextra -pedantic -o string compress.c
    
  • WRITTEN.md is finished.
  • all changes are committed and pushed to your github repository

Submit your program to Gradescope by selecting your coursework directory and the correct branch.

Grading

Percentage
Correctness 70%
Style 20%
Written 10%

Warning: If your program cannot be compiled using the commands above without error or warning, you will receive 0 points in correctness since there is no executables for us to run.