Homework 6: Hash Table and Tagged Union
Due Monday, July 29, 2024 at 11:59pm
In this assignment, you will implement some functions in a hash table and implement a module that fuses two implementations of dictionaries together.
- Compile and test frequently
- Read the entire assignment first before you start
- Start early and do not do all of the assignment in one sitting; coding is fun but fighting for hours with broken code is not
- Do not hesitate to seek help if you are stuck
Synopsis
In this homework, you will work in the hw6
directory.
table.c
: You will implement table_free
, table_get
and table_insert
functions. This file also contain other provided functions implementing the
header file at table.h
.
dict.c
: You will implement an abstract interface in dict.h
that allows
the user to choose either implementation of a dictionary dynamically, using
tagged unions.
groups.c
: My implementation of groups
from Homework 5 that
uses the abstract interface dict.h
. This file will be released after the late
deadline for Homework 5.
Written: You will answer some simple questions at the end.
Getting started
See homework 1.
What is provided?
The hw6
directory contains the following files:
Makefile
is a Makefile that specifies how to build this homework.table.h
is the header file for a map implemented as a hash table.table.c
is where you write your implementation of the three core functions in hash tables.table-test.c
contains unit tests for your implementation.groups.c
is my implementation ofgroups
, released later.hash.{h,c}
contains functions that hash and compare strings.tests/
is a directory containing sample inputs forgroups
.WRITTEN.txt
The directory also contains the following supporting files:
array-list.{h,c}
is a completed module implementing a dynamic-array list.linked-list.{h,c}
is a completed module implementing a linked list.tests.{c,h}
is a unit test framework that I use for this class. You can read them if you are curious.- Please ignore
memcheck.supp
.
Compiling
A Makefile is provided, containing the following rules.
make table-test
makes the table-test
, the unit test for your table
implementation.
make groups
makes the groups
program.
make all
, or simply make
, makes both of the above.
MacOS users who wish to run Address Sanitizer can also use the following two
commands to check for memory leaks in groups
and table-test
, respectively.
Specification
This homework is divided into two parts. You should implement table
first,
ensuring that all test cases pass before moving on to dict.c
.
Part I: Hash Table
You will implement a hash table that resolves collisions by chaining.
A table
is represented by the following struct
.
struct table {
int size; /* the number of buckets in this table */
int length; /* the number of key-value pairs */
int (*cmp)(void *, void *); /* comparison between two keys */
uint64_t (*hash)(void *); /* hash a key */
struct bucket *buckets[]; /* a variable array member for buckets */
};
A table contains two function pointers, cmp
and hash
, and two attributes
size
, which is the number of buckets, and length
, which is the number of
key-value pairs in the table.
The table struct contains an array of buckets in the same memory block as the
other fields as a variable array member.
A bucket is a linked list of key-value pairs.
Given a key void *key;
and table struct table *t;
, the index of a bucket can
be obtained by the following:
To look up the value associated to a key, you should search through the list of
buckets at index idx
.
To insert a key-value, you should first check if the key is already in the list
of buckets.
If so, the old value is replaced with the new value, and the old value is
returned.
Otherwise, you should prepend the new key-value pair to the list of buckets.
Note that NULL
signifies an empty linked-list.
Don't forget to update the length
when inserting.
The functions table_free
, table_get
, and table_insert
implement the same
interface as map_free
, map_get
, and map_insert
, respectively.
Variable Array Member
Since the buckets
are not allocated separately from the table
struct
itself, you do not need to free t->buckets
array separately. (However, you
do need to free each linked-list in the buckets
.)
To access a bucket at index idx
, you can just use the conventional array
subscripts, i.e. t->buckets[idx]
.
Running unit tests
See homework 4.
Part II. Tagged Union
In this part, you will implement a generic dictionary interface using tagged
unions, which handles multiple underlying data structures.
This interface creates a unified structure, struct dict
, which can either be a
hash table or a map, based on which "constructor" creates it.
enum dict_type {
TABLE,
MAP,
};
union table_or_map {
struct table *tbl;
struct map *map;
};
struct dict {
enum dict_type type;
union table_or_map data;
};
To review, union
allows a programmer to choose how to interpret the bits at run
time, and enum
informs the programmer which one of the variants in the union
is the "correct" interpretation.
The dict_walk
function is provided as an example of using a tagged union.
void dict_walk(struct dict *dict,
void (*visit)(void *key, void *value, void *data),
void *data)
{
if (dict->type == TABLE) {
table_walk(dict->data.tbl, visit, data);
} else {
map_walk(dict->data.map, visit, data);
}
}
dict
has, and dispatches the
correct, concrete function that implements walk
with the correct
interpretation in the union.
To construct a tagged union, you should set the appropriate tag and call the
constructor of the requested implementation (map_create
or table_create
).
You should implement all functions in dict.h
by invoking the correct concrete
implementations of a dictionary.
After finishing dict.c
, you should read through groups.c
and check if
groups
, which uses the abstract dictionary, compiles and works as expected.
All variants in a union occupy the same memory space. dict->data.map
and
dict->data.tbl
is the same piece of memory, but the .
selector tells
the compiler how that piece of memory should be interpreted.
Written
You need to answer some questions in hw6/WRITTEN.txt
.
Submission checklist
In hw6/
:
make
produce no errors and produce executables,groups
andtable-test
.
All changes are committed and pushed to your github repository. Submit your program to Gradescope by selecting your coursework directory and the correct branch.
Grading
Percentage | |
---|---|
Correctness | 70% |
Style | 20% |
Written | 10% |
Warning: If your program cannot be compiled using the commands above without error, you will receive 0 points in correctness since there is no executables for us to run.