Homework #5¶

Due: Friday, November 10th at 11:59pm

In this assignment, we will begin implementing our shell program. This assignment focuses on understanding the repository structure and getting the initial parsing of the shell commands correct.

CS Linux Machine¶

You will need access to an Linux based machine when working on your homework assignments. You should not test your programs on macOS or Windows Linux because these operating systems do not provide all utility commands necessary for completing this and possibly future assignments. Additionally, if they do provide a command then it may not contain all options that a Unix-like system provides. We will use and grade all assignments on the CS Linux machines and all programming assignments must work correctly on these machines. However, you can work locally on a Unix or Unix-like machine but ensure that you test your final solutions on a CS Linux machine.

Please follow the instructions provided here

Using Visual Studio Code and SSH

Creating Your Private Repository¶

For each assignment, a Git repository will be created for you on GitHub. However, before that repository can be created for you, you need to have a GitHub account. If you do not yet have one, you can get an account here: https://github.com/join.

To actually get your private repository, you will need this invitation URL:

HW5 invitation (Please check the Post “Homework #5 is ready” Ed)

When you click on an invitation URL, you will have to complete the following steps:

You will need to select your CNetID from a list. This will allow us to know what student is associated with each GitHub account. This step is only done for the very first invitation you accept.

Note

If you are on the waiting list for this course you will not have a repository made for you until you are admitted into the course. I will post the starter code on Ed so you can work on the assignment until you are admitted into the course.

You must click “Accept this assignment” or your repository will not actually be created.
After accepting the assignment, Github will take a few minutes to create your repository. You should receive an email from Github when your repository is ready. Normally, it’s ready within seconds and you can just refresh the page.
You now need to clone your repository (i.e., download it to your machine).
- Make sure you’ve set up SSH access on your GitHub account.
- For each repository, you will need to get the SSH URL of the repository. To get this URL, log into GitHub and navigate to your project repository (take into account that you will have a different repository per project). Then, click on the green “Code” button, and make sure the “SSH” tab is selected. Your repository URL should look something like this: git@github.com:mpcs51082-aut23/msh-GITHUB-USERNAME.git.
- If you do not know how to use git clone to clone your repository then follow this guide that Github provides: Cloning a Repository

If you run into any issues, or need us to make any manual adjustments to your registration, please let us know via Ed Discussion.

`msh`: A Unix-like shell¶

For the last three homework assignments, you will apply what you have learned in the course to build your own Unix-like shell called msh (i.e., MPCS shell). This shell will not include everything that the bash shell provides; however, many of the topics covered in the course will be implemented within msh. Each homework assignment, will be a continuation of the prior along with adding on additional components to the shell. We will provide you the opportunity to resubmit the programming portions of each assignment for a better grade. However, this will only result in partial points being rewarded back and not all. Although we are allowing for resubmissions, we recommend that you make progress each week to ensure you’ll be able to complete it all by the last assignment.

The first task in implementing msh is understanding the repository structure of the shell, which is described in the next section.

Task 0: Understanding the Repository Structure¶

Since the building of the shell spans multiple assignments, all remaining homework assignments will work out of the msh-GITHUB-USERNAME repository you created above. The initial contents of your repository has the following structure

├── bin
├── data
├── include
│   └── shell.h
├── README.md
├── Makefile
├── saqs
├── scripts
│   └── build.sh
├── src
│   ├── msh.c
│   └── shell.c
└── tests

The repository structure is similar to most small C projects in industry. The contents of each directory is explained below:

bin directory - will include the msh executable that is generated by the gcc compiler when compiling all source files together.
data directory - includes any auxiliary data files needed by the testcases. You are also allowed to place test data files in this directory for testing purposes.
include directory - contains all the include files for the modules associated with building the msh executable. Only header files (i.e., ``.h`` files) are allowed in this directory.
README.md file - contains documentation about the repository and instructions for building the shell. We will revisit this file in a later assignment.
Makefile file - This is an optional file that makes it easy to compile multiple C files together. We will not cover Makefiles in this course; however, if you are familiar with them then you can implement it in this file. Otherwise, you can ignore this file.
saqs directory - homework #6 and #7 will include short answer questions where your answers will go in this directory. You have no short answer questions for this assignment.
scripts directory - You will still be writing bash scripts along with building the shell. All script files for msh must be placed in this directory. For this assignment you will work with build.sh, which is explained in a later section.
src directory - includes all the source files for the modules you will implement. Only source files (i.e., ``.c`` files) are allowed in this directory.
tests directory - contains the testcases for testing the various modules created for the shell.

As we move along, we will be adding more files and data to the repository each week.

Task 1: Shell Module (`shell.h` and `shell.c`)¶

The actual logic for the shell and its state will be implemented in a seperate module defined in shell.c and shell.h. To get started, copy the following code into the shell.h file:

// Represents the state of the shell
typedef struct msh {
   /** TODO: IMPLEMENT **/
}msh_t;

The state of the shell is contained within the msh_t definition. You will need to add fields to this definition as you implement the shell module.

`alloc_shell` function¶

Inside your shell.h, copy the following function prototype:

/*
* alloc_shell: allocates and initializes the state of the shell
*
* max_jobs: The maximum number of jobs that can be in existence at any point in time.
*
* max_line: The maximum number of characters that can be entered for any specific command line.
*
* max_history: The maximum number of saved history commands for the shell.
*
* Returns: a msh_t pointer that is allocated and initialized
*/
msh_t *alloc_shell(int max_jobs, int max_line, int max_history);

This function allocates (i..e, malloc), initializes and returns a pointer to a msh_t value. The shell has limits to the number of jobs in existence, the maximum number of characters that can be entered in on the command line and how many previous commands are stored in its history, All these are represented by the parameters to the function (i.e., max_jobs, max_line and max_history) respectively. Ensure that you save these values inside your state definition (i.e., msh_t) because you will need to reference them in later functions. If any of these parameters are equal to 0 then you will use the following default values for the corresponding limits:

const int MAX_LINE = 1024;
const int MAX_JOBS = 16;
const int MAX_HISTORY = 10;

You will call this function later in the msh.c file to allocate a new shell state.

Place the implementation of this function inside src/shell.c.

`parse_tok` function¶

Inside your shell.h, copy the following function prototype:

/**
* parse_tok: Continuously retrieves separate commands from the provided command line until all commands are parsed
*
* line:  the command line to parse, which may include multiple commands. If line is NULL then parse_tok continues parsing the previous command line.
*
* job_type: Specifies whether the parsed command is a background (sets the value of 0 at the address of job_type) or foreground job (sets the value of 1 at the address of job_type). If no job is returned then assign the value at the address to -1
*
* Returns: NULL no other commands can be parsed; otherwise, it returns a parsed command from the command line.
*
* Please note this function does modify the ``line`` parameter.
*/
char *parse_tok(char *line, int *job_type);

msh shell contains two special characters that separate multiple commands (i.e., jobs) on the same command line: ; (foreground job) and & (background job). A msh command line could contain a single job on the command line as such

ls -la

or multiple jobs on a single command line as such

ls -la & cd .. ; cat file.txt

where the above command line contains three separate jobs: ls -la, cd .., and cat file.txt

This function is similar to strtok such that it continuously retrieves and returns the jobs on a command line that are separated by & and ;. For each returned job, it will place a 1 at the address job_type points to if the job is a foreground job; otherwise, it places 0 at that address to represent the job is a background job. If the line argument is NULL, similar to strtok, it continues to parse the previous command line. For example, here’s a sample code showing how parse_tok should work

char *cmd_line="ls -la & cd .. ; cat file.txt"
int type;
char *job;
job = parse_tok(cmd_line,&type);
printf("job=%s, type=%d\n",job,type); // prints job=ls -la type=0
job = parse_tok(NULL,&type);
printf("job=%s, type=%d\n",job,type); // prints job=cd .. type=1
job = parse_tok(NULL,&type);
printf("job=%s, type=%d\n",job,type); // prints job=cat file.txt type=1
job = parse_tok(NULL,&type); // job = NULL  type = -1
job = parse_tok(NULL,&type); // Still job = NULL  type = -1

We recommend that you keep an internal pointer of the line parameter in a static variable and reuse as means to return back the portions of the string. One way to implement this function is to place null characters \0 at the locations where you find a & and ; and return the address of the element the represents the beginning of the next job to return. You can return the address of an array element as follows: &array[NUM] where NUM is the valid index. For example, &array[2] returns the address of the third element in the array (assuming 0 indexing).

Place the implementation of this function inside src/shell.c.

`separate_args` function¶

Inside your shell.h, copy the following function prototype:

/**
* separate_args: Separates the arguments of command and places them in an allocated array returned by this function
*
* line: the command line to separate. This function assumes only a single command that takes in zero or more arguments.
*
* argc: Stores the number of arguments produced at the memory location of the argc pointer.
*
* is_builtin: true if the command is a built-in command; otherwise false.
*
* Returns: NULL is line contains no arguments; otherwise, a newly allocated array of strings that represents the arguments of the command (similar to argv). Make sure the array includes a NULL value in its last location.
* Note: The user is responsible for freeing the memory return by this function!
*/
char **separate_args(char *line, int *argc, bool *is_builtin);

This function separates out each word in the provided line and places them in a newly allocated array of strings with the command name always being at index 0. msh assumes words on the command line are separated by one or more whitespace characters. This function assumes line is a single job (i.e., ls -la and not ls -la; cd ..). It places the number of arguments it found (including the program name) at the address of argc. The last element in the returned array must contain a NULL value. For now, ignore is_builtin. We will come back to the is_builtin in a future assignment. Here’s a sample code showing how separate_args should work

char *cmd_line[]="ls -la /mpcs51082-aut23"
char **argv;
int argc
argv = separate_args(cmd_line,&argc);
printf("%s",argv[0]); // prints ls
printf("%s",argv[1]); // prints -la
printf("%s",argv[2]); // prints  /mpcs51082-aut23
//Please note there is an argv[3] that should contain NULL!
printf("%d", argc); // prints 3
argv = separate_args("",&argc);  // argv = NULL

Place the implementation of this function inside src/shell.c.

`evaluate` function¶

Inside your shell.h, copy the following function prototype:

/*
* evaluate - executes the provided command line string
*
* shell - the current shell state value
*
* line - the command line string to evaluate
*
* Returns: non-zero if the command executed wants the shell program to close. Otherwise, a 0 is returned.
*/
int evaluate(msh_t *shell, char *line);

This function executes the job(s) provided in the line parameter. We have not discussed executing commands within C yet. For now, just print out the separate jobs in the command line ending each job with its argc value. Please see the section Sample Run to see the correct output. We will continue to work on this specific function in future assignments. If line surpasses the maximum number of characters on the line then print: error: reached the maximum line limit.

Place the implementation of this function inside src/shell.c.

`exit_shell` function¶

Inside your shell.h, copy the following function prototype:

/*
* exit_shell - Closes down the shell by deallocating the shell state.
*
* shell - the current shell state value
*
*/
void exit_shell(msh_t *shell);

This function simply deallocates any memory allocated within the shell state. For now, make sure to free any state data allocated you called with malloc and make sure to free the shell variable. For this assignment, you may have not allocated any memory other than the shell variable itself, which is fine.

Place the implementation of this function inside src/shell.c.

Task 2: The Executable Program (`msh.c`)¶

The msh program will work similarly to bash command where it will act as a interactive REPL (Read-Eval-Print Loop), where it will allow the user to enter in a command line (i.e., input), evaluate the command line, and print the result of running the command line back to the console console. The msh program will continuously perform this REPL mechanism until the user enters exit. Most prompts for a shell such as bash looks like something like lamonts@linux2:~$, where it might have the machine logged in under along with current directory and a $. For msh, it will always have the prompt: msh>.

Inside the src/msh.c, you will implement a main function along with any helper functions you wish that must have the following

Handle the following optional command line arguments: -s NUMBER, -j NUMBER, -l NUMBER. The -s NUMBER represents the maximum number of command lines to store in the shell history. -j NUMBER represents the maximum number of jobs that can be in existence at any point in time and -l NUMBER represents the maximum number of characters that can be entered in for a single command line. These can be entered in any order and you must ensure that NUMBER is a positive integer (i.e., an integer greater than zero). Please note we are not implementing the functionality of -j and -s in this assignment but will in future assignments. However, you still need to handle these arguments in this assignment. I recommend looking at getopt() to make it easy to process the options and sscanf to easily determine if the NUMBER is an actual number. If any additional command line arguments are specified or an optional command line argument is not correctly formatted then the program must return 1 and print the usage state: usage: msh [-s NUMBER] [-j NUMBER] [-l NUMBER]. Do not print anything else out.
Allocate and initialize a msh_t state (i.e., call allocate_shell). If the user did not supply a optional command line argument then a 0 is passed in for that specific limit parameter (i.e., the shell will use the default value for that specific limit).
Using the allocated msh_t state, implement the REPL mechanism where it will prompt the user to enter in a command line, call the evaluate function of the shell module to execute the command. The program will continuously allow the user to enter in command lines until they enter exit, which should close down the shell and exit.
Ensure msh works with redirecting standard-in and standard-out to files.

`getline()`¶

The C getline(char **lineptr, size_t *restrict n,FILE *restrict stream) function reads an entire line from stream. The getline() stores the contents of a single line into the argument lineptr. The *lineptr is null-terminated and includes the newline character, if one was found. If *lineptr is set to NULL before the call, then getline() will allocate the string for storing the line. This string should be freed by the user program even if getline() failed. For example, take a look at the following code:

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>

int main() {

    char *line = NULL;
    long int len = 0;
    long nRead = getline(&line, &len, stdin);
    while ( nRead != -1) {
        printf("Retrieved line of length %zd with contents:%s\n", nRead, line);
        // Done using line so I must free it!!
        free(line);
        //Make sure to reset line back to null for the next line
        line = NULL;
        nRead = getline(&line, &len, stdin);
    }
    return 0;
}

For the purposes of this assignment, we can grab the entire contents being read in from standard input using getline within a while loop. getline will return -1 once it has reached the end of the stream. Otherwise, it returns the number of characters on the line (nRead). Notice that line is always set to NULL before we call getline. This is a requirement for this assignment so we can allocate a new string for each line from the stream. Make sure to always free the line after you are done using it. Also notice that when calling getline we pass in &line and &len. This is required to make sure it can successfully place in the contents. Note that len is not the number of characters on the line but rather it is the length of an internal buffer to hold the characters. You can ignore that value for this assignment; however, you still need to pass it in to call getline. Make sure to include #define _GNU_SOURCE because it is required to use getline.

Building the `msh` executable¶

Since the source and header files are in different directories you need to tell the compiler (i.e., gcc) where they are located. Here’s a simple way to do it:

$ gcc -I./include/ -o ./bin/msh src/*.c

This assumes you are in the root directory of your repository! The -I flag tells the compiler were the user-defined header files are located. Running this code will then place the msh executable in the bin directory.

Sample Runs¶

Here are few sample runs of the msh shell at this point in its development. Please note that all sample runs assumes you are running and building the program from the root directory of your repository:

$ gcc -I./include/ -o ./bin/msh src/*.c
$ alias msh=./bin/msh
$ msh
msh> ls -la .
argv[0]=ls
argv[1]=-la
argv[2]=.
argc=3
msh>   # I entered in a new line
msh> ls -la /mpcs/; cd .. & echo hello
argv[0]=ls
argv[1]=-la
argv[2]=/mpcs/
argc=3
argv[0]=cd
argv[1]=..
argc=2
argv[0]=echo
argv[1]=hello
argc=2
msh> exit
$ msh -l lamont -s 45
usage: msh [-s NUMBER] [-j NUMBER] [-l NUMBER]
$ msh -s 45 -j 0
usage: msh [-s NUMBER] [-j NUMBER] [-l NUMBER]
$ msh -l -j
usage: msh [-s NUMBER] [-j NUMBER] [-l NUMBER]
$ msh -l 5
msh> ls -la /mpcs;
error: reached the maximum line limit
msh> ls
argv[0]=ls
argc=1
msh> exit

As a reminder, evaluate only prints the command line arguments for a msh line. In the next assignment, we will start running the actual command lines.

Task 3: `build.sh`¶

Inside the scripts/build.sh write a bash script that builds the msh executable and places it inside the bin directory. Use the information from the Building the ``msh`` executable section above. This really should only be one to three lines of code. If you are using a Makefile then you can call make function from within here.

Testing¶

You should do you own local testing of the msh program yourself. Use the Sample Runs as a starting point.

Professor Samuels will provide additional test cases on Wednesday November 8th; however, we want you to first do your own testing to learn how to become better testers before providing you with more test cases.

Grading¶

Programming assignments will be graded according to a general rubric. Specifically, we will assign points for completeness, correctness, design, and style. (For more details on the categories, see our Assignment Rubric page.)

The exact weights for each category will vary from one assignment to another. For this assignment, the weights will be:

Completeness: 70%
Correctness: 20%
Design/Style: 10%

Submission¶

Before submitting, make sure you’ve added, committed, and pushed all your code to GitHub. You must submit your final work through Gradescope (linked from our Canvas site) in the “Homework #5” assignment page via two ways,

Uploading from Github directly (recommended way): You can link your Github account to your Gradescope account and upload the correct repository based on the homework assignment. When you submit your homework, a pop window will appear. Click on “Github” and then “Connect to Github” to connect your Github account to Gradescope. Once you connect (you will only need to do this once), then you can select the repository you wish to upload and the branch (which should always be “main” or “master”) for this course.
Uploading via a Zip file: You can also upload a zip file of the homework directory. Please make sure you upload the entire directory and keep the initial structure the same as the starter code; otherwise, you run the risk of not passing the automated tests.

Note

For either option, you must upload the entire directory structure; otherwise, your automated test grade will not run correctly and you will be penalized if we have to manually run the tests. Going with the first option will do this automatically for you. You can always add additional directories and files (and even files/directories inside the stater directories) but the default directory/file structure must not change.

Depending on the assignment, once you submit your work, an “autograder” will run. This autograder should produce the same test results as when you run the code yourself; if it doesn’t, please let us know so we can look into it. A few other notes:

You are allowed to make as many submissions as you want before the deadline.
Please make sure you have read and understood our Late Submission Policy.
Your completeness score is determined solely based on the automated tests, but we may adjust your score if you attempt to pass tests by rote (e.g., by writing code that hard-codes the expected output for each possible test input).
Gradescope will report the test score it obtains when running your code. If there is a discrepancy between the score you get when running our grader script, and the score reported by Gradescope, please let us know so we can take a look at it.

Homework #5¶

CS Linux Machine¶

Creating Your Private Repository¶

msh: A Unix-like shell¶

Task 0: Understanding the Repository Structure¶

Task 1: Shell Module (shell.h and shell.c)¶

alloc_shell function¶

parse_tok function¶

separate_args function¶

evaluate function¶

exit_shell function¶

Task 2: The Executable Program (msh.c)¶

getline()¶

Building the msh executable¶

Sample Runs¶

Task 3: build.sh¶

Testing¶

Grading¶

Submission¶

`msh`: A Unix-like shell¶

Task 1: Shell Module (`shell.h` and `shell.c`)¶

`alloc_shell` function¶

`parse_tok` function¶

`separate_args` function¶

`evaluate` function¶

`exit_shell` function¶

Task 2: The Executable Program (`msh.c`)¶

`getline()`¶

Building the `msh` executable¶

Task 3: `build.sh`¶