Homework #3

Due: Friday, October 20th at 11:59pm

This homework is intended to give practice filtering out information from the process status command.

CS Linux Machine

You will need access to an Linux based machine when working on your homework assignments. You should not test your programs on macOS or Windows Linux because these operating systems do not provide all utility commands necessary for completing this and possibly future assignments. Additionally, if they do provide a command then it may not contain all options that a Unix-like system provides. We will use and grade all assignments on the CS Linux machines and all programming assignments must work correctly on these machines. However, you can work locally on a Unix or Unix-like machine but ensure that you test your final solutions on a CS Linux machine.

Please follow the instructions provided here

Creating Your Private Repository

For each assignment, a Git repository will be created for you on GitHub. However, before that repository can be created for you, you need to have a GitHub account. If you do not yet have one, you can get an account here: https://github.com/join.

To actually get your private repository, you will need this invitation URL:

  • HW3 invitation (Please check the Post “Homework #3 is ready” Ed)

When you click on an invitation URL, you will have to complete the following steps:

  1. You will need to select your CNetID from a list. This will allow us to know what student is associated with each GitHub account. This step is only done for the very first invitation you accept.

Note

If you are on the waiting list for this course you will not have a repository made for you until you are admitted into the course. I will post the starter code on Ed so you can work on the assignment until you are admitted into the course.

  1. You must click “Accept this assignment” or your repository will not actually be created.

  2. After accepting the assignment, Github will take a few minutes to create your repository. You should receive an email from Github when your repository is ready. Normally, it’s ready within seconds and you can just refresh the page.

  3. You now need to clone your repository (i.e., download it to your machine).
    • Make sure you’ve set up SSH access on your GitHub account.

    • For each repository, you will need to get the SSH URL of the repository. To get this URL, log into GitHub and navigate to your project repository (take into account that you will have a different repository per project). Then, click on the green “Code” button, and make sure the “SSH” tab is selected. Your repository URL should look something like this: git@github.com:mpcs51082-aut23/hw3-GITHUB-USERNAME.git.

    • If you do not know how to use git clone to clone your repository then follow this guide that Github provides: Cloning a Repository

If you run into any issues, or need us to make any manual adjustments to your registration, please let us know via Ed Discussion.

Empty Repository?

One of the main goals for this course is to understand the Unix filesystem and creating directories and files within it. Moving forward in the course, it will be your responsibility to create the directories and files for the programming problems. You must also correctly commit them to your repository.

Make sure you create the directory structure exactly as stated in each problem. For this assignment, define the following structure (bold text means directory)

  • hw3
    • p1
      • p1.sh

hw3 should be the only directory at the top-level of your repository. This is the same structure from your previous homework assignments.

Programming Problem

Note: For this assignment, you are not allowed to use the ``awk`` command. Many of you are over using this command to solve problems that can be done with simple Unix commands. However, you are allowed to use any other unix commands (except for awk) for this assignment, regardless if it was or was not discussed in class.

For this problem, you will be writing a script that filters out information coming from the ps aux command (discussed in the pre-recorded lectures). One sample output from running this command is the following

USER     PID     %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
lamonts  3058393  0.0  0.0  19416  9468 ?        Ss   06:51   0:00 /lib/systemd/systemd --user
lamonts  3058396  0.0  0.0 202588  7632 ?        S    06:51   0:00 vim test.c
root     3080732  0.0  0.0  40724 10044 ?        Ss   09:54   0:00 sshd: lamonts [priv]
maxg     3813528  0.0  0.0  35468  9716 pts/570  S+   Oct05   0:00 vim library.c
maxg     3813981  0.0  0.0  19192  5484 pts/598  Ss   Oct05   0:00 -bash
maxg     3814003  0.0  0.0  34840  9124 pts/598  S+   Oct05   0:00 vim test.c
lamonts  3080750  0.0  0.0  40724  6984 ?        S    09:54   0:01 vim test.c
sallyf+  3292195  0.0  0.0    544     0 pts/2661 TN   May10   0:00 grep -o -E [0-9]+
sallyf+  3292196  0.0  0.0    392     0 pts/2661 TN   May10   0:00 head -1

Task: Inside the p1/p1.sh file, Implement a bash script that Displays the top 10 users with the most processes listed in the ``ps aux`` output. The output has the following format

USER,TOTAL_PROCESSES

where the lines are sorted by TOTAL_PROCESSES in descending order. If a USER is tied with another USER then sort them by username lexicographically. Your script must be able to accept the input from either stdin or from a file supplied on the command line. For all sample runs, we will assume ps aux always produces the output provided above. However, this will not be the case when you are actually testing your script since ps aux will change depending on the system.

Here are two sample runs

$ ps aux > ps.txt; bash p1.sh ps.txt
lamonts,3
maxg,3
sallyf+,2
root,1
$ ps aux | bash p1.sh
lamonts,3
maxg,3
sallyf+,2
root,1

As reminder, the script by default will print only the top 10 users; therefore if there were 20 uniques users then only 10 will appear in the output.

To be able to either read from stdin or a file use this code as basis for your initial implementation

CONTENTS=""
while read line; do
  CONTENTS+=""$line"\n"
done < "${1:-/dev/stdin}"

You should be able to understand what’s happening here based on the lectures. One clarification is that ${1:-/dev/stdin} states that if the positional argument $1 is not set then read from stdin. Note you will have to modify this code! once you start implementing the command line arguments.

ps aux produces a lot of consecutive spaces to align the columns of its output. Use the tr -s ' ' command to delimit the ps aux output to be a single space between each column.

It is required that your script file begins with the normal skeleton code

#! /usr/bin/env bash

set -o errexit
set -o nounset
set -o pipefail

Options

The script will have the following command line options that will change the output of script as follows:

  • -n NUMBER: Changes the number of shown lines in the output. By default the command shows the top ten users; however, similar to the head/tail command, -n will only display NUMBER of lines to the console. You can assume number will always be a positive integer (i.e., an integer greater than zero).

  • -e USER_1,USER_2,...,USER_N: Excludes the usernames defined in the USER_1,USER_2,...,USER_N comma separated list from the output. If a USER is not seen in the ps aux output then ignore that user.

  • -p 'COMMAND': For every user in the ps aux output, only count the process to a user’s total processes if the command matches the exact literal 'COMMAND'. Assume the COMMAND will always be given as a literal string.

You can assume that we will test this giving you the correct format for an option. This means you will never see something like -n lamont. However, the options can come in any order and multiple can be supplied on the same command line. If given a file as input then you can assume the file will always be the last argument to p1.sh.

Hint: I recommend you use a dictionary and research the shift command to help process command line options/arguments.

Sample Runs

Here are few sample runs of running the p1.sh file. Again, assume that for the purposes of these sample runs ps aux and ps.txt always produces/contains the output shown above.

$ bash p1.sh -n 1 ps.txt
lamonts,3
$ ps aux | bash p1.sh -n 2 -e lamonts,root
maxg,3
sallyf+,2
$ ps aux | bash p1.sh -e lamonts,root -n 2
maxg,3
sallyf+,2
$ bash p1.sh -n 2 -e lamonts,root,maxg ps.txt
sallyf+,2
$ bash p1.sh -p 'vim test.c' -e maxg ps.txt
lamonts,2
$ ps aux | bash p1.sh -e root -p 'vim test.c'
lamonts,2
maxg,1

As a reminder, you can assume we will always give the options before supplying the file (if script is not using stdin). You do not need to validate the options since we will always run the script correctly.

Grading

Programming assignments will be graded according to a general rubric. Specifically, we will assign points for completeness, correctness, design, and style. (For more details on the categories, see our Assignment Rubric page.)

The exact weights for each category will vary from one assignment to another. For this assignment, the weights will be:

  • Completeness: 70%

  • Correctness: 20%

  • Design/Style: 10%

Professor Samuels provided automated test cases on Ed under the “Homework #3 Tests” post.

Submission

Before submitting, make sure you’ve added, committed, and pushed all your code to GitHub. You must submit your final work through Gradescope (linked from our Canvas site) in the “Homework #3” assignment page via two ways,

  1. Uploading from Github directly (recommended way): You can link your Github account to your Gradescope account and upload the correct repository based on the homework assignment. When you submit your homework, a pop window will appear. Click on “Github” and then “Connect to Github” to connect your Github account to Gradescope. Once you connect (you will only need to do this once), then you can select the repository you wish to upload and the branch (which should always be “main” or “master”) for this course.

  2. Uploading via a Zip file: You can also upload a zip file of the homework directory. Please make sure you upload the entire directory and keep the initial structure the same as the starter code; otherwise, you run the risk of not passing the automated tests.

Note

For either option, you must upload the entire directory structure; otherwise, your automated test grade will not run correctly and you will be penalized if we have to manually run the tests. Going with the first option will do this automatically for you. You can always add additional directories and files (and even files/directories inside the stater directories) but the default directory/file structure must not change.

Depending on the assignment, once you submit your work, an “autograder” will run. This autograder should produce the same test results as when you run the code yourself; if it doesn’t, please let us know so we can look into it. A few other notes:

  • You are allowed to make as many submissions as you want before the deadline.

  • Please make sure you have read and understood our Late Submission Policy.

  • Your completeness score is determined solely based on the automated tests, but we may adjust your score if you attempt to pass tests by rote (e.g., by writing code that hard-codes the expected output for each possible test input).

  • Gradescope will report the test score it obtains when running your code. If there is a discrepancy between the score you get when running our grader script, and the score reported by Gradescope, please let us know so we can take a look at it.