Introduction to Shell Scripting for the Life Sciences

This syllabus is subject to change!

(Slide/homework contents and dates may not be updated yet)

Schedule

Lectures will be recorded (location TBD).

Date Topic Assignment
1 (1/27) Syllabus and introduction to Bash (pdf) HW 1: Exploring the command line (pdf)
2 (2/3) Filesystem, permissions, links (pdf) HW 1 due, HW 2: Managing files (pdf)
3 (2/10) Unix concepts, file descriptors, and pipes (pdf) HW 2 due, HW 3: Text files (pdf)
4 (2/17) Regular expressions (pdf) HW 3 due, HW 4: Regex practice and grep (pdf)
5 (2/24) Sed, Awk (Sed pdf) (Awk pdf) Nothing due
6 (3/3) Regex, Sed, Awk review HW 4 due, HW 5: Sed/Awk (pdf)
7 (3/10) Globs, booleans (pdf) HW 5 due, HW 6: Practice midterm
8 (3/17) Midterm! HW 6 due, take exam
9 (3/24) Spring break! Enjoy your break!
10 (3/31) Exam review + final project Nothing due
11 (4/7) Variables, order of operations (pdf) HW 7: Programming (pdf)
12 (4/14) Tests, loops (pdf) HW 7 due and project proposal due, HW 8: Control Structures (pdf)
13 (4/21) Here strings, arrays, functions (pdf) HW 8 due, HW 9: Functions (pdf)
14 (4/28) String slicing, Find/Xargs HW 9 due, Project milestone due
15 (5/5) Extras (pdf) Finish final project!
16 (5/18) Final project due May 18 Enjoy your summer!

Extras = Bash best practices, beyond Bash and Coreutils, Bioinformatics tools, compressed files, package management

Course Description

Life scientists and laboratory biologists are often faced with large datasets to store, retrieve, and analyze. Graphical and web interfaces, while convenient, may not always be efficient or available for high-throughput analysis. On the other hand, the command-line interface, also known as the shell, is available in most computer environments. A shell allows the user to interactively manage files and run programs, as well as create their own pipelines in the form of shell programs, or scripts, which can run dozens or hundreds of programs at once. Scripting allows for the automation of repetitive tasks and enables faster, reproducible, and portable pipeline creation and delivery. This course will teach life science majors the basics of the most widely-used shell, Bash. Students will learn about navigating the shell, running commands, composing pipelines, and scripting. Students will also be introduced to biologically relevant databases and applications. No prior programming experience is assumed!

Course Details

This course is BSCI238G, a 1 credit course. It is taught in PLS1129 on Fridays 12:00pm - 12:50pm during Spring 2023.

Instructors

The course facilitator (Skylar) should be contacted first about any questions or concerns before reaching out to the faculty advisor (Dr. Pierce).

Please wear a mask!

Although there is no mask mandate, the pandemic is not over. You are strongly encouraged to wear a mask.

Getting help

Piazza

We will use Piazza this semester as a question/answer forum. The Piazza link is on our ELMS page. Please send a message in Piazza before sending an email, as many students may have the same question as you.

Office Hours

By appointment. Please email me to set up a time.

Resources

There is no textbook for this course. The following free resources may be useful.

Command examples

Grades

Grades will be maintained on ELMS. You will be responsible for all material discussed in lecture as well as other standard means of communication (email and ELMS announcements), including but not limited to deadlines, policies, assignment changes, etc. Any regrade request for reconsideration of any grading on coursework must be submitted within one week of when the grade is returned. No regrades will be considered afterwards.

Your final course grade will be determined according to the following percentages.

Percentage Title Description
10% Participation surveys (1% each) Complete surveys after class. A maximum of 10% of the course grade can be earned from surveys.
40% Homework (5% each) Practice scripting and review course materials, graded for completion. The lowest 2 homeworks will be dropped.
20% Midterm The midterm will be on topics from weeks 1-7, and will consist of multiple choice, short answer, and coding questions.
30% Final Project (pdf) The final project will be scripting and documenting a substantial and biologically relevant script that incorporates several class materials.
>16% Extra Credit (pdf) Homeworks that cover some special topics.

Final grades will be decided from the following cutoffs, which may be adjusted at the discretion of the instructors. Plus/minus grades will be used.

A B C D
+ 100-97 90-87 80-77 70-67
97-93 87-83 77-73 67-63
- 93-90 83-80 73-70 63-60

Course policies

Homework

Homework assignments will be released after class and be due at the start of the next class (12 Fridays). Homeworks are graded for completion, and there will be a 20% deduction for late submissions, accepted until 11:59 pm on Monday after the due date. To earn full points on a homework assignment, an attempt must be made for each question. Answer keys will be released following homework deadlines. Students are expected to review these keys and make sure they understand the content.

Midterm Exam

There will be one midterm for this class. It is designed to take approximately one class period (50 minutes) to complete. It will be available for 24 hours from 12 pm March 16 to 12 pm March 17. It is open note, open internet, but not open people, nor open AI (no ChatGPT/friends allowed, see below). The exam should be completed independently, and collaboration with other students during this time will be considered as a violation of academic integrity. In other words, no cheating.

Academic Integrity

It is the responsibility, under the honor policy, of anyone who suspects an incident of academic dishonesty has occurred to report it to their instructor, or directly to the Honor Council. Cases of academic dishonesty will be pursued to the fullest extent possible as stipulated by the Office of Student Conduct. It is very important for you to be aware of the consequences of cheating, fabrication, facilitation, and plagiarism. For more information on the Code of Academic Integrity or the Student Honor Council, please visit https://www.shc.umd.edu.

In other words, the following is ok:

  • Working with other students on the homework
  • Asking the teaching staff for assignment help (eg through Piazza)
  • Using ideas or short fragments of code from publicly available information. You must cite the specific source in a comment in the relevant section of the program.

And the following is not:

  • Getting someone else to do your coursework for you
  • Working with others on the midterm
  • Copying or sharing code with anyone
  • Using ideas from another person's project
  • Posting solutions publicly online (public Git repository, Chegg, etc)
ChatGPT

Use of ChatGPT and other generative AIs on homeworks is discouraged due to ongoing legal and ethical issues. However, it is not banned in this course. You should know the following:

  • AIs may output incorrect information
  • AIs may be unable to cite its sources or explain how it determined its answer
  • As AI continues to advance, one should understand how to use it responsibly

Therefore, if you use ChatGPT or other generative AIs on the homework, you must do the following:

  1. Cite it as a source.
  2. Quote all your questions and all its responses, in full (verbatim), including questions when you ask it to modify its responses.
  3. Explain why its answer is correct or not, and what modifications you would make and why. Use relevant sources to support your explanations.

Failure to do this will count as an academic integrity violation.

ChatGPT will not allowed on the exam.

Excused Absence and Academic Accommodations

See the section titled "Attendance, Absences, or Missed Assignments" available at Course Related Policies.

Disability Support Accommodations

See the section titled "Accessibility" available at Course Related Policies.

Course Evaluations

If you have a suggestion for improving this class, don't hesitate to tell the instructor or student facilitator (Skylar Chan) during the semester. At the end of the semester, please don't forget to provide your feedback using the campus-wide CourseEvalUM system. Your comments will help make the STIC better.

Acknowledgements

I would like to thank Michelle Fang and Ethan Cheng for sharing their syllabi for their STICs with me ("Introduction to R/RStudio for Life Science Majors" and "Introduction to Python Programming for the Life Sciences"). I took inspiration from their general syllabus format and design.