logohome

Distributed Computing Systems

CS 4513
D-Term 2008

This is a 4000-level undergraduate course during which you will study the concepts, design, and implementation of distributed computing systems. The course will:

  1. Complete the content of CS-3013, Operating Systems, specifically with respect to file systems.
  2. Introduce the concepts of distributed computing and learn about protocols needed for their implementation.
  3. Reinforce the concepts of concurrency in computing environments.

The course includes a substantial practical component during which students will write programs that exercise distributed computing features.

Index


Prerequisites

The official course description on the WPI Computer Science web site says that this course “extends the study of the design and implementation of operating systems begun in CS 3013 to distributed and advanced computer systems. Topics include principles and theories of resource allocation, file systems, protection schemes, and performance evaluation as they relate to distributed and advanced computer systems.”

You must have a solid background CS-3013 and/or an equivalent Operating Systems course, and you must also be proficient in programming a low-level language such as C, including the use of pointers, casts, malloc(), and free().

You will need to know something about computer networks, protocols, and the 7-layer OSI stack. This is covered in CS-4514, Computer Networks. For those have not taken CS-4514, a brief tutorial will be offered at 11:00 AM to 1:00 PM on March 19 in Fuller Labs 320. The following slides include the material of the tutorial. (.ppt, html). A tutorial on the use of sockets can be found here:– http://beej.us/guide/bgnet.

top

Course Information

Time and Place: Tuesdays and Fridays, 8:00 — 9:50 AM, Goddard Hall 227
March 11 — April 29, 2008
No class on Tuesday, April 15 (campus-wide project day)
Fossil Lab will not be available from approximately 6:00 PM Friday, March 21 to
           
approximately 6:00 PM Saturday, March 22

Professor: Hugh C. Lauer

Email: <professor’s last name>@cs.wpi.edu

Office hours: by appointment, or (normally) 1 hour after each class

Office: Fuller Labs, room 239

 

Teaching Assistants:

Rick Skowyra

Email: rskowyra **at** cs.wpi.edu

Office hours:

2:00-4:00 PM, Mondays and Tuesdays

Office: Fossil Lab

Issac Chanin

Email: chanin **at** wpi.edu

Office hours:

5:30-7:30 PM, Tuesday and Thursdays

Office: Fossil Lab

 

 

Textbooks Andrew S. Tanenbaum and Maarten Van Steen, Distributed Systems: Principles and Paradigms, 2nd edition, Prentice Hall, 2007. Note:– Be sure that you have the 2nd edition of this book. There are copies of the 1st edition floating around, including on the web. The chapters of the two editions are in different orders.

You also need one of the following from your Operating System course:–

o       Silberschatz, Galvin, and Gagne, Operating Systems Concepts, Seventh Edition, John Wiley and Sons, 2005.

o       Andrew S. Tanenbaum, Modern Operating Systems. 2nd edition, Prentice Hall, 2001.

 

Class e-mail lists: The following two lists are in the domain cs.wpi.edu:–

o       cs4513-all  — to reach all students, TAs, and the professor

o       cs4513-staff — to reach just the TAs and the professor

Course web site: http://web.cs.wpi.edu/~cs4513/d08/

Students needing to be absent from class should notify the professor by e-mail or in person as soon as possible, especially if an exam is scheduled for the day of absence!

top


General Overview

CS-4513 meets for two 2-hour classes per week for a seven-week undergraduate term (28 hours). There will be no class on April 15, the campus-wide Project Day.

This course will be a combination of lecture and class discussion, programming projects, and quizzes and tests. There may be one or more unannounced quizzes during the term, and there will be a scheduled mid-term exam and a scheduled final exam.

There will two programming projects during the term, a simple project to be done individually and a larger to be done in teams.

Class participation is an essential part of the grade for the course. If you attend lectures but never say anything or engage in discussions, it would likely reduce your grade by one full letter or more. Students will be expected to be familiar with and to discuss the relevant sections of the textbook and with other reading assigned during the term.

top


Grading Policy

Final grades will be computed as follows:

  • Exams: 40%
  • Programming Projects: 40%
  • Class participation, written homework, and quizzes: 20%

Final grades will reflect the extent to which you have demonstrated understanding of the material, and completed the assigned projects. The base level grade will be a "B" which indicates that the basic objectives on assignments and exams have been met. A grade of "A" will indicate significant achievement beyond the basic objectives and a grade of "C" will indicate not all basic objectives were met, but work was satisfactory for credit.

If there are any circumstances that limit or restrict your participation in the class or the completion of assignments, please contact the professor as soon as possible in order to work something out.

Testing

o       There will be an in-class mid-term exam on Tuesday, April 1 of approximately one hour duration.

o       There will be an in-class final exam on Tuesday, April 29 of approximately one hour duration.

o       One or more unannounced quizzes may be held during the term.

The mid-term and final exams will be closed book, but you may bring one 8-½ ´ 11 inch sheet of prepared notes (double-sided). Unannounced quizzes are closed book and closed notes. Each student should have a calculator available for quizzes and exams.

There is no make-up for missed quizzes. It is not in your interest to miss an exam, but if extraordinary circumstances apply, please contact the professor and your advisor beforehand so that we can work something out.

Academic Honesty

Unless explicitly noted, all work is to be done on an individual basis. Any violation of the WPI guidelines for academic honesty will result in no credit for the course and referral to the Student Affairs Office. More information can be found at

http://www.wpi.edu/Pubs/Policies/Honesty/policy.html

That being said, you are strongly encouraged to discuss with each other about ideas, distributed system concepts, course material, and especially the challenges you encounter in working with projects and/or in the Fossil Lab.

Late Policy

Unless you have arranged otherwise with the professor at least one day prior to the due date, late submissions will be penalized 10% of total assignment value per day or partial day (with the weekend counting as one day), and no assignments will be accepted after seven days beyond the due date. All assignments are due at the start of class on the due date. Projects must be submitted as directed in class. Exceptions to these rules can be made only beforehand.

top


BS/MS Project

Students wishing to receive both BS and MS credit for this course must complete an additional research project. The project for this term is specified here (.doc, html).

top


Lecture Slides

Copies of the slides and notes used for lectures will be posted here just before or just after each class. The textbook chapters refer to Tanenbaum & van Steen, Distributed Systems: Principles and Paradigms, 2nd edition.

Date

Topics

Textbook
Chapters

Lecture
Notes

Week 1

Course Introduction &
What is a Distribute Computing System

1

.ppt

html

 

Remote Procedure Call

4.2

.ppt

html

Week 2

Naming

5

.ppt

html

 

Introduction to File Systems

OS text

.ppt

html

 

File System Implementations

OS text

.ppt

html

Week 3

Distributed File Systems (and related topics)

11

.ppt

html

 

Learning about MapReduce

 

.ppt

html

 

Atomic Transactions

 

.ppt

html

Week 4

The Grapevine Distributed System

 

.ppt

html

 

Synchronization in Distributed Systems

6

.ppt

html

 

Election algorithms

6

.ppt

html

 

Replication and Consistency

7

.ppt

html

Week 5

More on Replication and Consistency

7

.ppt

html

 

Security and Authentication

9

.ppt

html

 

MapReduce — algorithms

Not in text

.ppt

html

 

MapReduce — distributed functions

Not in text

.ppt

html

Week 6

MapReduce — applications

Not in text

.ppt

html

 

Google File System

Not in text

.ppt

html

Week 7

Security and Authentication (continued)

9

.ppt

html

 

More on Authentication

9

.ppt

html

top


Programming Projects

Fossil Lab

Project assignments this term will involve the development of programs that spawn multiple processes and that communicate across the network. Since the normal mistakes that occur during learning and development of such programs can wreak havoc on the daily operation of the campus computing facilities and networks, we will use the newly refurbished and re-equipped Fossil Lab. This enables students to work in a realistic but protected and isolated environment without impacting other computational activities.

If you elect to develop a programming assignment outside the Fossil Lab, please be advised that you are entirely responsible for any errors or chaos it might cause.

In the new Fossil Lab, students work on virtual machines. General instructions for setting up your virtual machine can be found here:– (.doc, .html). The specific virtual machine for this course can be found in the following folder, which is accessible from the desktop of your Fossil Lab workstation:–

P:\ Clonable-SUSE-Linux-10.3

This virtual machine requires about 8-9 gigabytes of storage on the Fossil server. You may make a full clone, but it will not fit within your Fossil server quota. A linked clone, however, will generally be less than 2 gigabytes and will with within your quota. However, the linked clone can only be used in the Fossil Lab. If you wish use the virtual machine on your own PC, make a full clone.

This virtual machine has a single user registered, namely “student”. The root password and also the password for “student” is “Fossil-B17”. It is strongly suggested that you create a user identity for yourself and delete the student identity. It is also strongly suggested that you change the root password of your virtual operating system to something else.

top

General Requirements for Programming Assignments

Each programming project submitted by a student will be compiled and tested by the graders. In order to make the grading process reasonable, a certain uniformity of project submissions and reasonable standard of programming practice are expected.

Programming projects this term will be compiled and tested on virtual machines in the Fossil Lab machines running the version of Linux, C, C++, or Java provided with this course. All student programs must compile without warnings. Using the “-Wno_deprecated” switch is strongly discouraged without prior permission from the instructor.

When grading any project, the grader or professor will do the following to compile the components of the project submission:–

download from Turnin to student’s directory
cd {student’s directory}
make

and the following to erase all compiled components

make clean

If this does not work, we will not try to figure out how to compile and run your program and the assignment will not be graded any further.

To test your program, we will execute

 {name of program} {arguments}

For example, if you were asked to write a program called life with three arguments, we will type the following to a shell to test your program:–

life argument1 argument2 argument3

We will not look through your submission to try to figure out what you decided to call your program.

top

Submission of Programming Projects

Projects must be submitted using the web-based version of the Turnin system. The instructor will provide accounts to students at the start of term. When accounts are set up, each student will receive an e-mail message with a password for the Turnin system. Students should log into Turnin and change their passwords to something they can remember. In the event that a password is forgotten, the instructor or grader can reset it. The web-based Turnin system can be accessed from within a virtual machine or from anywhere on the Internet by visiting

            http://web.cs.wpi.edu/~kfisler/turnin.html

            http://turnin.cs.wpi.edu:8088/servlets/turnin.ss

Project components should not be zipped together or be partitioned into folders for different parts of the project. The reason is that the Turnin system already zips them together.

When you submit an assignment, your submission must include not only the solution, but also the test code or test cases and it must be well commented and easy to read by others. Similarly, any output must be cleanly formatted and easy to read by others.

Be sure to put your name at the top of every file submitted as part of a programming project.  You would be surprised at how often students forget this!

“Getting the correct answer” on a programming project is not sufficient to meet the objectives of the assignment. Successfully meeting the objectives includes testing and paying attention to its robustness. On the team project, it also means multiple “deliverables,” each of which must be completed on time.

top

Programming Project Assignments:

 

Project #1 – Remote execution of a command (.doc, .html)
            Slides for assignment of project (.ppt, .html)

 

Project #2 – Distributed system project (.doc, .html)

 


Reference Material                                  

The following papers are relevant to the material presented in class:–

Allman, Eric, “E-mail Authentication: what? Why? How?,” ACM Queue, November 2006, pp 30-34. (.pdf)

Anderson, Thomas E., Dahlin, Michael D., Neefe, Jeanna M., Patterson, David A., Roselli, Drew S., and Wang, Randoph Y., “Serverless Network File Systems,” Proceedings of the 1995 Symposium on Operating System Principles, Copper Mountain, Colorado, December 1995. (.pdf)

Birrell, Andrew D., Levin, Roy, Needham, Roger M., and Schroeder, Michael D., “Grapevine: An Exercise in Distributed Computing,” Communications of the ACM, vol 25, #4, April 1982, pp. 260-274. (.pdf)

Birrell, Andrew D., and Nelson, Bruce Jay, “Implementing Remote Procedure Calls,” ACM Transactions on Computer Systems, vol. 2, #1, February 1984, pp 39-59. (.pdf)

Dean, Jeffrey, and Ghemawat, Sanjay, “MapReduce: Simplified Data Processing on Large Clusters,” Communications of the ACM, vol 51, #1, January 2008, pp. 107-113. (.pdf)

Dean, J. and Ghemawat, S. “MapReduce: Simplified data processing on large clusters,” In Proceedings of Operating Systems Design and Implementation (OSDI). San Francisco, CA, 2004. pp. 137-150. (.pdf). Note: This paper is an earlier version of the CACM paper above, but it contains some details not included in the CACM paper.

Ghemawat, Sanjay, Gobioff, Howard, and Leung, Shun-Tak, “The Google File System,” Proceedings of the 2003 Symposium on Operating System Principles, Bolton Landing (Lake George), NY, October 2003. (.pdf)

Lämmel, Ralf, “Google’s MapReduce Programming Model — Revisited,” Microsoft Corp., Redmond, WA. (.pdf)

McDougall, Richard, “Extreme Software Scaling,” ACM Queue, September 2005, pp 38-48. (.pdf)

Patterson, David A., “The Data Center is the Computer,” Communications of the ACM, vol 51, #1, January 2008, p. 105. (.pdf) Note: This paper is technical perspective on the CACM paper by Dean and Ghemawat, listed above.

Rosenblum, M, and Ousterhout, J. K., “The Design and Implementation of a Log-Structured File System,” Proceedings of 13th ACM Symposium on Operating Systems Principles, Pacific Grove, California, October 1991, pp. 1-15. (.pdf)

Schroeder, Michael D., Birrell, Andrew D., Needham, Roger M. “Experience with Grapevine: Growth of a Distributed System,” ACM Transactions on Computer Systems, Vol. 2, #1, February 1984, Pages 3-23. (.pdf)

Smed, J., Kaukoranta, T., and Hakonen, H. “Aspects of Networking in Multiplayer Computer Games,” The Electronic Library, vol. 20, #2, Pages 87-97, 2002. (.pdf)

Sutter, Herb, and Larus, James, “Software and the Concurrency Revolution,” ACM Queue, September 2005, pp 54-62. (.pdf)

top