Distributed Systems Course (fall 2016/2017)

Lecturer:	Konrad Iwanicki
Assistants:	none
Lectures:	Wednesday, 2:15 PM - 3:45 PM, Room 3230
Lab classes:	Wednesday, 4:15 PM - 5:45 PM, Room 3045
Final exam:	Wednesday, January 25, 2:15 PM - 3:45 PM (may be shorter), Room 3230

This (seventh) edition of the course consists of two components: lectures and lab classes. The lectures cover the principles, advanced concepts, and technologies of distributed systems, including communication, replication, fault tolerance, and security. The objective of the lab, in turn, is to give every student a chance to design, implement, and evaluate his own distributed system in the area of cloud computing, as well as to broaden the students' knowledge on the state of the art in distributed systems. The course is recommended for graduate students attending the distributed systems seminar and following the DOS Master's track, as well as for other students interested in computer systems. The course may be given in English.

Contents
1. Passing Rules
1.1. Lab Rules
1.2. Exam Rules
2. Lecture Topics and Schedule
3. Lab Topics and Schedule
4. Student Presentation Topics and Schedule
5. Past Exams

Passing Rules

To pass the course, a student has to score at least 60 out of a total of 100 points and pass the lab (see below). The points can be scored for:

lab assignments: up to 50 points
a written exam at the end of the semester: up to 50 points

The final grade is calculated as follows:

Points	0-51	52-59	60-67	68-75	76-83	84-91	92-...
Grade	2 (fail)	2+ (fail)	3	3+	4	4.5	5

Lab Rules

The goals of the lab are twofold. First, the lab allows each student to build her own simple distributed system. The building process will consist of two assignments and one colloquium. Second, the lab creates an opportunity for the student to update other students with a piece of recent work in the area of distributed systems. This will have the form of an oral presentation.

To pass the lab, each student has to score a total of at least 30 points and a given number of points per each assignment. The detailed breakdown of the scores and deadlines is as follows:

What	When	How many points	Min. required points
Colloquium	October 12, 2016, 16:00 CEST	5	0
Assignment 1	October 28, 2016, 23:59 CEST	10	6
Assignment 2	December 30, 2016, 23:59 CET	30	16
Oral Presentation	Individually set date (schedule)	15	8

At the beginning of the course, students may decide if they want to work on the assignments individually or in pairs. No larger groups will be allowed. The decision cannot be changed during the semester (after the colloquium). The lecturer will not regard any conflicts within pairs as circumstances affecting grades. In other words, if you work in a pair, choose your partner well.

Assignment solutions have to be handed in on time by submitting e-mails with topic “[DS2016] Solution X” to the lecturer (where X can be 1 or 2). Since the lecturer receives an excessive number of e-mails, e-mails with different topics may be ignored. Moreover, each day of delay in submitting a solution results in multiplying the scores received for the solution by 0.9. Normally, the delay must not be more than 7 days, after which an assignment is considered as failed (the student receives 0 points). However, each day a student participates in both a lecture and a lab gives the student one extra day of delay (in this day, the points are not multiplied by 0.9). No future days during which the student intends to participate will be counted toward the reduction. For students working in pairs, the reduction will be counted as the average of the lectures attended by each of the participants (rounded down if necessary).

It is allowed to talk about your ideas on solving the assignments with your colleagues. It is NOT allowed to show, share, exchange code (in any form) without a prior permission from the lecturer.

A presentation is in turn prepared individually and normally given in Polish with slides in English. However, if foregin students enroll for the course, all presentations will be required to be given in English. The strict time limit of a single talk is 60 minutes, in case of one presentation per class, or 45 minutes, if there are two presentations during a single class. The presenting student will be interrupted after this period. During the talk, other students are discouraged from asking questions. After the talk, there is a questions-and-answers session, during which the presenter answers question posed by the lecturer and other students. The objective of the questions could be, for instance, to clarify some aspects of the paper or to learn the presenter's opinion on a problem related to the paper.

During her presentation of a paper, a student is obliged to display PowerPoint/PDF slides for the paper. As a reminder, they have to be in English. The student has to prepare the slides on her own. If some slides for the paper already exist on the Internet, the concents of those slides can be re-used by the student preparing her own slides only if re-using the contents does not violate any copyrights, especially when the student's presentation is made available online. Moreover, the student has to acknowledge using somebody else's slides.

Tips:

Read your paper well in advance to understand it and to later be able to answer other students' questions.
Practice your talk to fit in the time limit.
Try to briefly go over the related work cited in the paper as this can give you some valuable input on the problem the paper is solving.
Try to find any follow-ups on the paper because this can be rewarding as well. Skimming through follow-up papers will help you better understand the topic.
Ask the presenter questions that, rather than proving the presenter doesn't know something, lead to interesting discussions. You are not awarded points for mean or stupid questions.
If you have read and understood the presented paper, and if you have practiced your talk, relax during your presentation: you will surely be able to answer all questions.

Exam Rules

The exam covers the lecture topics as well as the students' presentations. It consists of a series of questions. Each question has three subquestions with binary (TRUE/FALSE) answers. A students scores a point for a question only if the answers to all subquestions of the question are correct. Conversely, if an answer to any subquestion of the question is incorrect, no point is given for the entire question. Note that these scoring rules are really demanding (cf. the scores for 2015/2016).

Lecture Topics and Schedule

Since this is still a developing course, this year's lectures will be given mostly based on a book by my PhD adviser and the head of my former research group: Maarten van Steen and Andrew S. Tanenbaum, “Distributed Systems: Principles and Paradigms,” Second Edition, Prentice Hall, 2007, 702 pages, ISBN 9780132392273. Purchasing the book is not mandatory as the lecture slides will be available here. There will be a few lectures with an extra material, though.

Date	Topics	Slides
October 5, 2016	Introduction: goals of distributed systems, common types of distributed systems	lecture 01
October 12, 2016	Architectures: architectural styles, system architectures, self-management	lecture 02
October 19, 2016	Processes: threads, virtualization, clients & servers, server clusters, code migration	lecture 03
October 26, 2016	Communication: fundamentals, remote procedure call, message-oriented communication, stream-oriented communication, multicast communication	lecture 04-05
November 2, 2016		lecture 04-05
November 9, 2016	Naming: basic terms and definitions, flat naming, structured naming, attribute-based naming	lecture 06 and supplement
November 16, 2016	Synchronization: clock synchronization, logical clocks, totally-ordered multicast, causally-ordered multicast mutual exclusion, global positioning of nodes, leader election	lecture 07-08 and supplement
November 23, 2016		lecture 07-08 and supplement
November 30, 2016	Replication and Consistency (Part I): replica management, continuous consistency, data-centric consistency models, consistency protocols	lecture 09 (selected slides)
December 7, 2016	Fault Tolerance (Part I): failure models, failure masking, failure detection, reliable client-server communication, atomic multicast, two-phase commit, three-phase commit, checkpointing, logging, recovery, agreement in faulty systems	lecture 10-11
December 14, 2016		lecture 10-11
December 21, 2016	Fault Tolerance (Part II): agreement in faulty systems (continued), Paxos	lecture 12
January 11, 2017	Replication and Consistency (Part II): CAP theorem, PACELC, eventual consistency, conflict-free replicated data types, client-centric consistency models	lecture 13
January 18, 2017	student presentations moved from the lab to the lecture
January 25, 2017	FINAL EXAM

Lab Topics and Schedule

The schedule of the lab classes with material relevant to building the distributed system is as follows:

Date	Materials
October 5, 2016	Scenario 01
October 12, 2016	Scenario 02
October 19, 2016	Scenario 03
October 26, 2016	Individual work, assignment grading in spare time
November 2, 2016	Individual work, assignment grading in spare time
November 9, 2016	Scenario 04
November 16, 2016	Scenario 05
November 23, 2016	Scenario 06
November 30, 2016	Scenario 07
December 7, 2016	Scenario 08
December 14, 2016	Individual work, assignment grading in spare time
December 21, 2016
January 11, 2017
January 18, 2017	Entire lab dedicated to assignment grading
January 25, 2017	Entire lab dedicated to assignment grading

Student Presentation Topics and Schedule

The schedule of the students' presentations is as follows:

Date	Presenter	Topic
October 5, 2016	Konrad Iwanicki	Lab organization and rules. Assignment presentation.
October 12, 2016	Juliusz Straszynski	Trishul Chilimbi, Yutaka Suzue, Johnson Apacible, Karthik Kalyanaraman: “Project Adam: Building an Efficient and Scalable Deep Learning Training System”
October 19, 2016	Hubert Tarasiuk	Maxime Colmant, Mascha Kurpicz, Pascal Felber, Loic Huertas, Romain Rouvoy, Anita Sobe: “Process-level Power Estimation in VM-based Systems”
October 26, 2016	Mateusz Piotrowski	Sebastian Angel, Hitesh Ballani, Thomas Karagiannis, Greg O'Shea, Eno Thereska: “End-to-end Performance Isolation Through Virtual Datacenters”
October 26, 2016	Janusz Marcinkiewicz	Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian, Ming Wu, Lidong Zhou: “Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing”
November 2, 2016	Maciej Kisiel	Sangjin Han, Scott Marshall, Byung-Gon Chun, Sylvia Ratnasamy: “MegaPipe: A New Programming Interface for Scalable Network I/O”
November 2, 2016	Andrzej Jackowski	Sangman Kim, Seonggu Huh, Yige Hu, Xinya Zhang, Emmett Witchel, Amir Wated, Mark Silberstein: “GPUnet: Networking Abstractions for GPU Programs”
November 9, 2016	Pawel Janus	Pawan Prakash, Advait Dixit, Y. Charlie Hu, Ramana Kompella: “The TCP Outcast Problem: Exposing Unfairness in Data Center Networks”
November 16, 2016	Ewa Pawlowska	Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, Venkateshwaran Venkataramani: “Scaling Memcache at Facebook”
November 23, 2016	Piotr Rymarz	Michael Chow, David Meisner, Jason Flinn, Daniel Peek, Thomas F. Wenisch: “The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services”
November 30, 2016	Przemyslaw Przybyszewski	Shobana Balakrishnan, Richard Black, Austin Donnelly, Paul England, Adam Glass, Dave Harper, Sergey Legtchenko, Aaron Ogus, Eric Peterson, Antony Rowstron: “Pelican: A Building Block for Exascale Cold Data Storage”
December 7, 2016	~~Mateusz Chololowicz~~	Subramanian Muralidhar, Wyatt Lloyd, Sabyasachi Roy, Cory Hill, Ernest Lin, Weiwen Liu, Satadru Pan, Shiva Shankar, Viswanath Sivakumar, Linpeng Tang, Sanjeev Kumar: “f4: Facebook's Warm BLOB Storage System”
December 14, 2016	~~Olivier Czuper~~	~~Jake Wires, Stephen Ingram, Zachary Drudi, Nicholas J. A. Harvey, Andrew Warfield: “Characterizing Storage Workloads with Counter Stacks”~~
December 14, 2016	Pawel Krawczyk	Mai Zheng, Joseph Tucek, Dachuan Huang, Feng Qin, Mark Lillibridge, Elizabeth S. Yang, Bill W Zhao, Shashank Singh: “Torturing Databases for Fun and Profit”
December 21, 2016	Adam Paszke	Wenting Zheng, Stephen Tu, Eddie Kohler, Barbara Liskov: “Fast Databases with Fast Durability and Recovery Through Multicore Parallelism”
December 21, 2016	Cezary Siluszyk	Sriram Subramanian, Swaminathan Sundararaman, Nisha Talagala, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau: “Snapshots in a Flash with ioSnap”
January 11, 2017	Krzysztof Pszeniczny	James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, Dale Woodford: “Spanner: Google's Globally-Distributed Database”
January 11, 2017	~~Jan Kopanski~~	~~Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, Mike Dahlin: “Robustness in the Salus scalable block store”~~
January 18, 2017 (lecture)	Michal Rybak	Masoud Saeida Ardekani, Douglas B. Terry: “A Self-Configurable Geo-Replicated Cloud Storage System”
January 18, 2017 (lecture)	~~Klaudia Algiz~~	~~Chao Xie, Chunzhi Su, Manos Kapritsos, Yang Wang, Navid Yaghmazadeh, Lorenzo Alvisi, Prince Mahajan: “Salt: Combining ACID and BASE in a Distributed Database”~~
January 18, 2017 (lab)	No presentations: grading Assignment 2
January 25, 2017	No presentations: grading Assignment 2

Past Exams

Below, you can find the questions from past exams:

Year	Exam Set	Participants			Points
Year	Exam Set	Course	Exam	%	Available	Min	Avg	Med	Max
2015/2016	Final (test)	16	13	81.3	25	4	10.08	10	22
2014/2015	Final (test)	17	17	100	25	5	12.76	13	20
2013/2014	Final (test)	16	16	100	25	11	14.69	13	21
2012/2013	Final (test)	34	34	100	25	3	10.33	10	22
2011/2012	Final	36	34	94.4	50	10	29.85	30.5	49
2010/2011	Part II	26	21	80.8	25	3.75	16.27	13.5	24.25
2010/2011	Late Part I	26	11	42.3	25	13.75	21.6	21.25	24.75
2010/2011	Early Part I	26	17	65.4	25	9.25	14.9	13.5	22

Last updated: January 23, 2017, 15:39:24 GMT+0000 (Coordinated Universal Time).