Randy Chow is a professor of Computer and Information Science and Engineering at the University of Florida.
His research interests include computer networks, distributed systems, computer security, and system performance
evaluation.
Johnson, Theodore : AT&T Labs-Research
Theodore Johnson is a member of the technical staff at the Database Research department of AT&T Labs-Research.
Previously, he was a professor of Computer and Information Science and Engineering at the University of Florida.
His research interests include distributed systems, databases, and performance modeling.
Preface
This book has developed from a set of lecture notes we used to teach a graduate two-course sequence on operating
systems. We had difficulties finding a textbook that had balanced coverage of concepts and theories and sufficient
detail for a graduate-level class. After using research papers and selected book chapters for several semesters
(which students hate), we developed lecture notes for the students and then developed the notes into this book.
The key idea behind the book is to integrate the theory and practice of distributed operating systems. Many
texts provide an overview of distributed systems with a vague description of the algorithms. It is difficult to
teach an in-depth course from these books. Other texts focus on algorithms. However, it is hard to motivate students
to study algorithms if they do not have a picture of what is needed in practice. For example, understanding the
need for naming and directory services clarifies the need for causal message passing and update propagation.
The first part of the book (Chapters 1 through 8) is intended to be used for a core graduate
course on operating systems. It provides a broad discussion of modern operating systems. Since we assume that the
student readers already have taken an undergraduate course on operating systems, we do not include many topics,
such as memory management, deadlocks, and device drivers, that are typically covered by undergraduate courses and
that are well covered by existing texts. Instead, we concentrate on issues and implementations of modern operating
systems for parallel systems, distributed systems, real-time systems, and computer networks. The programming projects
in this part concentrate on using tools, such as sockets, RPC, and threads.
The second part of the book (Chapters 9 through 13) is intended for an advanced course on distributed
operating systems, focusing on algorithms for asynchronous distributed systems. The material provides a framework
for thinking about distributed systems, precise algorithm descriptions, and an understanding of why the algorithms
were developed and why they are correct. The theme behind much of the material presented in this section is providing
the framework for implementing a replicated server-- a prototype for implementing a fault-tolerant and highly available
distributed system. The projects in this part use the tools presented in the first part to implement the algorithms.
The different focuses of the two parts of the book complement each other, with the first part supplying background
and the second part supplying theory. While the book is intended for a two-course sequence, it also works well
for a graduate operating system or an advanced distributed systems class alone. Although each of the two parts
is self-contained, extensive cross-referencing allows the reader to emphasize either theory or implementation or
to cover both elements of selected topics. With the widespread exposure of the Internet, we envision that the book
could also be of interest to undergraduate students and can be tailored to undergraduate classes. Furthermore,
the combined book makes an excellent reference for the practitioner who has some working experience on operating
systems.
The book is structured to present the important concepts of distributed operating systems. We make many references
to commercial and experimental systems to illustrate the concepts and the implementation issues. In addition, each
chapter provides an extensive annotated bibliography that points the interested reader to recent developments in
distributed computing theory, technology, and practice. In the following we give a brief description of the content
of each chapter.
Part I of the textbook is a pragmatic presentation of the basic concepts and implementation
issues of a distributed operating system. It is organized in three major components: fundamental concepts, distributed
processes, and distributed resources.
Chapter 1 introduces a classification of centralized operating system, network operating system,
distributed operating system, and cooperative autonomous systems, using the key characteristics of virtuality,
interoperability, transparency and autonomicity, respectively, for each system. It illustrates the evolution that
led to the development of modern distributed operating systems and explains the emerging need for distributed software
and the importance of distributed coordination algorithms.
Chapter 2 begins the discussion of distributed operating systems. It presents the concepts
of transparency and services. Distributed systems and their underlying communication architectures are introduced.
The chapter concludes with a list of major system design issues that establishes an order for the presentation
of the subsequent chapters.
Chapter 3 describes concurrent processes and programming. It defines processes and threads
and shows how their interaction can be modeled by using some fundamental concepts such as a graph, a logical clock,
and the client and server model. Both shared memory and message passing for synchronization and communication are
addressed. They are presented along with the development of concurrent language constructs. A taxonomy of these
language mechanisms and their implementation is given. This chapter presents an integrated view of synchronization
and communication.
Chapter 4 extends the discussion of process interaction from synchronization to communication
and to distributed process coordination using message passing communication. Three communication models, message
passing (socket), request/reply (RPC), and transaction communication, are presented. A special emphasis is placed
on group communication and coordination. Two classical distributed coordination problems, mutual exclusion and
leader election using message passing interprocess communication, are introduced. These problems are further studied
in Chapters 10 and 11 in Part II of the textbook. The chapter also includes a presentation of name service, an
essential facility for communication in distributed systems.
Chapter 5 turns to the third process management issue, that of process scheduling. The effect
of communication on both static and dynamic process scheduling is emphasized. The chapter describes distributed
computation through dynamic redistribution of processes by using remote execution and process migration techniques.
It also addresses several unique issues in real-time scheduling.
Chapter 6 discusses the distributed implementation of file systems, the first of the two important
distributed resources: files and memory. It demonstrates the use of the concept of transparency and service in
the design of distributed file systems. Two major implementation issues, data caching and file replication, are
discussed in this chapter. The chapter also covers distributed transactions as part of the file service. Since
management of replicated data touches upon both data and communication, two central issues in distributed systems,
it is further detailed in Chapter 12.
Chapter 7 covers distributed shared memory systems that simulate a logical shared memory on
a physically distributed memory system. The issues studied are coherence and consistency of data due to memory
sharing. The chapter describes implementation strategies for different memory consistency requirements. It also
shows the significance of the object-based data sharing models.
Chapter 8 addresses unique security issues in network and distributed environments. These
issues are divided into two areas: authorization and authentication. Authorization includes the study of distributed
access and flow control models. Authentication covers cryptography and its applications for mutual authentication
and key distribution protocols. Implementations of some security features in modern systems are illustrated.
Part II of the textbook discusses distributed algorithms. The discussion is pragmatic and is intended to give
the reader a solid understanding of common problems and solution techniques. The topics are organized in five chapters.
Chapter 9: introduces the concepts of time and global states in a distributed system. The
fundamental problem of distributed algorithms is a lack of a global clock and a global state. Recent research on
vector time and distributed predicates has developed unified models for thinking about distributed time and the
distributed state. This chapter presents the concepts of causality, vector timestamps, and global states. The algorithms
for implementing these concepts are presented. The connections between the different models are explored. Finally,
a model for proving the correctness of distributed algorithms is presented.
Chapter 10: covers distributed synchronization and distributed election. While the distributed
synchronization algorithms are not considered pragmatic, they illustrate important algorithm design techniques.
For example, voting algorithms for replicated data management are foreshadowed in Maekawa¹s algorithm, and
the Chang-Singhal-Liu algorithm illustrates the ideas behind distributed shared memory (and distributed object)
algorithms. The chapter concludes with algorithms for electing a computation leader. Election is a critical component
of many systems. The invitation algorithmvii in particular is a prototype for handling failures in an asynchronous
system and foreshadows the group view maintenance algorithms of Chapter 12.
Chapter 11 discusses the abstract distributed agreement problem. First, Byzantine agreement
is discussed. Next, the FischerLynchPaterson (FLP) result that no algorithm solves distributed agreement
problems in an asynchronous system is covered in detail. This is the appropriate point to introduce the FLP result,
because the next chapter covers replicated data management and must solve distributed agreement in asynchronous
systems. The FLP result leaves open three ways to achieve distributed agreement in an asynchronous system: hope
that it happens, use relative agreement, or use a randomized algorithm. The chapter discusses these implications
of the FLP result and concludes with some randomized agreement protocols.
Chapter 12 covers replicated data management. Since providing replicated servers reduces to
replicating the state of the servers, this section also discusses the problems and concepts of replication. We
cover three main approaches: the transaction approach, the reliable multicast approach, and the log propagation
approach. The transaction approach includes discussion of two-phase commit, three-phase commit, one-copy serializability,
voting, and dynamic voting protocols. The reliable multicast approach includes discussion of virtual synchrony,
algorithms for implementing reliable and causal multicast, algorithms for totally ordered multicast, and consistent
multicast group maintenance algorithms. The log propagation approach covers naive log propagation, epidemics, and
causal log propagation. This chapter is the culmination of Part II of the text and draws together the results presented
in previous chapters.
Chapter 13 covers distributed rollback and recovery. These techniques are critical for implementing
fault-tolerant systems and are complimentary to the replicated data management techniques of the previous chapter.
By using the theory developed in the previous chapters (especially Chapter 9), different rollback and recovery
algorithms are presented in a unified manner and are related to algorithms discussed previously.
The book contains sufficient materials for a two-semester operating system (or distributed system) course sequence.
To use the book for an one-semester course, we recommend the following two options for coverage with different
orientations.
Distributed Operating Systems: This option covers the entire Part I with supplemental Sections
9.1, 9.2, 10.1.2, 10.1.3, 10.1.4, 10.2, 12.1, and 12.1.3 from Part II. The course will focus on the implementation
issues of the major components in a distributed operating system. The supplemental sections in Part II extend the
discussion of distributed coordination algorithms.
Distributed Algorithms: This option uses the entire Part II with Sections 4.1-4.4, 6.1-6.3,
and 7.1-7.4 from Part I. The course will emphasize the design of distributed algorithms in distributed systems.
The selected sections in Part I serve as the background and motivations for the discussion of distributed algorithms
in Part II.
The authors wish to acknowledge the COP5615 (Operating System Principles) and the COP6635 (Distributed Operating
Systems) classes for allowing them to test preliminary versions of the book. Special thanks go to Anthony Bridgewater,
Raja Chatterjee, William Chow, Roger Collins, Darin Davis, Carlos Guerra, Steve Greenwald, Vanja Josifovski, Eric
Shaffer, and Shiby Thomas for finding errors in the examples and algorithms.
Several topics of discussion use the research work from theses and dissertations by students at the University
of Florida. Frank Anger and J. J. Hwang contributed the multiprocessor scheduling models and algorithms. The LAM
distributed shared memory system was given by Roger Denton. I-Lung Kao shared the discussion of the complex security
policies. Lionel Maugis contributed the start of the bibliography and the total ordering algorithm from his thesis.
The comments and feedback from the reviewers were enormously valuable. We wish to show our appreciation to them
for providing their expertise to improve the quality of the book. They are: Warren R. Carithers of Rochester Institute
of Technology, Prasun Dewan of University of North Carolina, Peter G. Drexel of Plymouth State College, Gary Harkin
of Montana State University, Eric H. Herrin, II of University of Kentucky, Mark A. Holliday of Western Carolina
University, Michael A. Keenan of Columbus State University, William F. Klostermeyer of West Virginia University,
Bruce Maggs of Carnegie-Mellon University, James Purtilo of University of Maryland, Gurdip Singh of Kansa State
University, and Salih Yurttas of Texas A&M University.
The book cover design was inspired by the "turtles all the way down" anecdote which was believed to
originate from WilliamJames and retold by many philosophers including Bertrand Russell and Steve Hawking. We added
the stack of "gators" to show another layered-structured autonomous system. The art work was contributed
by Sheng-Wan Hu. Special thanks go to Michael Downes and S. Y. Cheng who helped to solve many Latex technical problems.
We started writing the book when our wives complained that they did not have any book dedicated to them as many
others do. We would thank Johna and Taiying for giving us such great motivation to write the book. This book is
dedicated to them.
Summary
Distributed Operating Systems and Algorithms integrates into one text both the theory and implementation
aspects of distributed operating systems for the first time. This innovative book provides the reader with knowledge
of the important algorithms necessary for an in-depth understanding of distributed systems; at the same time it
motivates the study of these algorithms by presenting a systems framework for their practical application.
The first part of the book is intended for use in an advanced course on operating systems and concentrates on
parallel systems, distributed systems, real-time systems, and computer networks. The second part of the text is
written for a course on distributed algorithms with a focus on algorithms for asynchronous distributed systems.
While each of the two parts is self-contained, extensive cross-referencing allows the reader to emphasize either
theory or implementation or to cover both elements of selected topics.
Features
Integrates and balances coverage of the advanced aspects of operating systems with the distributed algorithms
used by these systems.
Includes extensive references to commercial and experimental systems to illustrate the concepts and implementation
issues.
Provides precise algorithm description and explanation of why these algorithms were developed.
Structures the coverage of algorithms around the creation of a framework for implementing a replicated serverÐa
prototype for implementing a fault-tolerant and highly available distributed system.
Contains programming projects on such topics as sockets, RPC, threads, and implementation of distributed algorithms
using these tools.
Includes an extensive annotated bibliography for each chapter, pointing the reader to recent developments.
Solutions to selected exercises, templates to programming problems, a simulator for algorithms for distributed
synchronization, and teaching tips for selected topics are available to qualified instructors from Addison Wesley.
Table of Contents
UNIT 1. OPERATING SYSTEM FUNDAMENTALS
Evolution of Modern Operating Systems
Centralized Operating System Overview
Network Operating Systems
Distributed Operating Systems
Cooperative Autonomous Systems
Distributed Algorithms
Summary and Bibliography
UNIT 2. SYSTEMS: CONCEPTS AND ARCHITECTURE'S
Goals
Transparency
Services
Architecture Models
Network Communication Protocols Major Design Issues
Distributed Computing Environment (DCE)
Summary and Bibliography
Exercises
UNIT 3. CONCURRENT PROCESSES AND PROGRAMMING
Processes and Threads
Graph Models for Process Representation
The Client/Server Model Time Services
Language Mechanisms for Synchronization
Object Model Resource Servers Concurrent Programming Languages
Distributed and Network Programming Languages
Summary and Bibliography
Exercises
UNIT 4. INTERPROCESS COMMUNICATION AND COORDINATION
Message Passing Communication
Request/Reply Communication
Transaction Communication
Name and Directory Services
Distributed Mutual Exclusion
Leader Election
Summary and Bibliography
Exercises
UNIT 5. DISTRIBUTED PROCESS SCHEDULING
A System Performance Model
Static Process Scheduling with Communication
Dynamic Load Sharing and Balancing
Distributed process Implementation
Real-time Scheduling
Summary and Bibliography
Exercises
UNIT 6. DISTRIBUTED FILE SYSTEMS
Transparencies and Characteristics of DFS
DFS Design and Implementation
Transaction Service and Concurrency Control
Data and File Replication
Summary and Bibliography
Exercises
UNIT 7. DISTRIBUTED SHARED MEMORY
Non-Uniform Memory Access Architecture's
Memory Consistency Models
Multiprocessor Cache Systems
Distributed Shared Memory
Implementation of DSM systems
Summary and Bibliography
Exercises
UNIT 8. DISTRIBUTED COMPUTER SECURITY
Fundamentals of Computer Security
Discretionary Access Control Models
Mandatory Flow Control Models
Cryptography
Distributed Authentication and Key Distribution Issues Relevant to Distributed Security
Summary and Bibliography
Exercises
UNIT 9. DISTRIBUTED ALGORITHMS
MODELS OF DISTRIBUTED COMPUTATION
Preliminaries
Causality
Distributed Snapshots
Modeling a Distributed Computation Failures in a Distributed System
Summary and Bibliography
Exercises
UNIT 10. SYNCHRONIZATION AND ELECTION
Distributed Mutual Exclusion
Election
Summary and Bibliography
Exercises
UNIT 11. DISTRIBUTED AGREEMENT
Adversaries
Byzantine Agreement
Impossibility of Consensus
Randomized Distributed Agreement