Welcome to STUDYtactics.com    
  BOOKS eCONTENT SPECIALTY STORES MY STUDYaides MY ACCOUNT  
New & Used Books
 
Product Detail
Product Information   |  Other Product Information

Product Information
Distributed Operating Systems and Algorithms
Distributed Operating Systems and Algorithms
Author: Chow, Randy / Johnson, Theodore
Edition/Copyright: 1997
ISBN: 0-201-49838-3
Publisher: Addison-Wesley Longman, Inc.
Type: Paperback
Used Print:  $119.75
Other Product Information
Author Bio
Preface
Summary
Table of Contents
 
  Author Bio

Chow, Randy : University of Florida

Randy Chow is a professor of Computer and Information Science and Engineering at the University of Florida. His research interests include computer networks, distributed systems, computer security, and system performance evaluation.


Johnson, Theodore : AT&T Labs-Research

Theodore Johnson is a member of the technical staff at the Database Research department of AT&T Labs-Research. Previously, he was a professor of Computer and Information Science and Engineering at the University of Florida. His research interests include distributed systems, databases, and performance modeling.

 
  Preface

This book has developed from a set of lecture notes we used to teach a graduate two-course sequence on operating systems. We had difficulties finding a textbook that had balanced coverage of concepts and theories and sufficient detail for a graduate-level class. After using research papers and selected book chapters for several semesters (which students hate), we developed lecture notes for the students and then developed the notes into this book.

The key idea behind the book is to integrate the theory and practice of distributed operating systems. Many texts provide an overview of distributed systems with a vague description of the algorithms. It is difficult to teach an in-depth course from these books. Other texts focus on algorithms. However, it is hard to motivate students to study algorithms if they do not have a picture of what is needed in practice. For example, understanding the need for naming and directory services clarifies the need for causal message passing and update propagation.

The first part of the book (Chapters 1 through 8) is intended to be used for a core graduate course on operating systems. It provides a broad discussion of modern operating systems. Since we assume that the student readers already have taken an undergraduate course on operating systems, we do not include many topics, such as memory management, deadlocks, and device drivers, that are typically covered by undergraduate courses and that are well covered by existing texts. Instead, we concentrate on issues and implementations of modern operating systems for parallel systems, distributed systems, real-time systems, and computer networks. The programming projects in this part concentrate on using tools, such as sockets, RPC, and threads.

The second part of the book (Chapters 9 through 13) is intended for an advanced course on distributed operating systems, focusing on algorithms for asynchronous distributed systems. The material provides a framework for thinking about distributed systems, precise algorithm descriptions, and an understanding of why the algorithms were developed and why they are correct. The theme behind much of the material presented in this section is providing the framework for implementing a replicated server-- a prototype for implementing a fault-tolerant and highly available distributed system. The projects in this part use the tools presented in the first part to implement the algorithms.

The different focuses of the two parts of the book complement each other, with the first part supplying background and the second part supplying theory. While the book is intended for a two-course sequence, it also works well for a graduate operating system or an advanced distributed systems class alone. Although each of the two parts is self-contained, extensive cross-referencing allows the reader to emphasize either theory or implementation or to cover both elements of selected topics. With the widespread exposure of the Internet, we envision that the book could also be of interest to undergraduate students and can be tailored to undergraduate classes. Furthermore, the combined book makes an excellent reference for the practitioner who has some working experience on operating systems.

The book is structured to present the important concepts of distributed operating systems. We make many references to commercial and experimental systems to illustrate the concepts and the implementation issues. In addition, each chapter provides an extensive annotated bibliography that points the interested reader to recent developments in distributed computing theory, technology, and practice. In the following we give a brief description of the content of each chapter.

Part I of the textbook is a pragmatic presentation of the basic concepts and implementation issues of a distributed operating system. It is organized in three major components: fundamental concepts, distributed processes, and distributed resources.

  • Chapter 1 introduces a classification of centralized operating system, network operating system, distributed operating system, and cooperative autonomous systems, using the key characteristics of virtuality, interoperability, transparency and autonomicity, respectively, for each system. It illustrates the evolution that led to the development of modern distributed operating systems and explains the emerging need for distributed software and the importance of distributed coordination algorithms.
  • Chapter 2 begins the discussion of distributed operating systems. It presents the concepts of transparency and services. Distributed systems and their underlying communication architectures are introduced. The chapter concludes with a list of major system design issues that establishes an order for the presentation of the subsequent chapters.
  • Chapter 3 describes concurrent processes and programming. It defines processes and threads and shows how their interaction can be modeled by using some fundamental concepts such as a graph, a logical clock, and the client and server model. Both shared memory and message passing for synchronization and communication are addressed. They are presented along with the development of concurrent language constructs. A taxonomy of these language mechanisms and their implementation is given. This chapter presents an integrated view of synchronization and communication.
  • Chapter 4 extends the discussion of process interaction from synchronization to communication and to distributed process coordination using message passing communication. Three communication models, message passing (socket), request/reply (RPC), and transaction communication, are presented. A special emphasis is placed on group communication and coordination. Two classical distributed coordination problems, mutual exclusion and leader election using message passing interprocess communication, are introduced. These problems are further studied in Chapters 10 and 11 in Part II of the textbook. The chapter also includes a presentation of name service, an essential facility for communication in distributed systems.
  • Chapter 5 turns to the third process management issue, that of process scheduling. The effect of communication on both static and dynamic process scheduling is emphasized. The chapter describes distributed computation through dynamic redistribution of processes by using remote execution and process migration techniques. It also addresses several unique issues in real-time scheduling.

  • Chapter 6 discusses the distributed implementation of file systems, the first of the two important distributed resources: files and memory. It demonstrates the use of the concept of transparency and service in the design of distributed file systems. Two major implementation issues, data caching and file replication, are discussed in this chapter. The chapter also covers distributed transactions as part of the file service. Since management of replicated data touches upon both data and communication, two central issues in distributed systems, it is further detailed in Chapter 12.
  • Chapter 7 covers distributed shared memory systems that simulate a logical shared memory on a physically distributed memory system. The issues studied are coherence and consistency of data due to memory sharing. The chapter describes implementation strategies for different memory consistency requirements. It also shows the significance of the object-based data sharing models.
  • Chapter 8 addresses unique security issues in network and distributed environments. These issues are divided into two areas: authorization and authentication. Authorization includes the study of distributed access and flow control models. Authentication covers cryptography and its applications for mutual authentication and key distribution protocols. Implementations of some security features in modern systems are illustrated.

    Part II of the textbook discusses distributed algorithms. The discussion is pragmatic and is intended to give the reader a solid understanding of common problems and solution techniques. The topics are organized in five chapters.

  • Chapter 9: introduces the concepts of time and global states in a distributed system. The fundamental problem of distributed algorithms is a lack of a global clock and a global state. Recent research on vector time and distributed predicates has developed unified models for thinking about distributed time and the distributed state. This chapter presents the concepts of causality, vector timestamps, and global states. The algorithms for implementing these concepts are presented. The connections between the different models are explored. Finally, a model for proving the correctness of distributed algorithms is presented.
  • Chapter 10: covers distributed synchronization and distributed election. While the distributed synchronization algorithms are not considered pragmatic, they illustrate important algorithm design techniques. For example, voting algorithms for replicated data management are foreshadowed in Maekawa¹s algorithm, and the Chang-Singhal-Liu algorithm illustrates the ideas behind distributed shared memory (and distributed object) algorithms. The chapter concludes with algorithms for electing a computation leader. Election is a critical component of many systems. The invitation algorithmvii in particular is a prototype for handling failures in an asynchronous system and foreshadows the group view maintenance algorithms of Chapter 12.
  • Chapter 11 discusses the abstract distributed agreement problem. First, Byzantine agreement is discussed. Next, the Fischer­Lynch­Paterson (FLP) result that no algorithm solves distributed agreement problems in an asynchronous system is covered in detail. This is the appropriate point to introduce the FLP result, because the next chapter covers replicated data management and must solve distributed agreement in asynchronous systems. The FLP result leaves open three ways to achieve distributed agreement in an asynchronous system: hope that it happens, use relative agreement, or use a randomized algorithm. The chapter discusses these implications of the FLP result and concludes with some randomized agreement protocols.
  • Chapter 12 covers replicated data management. Since providing replicated servers reduces to replicating the state of the servers, this section also discusses the problems and concepts of replication. We cover three main approaches: the transaction approach, the reliable multicast approach, and the log propagation approach. The transaction approach includes discussion of two-phase commit, three-phase commit, one-copy serializability, voting, and dynamic voting protocols. The reliable multicast approach includes discussion of virtual synchrony, algorithms for implementing reliable and causal multicast, algorithms for totally ordered multicast, and consistent multicast group maintenance algorithms. The log propagation approach covers naive log propagation, epidemics, and causal log propagation. This chapter is the culmination of Part II of the text and draws together the results presented in previous chapters.
  • Chapter 13 covers distributed rollback and recovery. These techniques are critical for implementing fault-tolerant systems and are complimentary to the replicated data management techniques of the previous chapter. By using the theory developed in the previous chapters (especially Chapter 9), different rollback and recovery algorithms are presented in a unified manner and are related to algorithms discussed previously.

The book contains sufficient materials for a two-semester operating system (or distributed system) course sequence. To use the book for an one-semester course, we recommend the following two options for coverage with different orientations.

  • Distributed Operating Systems: This option covers the entire Part I with supplemental Sections 9.1, 9.2, 10.1.2, 10.1.3, 10.1.4, 10.2, 12.1, and 12.1.3 from Part II. The course will focus on the implementation issues of the major components in a distributed operating system. The supplemental sections in Part II extend the discussion of distributed coordination algorithms.
  • Distributed Algorithms: This option uses the entire Part II with Sections 4.1-4.4, 6.1-6.3, and 7.1-7.4 from Part I. The course will emphasize the design of distributed algorithms in distributed systems. The selected sections in Part I serve as the background and motivations for the discussion of distributed algorithms in Part II.

The authors wish to acknowledge the COP5615 (Operating System Principles) and the COP6635 (Distributed Operating Systems) classes for allowing them to test preliminary versions of the book. Special thanks go to Anthony Bridgewater, Raja Chatterjee, William Chow, Roger Collins, Darin Davis, Carlos Guerra, Steve Greenwald, Vanja Josifovski, Eric Shaffer, and Shiby Thomas for finding errors in the examples and algorithms.

Several topics of discussion use the research work from theses and dissertations by students at the University of Florida. Frank Anger and J. J. Hwang contributed the multiprocessor scheduling models and algorithms. The LAM distributed shared memory system was given by Roger Denton. I-Lung Kao shared the discussion of the complex security policies. Lionel Maugis contributed the start of the bibliography and the total ordering algorithm from his thesis.

The comments and feedback from the reviewers were enormously valuable. We wish to show our appreciation to them for providing their expertise to improve the quality of the book. They are: Warren R. Carithers of Rochester Institute of Technology, Prasun Dewan of University of North Carolina, Peter G. Drexel of Plymouth State College, Gary Harkin of Montana State University, Eric H. Herrin, II of University of Kentucky, Mark A. Holliday of Western Carolina University, Michael A. Keenan of Columbus State University, William F. Klostermeyer of West Virginia University, Bruce Maggs of Carnegie-Mellon University, James Purtilo of University of Maryland, Gurdip Singh of Kansa State University, and Salih Yurttas of Texas A&M University.

The book cover design was inspired by the "turtles all the way down" anecdote which was believed to originate from WilliamJames and retold by many philosophers including Bertrand Russell and Steve Hawking. We added the stack of "gators" to show another layered-structured autonomous system. The art work was contributed by Sheng-Wan Hu. Special thanks go to Michael Downes and S. Y. Cheng who helped to solve many Latex technical problems.

We started writing the book when our wives complained that they did not have any book dedicated to them as many others do. We would thank Johna and Taiying for giving us such great motivation to write the book. This book is dedicated to them.

 
  Summary

Distributed Operating Systems and Algorithms integrates into one text both the theory and implementation aspects of distributed operating systems for the first time. This innovative book provides the reader with knowledge of the important algorithms necessary for an in-depth understanding of distributed systems; at the same time it motivates the study of these algorithms by presenting a systems framework for their practical application.

The first part of the book is intended for use in an advanced course on operating systems and concentrates on parallel systems, distributed systems, real-time systems, and computer networks. The second part of the text is written for a course on distributed algorithms with a focus on algorithms for asynchronous distributed systems. While each of the two parts is self-contained, extensive cross-referencing allows the reader to emphasize either theory or implementation or to cover both elements of selected topics.

Features

  • Integrates and balances coverage of the advanced aspects of operating systems with the distributed algorithms used by these systems.
  • Includes extensive references to commercial and experimental systems to illustrate the concepts and implementation issues.
  • Provides precise algorithm description and explanation of why these algorithms were developed.
  • Structures the coverage of algorithms around the creation of a framework for implementing a replicated serverÐa prototype for implementing a fault-tolerant and highly available distributed system.
  • Contains programming projects on such topics as sockets, RPC, threads, and implementation of distributed algorithms using these tools.
  • Includes an extensive annotated bibliography for each chapter, pointing the reader to recent developments.
  • Solutions to selected exercises, templates to programming problems, a simulator for algorithms for distributed synchronization, and teaching tips for selected topics are available to qualified instructors from Addison Wesley.

 
  Table of Contents

UNIT 1. OPERATING SYSTEM FUNDAMENTALS

Evolution of Modern Operating Systems
Centralized Operating System Overview
Network Operating Systems
Distributed Operating Systems
Cooperative Autonomous Systems
Distributed Algorithms

Summary and Bibliography

UNIT 2. SYSTEMS: CONCEPTS AND ARCHITECTURE'S

Goals
Transparency
Services
Architecture Models
Network Communication Protocols Major Design Issues
Distributed Computing Environment (DCE)

Summary and Bibliography
Exercises

UNIT 3. CONCURRENT PROCESSES AND PROGRAMMING

Processes and Threads
Graph Models for Process Representation
The Client/Server Model Time Services
Language Mechanisms for Synchronization
Object Model Resource Servers Concurrent Programming Languages
Distributed and Network Programming Languages

Summary and Bibliography
Exercises

UNIT 4. INTERPROCESS COMMUNICATION AND COORDINATION

Message Passing Communication
Request/Reply Communication
Transaction Communication
Name and Directory Services
Distributed Mutual Exclusion
Leader Election

Summary and Bibliography
Exercises

UNIT 5. DISTRIBUTED PROCESS SCHEDULING

A System Performance Model
Static Process Scheduling with Communication
Dynamic Load Sharing and Balancing
Distributed process Implementation
Real-time Scheduling

Summary and Bibliography
Exercises

UNIT 6. DISTRIBUTED FILE SYSTEMS

Transparencies and Characteristics of DFS
DFS Design and Implementation
Transaction Service and Concurrency Control
Data and File Replication

Summary and Bibliography
Exercises

UNIT 7. DISTRIBUTED SHARED MEMORY

Non-Uniform Memory Access Architecture's
Memory Consistency Models
Multiprocessor Cache Systems
Distributed Shared Memory
Implementation of DSM systems

Summary and Bibliography
Exercises

UNIT 8. DISTRIBUTED COMPUTER SECURITY

Fundamentals of Computer Security
Discretionary Access Control Models
Mandatory Flow Control Models
Cryptography
Distributed Authentication and Key Distribution Issues Relevant to Distributed Security

Summary and Bibliography
Exercises

UNIT 9. DISTRIBUTED ALGORITHMS

MODELS OF DISTRIBUTED COMPUTATION

Preliminaries
Causality
Distributed Snapshots
Modeling a Distributed Computation Failures in a Distributed System

Summary and Bibliography
Exercises

UNIT 10. SYNCHRONIZATION AND ELECTION

Distributed Mutual Exclusion
Election

Summary and Bibliography
Exercises

UNIT 11. DISTRIBUTED AGREEMENT

Adversaries
Byzantine Agreement
Impossibility of Consensus
Randomized Distributed Agreement

Summary and Bibliography
Exercise

UNIT 12. REPLICATED DATA MANAGEMENT

Database Techniques
Atomic Multicast
Update Propagation

Summary and Bibliography
Exercises

UNIT 13. CHECKPOINTING AND RECOVERY

Problems in Rollback
Incarnation Numbers
Taxonomy of Solution Techniques Uncoordinated Checkpointing
Coordinated Checkpointing
Synchronous Logging Asynchronous Logging
Adaptive Logging

Summary and Bibliography
Exercises

 

New & Used Books -  eContent -  Specialty Stores -  My STUDYaides -  My Account

Terms of Service & Privacy PolicyContact UsHelp © 1995-2024 STUDYtactics, All Rights Reserved