Interconnection Network Reliability Evaluation
Multistage Layouts
Part of the Performability Engineering series
This book presents novel and efficient tools, techniques and approaches for reliability evaluation, reliability analysis, and design of reliable communication networks using graph theoretic concepts.
In recent years, human beings have become largely dependent on communication networks, such as computer communication networks, telecommunication networks, mobile switching networks etc., for their day-to-day activities. In today's world, humans and critical machines depend on these communication networks to work properly. Failure of these communication networks can result in situations where people may find themselves isolated, helpless and exposed to hazards. It is a fact that every component or system can fail and its failure probability increases with size and complexity.
The main objective of this book is to devize approaches for reliability modeling and evaluation of such complex networks. Such evaluation helps to understand which network can give us better reliability by their design. New designs of fault-tolerant interconnection network layouts are proposed, which are capable of providing high reliability through path redundancy and fault tolerance through reduction of common elements in paths. This book covers the reliability evaluation of various network topologies considering multiple reliability performance parameters (two terminal reliability, broadcast reliability, all terminal reliability, and multiple sources to multiple destinations reliability).
Quantitative Assessments of Distributed Systems
Methodologies and Techniques
Part of the Performability Engineering series
Distributed systems employed in critical infrastructures must fulfill dependability, timeliness, and performance specifications. Since these systems most often operate in an unpredictable environment, their design and maintenance require quantitative evaluation of deterministic and probabilistic timed models. This need gave birth to an abundant literature devoted to formal modeling languages combined with analytical and simulative solution techniques
The aim of the book is to provide an overview of techniques and methodologies dealing with such specific issues in the context of distributed systems and covering aspects such as performance evaluation, reliability/availability, energy efficiency, scalability, and sustainability. Specifically, techniques for checking and verifying if and how a distributed system satisfies the requirements, as well as how to properly evaluate non-functional aspects, or how to optimize the overall behavior of the system, are all discussed in the book. The scope has been selected to provide a thorough coverage on issues, models. and techniques relating to validation, evaluation and optimization of distributed systems. The key objective of this book is to help to bridge the gaps between modeling theory and the practice in distributed systems through specific examples.
Network Reliability
Measures and Evaluation
by Sanjay Kumar Chaturvedi
Part of the Performability Engineering series
In Engineering theory and applications, we think and operate in terms of logics and models with some acceptable and reasonable assumptions. The present text is aimed at providing modelling and analysis techniques for the evaluation of reliability measures (2-terminal, all-terminal, k-terminal reliability) for systems whose structure can be described in the form of a probabilistic graph. Among the several approaches of network reliability evaluation, the multiple-variable-inversion sum-of-disjoint product approach finds a well-deserved niche as it provides the reliability or unreliability expression in a most efficient and compact manner. However, it does require an efficiently enumerated minimal inputs (minimal path, spanning tree, minimal k-trees, minimal cut, minimal global-cut, minimal k-cut) depending on the desired reliability. The present book covers these two aspects in detail through the descriptions of several algorithms devised by the "reliability fraternity" and explained through solved examples to obtain and evaluate 2-terminal, k-terminal and all-terminal network reliability/unreliability measures and could be its USP. The accompanying web-based supplementary information containing modifiable Matlab® source code for the algorithms is another feature of this book.
A very concerted effort has been made to keep the book ideally suitable for first course or even for a novice stepping into the area of network reliability. The mathematical treatment is kept as minimal as possible with an assumption on the readers' side that they have basic knowledge in graph theory, probabilities laws, Boolean laws and set theory.
Machine Tool Reliability
Part of the Performability Engineering series
This book explores the domain of reliability engineering in the context of machine tools. Failures of machine tools not only jeopardize users' ability to meet their due date commitments but also lead to poor quality of products, slower production, down time losses etc.
Poor reliability and improper maintenance of a machine tool greatly increases the life cycle cost to the user. Thus, the application area of the present book, i.e. machine tools, will be equally appealing to machine tool designers, production engineers and maintenance managers. The book will serve as a consolidated volume on various dimensions of machine tool reliability and its implications from manufacturers and users point of view.
From the manufacturers' point of view, it discusses various approaches for reliability and maintenance based design of machine tools. In specific, it discusses simultaneous selection of optimal reliability configuration and maintenance schedules, maintenance optimization under various maintenance scenarios and cost based FMEA.
From the users' point of view, it explores the role of machine tool reliability in shop floor level decision-making. In specific, it shows how to model the interactions of machine tool reliability with production scheduling, maintenance scheduling and process quality control.
Probabilistic Physics of Failure Approach to Reliability
Modeling, Accelerated Testing, Prognosis and Reliability Assessment
Part of the Performability Engineering series
The book presents highly technical approaches to the probabilistic physics of failure analysis and applications to accelerated life and degradation testing to reliability prediction and assessment. Beside reviewing a select set of important failure mechanisms, the book covers basic and advanced methods of performing accelerated life test and accelerated degradation tests and analyzing the test data. The book includes a large number of very useful examples to help readers understand complicated methods described. Finally, MATLAB, R and OpenBUGS computer scripts are provided and discussed to support complex computational probabilistic analyses introduced.
Artificial Neural Network Applications for Software Reliability Prediction
Part of the Performability Engineering series
This book provides a starting point for software professionals to apply artificial neural networks for software reliability prediction without having analyst capability and expertise in various ANN architectures and their optimization.
Artificial neural network (ANN) has proven to be a universal approximator for any non-linear continuous function with arbitrary accuracy. This book presents how to apply ANN to measure various software reliability indicators: number of failures in a given time, time between successive failures, fault-prone modules and development efforts. The application of machine learning algorithm i.e. artificial neural networks application in software reliability prediction during testing phase as well as early phases of software development process are presented. Applications of artificial neural network for the above purposes are discussed with experimental results in this book so that practitioners can easily use ANN models for predicting software reliability indicators.
Fundamentals of Reliability Engineering
Applications in Multistage Interconnection Networks
Part of the Performability Engineering series
Provides fundamentals of reliability engineering and illustrates practical applications in the area of parallel/distributed systems (Multistage Interconnection Networks)
The first part of the book (chapters 1—5) introduces the concept of reliability engineering, elements of probability theory, probability distributions, availability, and data analysis. The second part of the book (chapters 6—11) provides an overview of parallel/distributed computing, network design considerations, classification of multistage interconnection networks, network reliability evaluation methods, and reliability analysis of multistage interconnection networks including reliability prediction of distributed systems using Monte Carlo method.
Fundamentals of Reliability Engineering meets the increasing demand for knowledge tools that practicing reliability professionals can use to optimize their reliability decisions. Reliability prediction is important as it determines the usability and efficiency of the network to provide services. Reliability evaluation methods discussed in this book can be applied to analyze the reliability of any other systems. As an example, reliability analysis of distributed systems that consist of layers of switching elements connected together in a predefined topology that provide the connectivity between the set of processors and the set of memory modules, are presented.
Repairable Systems Reliability Analysis
A Comprehensive Framework
Part of the Performability Engineering series
This book provides an application-oriented framework for reliability modeling and analysis of repairable systems in conjunction with the procurement process of weapon systems and throughput analysis for industries.
Most of the reliability literature is directed towards non-repairable systems, that is, systems that fail are discarded or replaced. This book is mainly dedicated towards providing coverage to the reliability modeling and analysis of repairable systems that undergo failure-repair cycles.
This unique book provides a comprehensive framework for the modeling and analysis of repairable systems considering both the non-parametric and parametric approaches to deal with their failure data. The book presents MCF based non-parametric approach with several illustrative examples and the generalized renewal process (GRP) based arithmetic reduction of age (ARA) models along with its applications to the systems failure data from the aviation industry. A complete chapter on an integrated framework for procurement process is devoted by utilizing the concepts of multi-criteria decision-making (MCDM) techniques which will of a great assistance to the readers in enhancing the potential of their respective organizations. This book also presents FMEA methods tailored for GRP based repairs.
This text has primarily emerged from the industrial experience and research work of the authors. A number of illustrations have been included to make the subject lucid and vivid even to the readers who are relatively new to this area. Besides, various examples have been provided to display the applicability of presented models and methodologies to assist the readers in applying the concepts presented in this book.
Building Dependable Distributed Systems
Part of the Performability Engineering series
A one-volume guide to the most essential techniques for designing and building dependable distributed systems
Instead of covering a broad range of research works for each dependability strategy, this useful reference focuses on only a selected few (usually the most seminal works, the most practical approaches, or the first publication of each approach), explaining each in depth, usually with a comprehensive set of examples. Each technique is dissected thoroughly enough so that readers who are not familiar with dependable distributed computing can actually grasp the technique after studying the book.
Building Dependable Distributed Systems consists of eight chapters. The first introduces the basic concepts and terminology of dependable distributed computing, and also provides an overview of the primary means of achieving dependability. Checkpointing and logging mechanisms, which are the most commonly used means of achieving limited degree of fault tolerance, are described in the second chapter. Works on recovery-oriented computing, focusing on the practical techniques that reduce the fault detection and recovery times for Internet-based applications, are covered in chapter three. Chapter four outlines the replication techniques for data and service fault tolerance. This chapter also pays particular attention to optimistic replication and the CAP theorem. Chapter five explains a few seminal works on group communication systems. Chapter six introduces the distributed consensus problem and covers a number of Paxos family algorithms in depth. The Byzantine generals problem and its latest solutions, including the seminal Practical Byzantine Fault Tolerance (PBFT) algorithm and a number of its derivatives, are introduced in chapter seven. The final chapter details the latest research results surrounding application-aware Byzantine fault tolerance, which represents an important step forward in the practical use of Byzantine fault tolerance techniques.
Binary Decision Diagrams and Extensions for System Reliability Analysis
Part of the Performability Engineering series
Recent advances in science and technology have made modern computing and engineering systems more powerful and sophisticated than ever. The increasing complexity and scale imply that system reliability problems not only continue to be a challenge but also require more efficient models and solutions. This is the first book systematically covering the state-of-the-art binary decision diagrams and their extended models, which can provide efficient and exact solutions to reliability analysis of large and complex systems. The book provides both basic concepts and detailed algorithms for modelling and evaluating reliability of a wide range of complex systems, such as multi-state systems, phased-mission systems, fault-tolerant systems with imperfect fault coverage, systems with common-cause failures, systems with disjoint failures, and systems with functional dependent failures. These types of systems abound in safety-critical or mission-critical applications such as aerospace, circuits, power systems, medical systems, telecommunication systems, transmission systems, traffic light systems, data storage systems, and etc.
The book provides both small-scale illustrative examples and large-scale benchmark examples to demonstrate broad applications and advantages of different decision diagrams based methods for complex system reliability analysis. Other measures including component importance and failure frequency are also covered. A rich set of references is cited in the book, providing helpful resources for readers to pursue further research and study of the topics. The target audience of the book is reliability and safety engineers or researchers.
The book can serve as a textbook on system reliability analysis. It can also serve as a tutorial and reference book on decision diagrams, multi-state systems, phased-mission systems, and imperfect fault coverage models.