National Institute of Technology Rourkela

राष्ट्रीय प्रौद्योगिकी संस्थान राउरकेला

ଜାତୀୟ ପ୍ରଯୁକ୍ତି ପ୍ରତିଷ୍ଠାନ ରାଉରକେଲା

An Institute of National Importance

Syllabus

Course Details

Subject {L-T-P / C} : CS6118 : Fault Tolerant Computing { 3-0-0 / 3}

Subject Nature : Theory

Coordinator : Prof. Pabitra Mohan Khilar

Syllabus

Module 1: Introduction to Fault Tolerant Computing. Basic concepts and overview of the course Faults and their manifestations, Fault/error modeling, Reliability, availability and maintainability analysis, System evaluation, performance reliability trade offs. System level fault diagnosis, Hardware and software redundancy techniques.

Module 2: Fault tolerant system design methods, Mobile computing and Mobile communication environment, Fault injection methods,

Module 3: Software fault tolerance, Design and test of defect free integrated circuits, fault modeling, built in self test, data compression, error correcting codes, simulation software/hardware, fault tolerant system design, CAD tools for design for testability. Information Redundancy and Error Correcting Codes, Software
Problem. Software Reliability Models and Robust Coding Techniques, Reliability in Computer Networks Time redundancy. Re execution in SMT, CMP Architectures.

Module 4: Fault Tolerant Distributed Systems, Data replication, Fault Isolation, Fault Recovery, Metrics for evaluation of fault tolerance algorithms, Case Studies in FTC: ROC, HP Non Stop Server. Case studies of fault tolerant systems and current research issues.

Course Objectives

  • To understand the fault tolerant design principles
  • To identify the requirement of fault tolerant systems
  • To understand fault tolerant distributed systems and its requirement
  • To design algorithms for fault tolerant systems

Course Outcomes

Designing and implementing fault tolerant systems for different applications <br /> <br />To identify the policy and mechanisms to achieve fault tolerance

Essential Reading

  • D. K. Pradhan, Fault Tolerant Computer System Design, Prentice Hall , 1996.
  • I. Koren, Fault Tolerant Systems, Morgan Kauffman , 2007

Supplementary Reading

  • L. L. Pullum, Software Fault Tolerance Techniques and Implementation, Artech House Computer Security Series , 2001.
  • M. L. Shooman, Reliability of Computer Systems and Networks Fault Tolerance Analysis and Design, Wiley , 2002

Journal and Conferences

  • P.M.Khilar and S.Mahapatra, “Time-Constrained Fault Tolerant X-by-wire Systems” International Journal of Computer and Applications, Vol. 31, No.4, Oct-Dec, 2009, pp. 231-238
  • A.Mahapatra and P.M.Khilar (2013), Fault Diagnosis in Wireless Sensor Networks: A Survey, IEEE Communications Surveys and Tutorials, Issue 99, pp. 1-27, April 2013(IEEE)