Healing and Self-Repair in Large Scale Distributed Computing Systems
Summary
The project will focus on the development of fault tolerance mechanisms to allow distributed systems to operate under different operating conditions.
Supervisor(s)
Research Location
Program Type
Masters/PHD
Synopsis
As the complexity of distributed systems increases time there will be a need to endow such systems with capabilities that make them capable of operating in disaster scenarios. What makes this problem very complex is the heterogeneous nature of today’s distributed computing environments that could be made up of hundreds or thousands of components (computers, databases, etc). In addition, a user in one location might not be able to have control over other parts of the system. So it is rather logical that there is a need for “smart” algorithms (protocols) that can achieve such an acceptable level of fault-tolerance and account for a variety of disaster recovery scenarios.
Want to find out more?
Contact us to find out what’s involved in applying for a PhD. Domestic students and International students
Contact Research Expert to find out more about participating in this opportunity.
Browse for other opportunities within the Computer Science .
Keywords
parallel systems, Distributed systems, internet-scale computing systems, Distributed computing, complex systems, optimization, ICT
Opportunity ID
The opportunity ID for this research opportunity is: 978
Other opportunities with Professor Albert Y. Zomaya
- Parallel Stochastic Optimization Algorithms
- Self-Assembly and Self-Organization in Complex Systems
- Cellular Automata Based Cryptography
- The Mapping of Optimization Algorithms on Different Families of Computer Architectures
- Scheduling and Load Balancing in Large Scale Distributed Computing Environments
- Quality of Service in Distributed Computing Systems
- Application Isolation Techniques in Cloud Computing Platforms
- Accountability in Distributed Systems for Bioinformatics Data Management
- Application-Specific Service Level Agreement and Energy-Efficiency Improvement in Cloud Computing Platforms
- Autonomic Communications in Parallel and Distributed Computing Systems
- Detection of Anomalous Variations in Dynamic Networks
- The Choice of Appropriate Difference Measures
- Distributed Coalition Planning and Decision Making
- Federating Autonomous Sensor Networks
- Self-Assembly and Self-Organization in Complex Distributed Systems
- Parallel Stochastic Optimization Algorithms
- MicroRNAs as Regulators of Cellular Programs
- Resilience and distributed systems for a healthy society
- Biological metaphors and resilience
- Complex Networks and Performance