HAVmS: Highly Available Virtual Machine Computer System Fault Tolerant with Automatic Failback and Close to Zero Downtime

Memmo Federici, Carlo Gaibisso, Bruno L. Martino

Abstract


In scientic computing, systems often manage computations that require continuous acquisition of of satellite data and the management of large databases, as well as the execution of analysis software and simulation models (e.g. Monte Carlo or molecular dynamics cell simulations) which may require several weeks of continuous run. These systems, consequently, should ensure the continuity of operation even in case of serious faults. HAVmS (High Availability Virtual machine System) is a highly available, "fault tolerant" system with zero downtime in case of fault. It is based on the use of Virtual Machines and implemented by two servers with similar characteristics. HAVmS, thanks to the developed software solutions, is unique in its kind since it automatically failbacks once faults have been fixed. The system has been designed to be used both with professional or inexpensive hardware and supports the simultaneous execution of multiple services such as: web, mail, computing and administrative services, uninterrupted computing, data base management. Finally the system is cost effective adopting exclusively open source solutions, is easily manageable and for general use.

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN 2336-5382 (Online)
Published by the Czech Technical University in Prague