Summary: | The Czech National Grid Infrastructure went through a complex transition inthe last year. The production environment has been switched from a commercialbatch system PBSPro, which was replaced by an open source alternative Torquebatch system.This paper concentrates on two aspects of this transition. First, we will presentour practical experience with Torque being used as a production ready batchsystem. Our modified version of Torque, with all the necessary PBSPro ex-clusive features re-implemented and further extended with new features likecloud-like behaviour, was deployed across the entire production environment,covering the entire Czech Republic for almost a full year.In the second part, we will present our work on meta-scheduling. This in-volves our work on distributed architecture and cloud-grid convergence. Thedistributed architecture was designed to overcome the limitations of a centralserver setup, which was originally used and presented stability and performanceissues. While this paper does not discuss the inclusion of cloud interfaces intogrids, it does present the dynamic infrastructure, which is a requirement forsharing the grid infrastructure between a batch system and a cloud gateway.We are also inviting everyone to try out our fork of the Torque batch system,which is now publicly available.
|