In Cloud computing, both the public and private sectors are already offering Cloud resources as IaaS (Infrastructure as a Service). However, there are numerous areas of interest to scientific communities where Cloud computing uptake is currently lacking, especially at the PaaS and SaaS (Software as a Service) levels.
In this context, INDIGO-DataCloud (INtegrating Distributed data Infrastructures for Global ExplOitation), a project funded under the Horizon2020 framework program of the European Union and lead by the National Institute for Nuclear Physics (INFN), aims at developing a data and computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. By filling existing gaps in PaaS and SaaS levels, INDIGO-DataCloud will help developers, resources providers, e-infrastructures and scientific communities to overcome current challenges in the Cloud computing, storage and network areas.
The challenges of the Big Data era
- ... orchestrate and federate Cloud, Grid and HPC [public or private] resources?
- ... Avoid software and vendor lock-in?
- ... overcome performance issues limiting massive adoption of virtualized Cloud resources in large data centers?
- ... exploit specialized hardware, such as GPUs or low-latency interconnections?
- ... manage dynamic and complex workflows for scientific data analysis?
- … combine data from multiple sources and stored in multiple locations through incompatible technologies?
- … support federated identities and provide privacy and distributed authorization in open Cloud platforms?
- ... provide APIs to exploit the above and write applications, customizable portals and mobile views?
- ... move beyond statical location and partitioning of both storage and computing resources in data centers?
- ... distribute and deploy applications in a flexible way?
- ... exploit distributed computing and storage resources through transparent network interconnections?
INDIGO - DataCloud will develop and deliver software to simplify the execution of applications on Cloud and Grid based infrastructures, as well as on HTC and HPC clusters.
The project will extend existing PaaS (Platform as a Service) solutions, allowing public and private e-infrastructures, including those provided by EGI, EUDAT, PRACE and Helix Nebula, to integrate their existing services and make them available through AAI services compliant with GEANT’s interfederation policies, thus guaranteeing transparency and trust in the provisioning of such services. It will allow execution of applications on Cloud and Grid based infrastructures, as well as on HPC clusters.
INDIGO will also provide a flexible and modular presentation layer connected to the PaaS and SaaS frameworks developed within the project, allowing innovative user experiences and dynamic workflows, also from mobile appliances.
- Based on Open Source solutions, it will also develop Open Source software.
- Rooted in use cases and supported by multi-disciplinar scientific communities, big and small.
- It exploits available, general solutions rather than on custom, home-made specific tools or services.
- The framework or services offered to final users as well as to developers will have a low learning curve, with popular existing software suites will be supported and exploitable by INDIGO software in a transparent way.
- It will allow to run the software in a hybrid, distributed Cloud environment.