Cloud computing for big data: 4 key considerations

The leap to cloud computing is likely to happen in many companies – if not already the case – considering the flexibility it offers to improve IT efficiency. Cloud computing is simply resources available on demand. It enables companies to purchase large scale IT infrastructures and resources as needed without massive upfront investments. This is a significant shift in the way IT executives could manage their IT environment.

The former Vice-President of the European Commission responsible for the Digital Agenda, Neelie Kroes, has stated few years ago that “[…] when it comes to cloud computing […] the potential for a fundamental change in business computing and beyond has been widely recognised.”(1)

Cloud computing is deservedly gaining attention, in particular for big data workloads due to its pay-as-you go approach. The following considerations need to be factored in your decision model before migrating to the cloud:

Cloud Computing

Which service model?

  • IAAS – Infrastructure As A Service
    • Lowest level: raw computing resources
    • Ex: Amazon EC2 (Elastic Computing Cloud); Rackspace private cloud; Microsoft Azure…
  • PAAS – Platform As A Service
    • Next level: developers facing services
    • Ex: Amazon RDS (Relational Database Service); DynamoDB; Google App Engine; Microsoft Azure…
  • SAAS – Software As A Service
    • Highest level: users facing applications
    • Ex: Google; Facebook; Twitter; Salesforce; Dataiku; Tableau…

Which deployment model?

  • Public cloud – resources are shared across multiple organizations or tenants
  • Private cloud – resources allocated solely to a single organization
  • Hybrid cloud – combination of the two previous models

Benefits for big data workloads

  • Reliable Distributed Storage: benefit from the cloud infrastructure.
  • Elasticity: ability to run large scale computation on demand.
  • Data sharing across tenants: typically, on public or hybrid model, you can benefit from public data sets available on the cloud.

Challenges to anticipate

By using the cloud as part of your IT strategy, you will outsource computation or storage to a third party. You need to make sure that the following concerns are managed effectively:

  • Security and legal compliance: To a certain extent you will hand over security management to your cloud provider. This is deservedly one of the most challenging aspect and obstacle to cloud migration. Typically, some sensitive data (health data, financial data… etc.) require you to have strict control over data governance. In addition, you might need to follow specifics laws and data security standard (PCI DSS, HIPAA, Directive 95/46/EC, GDPR… etc.) for the storage and computation of this information. Cloud providers understand the importance of these concerns and are giving more control to the customers over security properties, such as encryption of the stored data (including remote data encryption, key rotation… etc.) or access control (user role and profile management, user authentication including two factors authentications). They are also working ahead to provide solutions compliant with data security standard and policies. The physical location of the servers is also another important aspect of cloud security as they will be under the jurisdiction and the regulations of the local authorities’.
  • Service level agreement: Availability and reliability of the service provided. How does your provider ensure business continuity in case of disasters? For example power goes down or lightning strikes your provider data center. What happens to your data or computation? Location diversity is one possible solution. Location diversity means that your computation is geographically distributed and redundant. It reduces therefore the risk of having all the servers going down at the same time. Multi-could architecture is another possible solution which is gaining more attention, not only to optimize fault tolerance but also to address security and vendor lock-in concerns.
  • Data import/export: Moving big data over the internet is really slow. An alternative to internet transfer is to ship physical disks.
  • Vendor lock-in: Depending on your service model (SAAS, PAAS or IAAS), it might be difficult to move from one provider to another if they are not sharing the same interfaces or services on which you have developed your applications. In addition, you might face the data transfer issue described above when you decide to switch provider.


If you foresee cloud computing as part of your IT strategy, you need to wisely select the components to migrate in the cloud and carefully manage the challenges to ensure the effectiveness of your investment.


(1)Reference – European Commission press releases database

© Copyright Certosa Consulting – All Rights Reserved. Unauthorized use and/or duplication of this material without express and written permission from this site’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to Pastel Gbetoho and Certosa Consulting with appropriate and specific direction to the original content.


Pastel Gbetoho

Senior Data Architect, passionate about data integration, data management, data visualization and understanding how to leverage the data for better business decisions.

Send this to friend