Creating data lakes is often the first step towards maximizing value from data by generating insights for the business.
Hadoop data leaks, are today, the most common ones that are found on-premise and many of Google’s customers are moving these to the Google Cloud Platform. And the trend is accelerating.
In this video, Google Cloud Strategic Cloud Engineer, Roderick Yao, will teach you about the growing challenges in managing on-prem data lakes and what is driving the growth of open source implementations on the cloud.
He will walked you through how to architect, migrate, and secure your own open source data lake on Google Cloud using a mix of managed services and open source tools.
During the course of this presentation, Yao will go over:
- Why run data lakes on Google Cloud
- Designing and migrating data lakes
- Security and governance