The Rise of GCP in Data

Cloud Event Review

Earlier this year, Host gathered some of the best GCP experts in the market for a panel talk discussion about the rising popularity of Google Cloud Platform in Data.

We were thrilled to host this event alongside our panel speakers Mike Fowler, CTO at Synalogic, Anna Maria Wykes, Senior Data Engineering Consultant at Advancing Analytics, and Matt Penton, Head of Data and Analytics at Appsbroker.

For those that couldn’t attend the event, we’ve put together some key takeaways that were discussed. We’d like to thank the panellists for their time and insight, and look forward to hosting more events in the future!

The cloud market is bigger than ever.

Event Host, Joe Davies opened the conversation by highlighting the magnitude of the cloud market, and how it’s become a popular subject in recent years. Research and data collection has shown that the market value is currently at USD $150bn in 2019, which has only continued to grow over the last two years.

The first question put out to the panel was how do you see data architecture is how is that transforming the use of GCP and other cloud setting other cloud options at the moment?

Mike answered this question, and highlighted that cloud computing has completely shifted the way in which we work with our data. Before cloud systems, it was data warehouses and data centres, which although are widely used today, are becoming the less favourable choice for a number of reasons. Something that is abundantly clear is that the cloud gives us the ability to model on demand.

We can put our data into different formats and storage systems, which have better strengths, depending on the type of data you're working with. Another interesting point that was made was around unstructured data, and how the cloud allows us to extract and iterate a lot faster. It also allows more room for error, as throwing it away and starting from scratch isn’t detrimental.

Matt agreed with this, stating that unstructured data is the “untapped wealth of many organisations”. Although unstructured data and the cloud seem to be a match made in heaven, Anna expressed her concerns around data and how it’s used by cloud providers, highlighting that a lot of providers don’t realise how much work needs to be put in to ensure that the data is in a state where it can be reported and used. These solutions should still be seen as a work in progress, rather than a “magic wand” to solve all of your problems.

This raised the question of regulation, and how this can be monitored either from a regulatory board or internally within the organisation. The consensus from our panellists were the following:

  • It’s down to the business to organise individuals and spend time engineering the data

  • Cloud providers should have tools readily available to make this easier

  • For sensitive industries, such as finance and healthcare, information security and data management should be in front of mind, with extra precautions put in place

Data flow vs cloud function: could one be replaced with the other?

Another area that was discussed was replacing data flow with cloud functions, which Matt said was nuanced in his opinion. Cloud functions in themselves have limited their lightweight serverless execution, so it’s less heavyweight or controlled than Data flow. Mike agreed, and added that the rule of thumb that can be applied is if what you’re doing could be executed row by row, instead of potentially having a cloud function.

Different cloud offerings: how does it benefit your industry?

With more than 5 mainstream cloud providers, each come with their pros and cons. For example, Azure naturally is linked to Microsoft, and GCP with Google, but this in turn creates a competitive market. It causes other cloud providers to see what their competitors are doing, and ensuring that they’re ahead of the curve. AWS is a clear leader in the market with over 30% of organisations using it.

Mike explained that he feels there will be ebbing and flowing across various cloud providers as time progresses, and it’s important to recognise that providers like GCP see this as a “side business”, especially as Google’s main revenue generator is advertising. For some providers, cloud isn’t necessarily their main priority, so there will always be providers (like Azure) that will continue to dominate the market.

What data tool/service are you most excited about?

With multiple tools emerging in the market, there are many to be excited about. Anna highlighted that the new engine database is one she’s interested in, as due to it being written in c++, she’s keen to see if it improves performance issues.

Matt explained that for him, he’s most excited about BigQuery. The general performance of BigQuery has improved over the years, and for Matt, it’ll be interesting how they continue to evolve over the next few months/years.

Do you think data lakes have finally replaced data centres post pandemic?

Matt highlighted that Covid-19 has accelerated this, and the short answer is “yes”, with Anna adding that throughout her career, everything was on prem in server rooms, and now there’s been a complete switch to working with the Cloud. She explained that it’s difficult to get companies on board with this, especially as they feel like they’re losing control of their data, as it’s not physically in the building.

Pushing for data lakes is something that she focuses on as much as possible, but it’s also important to be mindful of the fact that there will be resistance from decision makers. Mike explained that he has come across resistance a number of times, and although some of it is justifiable, some of it can just act as a bottleneck for a necessary process. It’s ultimately about relinquishing control and trusting that this is the future of data storage.

Watch the event recording

Join us at the next event… To be announced soon.

Keep an eye on our events page and follow us on LinkedIn for updates.

Previous
Previous

Podcasts: Why do we love them so much?

Next
Next

Lead with Empathy