Google BigQuery: 5 benefits for cloud data warehousing

BigQuery is the enterprise data warehouse service of the Google Cloud Platform (GCP) - and it's making big waves for low costs and flexibility.

As a cloud-based warehousing solution, BigQuery is entirely serverless, with high availability and petabyte scalability. Unlike other options, there are no nodes to plan, configure, or scale.

BigQuery implements key capabilities of Dremel, Google’s own distributed system for querying large datasets, and stores data in a columnar format. It uses a tree architecture to parallelise queries across a large number of machines, and each SQL query scans the whole table, producing insightful results in just seconds (even for tables with millions of rows).

With queries no longer taking hours or days, as with traditional on-premises tools, we get results within moments. Analysts across the enterprise can then leverage such efficiency to act on insights faster, build visualised reports on aggregated data, and pin-point answers to questions from historical data with more accuracy.

In short, it’s fast, fully managed, and cost-effective - and you can’t afford to ignore it any longer.

 

Top 5 benefits of Google Cloud Platform's BigQuery data warehouse service

BigQuery is effectively GCP’s counterpart to Microsoft’s Azure Data Warehouse and Amazon Athena, but with its own unique advantages owing to Google’s broader cloud strengths. 

What they share in common is their mutual emphasis on helping enterprises stay focused on leveraging the many benefits of analytics, rather than worrying about the infrastructure behind it.

The GCP itself is tailored towards rich big-data environments, and offers a wide range of options for assets and data storage to build or expand enterprise infrastructure. 

For storing and analysing large amounts of information efficiently and with minimal drop in performance, BigQuery does just that. Speaking to our customers, we’ve broken down the top 5 benefits of the service and why you should consider BigQuery for your big data needs.

 

#1 - Cost-effective and highly scalable ‘NoOps’ warehousing solution

BigQuery is a fully managed service on a pay-as-you-go cost model, for storage and querying.

Unlike other cloud-based data warehouse solutions, BigQuery costs are based on usage and not a fixed rate, meaning your bill reflects how much you use per month. Its on-demand nature means you can lower your total cost of ownership by up to 88%.

Storage with BigQuery is charged based on the amount of data you have stored in the service, either under an ‘Active’ monthly charge, or ‘Long-Term’ lower monthly charge. Query costs, meanwhile, are on-demand based on the amount of data processed by each query you run. 

However, enterprises with fixed budgets can opt for a flat-rate model, so you can buy dedicated resources based on your specific business needs.

Currently, storage costs for active storage are at $0.023 per GB, and for queries $9.55 AUD per TB. However, the first 1TB per month for both storage and queries is always free, and there are a ton of free operations that do not incur costs, such as:

  • Copying and exporting data
  • Deleting data sets, tables, partitions and functions
  • Loading data into BigQuery from cloud storage
  • Metadata operations

Meanwhile, the elastic scaling capabilities of BigQuery is automatic and allows for streaming ingestion to load data from your cloud data lake or on-premises data sources seamlessly, while optimising performance for real-time analytics. It’s fast, flexible and efficient for better storage and querying than you may be getting with older, on-premise SQL warehousing solutions.

 

#2 - Streamlined analytics with ETL compatibility and BI connectors

BigQuery’s main appeal is in how fast it lets an enterprise set up a data warehouse, query data and run at scale. 

The service has in-built connectors to some of the most popular and heavily used ETL tools like Informatica, which enrich data using built-in Data Transmission Service (DTS) that migrates data between data storage types, such as relational databases, NoSQL, and OLAP. 

Meanwhile, BigQuery natively supports the most heavily used business analytics/business intelligence platforms such as Looker and Tableau without additional tinkering, allowing enterprises that have already invested in these powerful toolsets to harness their organised data in BigQuery and begin building dashboards and reports with rich visualisations much faster.

 

#3 - You can create and test machine learning models using SQL queries

BigQuery includes a feature called BigQuery ML which allows advanced users to create, run and test machine learning models using standard SQL queries - directly within the solution.

The common scenario for building machine learning solutions with enterprise data is to export the relevant data out of our warehousing solutions and then build the model.

With BigQuery ML, you bring the model - and the entire build, run and test process - to the data instead, reducing complexity and the steps required to get things started, and improving speed.

Because it uses SQL queries, BigQuery ML allows users already familiar with SQL language to use their existing skills and processes to get started with the many benefits of machine learning faster, without having to worry about programming solutions using Java or Python.

Currently, you can access BigQuery ML via the BigQuery user interface, BigQuery REST API, and external solutions such as your BI platform (Looker or Tableau) or notebook (Jupyter).

 

#4 - Managed service that lets you focus on innovation, not maintenance

All updates for BigQuery are delivered to your systems automatically, with no infrastructure to manage on your end. 

Because Google sorts out all maintenance, patching and feature upgrades, this lets your team focus on other key areas, such as deriving more value from your data, and makes it especially attractive and cost-effective for businesses moving from complex on-premises data islands.

 

#5 - Built-in DR and data protection for stronger data security

In addition to a 99.9% service level agreement (SLA) that guarantees uptime, BigQuery has some of the strongest governance and security of any cloud-based data warehouse service.

Protecting the integrity of our enterprise data is of the utmost importance to ensure accuracy and reliability of the data we store and query, and the insights we derive once analysed.

BigQuery includes automatic data replication for built-in disaster recovery (DR) and high availability in the case of an unexpected downtime or unintentional (or malicious) breach, and at no extra cost. Without having to use other tools to fulfil this level of data protection frees up your team further to focus on producing insights, rather than tinkering with multiple tools for security.