Constantin Lungudatawise.dev·Apr 15, 2024Sharded tables in BigQueryHave you ever worked with sharded tables in BigQuery? I've encountered them in a project long time ago and haven't seen them much around since. Does the name not ring a bell? Well, think of it as pseudo-partitioning, a way to store data split bet...Discuss·46 readsPractical BigQuerybigquery
Constantin Lungudatawise.dev·Apr 11, 2024Comparing tables with FULL OUTER JOINDoes your Data Engineering project use a data-diffing tool? Say you're preparing to deploy a change to a prod table. You've changed the way some metrics are calculated and twisted some filters. How do you find out what's different between two table...DiscussPractical BigQuerySQL
Constantin Lungudatawise.dev·Apr 11, 2024Splitting a STRING in BigQuerySplitting a string in BigQuery works pretty much the same as in Excel. SPLIT works in a similar way as its Excel cousin TEXTSPLIT - taking a string to be split and a delimiter (can be multiple characters), and returns an array of elements. You ca...Discuss·27 readsPractical BigQuerybigquery
Constantin Lungudatawise.dev·Apr 6, 2024Search Indexes in BigQueryHere's something that might be interesting if you analyze large volumes of STRING or JSON data in BigQuery. Let's look at SEARCH indexes and what you need to know to get started. So, what are they used for? You have a big table (>10 GB) with STRING...Discuss·34 readsPractical BigQuerybigquery
Constantin Lungudatawise.dev·Apr 4, 2024Extract all pattern occurrences in BigQueryIf you ever need to extract information based on a pattern in a BigQuery string, check out the REGEXP_EXTRACT_ALL function. This will return an array of all the occurrences matching the specified regular expression. With regards to the pattern it...DiscussPractical BigQuerySQL
Constantin Lungudatawise.dev·Apr 2, 2024Why you should use UNION DISTINCT sparinglyLet's help BigQuery do less unneeded work! If you're UNIONING two sources known to have distinct values (and they don't have duplicates), go for UNION ALL instead of UNION DISTINCT (UNION for some other sql dialects) to avoid redundant de-duplicati...Discuss·56 readsPractical BigQuerySQL
Constantin Lungudatawise.dev·Apr 1, 2024ORDER BY expressions in SQLFriendly reminder: when you ORDER BY something in SQL, that something does not necessarily need to be a column, but could be an expression, the output of which can be ordered. In the example below, we'd like to ORDER by sales decreasingly, but show...Discuss·47 readsPractical BigQuerySQL
Constantin Lungudatawise.dev·Mar 31, 2024Accessing ARRAY elements in BigQuerySo here's 3 ways we can access elements in a BigQuery array. - by index: array[index], starting at 0- using OFFSET(index): array[OFFSET(index)], also starting at 0- using ORDINAL(1-based index)), starting at 1 The above will return an "index out ...Discuss·1 like·74 readsPractical BigQuerybigquery
Constantin Lungudatawise.dev·Mar 31, 2024Enumerating ARRAY elements in BigQuery using WITH OFFSETIn a previous post we've covered what ARRAYS are in BigQuery, their use cases and how to flatten them with UNNEST. Quite important to mention, ARRAYS are ordered collections (like lists in Python) - you set up that order when creating it. By UNNEST...Discuss·50 readsPractical BigQuerybigquery
Constantin Lungudatawise.dev·Mar 30, 2024UNNESTING ARRAYS in BigQueryHere's perhaps my favorite feature in BigQuery and another one I discovered when switching from SQL Server. It's one of its most powerful features - the support for ARRAYS. Although a bit intimidating when seeing them for the first time, they allow...Discuss·32 readsPractical BigQuerySQL