Tableau Data Source Connections

Tableau Data Source Connections are one of the most critical aspects of data analysis. It determines the overall design of the analytical report, and also influences it’s performance. Based on the technology landscape, there can be a variety of different design options in Tableau that we can reply on.

portrait photo of laughing man in white dress shirt and black framed eyeglasses celebrating
Efficient Data Connections for the Tableau Data Analyst Certification

Data extracts vs. live connections

Extracts are one among the foremost powerful but overlooked tools in Tableau’s arsenal. Tableau Data Extracts takes screenshot of data optimized for aggregation and loaded into system memory to be quickly recalled for visualization. Extracts features seems to be much faster than live connections in Tableau, especially in complex visualizations with large data, filters, calculations, etc.

What is a Tableau Data Extract (TDE)? 

A Tableau data extract is additionally a compressed snapshot of information stored on disk and loaded into memory to render a Tableau viz. That’s fair for a working definition. However, the entire story is way more interesting and powerful.

There are two aspects of TDE design that make them ideal for supporting analytics and data discovery. However, let’s a minimum of establish the common understanding that columnar databases store column values together rather than row values. As a result, they dramatically reduce the input/output required to access and aggregate the values in an exceedingly column. 

The second key aspect of TDE design is how they’re structured which impacts how they’re loaded into memory and employed by Tableau. this can be an extremely important a component of how TDEs are “architecture aware”. Basically, architecture-awareness implies that TDEs use all parts of your computer’s memory, from RAM to device, and put each part to work as most closely fits its characteristics. 

To better understand this aspect of TDEs, we’ll practice how a TDE is formed so used because the data source for one or more visualizations. When Tableau creates an information extract, it first defines the structure for the TDE and creates separate files for each column within the underlying source. (This is why it’s beneficial to attenuate the amount of information source columns selected for extract). 

The sorting and compression occur sooner within the tactic than in previous versions, accelerating the operation and reducing the amount of temporary space used for extract creation. People often ask if a TDE is decompressed because it’s being loaded into memory. the answer is no. 

The compression accustomed reduce the storage requirements of a TDE to form them more efficient isn’t file compression. However, good old file compression can still be accustomed further reduce the scale of a TDE.

To end the creation of a TDE, individual column files are combined with metadata to create a memory-mapped file or to be more accurate, one file containing as many individual memory-mapped files as there are the columns within the underlying data source. this may be a key enabler of its carefully engineered architecture-awareness. 

Tableau doesn’t should open, process or decompress the TDE to begin out using it. If necessary, the package continues to data in and out of RAM to verify that each one in every of the requested data is created available to Tableau. 

This could preferably be a key point – it implies that Tableau can query data that’s bigger than the available RAM on a machine! Only data for the columns that are requested is loaded into RAM. However, there are another subtler optimization. As an example, a typical OS-level optimization is to acknowledge when access to data in an exceedingly very memory-mapped file is contiguous, and as a result, read ahead so on extend speed access. Memory-mapped files are only loaded once by an OS, regardless of what number users or visualizations access it.  Lastly, architecture-awareness doesn’t stop with memory – TDEs support the Mac OS X and Linux OS additionally to Windows and are 32- and 64-bit cross-compatible. after you create an extract from a neighborhood file (such as a .csv or an Excel workbook) or an on-premise database, you’re speeding up the workbook through optimization. 

As a result, Tableau doesn’t need the database to make the visualization. instead of Tableau’s in-memory data engine queries the extract directly. However, because an extract could even be a snapshot of the knowledge, the extract will must be refreshed to receive updates from the initial data source, whether it’s a bit file or an on-premise database.

What is Live Connection?

 Live connections offer the convenience of real-time updates, with any changes within the data source reflected in Tableau. But live connections also depend on the database for all queries. and in contrast to extracts, databases aren’t always optimized for fast performance. With live connections, your data queries are only as fast because the database itself.

 There are more variables at play when employing a live connection. Workbook speeds are stricken by a spread of things, including your network speed, traffic thereon network, and any custom SQL. Live connection. 

This refers to an information source that contains direct connection to underlying data, which provides real-time or near real-time data. With a live connection, Tableau makes queries directly against the database or other source and returns the results of the query to be used in an exceedingly workbook. 

Users can create live connections and so share them on Tableau Server so other Tableau users can use the identical data using the identical connection and filtering settings. b\Because the Tableau Server administrator, you’ll manage credentials and therefore the permissions related to the info source to manage what data users can access.

An extract or a live connection—which to use? 

Both varieties of connections have their place. Hospitals that monitor incoming patient data must make real-time decisions. These situations necessitate a live database connection. But within the same hospital, there might also be visualizations that monitor daily or weekly trends. For these analytics, using an extract of the information source helps build a faster workbook. 

Tableau Online currently supports live connections to the subsequent cloud-hosted data sources: 

  • Amazon Redshift 
  • Amazon Aurora 
  • Google BigQuery 
  • Google Cloud SQL 
  • Hive and Impala on Amazon Elastic MapReduce 
  • HP Vertica 
  • Microsoft SQL Server 
  • Microsoft Azure SQL Data Warehouse 
  • Microsoft Azure Database (Marketplace DataMarket) 
  • MySQL 
  • PostgreSQL 
  • SAP HANA 
  • Spark SQL 
  • Snowflake. 

With a scheduled extract refreshes, workbooks can connect with data from the subsequent cloud applications: 

  • Salesforce 
  • Google Analytics 
  • Google Sheets 
  • Quickbooks 

Why use published data sources? Now that you just know what reasonably data connections are in your arsenal, let’s discuss a way to manage those sources. People are not any longer required to ascertain connections to databases themselves. Instead, publishing the info source connection provides simple and secure access through a user’s Tableau Online account. Publishing an information source to Tableau Online also captures any metadata you’ve inbuilt Tableau Desktop. If you created new calculated fields, groups, sets, or hierarchies within the data pane of your workbook, of these modifications are going to be reflected within the data source published to Tableau Online. We’ve found this useful in curating easy-to-use data sources for organizations.

Published Data Source 

How to publish a knowledge source. Say you have got a cloud-hosted Amazon Redshift database with one main account, but you would like all your users to possess access to the database to be used in Tableau. 

You’ll want to publish the information source to Tableau Online together with your Redshift login credentials embedded. 

  1. First create a replacement connection to the information source in Tableau Desktop. 
  2. Choose the information you wish to bring into Tableau. 
  3. Sign into Tableau Online within the Server menu, using the address online.tableau.com. 
  4. Within the same menu, publish the info source. 
  5. Choose the project during which you wish the info source to measure. you’ll also add a reputation, tags, permissions, and authentication. 
  6. During this case, I’m visiting choose embedded credentials, so my users won’t enter credentials anytime they use the connection. 

 Now we’ve our data source hosted on Tableau Online! 

How to hook up with a broadcast data source. 

On the user end, connecting to the published data source is very simple. 

  1. In Tableau Desktop, choose “Tableau Server” because the database and enter “online.tableau.com” because the server URL. 
  2. Choose the respective published data source. 
  3. You’re able to create a viz! If you often use over one database, published data sources provide easy organization and connection to any number of databases. 

Advantages

  • Tableau’s data server could be a server component that enables you to centrally manage and store Tableau Server data sources. a knowledge source may be a reusable connection to data. 
  • The data are often located either in Tableau’s data engine, as an extract, or in an exceedingly live on-line database. 
  • For electronic database connections, the knowledge stored within the data source is employed for a pass-through connection. 
  • The data source can even include customization you’ve made at the sector level in Tableau Desktop, like calculations, dimension aliases, groups, or sets. 
  • For administrators, there are many advantages to using Tableau Server data sources. Because one data source extract can be used by many workbooks, so that space and time interval in the server can be minimized. 
  • One more point, the advantage of getting published data sources in tableau server is your extracts are not any more passionate about database. If somehow your database gets crashed or down, then your extracts will have data till last refresh. 
  • Your dashboard are alive Extract refreshes may be scheduled per extract rather than per workbook, and when a workbook employing a Tableau Server data source is downloaded, the info extract stays on the server, leading to less network traffic. 
  • Finally, if a database driver is required for a connection, you simply need to install the driving force once, on Tableau Server, rather than multiple times, on all your users’ desktops.

Tableau Practice Test

The best Tableau practice exams built. Period. Explore definitive practical problems created by brilliant Tableau experts.