Which databases already contain available data

Data sources for the Power BI service

  • 9 minutes to read

Data is at the core of Power BI. Let's say you are exploring data. You can create diagrams and dashboards for this, or use the Q&A-Function ask questions. The visualizations and responses displayed pull the underlying data from a dataset. But where does the dataset come from? It comes from a data source.

This article describes the types of data sources that you can connect to using the Power BI service. Remember, you can get data from many other types of data sources as well. When you select these data sources, you may need to first use the Power BI Desktop or Excel features for advanced data querying or data modeling. These options are explained in more detail later in the article. First, let's look at the different types of data sources available through the Power BI service website.

You can get data from any Power BI data source by clicking in the lower left corner of the page Call up data click.

After you click on Call up data clicked, you can choose the data you want to access.

Determine content

In the section Determine content all the data and reports you need are already prepared. There are two types of content packs in Power BI: for organizations and for services.

Organization-related: If you and other users in your organization have a Power BI Pro or Premium per user account, you can create, share, and use content packs. For more information, see Introduction to organizational content packs in Power BI.

Services: There are dozens of content packaged services for Power BI, and their number is growing all the time. Most services require an account. For more information, see Connect to the services in use using Power BI.

Create new content

In the section Create new content find options for creating and importing content. There are two ways to create or import your own content in Power BI: files and databases.

Files

Excel ( .xlsx, .xlsm): An Excel workbook can contain different types of data. This includes, for example, data that you have entered in the worksheet yourself. It can also contain data that you have queried and loaded from external data sources using Power Query. Power Query is available in Excel 2016 or Power Pivot via the functions for Get and transform available. You can import data from tables in worksheets or from a data model. For more information, see Get data from files for Power BI.

Power BI Desktop ( .pbix): You can query and load data from external data sources and generate reports using Power BI Desktop. You can also extend your data model with measures and relationships, or import your Power BI Desktop file onto your Power BI website. Power BI Desktop is best suited for experienced users. These users usually have a good knowledge of data sources. You will also be familiar with querying and transforming data and the concepts of data modeling. For more information, see Connect to data in Power BI Desktop.

Delimited files ( .csv): These are simple text files with lines of data. Each line can contain one or more values, each separated by a comma. A CSVFor example, a file that contains name and address data can consist of many lines. Each row can contain values ​​such as first name, last name, street name, city, and state. You can't put data into a CSV-Import file. Many uses, e.g. B. Excel, however, you can save simple tabular data as CSV-Save file.

For other file types, e.g. B. XML table files ( .xml) or text files ( .txt), you can first use the functions for Get and transform to query, transform, and load data into an Excel or Power BI Desktop file. You can then import the Excel or Power BI Desktop file into Power BI.

It also makes a huge difference where you save the files. OneDrive for Business offers the greatest degree of flexibility and integration with Power BI. You can save your files on a local drive. However, additional steps will then be required if you need to update your data. Please see the following articles for more information.

Databases

Databases in the cloud: You can connect live to the following services through the Power BI service:

  • Azure SQL database
  • Azure Synapse Analytics (formerly SQL Data Warehouse)
  • Spark on Azure HDInsight

These Power BI database connections are live connections. Assume that you want to connect to an instance of Azure SQL Database. Then, first of all, you'll examine the data by generating reports in Power BI. When you segment the data or add a field to a visualization, Power BI queries the database. For more information, see Azure and Power BI.

For other types of databases in your organization, you need to use Power BI Desktop or Excel to connect to the data, query it, and load it into a data model. You can then import the file into Power BI. A dataset is available there. When you configure a scheduled update, Power BI uses the configuration and connection information from the file to connect directly to the data source and query for updates. Power BI then loads these updates into the dataset. For more information, see Connect to data in Power BI Desktop.

What do I have to consider if the data comes from another source?

There are hundreds of data sources you can use with Power BI. However, the data must always be in a format that the Power BI service supports. For example, the Power BI service can use usable data to create reports and dashboards and ask questions about the Q&AAnswer function.

Some data sources already contain data that has been formatted for the Power BI service. These sources are comparable to content packages from service providers such as Google Analytics or Twilio. The SQL Server Analysis Services tabular model databases are also in an appropriate format. You can also have a live connection to databases in the cloud, e.g. B. Azure SQL Database and Spark on HDInsight.

In other cases, you may need to query the data you want and load it into a file. Suppose you are storing logistics data in your organization. This data is stored in a data warehouse database on a server. You can't connect to the database and start exploring the data in the Power BI service (unless it's a tabular model database). However, you can use Power BI Desktop or Excel to query the logistics data and load it into a data model that you then save as a file. You can then import this file into Power BI. A dataset is available there.

You are probably thinking by now that the logistics data in the database changes daily and are wondering how to update your Power BI dataset. When you import the data into the dataset, the connection information is also imported from the Power BI Desktop or Excel file.

For example, suppose you configure a scheduled update or manually update the dataset. Power BI uses the dataset's connection information and a few other settings to connect directly to the database. This is then queried for updates and the updates are loaded into the dataset. You will likely need a Power BI gateway to secure data transfer between the local server and Power BI. After the transfer, all visualizations in reports and dashboards are automatically updated.

So even though you can't connect directly from the Power BI service to the data source, you can still transfer the data into Power BI. It just takes a few extra steps and possibly a little help from the IT department. For more information, see Data sources in Power BI Desktop.

Some more information

You often come across the terms "dataset" and "data source" in Power BI. These are often used synonymously. They are related, but very different.

A Dataset is automatically used in Power BI when you use the Call up data use. With the function Call up data connect to a content pack or file to import data from it, or connect to a live data source. A dataset contains information about the data source and associated credentials. Often times, it also contains a subset of the data that was copied from the data source. When creating visualizations in reports and dashboards, these are often based on data in the dataset.

A Data Source corresponds to the source of the data in a dataset. The data can come from the following sources, for example:

  • an online service like Google Analytics or QuickBooks
  • a database in the cloud like Azure SQL Database
  • a database or file on a local computer or server in your organization

Data update

If you are storing your files on a local drive or an organization drive, you will likely need a Power BI gateway to update the dataset in Power BI. The computer on which the files are stored must be switched on when the update is carried out. You can import the file again or use the “Publish” function in Excel or Power BI Desktop. However, these processes are not automated.

When you save the files to OneDrive for Business or SharePoint for team sites, you can connect them to Power BI or import them into Power BI. Your datasets, reports and dashboards are always up to date. Since OneDrive and Power BI are in the cloud, Power BI can connect directly to the saved file. The connection is established about once an hour to check for updates. The dataset and all visualizations are automatically updated when updates are available.

Content packs from services are updated automatically. In most cases these are updated once a day. You can update it manually, but it depends on your service provider whether you can see updated data. Whether and how the content packs are updated by users in your organization depends on the data source used. It is also crucial how the creator of the content package configured the update.

Azure SQL Database, Azure Synapse Analytics (formerly SQL Data Warehouse) and Spark on Azure HDInsight are data sources in the cloud. Since the Power BI service is also in the cloud, Power BI can use DirectQuery Establish a live connection to these data sources. The data in Power BI is always synced, and you don't have to set up a scheduled refresh.

When you connect to SQL Server Analysis Services from Power BI, it is a live connection, just like an Azure database in the cloud. The difference is that the database is on an organization server. This type of connection requires a Power BI gateway that must be configured by the IT department.

Data refresh is an extremely important aspect of Power BI and is too complex to cover in depth here. If you want a thorough understanding of this aspect, see Refreshing data in Power BI.

Considerations and Limitations

The following restrictions apply to all data sources used in the Power BI service. There are other restrictions that apply to certain features, but the following list applies to the entire Power BI service:

  • Dataset size limit: There is a 1 GB limit on datasets stored in shared capacities in the Power BI service. If you need larger datasets, you can use Power BI Premium.

  • Different values ​​in one column: When caching data in a Power BI dataset (sometimes called "import mode"), there is a limit of 1,999,999,997 on the number of different values ​​that can be stored in a column.

  • Line limit: If you DirectQuery Power BI places an upper limit on the query results sent to your underlying data source. If the query sent to the data source returns more than a million rows, an error is displayed and the query fails. Your underlying data can still contain more than a million rows. You are unlikely to hit this limit, as most reports break the data into smaller result sets.

  • Column restriction: No more than 16,000 columns are allowed for all tables in the dataset. This limitation applies to the Power BI service and to datasets used in Power BI Desktop. Power BI keeps track of the number of columns and tables in the dataset this way. This means that the maximum number of columns is 16,000 minus one column for each table in the dataset.