How to Configure Incremental Refresh in Power BI?
In today’s digital landscape, online systems generate a large amount of data. When working in a team, it is common for the data to be updated frequently, say weekly, or daily, or even hourly. Loading the entire dataset from such data sources into a business intelligence tool like Power BI every day (or at regular intervals) to meet reporting requirements is time-consuming and inefficient.
What is Incremental Refresh?
Incremental refresh refers to the process of updating data only by loading new or changed data. This means that whenever new data is added to the data source, the incremental refresh feature fetches only the newly added data without reloading the entire dataset from the data source.
The Incremental refresh feature saves time and makes the process more efficient by only loading new data rather than the entire dataset. This ensures that the data remains up-to-date while minimizing the use of resources and making the data update process seamless.
Importance of incremental refresh
Incremental refresh is an essential component of any data system that handles large amounts of data, as it helps to improve efficiency and performance. Following are some key reasons why incremental refresh is important:
Refreshes are quicker
Incremental refresh allows you to update only the new or modified data in a database, significantly reducing the time required to load new data. This improves the overall refresh time and ensures that the data is up-to-date as quickly as possible.
Less resource utilization
Incremental refresh ensures that fewer resources are used to process new data which can help reduce the costs associated with data management.
Incremental refresh enables the system to handle increasing amounts of data more efficiently, making it highly scalable and suitable for long-term use. With the ability to only process new or updated data, the system can continue to function smoothly even as the volume of data grows, making it a smart and sustainable solution.
Before configuring incremental refresh in Power BI, it’s crucial to have a good understanding of query folding concept.
Query folding in Power BI refers to the process of converting a Power Query query into a single query statement for both retrieving and transforming the data. Through query folding, Power BI shifts the process of loading and transforming data from the Power Query to the underlying data source, which optimizes its performance.
Data sources that support query folding can be configured for incremental refresh in Power BI, while for those that do not, a scheduled refresh is the only way to go.
For example, most databases such as Access and SQL databases support query folding and can be configured for incremental refresh in Power BI, while Excel, which does not support query folding, is best suited for a scheduled refresh.
Configuring Incremental Refresh in Power BI
PowerBI allows for the integration of various data sources, and while the configuration process for incremental refresh may vary slightly depending on the source, the overall steps remain consistent.
As an example, we will demonstrate how to set up incremental refresh using a Microsoft Access database, however, the process can be applied to any data source that supports incremental refresh. The following are the steps to configure incremental refresh in Power BI:
Connect to a data source
Before setting up incremental refresh in Power BI, it is important to first establish a connection with a data source. In this tutorial, we will use a Microsoft Access database as an example. To follow along, please click here to download the sample database.
Once the database is downloaded, open Power BI, navigate to the Get Data button, and select Access Database as the source.
To connect to the database, navigate to the directory where the database (sample_superstore_data) is located, select the database and click Open.
Select Orders table, and then click Load. This will load the Orders table into PowerBI.
Define a unique key column
To configure incremental refresh, the dataset must have a unique key column that Power BI can use to identify new or updated rows of data. In most scenarios, a Date or DateTime column is used as the key column, as it makes it easy to identify if new data has been added to a table after a certain date range.
In the provided sample database, we have a unique column named Order Date that will be used to set up incremental refresh in Power BI.
Setting up Parameters:
Before setting up incremental refresh on the Orders table, it is necessary to establish two parameters for Power BI to recognize the configuration of the incremental refresh. To do this, go to the Home tab and click on Transform Data to open the Power Query Editor. Click on Manage Parameters to create the first parameter.
A pop-up window will appear when you click on Manage Parameters. In this window, click on New, give your parameter the name RangeStart, set the data type to Date/Time, and set the current value to the earliest date/time value in your data.
To create the second parameter, repeat the process of clicking on Manage Parameters and creating a new parameter. Give it the name RangeEnd, set the data type to “Date/Time” and set the current value to the date/time on which you are configuring the incremental refresh. In most cases, this is set to the latest date/time value.
Applying filter on the unique key column
With the parameters created, it is now time to use them to filter the unique key column, Order Date in this case.
To apply the filter on the Order Date column, click on the dropdown arrow next to the Order Date column, hover over Date/Time Filters, and select Custom Filter at the bottom of the list.
When you select Custom Filter, a pop-up window will appear. In this window, select the options similar to the ones shown in the provided image. This will filter the rows to include only those rows that fall within the specified date range, as defined by the parameters created earlier.
Once done with creating the filter, click Close & Apply at the top-right of the screen to apply all the changes made in the power query.
Setting up Incremental Refresh
With the ranges defined and the filter applied, it is now time to set up incremental refresh. To do this, go to the Data pane in Power BI, right-click on the Orders table, and select Incremental Refresh.
When you select Incremental Refresh, a pop-up window will appear. In this window, select the table that you want to set up for incremental refresh and click the button next to Incrementally refresh this table.
The Archive data option allows you to specify the amount of data that you want to retain in the dataset before it is removed. For example, if you have 10 years of data and you are only interested in the previous 5 years from the latest date, you can enter 5 in the blank next to Archive data.
The next option Incrementally refresh data starting allows you to specify the range of data that will be incrementally refreshed each time a refresh is run. For example, if you select 3 Months, each time you refresh the data, the reports and visualizations will include data only from the previous three months from the current date/time.
Refer to the provided image and select the same configuration as shown. The configuration options can be adjusted according to your specific requirements. Click Apply when done entering the configurations.
With this configuration, you have successfully set up incremental refresh in Power BI. Next, publish the data source to your online Power BI service. From now on, each time you run a refresh, it will fetch only the newly added data from the database, without importing all the data in the database.
Common Issues and Fixes
At times, incremental refresh can develop issues, therefore, it is good to have an understanding of the common issues and fixes in incremental refresh in Power BI. Following are some of the common issues that may arise in incremental refresh:
Conflict in data updates
Conflicts in the data source being updated from different sources is a common issues that can arise when setting incremental refresh up in Power BI. Conflicts can occur when incremental refresh is set up, and the data source is updated simultaneously from multiple sources. To resolve this issue, schedule the incremental refresh and other updating processes at different times to avoid conflicts.
Incorrect configuration of incremental refresh
Most of the issues with incremental refresh arises at the time of configuration, especially if you are new to incremental refresh in Power BI. These issues can occur when the ranges for the parameters have not been defined properly or when an incorrect column is used to perform an incremental refresh.
To make sure that incremental refresh is configured correctly, double-check all the important steps and options before running the refresh.
Issues with the data model
For an incremental refresh, the data model must remain consistent throughout the refresh cycle. If there is a change in the data model, the incremental refresh may no longer work properly. To avoid this issue, make sure that incremental refresh is implemented on a dataset that is not subject to change.