03e. Basket Analysis Wizard
Basket Analysis is a premium licence option that groups, ranks and transforms transactional data into a format that can be easily used to analyse spending and usage patterns across large data sets.
Items relating to a single transaction or event are grouped together and stored as a ‘basket’. Multiple baskets can be analysed and compared to identify frequently recurring items.
Without Basket Analysis enabled, a single basket is comprised of multiple items or instances which require individual rows. Using Basket Analysis, multiple instances appear on a single row and individual customer spending can be tracked by grouping multiple items in a single basket or transaction. This provides insight into what products are being purchased, when, in what quantity and in what combination.
To create a Basket Analysis Index, click Wizards, then Basket Analysis.
Click the New button to begin the process.
Creation Process
Details
Enter the name that will be given to the Data Source, Data Source Group and Index in the Name text box.
Use the Tags text box to add associated search terms to the Index structure. This allows the components to be searched using alternative strings when using the search bar.
To automatically restrict access to the Index once built, enable the Sensitive option. Only users with explicit permission will be able to access the Index.
The Index Method drop-down list specifies the build method to be used when building the Index.
The Complete option will result in all items in a data source being indexed when a refresh takes place. Once refreshed, a new version number is applied to the Index. Each run completely refreshes the Index.
Building an Incremental Index enables an optimised refresh process when rebuilding Indexes to account for any changes. Rather than rebuilding the entire Index, only data that has been modified or added will be processed. New and changed records are added as new rows. This maintains the history of a record that may have changed over time, and no de-duplication is applied. When selected, the Incremental Identifier drop-down list is revealed. This is required to identify changes, using the selected field to detect when new data is greater than the previous highest value.
The Cumulative option works the same as an Incremental build, but replaces duplicates as well as adding new data. Duplicate records are removed from the result set based on the Increment Identifier and Primary Key. When cumulatively building from CSV files, only a Primary Key is required. For other database types, an Incremental Identifier and a Primary Key are required.
Use the Index Size drop-down list to specify the expected number of records the resulting Index will contain. This is an indicative value and does not need to be exact.
Using this drop-down list allows the system to work out how many folders to split the Index into. The higher the specified Index size, the less folders created. A lower number of folders results in less threads used to query the Index, as a single thread is allocated per folder. This decreases the speed of individual queries, but reduces the performance impact of multiple concurrent users querying the system simultaneously.
Use the Add to Search Engine drop-down list to select the relevant search engine the Index will be added to upon completion and the Add to Collection drop-down list to select an existing Collection to which the Index will be added.
Enable the Build Now option to build the Index as soon as the creation process is complete. If disabled, the settings are saved for the Index to be built at a later time.
General
Select the source Index from the Index dropdown list. If only a selection of the Index is required, use the Saved Queries dropdown list to select a previously saved query.
Product Details
The Product Details tab allows the details to be entered that will define how the basket is compiled.
The dropdown lists are ordered by level of granularity, with the bottom option allowing the selection of multiple items and the above options available to detail the various categories they are a part of.
Chronological Details
The Transaction Time Stamp is the time stamp associated with the Index, such as a time of purchase. This should typically be set to the date field in the Index.
Specify how many months the analysis will reference back to using the Months textbox.
Transaction Details
Set the Basket Identified dropdown list to a value that is unique to a transaction, allowing the subsequent data for the specific transaction to be assigned.
The Loyalty Card option assigns the Basket Identifier to a particular customer. Typically, a loyalty card number provides the unique identifier that allows the purchases made by individual customers to be tracked.
The Sale Price field should be assigned to the sale price field in the Index. Combined with the Loyalty Card information, the basket analysis will provide detailed information regarding purchases that can then be grouped by customer.
Assign the ‘quantity’ or ‘stock’ equivalent field to the Quantity field, and the cost of the stock before it is sold to the Cost Price field. This enables direct comparisons to the Sale Price to discover overall profit margins.
The Discount Amount field allows the discount data to be mapped to a field, which can then be applied to the final build.
If present in the data source, transactions containing vouchers can also be included by mapping the voucher field to the Voucher Price option.
If the profit is also included in the data source, it can be mapped using the Profit option.
Aggregation Details
This tab contains a number of Aggregation Types that can be applied to the source data. The relevant field is automatically detected based on the aggregation chosen and four new fields are created for each aggregation applied: Rank, Banding, Percentile and Percentile Value.
The Average function adds all of the values together and divides the total by the number of values, while the Count function counts the amount of sales or transactions that have been made in a single basket, excluding all NULL values. The Count All function performs the same action, but includes all NULL values.
The Maximum function displays the highest value within the sales or transactional field and the Minimum function displays the lowest value in the sales or transactional field.
Using the Standard Deviation function analyses the spread of data to discover the values that are closely weighted towards the average. If a high frequency of values are close to the average, the standard deviation will be small. If the values are spread across a relatively wide range of values, the standard deviation will be large.
The Sum function totals the sales and transactional values within a single basket.
By setting a frequency from the available Aggregation Levels, the configuration of how the data is analysed from one point in time to another can be established.
These options build two additional rows for each field, with the selection and a summary of that selection. For example, if the Basket Analysis is built using the Months option, there will be Months and Months Summary fields for each line of data.
The Summary row will contain a Pattern field, where the start and end dates are analysed for a pattern in which users can pinpoint the point in time when the data was entered.
Select a field from the Custom Aggregation Level drop-down list to add an additional field containing the records, allowing customised aggregation fields to be added into the Index.
The Rolling Period in Weeks option, when a value is entered, will create an additional Index titled Banding and Ranking Rolling being created alongside the Basket Analysis Index, allowing a customisable period to be analysed.
Optional Details
The Basket Analysis Extra Fields drop-down list allows additional fields to be added that are not included in the standard setup.
Enable the Run Banding and Ranking option to allow analysis of customer shopping frequency and select additional fields to be analysed from the Banding and Ranking Extra Fields drop-down list. These options will result in an additional Index titled Banding and Ranking being created alongside the Basket Analysis Index.
The Run from Start option, when enabled, will result in a complete Index build when refreshed. This accounts for any amendments made to historical data, ensuring they are reflected in created totals. If not enabled, an incremental Index build is completed when refreshed, only amending new data to the Index.
To complete the setup process, click Create. To discard any changes made, click Cancel.