Model Sanitization Guide

Options
andy_tarr
andy_tarr Member, ALL USERS, Employee Posts: 5 Master Anaplanner of the Year

Introduction

The Anaplan Security team’s policy mandates that all models undergoing model concurrency testing must be sanitized. This provides additional protection for customers on top of the secure performance test environment. Model sanitization involves manipulation of data contained in a model to values that do not allow the identification of a company, any persons, any precise locations, any company plans, and any other sensitive data.

We advise customers to make a copy of their model, sanitize it, and provide the model concurrency team with access to the model. We can then import this sanitized copy of the model into our performance test environment. Alternatively, if there is insufficient workspace capacity to enable a model copy, L3 Support can assist by providing an isolated workspace to carry out sanitization.

Purpose

This document serves to inform model builders/customers of what the model sanitization requirements are and some approaches. Script development does not start until a model has been imported into the performance test environment.

Sanitization priority list

We understand that customers may not sanitize all data due to time/effort constraints. We recommend that all data be sanitized, but the table below lists which data can remain unchanged.Model sanitization.png

 

Sanitization guide

1.Company name(s)—Mandatory

The company names can often be found in the name of the model. When you make a copy of your model, please rename it to an easily identifiable name of "Performance Model." This also ensures that our team imports the correct model.

 

2. Other company name(s)—Mandatory

This type of data is usually within the model itself. Check the general lists for organizations, key suppliers, clients, or distributors, etc. For this data, we recommend that these values are either converted into a modified version of “Numbered List” (refer to Sanitization Techniques section), scrambled, or replaced with random alpha-numeric strings. Using numbers instead is inadvisable because it reduces legibility of your model. Any instances of names on dashboards should also be amended appropriately.

 

3. Financial data—Mandatory

Financial data would usually be calculated values from several modules. Check the blueprint view of Revenue and Expenses to get a list of all originating modules. Scrambling the related number values in these modules will prevent people from guessing the size of your company and market performance. It is sufficient to make these number values all the same. However, it is inadvisable to just replace these with 0s. Remember that the goal of model concurrency testing is to provide information on performance during concurrent usage—realistic cell calculations should be exercised. Please refer to the Sanitization Techniques section for best practices.

 

4. Real person name(s)—Mandatory

People names can often be used in employee lists, teams lists, and similar. Changing the names to a modified version of the “Numbered List” will help replace the actual names (refer to Sanitization Techniques section). Another option would be to just replace the surname column values with random strings in the list. The surname need not be unique depending on usage within the model. In such cases, you can copy/paste one value for all surnames instead. If you would like a list of randomly generated names, please contact Model.Concurrency@anaplan.com in advance and we can provide this as a CSV or spreadsheet document.

 

5. Locations—Optional

This is most likely to be found in the same lists as the above "other company name(s)." In addition, lists such as sales offices, assets, and expansion plans should also be checked. Addresses are usually contained in simple text fields. Replacing them with any random string of approximately 100 characters should be sufficient. Each address need not be unique and it would be sufficient to copy/paste the same random string.

 

6. Products—Optional

Product brands and product names can make it easy to identify the industry your company belongs to and even guess your company if the product range is very limited. These names can be replaced with numbered lists. Another way of sanitizing this data very quickly is to export the list and do a find and replace operation of the common brands. Save the document as a CSV file and run an import operation taking care to correct the unique identifier setting. If the products in question are generic, there is no need to sanitize.

 

7. Services—Optional

This type of data is similar to the above "Products" and the approach to sanitization is also the same. This has a low sanitization priority as this data tends to not contain any identifiable branding.

 

Sanitization techniques

A workspace admin can use the following steps to sanitize list items:

Modified numbered lists—Recommended

This is the quickest way to sanitize list items representing sensitive data such as employee lists, accounts, etc. The following steps will convert proper names into an alpha-numeric string format.

  1. Go to Settings > General Lists and identify the list you want to sanitize.
  2. Scroll right until you’ve reached the “Numbered” column.
  3. Tick then un-tick the box. This should change the names of the items on that list to the default IDs that the model associates with the items.
  4. Now go to General List > [the list you’re sanitizing] > Properties tab.
  5. Insert a new property and name it "Sanitized Display Name."
  6. Change the format to “Text”.
  7. Enter the following formula in the "Formula" column: "ListName" & NAME(ITEM('[list name]')). Where ListName is the prefix you want to give to the list item names and list name is the actual name of the list you’re modifying.
  8. Go back to Setting > General Lists > [the list you’re sanitizing] and scroll right to the "Numbered" column.
  9. Tick the box to activate numbered lists.
  10. Then select "Sanitized Display Name" under “Display Name Property” column.

 

Temporary hardcoding values—Recommended

Modules used for collecting user data input, such as actual sales figures, forecasted sales values, salaries, etc. can be quickly sanitized using the following steps:

  1. From the Settings tab open the module used for data input.
  2. Determine what line item is used for inputting data that needs to be sanitized. Select any cell under that line item.
  3. Go to its formula bar and enter in a random number. This should change the values of all the cells associated with that line item.
  4. Delete the random number from the formula bar.

 

Direct copy and paste

If you have only one column of data in a module to be sanitized, you should be able to do a copy and paste directly from a spreadsheet/text application. There may be times when this is not possible due to cells requiring values from a “List Selection." To overcome this obstacle, you can use the Export and Import functionality.

 

Bulk sanitization

The Export and Import functionality is also a convenient way of sanitizing several columns of data in one action. This method would also typically reduce human error.

 

Tip: Mistakes and errors

If you encounter an error during the sanitization of your data, you can use the Model Restore functionality found in the Settings tab > History to return to an original state.