How To Find And Fix PII In Google Analytics 4 Data

How To Find And Fix PII In Google Analytics 4 Data

Find PII in Google Analytics 4 via custom Explorations report and fix them in Google Tag Manager by implementing custom JavaScript and tag configuration

By: Mussarat Nosheen | 3 mins read
Published: Nov 7, 2023 2:26:27 AM | Updated: Apr 26, 2024 03:56:20 AM

What is PII?

Personally Identifiable Information (PII) refers to the information that could enable someone to identify an individual personally. 

It includes information like 

  • Name
  • Email
  • Location
  • Social Security 

How Do You Pass on PII to GA4?

When visitors land on your website and perform a search query or submit a form, their PII risks getting shared with GA4

It happens because forms typically contain PIIs like first and last names and contact details. 

Once submitted, the information goes to the data server, appended to the query parameters of the form submission request URL. 

It creates user data privacy issues, and therefore, you must work your way to avoid it

But what is a query parameter?

A query parameter is what comes after the question mark. 

In the examples here,

example.com?email=hello@world.com

or,

example.come?firstname=john&lastname=doe

the highlighted segment shows the query parameter

So, unless the query parameter gets redirected to a confirmation page that does not receive the parameters, all the user information in the form passes on to GA4

Similarly, if you are using GA4 to analyze your website data without any precautionary measures, PIIs do get shared with Google. 

Collecting and passing on visitor PII to analytics tools or search engines has both moral and legal implications. 

So, in this blog, we will show you how to find the PIIs your website is collecting and ways to remove them. 

How to Find PII in Google Analytics 4?

Before you embark on the journey to clean your data, you need to identify the PII collected by your website.

Custom Explorations Report 

Create a custom report in Explorations to find out the PIIs you are currently collecting.

You can do so by following the steps below. 

  • Go to Explorations and click the Blank report
  • Import the Page path + query string  or Page location dimension
  • In the Metrics section, import Views
  • Then, double-click both dimensions to add them to Rows and Values in the Variables tab.
  • Now, test the setup by entering the website URL with the personally identifiable information you wish to remove. 
    • If correctly implemented the Page URL Without PII tag will return the URL minus the data.

  • Go to Filters, add Page path + query string in the Conditions tab, choose contains, and type "?"
  • Go to Show rows and select 250 to see more rows
  • In the emerging rows, you will find URLs with a query parameter.
  • Keep playing around with the conditions to find PIIs like first name and last name with the same dimension to find what you need to remove

  • Note down the parameters containing PII.
    If you have any queries about our analytics services, get in touch with us to let our team of highly professional experts take care of your concerns.

How to Remove PII in Google Analytics 4?

Now that you have the PIIs that need to be removed, it is time to clean the data you track and share with GA4

Custom JavaScript Code

You can remove personally identifiable information via a custom Javascript variable. Follow the steps below to do so.

  • Go to Google Tag Manager > Variables, click New > Variable Configuration > Custom JavaScript
  • Variable Configuration window opens, rename the variable to Page URL Without PII, paste the code below in the code block, then hit Save

function(){

  var blocklist = 'email,address_line_1,address_line_2,city,state,zip_code,full_name,first_name,last_name,phone_number,postcode'.split(',');

  var replaceWith = '';

 

  var url = location.href;

  var sanitizedUrl = url.replace(/((\?)|&)([^#&=]+)(?:=([^#&]*))?/g, 

 

function(input, delim, qmark, key, val) {

    if (-1 === blocklist.indexOf(key))

      return input;

    else

      return replaceWith ? delim + key + '=' + replaceWith : qmark || '';

  }).replace(/\?&*$|(\?)&+/, '$1');

 

  return sanitizedUrl;

}

  • You can add or remove the parameters relevant to your query parameters.

Google Tag Manager

Follow the steps below to remove the PIIs.

  • Go to Google Tag Manager > Tag, click New > Tag Configuration > Choose tag type > Google Analytics: GA4 Configuration

 

  • Go to Google Analytics 4 > Admin > Data Collection > Data Streams, click Web, click to find Web stream details, copy the Measurement ID.

  • Go to Google Tag Manager > Workspace > Tags, click New > Tag Configuration > Choose tag type > Google Analytics: GA4 Configuration, paste the Measurement ID from GA4 
  • Click Triggering, and click All Pages in Choose a trigger, name the tag GA4 Config
  • Go to Tag Configuration > Choose tag type, scroll down to Fields to Set, Add Row, type Field Name “page_location” and Value “”, then hit Save and Preview.

Data Redaction

Google Analytics 4 recently introduced another feature to resolve your customer privacy issues, called the Data redaction.

It prevents PII from being collected and passed on to GA4 at the client-side data collection stage. 

However, it must be noted that redaction is no replacement for excluding the URL query parameters as performed in GTM

But only serves as an add-on to ensure the customer data is safe. 

Implement the data redaction by following the steps below. 

  • Go to Google Analytics 4 > Admin > Data Collection > Data Streams, click Web, click to find Web stream details, and scroll down to Redact data.
  • Toggle on the Email and URL query parameters.

  • Finally, test your implementation in the Test data redaction by typing the URL query parameter in the test box to see the redacted version in the box on the right.

Conclusion

Personally identifiable information needs to be protected to ensure customer data privacy and legal compliance.

To remove PII, find out the ones you are already collecting via the custom Explorations report. 

Then move on to remove them by implementing a Custom JavaScript code and tag configuration in the Google Tag Manager

For additional security, implement Data Redaction for your client-side redaction at the data collection stage. 
Interested in learning more? Read our blogs here.