Google Analytics: Site Search and Capital Letters
The site search functionality on a website is a powerful source of information about what users are looking for. It’s the place where visitors actually use their keyboard to tell you what they’re trying to do. I’ve addressed site search and analytics previously, in “Integrating Google Analytics with Site Search to Drive Conversions.” But I recently came across a scenario that prompted me to make an additional configuration change to a client’s Google Analytics setup to improve the usefulness of this reporting.
Users Capitalize Searches in Unpredictable Ways
As an example, let’s say we have a website that sells, among other things, some sort of bags. We might look at the “Search Terms” report, found under Content > Site Search > Search Terms, to see what the most popular searches are on the site and see something like this.
“Bags” shows up seventh on the list, with 114 unique searches. So, is that how many times people searched for “bags”? Actually, it’s not. It’s simply the number of times people searched for “Bags” — the capitalized form of the word.
If we filter the report for “bags” — or, actually, for a regular expression of “^bags$” — we see a surprising list.
The filter we applied to the report returns all combinations of the word, regardless of whether letters were uppercase or lowercase. But, they still appear on different rows, because Google Analytics sees them as being different terms — case matters in Google Analytics.
We now see that there were actually 201 unique searches for “bags,” using all capitalization combinations. The last four variations account for a small percentage of the total searches for the term — this example is real data and some visitors did search with these oddball capitalizations — but the fact that our initial view of the report did not include the 80 “bags” searches in the list potentially ranked that search term as the seventh most popular, when in reality it was the most popular search.
A Relatively Simple Fix
In most cases, we don’t really care about the exact capitalization of searches on our sites, and most site search engines themselves actually aren’t case-sensitive. So, to shorten the list of “real” site search terms used on our site, we can tell Google Analytics to make all search terms lowercase before processing them.
This is a change that happens at the Google Analytics profile level through the use of a profile filter. Implementing this filter is straightforward, but it does require administrative access to the profile.
Start by clicking the Admin link at the top of any page inside Google Analytics. Then, in the first column of the Admin screen, select “All Filters.”
At the top of the list that appears, select “New Filter.”
Give the filter a descriptive name like, “Convert Site Search Terms to Lowercase.” Then, select “Custom filter” and “Lowercase,” and select “Search Term” from the “Filter Field” dropdown.
In the left box at the bottom of the screen, select each profile to which you want to apply this filter and click “Add” to move those profiles into the box at the right.
Click “Save” and you’re done. From this point forward, on-site searches for “Bags,” “bags,” “BAgs,” “BaGs,” “baGs,” and “BAGS” will all get grouped into a single row in the Search Terms report: “bags.”
This Filter Is Not Retroactive
Profile filters are applied to the data almost as soon as Google Analytics receives it — while the hit is still in fairly raw form. Filters are a transformation of the data before the data fully gets loaded into Google Analytics for analysis. This means that, unlike an advanced segment, filters only impact data from the point the filter was created and applied to the profile going forward. So, once you’ve applied the filter, you will still see mixed-case results when looking at data captured before the filter was applied.
Going forward, though, the number of unique search terms for your site search reports will be shorter, and you will be less likely to miss a popular term that gets commonly capitalized in different variations.