Search Event Data Collection – Progress So Far

Search Event Data Collection identify differences and unify

More than 2 years ago we open-sourced our search-collector, a lightweight javascript SDK that allows you to run search KPI collection from your e-commerce website. This post will illustrate our progress with search event data collection to date. Since launch, our search event collector has gathered close to a billion events, all while maintaining utmost user privacy – the collector SDK does not track any personally identifiable information, uses no fingerprinting or any associated techniques. The sole focus of our collector is rather, to simply record search events and pass them to an endpoint.

Why did we create the search collector?

One may argue, that Google Analytics provides everything you need. However, once you dive deeper into site search analytics, its deficiencies become apparent.

  • Google Analytics runs on sampled data. As a result, it does not depict an accurate picture.
  • It’s not possible to implement certain KPIs. For example product click tracking per keyword.
  • Google Analytics often lacks optimum configuration within the web-shop. Fixing it requires rarely available engineering resources.

These types of scenarios led to the birth of the search event collector. As we do not want to impose a particular type of configuration, we structured the collector as an SDK. This strategy gives every team the flexibility to assemble a unique search metric collection solution fit for a particular purpose.

How does search-collector work?

Search-collector has two key concepts

Collectors

Collectors are simple javascript classes that attach to a DOM element 1, ex. the search box. When an event of interest happens at that DOM element, the collector reacts, packages the relevant data, and passes it on to a Writer (see below).

We’ve provided many out of the box collectors that can help track events like typing, searches, refinements, add to baskets and more.

Writers

Writers receive data from the collectors and deliver it to a storage location. Chaining Writers together will provide separation of concerns (SoC)2 and prepare them for reuse. For example, we offer BufferingWriter whose only role is to buffer the incoming data for a certain amount of time before sending the package on to the endpoint. This is necessary to prevent an HTTP request from firing upon each keypress within the search box.

Two key writers of interest to the readers of this post are the RestEventWriter and the SQSEventWriter, sending data either to a specified REST endpoint or to Amazon’s SQS. In production, we mostly use the SQS writer, due to its low cost and reliability.

Search-Collector: Progress vs. Room For Improvement

The Progress

  • The search-collector has reliably collected close to a billion events on both desktop and mobile. We have not encountered any client issues, while the appeal of precise search tracking captures the interest of web-shops and e-commerce owners immediately. The resulting data is easy to digest and manage.
  • We package the collector as a single script bundle. This single line adds the search-collector to the web-shop. This streamlined initial setup ensures flexible updates to the event collection setup later.
  • The SQS mechanism 3 proved to be a cheap and reliable option for search event storage.
  • The composable Collectors and Writers are flexible enough to capture almost any case we’ve encountered to date.

Room For Improvement

  • The tight coupling of the collector code to the DOM model within the web-shop sometimes creates issues.
    • For example, when DOM structure changes are made without notice. We’re working on a best practice document and a new code version that encourages the use of custom client-side events. For example, soon we will recommend web-shops send a custom SearchEvent when the search is triggered. At the same time, the collector code will register as a page listener for these events.
  • Impression tracking on mobile is difficult. Events are fired in a different way and detection as to whether a product was within the visible screen area, does not work consistently across devices. Although impressions are rarely used, we’re working on improving in this area.
  • Combining Google Analytics data (web-shops usually have it and use it) with Search-Collector data is not trivial. We’re close to launching our Search Insights product that does just that. This will be a considerable help in the event you need to combine these data sources manually – mind the bot traffic.

Summary – Making Search More Measurable and Profitable

2 years in, we’ve learned much from our Search-Collector SDK project. On the one hand, we are collecting more data with seamless web-shop integration than ever before. This ultimately allows for a broader understanding of things like findability. On the other hand, the more information we gather the more necessary the maintenance of the collection pipelines. It’s clear, however, that the value we add to our customer’s e-commerce shops far outweighs any limitations we may have encountered.

As a result, we continue on this journey and look forward to the next version of our search-collector. This new version will offer the benefits of streamlined integration, and added transparency into Google Analytics site-search data. All the while, maintaining integration flexibility to ensure continuity of the collected data even after sudden, unforeseen changes to web-shop code.

We’ll be launching soon, so please watch this space.

Footnotes

  1. The Document Object Model (DOM) defines the logical structure of documents and the way a document is accessed and manipulated.
  2. (SoC) is a design principle for separating a computer program into distinct sections such that each section addresses a separate concern.
  3. SQS is a queue from which your services pull data, and it only supports exactly-once delivery of messages.

Leave A Comment