The traditional approach for identifying unique visitors involves using cookies or IP addresses that often violate privacy laws due to their ability to individually identify users. A newer alternative employed by more privacy-oriented solutions involve hashing IP addresses for pseudonymization. However, this can also fall into a privacy grey area and can be compromised according to some academic papers.

To adhere strictly to privacy regulations, we use a browser cache-based approach to track unique visitors without personal identifiers. In simpler terms, if a visitor has visited the website before, their browser would have cached the tracking script’s request. If they have not visited the website before, the tracking script’s request would not have been cached yet. This allows us to differentiate between new and returning visitors without using cookies or IP addresses.

This approach does not allow us to identify individual users and therefore abides by existing privacy regulations.

Technical Explanation

Here’s a detailed explanation of how this method works:

  • Initial Request: When a visitor accesses the site, the tracking script sends a request to the API server.
    • Unique Visitor: If the user has not visited the site before, they will ping the server without a cached If-Modified-Since timestamp. The server then responds with a Last-Modified header set to the current date.
    • Returning Visitor: If the user has visited the site before, their browser will include an If-Modified-Since header with the date of the last Last-Modified response when pinging the server. The server can then recognise that the visitor has visited the site before.

The Last-Modified timestamp is reset every day to ensure that visitors are counted as unique each new day.

By using the Last-Modified and If-Modified-Since cache headers, we can determine if the visitor has visited the website before. This approach is simple, efficient, and preserves the anonymity of the visitor.