Web tracking is widely used as a means to track user's behavior on websites. While web tracking provides new opportunities of e-commerce, it also includes certain risks such as privacy infringement. Therefore, analyzing such risks in the wild Internet is meaningful to make the user's privacy transparent. This work aims to understand how the web tracking has been adopted to prominent websites. We also aim to understand their resilience to the ad-blocking techniques. Web tracking-enabled websites collect the information called the web browser fingerprints, which can be used to identify users. We develop a scalable system that can detect fingerprinting by using both dynamic and static analyses. If a tracking site makes use of many and strong fingerprints, the site is likely resilient to the ad-blocking techniques. We also analyze the connectivity of the third-party tracking sites, which are linked from multiple websites. The link analysis allows us to extract the group of associated tracking sites and understand how influential these sites are. Based on the analyses of 100,000 websites, we quantify the potential risks of the web tracking-enabled websites. We reveal that there are 226 websites that adopt fingerprints that cannot be detected with the most of off-the-shelf anti-tracking tools. We also reveal that a major, resilient third-party tracking site is linked to 50.0 % of the top-100,000 popular websites.
Yumehisa HAGA
Waseda University
Yuta TAKATA
NTT Secure Platform Laboratories
Mitsuaki AKIYAMA
NTT Secure Platform Laboratories
Tatsuya MORI
Waseda University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yumehisa HAGA, Yuta TAKATA, Mitsuaki AKIYAMA, Tatsuya MORI, "Building a Scalable Web Tracking Detection System: Implementation and the Empirical Study" in IEICE TRANSACTIONS on Information,
vol. E100-D, no. 8, pp. 1663-1670, August 2017, doi: 10.1587/transinf.2016ICP0020.
Abstract: Web tracking is widely used as a means to track user's behavior on websites. While web tracking provides new opportunities of e-commerce, it also includes certain risks such as privacy infringement. Therefore, analyzing such risks in the wild Internet is meaningful to make the user's privacy transparent. This work aims to understand how the web tracking has been adopted to prominent websites. We also aim to understand their resilience to the ad-blocking techniques. Web tracking-enabled websites collect the information called the web browser fingerprints, which can be used to identify users. We develop a scalable system that can detect fingerprinting by using both dynamic and static analyses. If a tracking site makes use of many and strong fingerprints, the site is likely resilient to the ad-blocking techniques. We also analyze the connectivity of the third-party tracking sites, which are linked from multiple websites. The link analysis allows us to extract the group of associated tracking sites and understand how influential these sites are. Based on the analyses of 100,000 websites, we quantify the potential risks of the web tracking-enabled websites. We reveal that there are 226 websites that adopt fingerprints that cannot be detected with the most of off-the-shelf anti-tracking tools. We also reveal that a major, resilient third-party tracking site is linked to 50.0 % of the top-100,000 popular websites.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2016ICP0020/_p
Copy
@ARTICLE{e100-d_8_1663,
author={Yumehisa HAGA, Yuta TAKATA, Mitsuaki AKIYAMA, Tatsuya MORI, },
journal={IEICE TRANSACTIONS on Information},
title={Building a Scalable Web Tracking Detection System: Implementation and the Empirical Study},
year={2017},
volume={E100-D},
number={8},
pages={1663-1670},
abstract={Web tracking is widely used as a means to track user's behavior on websites. While web tracking provides new opportunities of e-commerce, it also includes certain risks such as privacy infringement. Therefore, analyzing such risks in the wild Internet is meaningful to make the user's privacy transparent. This work aims to understand how the web tracking has been adopted to prominent websites. We also aim to understand their resilience to the ad-blocking techniques. Web tracking-enabled websites collect the information called the web browser fingerprints, which can be used to identify users. We develop a scalable system that can detect fingerprinting by using both dynamic and static analyses. If a tracking site makes use of many and strong fingerprints, the site is likely resilient to the ad-blocking techniques. We also analyze the connectivity of the third-party tracking sites, which are linked from multiple websites. The link analysis allows us to extract the group of associated tracking sites and understand how influential these sites are. Based on the analyses of 100,000 websites, we quantify the potential risks of the web tracking-enabled websites. We reveal that there are 226 websites that adopt fingerprints that cannot be detected with the most of off-the-shelf anti-tracking tools. We also reveal that a major, resilient third-party tracking site is linked to 50.0 % of the top-100,000 popular websites.},
keywords={},
doi={10.1587/transinf.2016ICP0020},
ISSN={1745-1361},
month={August},}
Copy
TY - JOUR
TI - Building a Scalable Web Tracking Detection System: Implementation and the Empirical Study
T2 - IEICE TRANSACTIONS on Information
SP - 1663
EP - 1670
AU - Yumehisa HAGA
AU - Yuta TAKATA
AU - Mitsuaki AKIYAMA
AU - Tatsuya MORI
PY - 2017
DO - 10.1587/transinf.2016ICP0020
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E100-D
IS - 8
JA - IEICE TRANSACTIONS on Information
Y1 - August 2017
AB - Web tracking is widely used as a means to track user's behavior on websites. While web tracking provides new opportunities of e-commerce, it also includes certain risks such as privacy infringement. Therefore, analyzing such risks in the wild Internet is meaningful to make the user's privacy transparent. This work aims to understand how the web tracking has been adopted to prominent websites. We also aim to understand their resilience to the ad-blocking techniques. Web tracking-enabled websites collect the information called the web browser fingerprints, which can be used to identify users. We develop a scalable system that can detect fingerprinting by using both dynamic and static analyses. If a tracking site makes use of many and strong fingerprints, the site is likely resilient to the ad-blocking techniques. We also analyze the connectivity of the third-party tracking sites, which are linked from multiple websites. The link analysis allows us to extract the group of associated tracking sites and understand how influential these sites are. Based on the analyses of 100,000 websites, we quantify the potential risks of the web tracking-enabled websites. We reveal that there are 226 websites that adopt fingerprints that cannot be detected with the most of off-the-shelf anti-tracking tools. We also reveal that a major, resilient third-party tracking site is linked to 50.0 % of the top-100,000 popular websites.
ER -