Modern web users may encounter a browser security threat called drive-by-download attacks when surfing on the Internet. Drive-by-download attacks make use of exploit codes to take control of user's web browser. Many web users do not take such underlying threats into account while clicking URLs. URL Blacklist is one of the practical approaches to thwarting browser-targeted attacks. However, URL Blacklist cannot cope with previously unseen malicious URLs. Therefore, to make a URL blacklist effective, it is crucial to keep the URLs updated. Given these observations, we propose a framework called automatic blacklist generator (AutoBLG) that automates the collection of new malicious URLs by starting from a given existing URL blacklist. The primary mechanism of AutoBLG is expanding the search space of web pages while reducing the amount of URLs to be analyzed by applying several pre-filters such as similarity search to accelerate the process of generating blacklists. AutoBLG consists of three primary components: URL expansion, URL filtration, and URL verification. Through extensive analysis using a high-performance web client honeypot, we demonstrate that AutoBLG can successfully discover new and previously unknown drive-by-download URLs from the vast web space.
Bo SUN
Waseda University
Mitsuaki AKIYAMA
NTT Corporation
Takeshi YAGI
NTT Corporation
Mitsuhiro HATADA
Waseda University
Tatsuya MORI
Waseda University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Bo SUN, Mitsuaki AKIYAMA, Takeshi YAGI, Mitsuhiro HATADA, Tatsuya MORI, "Automating URL Blacklist Generation with Similarity Search Approach" in IEICE TRANSACTIONS on Information,
vol. E99-D, no. 4, pp. 873-882, April 2016, doi: 10.1587/transinf.2015ICP0027.
Abstract: Modern web users may encounter a browser security threat called drive-by-download attacks when surfing on the Internet. Drive-by-download attacks make use of exploit codes to take control of user's web browser. Many web users do not take such underlying threats into account while clicking URLs. URL Blacklist is one of the practical approaches to thwarting browser-targeted attacks. However, URL Blacklist cannot cope with previously unseen malicious URLs. Therefore, to make a URL blacklist effective, it is crucial to keep the URLs updated. Given these observations, we propose a framework called automatic blacklist generator (AutoBLG) that automates the collection of new malicious URLs by starting from a given existing URL blacklist. The primary mechanism of AutoBLG is expanding the search space of web pages while reducing the amount of URLs to be analyzed by applying several pre-filters such as similarity search to accelerate the process of generating blacklists. AutoBLG consists of three primary components: URL expansion, URL filtration, and URL verification. Through extensive analysis using a high-performance web client honeypot, we demonstrate that AutoBLG can successfully discover new and previously unknown drive-by-download URLs from the vast web space.
URL: https://global.ieice.org/en_transactions/information/10.1587/transinf.2015ICP0027/_p
Copy
@ARTICLE{e99-d_4_873,
author={Bo SUN, Mitsuaki AKIYAMA, Takeshi YAGI, Mitsuhiro HATADA, Tatsuya MORI, },
journal={IEICE TRANSACTIONS on Information},
title={Automating URL Blacklist Generation with Similarity Search Approach},
year={2016},
volume={E99-D},
number={4},
pages={873-882},
abstract={Modern web users may encounter a browser security threat called drive-by-download attacks when surfing on the Internet. Drive-by-download attacks make use of exploit codes to take control of user's web browser. Many web users do not take such underlying threats into account while clicking URLs. URL Blacklist is one of the practical approaches to thwarting browser-targeted attacks. However, URL Blacklist cannot cope with previously unseen malicious URLs. Therefore, to make a URL blacklist effective, it is crucial to keep the URLs updated. Given these observations, we propose a framework called automatic blacklist generator (AutoBLG) that automates the collection of new malicious URLs by starting from a given existing URL blacklist. The primary mechanism of AutoBLG is expanding the search space of web pages while reducing the amount of URLs to be analyzed by applying several pre-filters such as similarity search to accelerate the process of generating blacklists. AutoBLG consists of three primary components: URL expansion, URL filtration, and URL verification. Through extensive analysis using a high-performance web client honeypot, we demonstrate that AutoBLG can successfully discover new and previously unknown drive-by-download URLs from the vast web space.},
keywords={},
doi={10.1587/transinf.2015ICP0027},
ISSN={1745-1361},
month={April},}
Copy
TY - JOUR
TI - Automating URL Blacklist Generation with Similarity Search Approach
T2 - IEICE TRANSACTIONS on Information
SP - 873
EP - 882
AU - Bo SUN
AU - Mitsuaki AKIYAMA
AU - Takeshi YAGI
AU - Mitsuhiro HATADA
AU - Tatsuya MORI
PY - 2016
DO - 10.1587/transinf.2015ICP0027
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E99-D
IS - 4
JA - IEICE TRANSACTIONS on Information
Y1 - April 2016
AB - Modern web users may encounter a browser security threat called drive-by-download attacks when surfing on the Internet. Drive-by-download attacks make use of exploit codes to take control of user's web browser. Many web users do not take such underlying threats into account while clicking URLs. URL Blacklist is one of the practical approaches to thwarting browser-targeted attacks. However, URL Blacklist cannot cope with previously unseen malicious URLs. Therefore, to make a URL blacklist effective, it is crucial to keep the URLs updated. Given these observations, we propose a framework called automatic blacklist generator (AutoBLG) that automates the collection of new malicious URLs by starting from a given existing URL blacklist. The primary mechanism of AutoBLG is expanding the search space of web pages while reducing the amount of URLs to be analyzed by applying several pre-filters such as similarity search to accelerate the process of generating blacklists. AutoBLG consists of three primary components: URL expansion, URL filtration, and URL verification. Through extensive analysis using a high-performance web client honeypot, we demonstrate that AutoBLG can successfully discover new and previously unknown drive-by-download URLs from the vast web space.
ER -