Toshiki SHIBAHARA Yuta TAKATA Mitsuaki AKIYAMA Takeshi YAGI Kunio HATO Masayuki MURATA
Many users are exposed to threats of drive-by download attacks through the Web. Attackers compromise vulnerable websites discovered by search engines and redirect clients to malicious websites created with exploit kits. Security researchers and vendors have tried to prevent the attacks by detecting malicious data, i.e., malicious URLs, web content, and redirections. However, attackers conceal parts of malicious data with evasion techniques to circumvent detection systems. In this paper, we propose a system for detecting malicious websites without collecting all malicious data. Even if we cannot observe parts of malicious data, we can always observe compromised websites. Since vulnerable websites are discovered by search engines, compromised websites have similar traits. Therefore, we built a classifier by leveraging not only malicious but also compromised websites. More precisely, we convert all websites observed at the time of access into a redirection graph and classify it by integrating similarities between its subgraphs and redirection subgraphs shared across malicious, benign, and compromised websites. As a result of evaluating our system with crawling data of 455,860 websites, we found that the system achieved a 91.7% true positive rate for malicious websites containing exploit URLs at a low false positive rate of 0.1%. Moreover, it detected 143 more evasive malicious websites than the conventional content-based system.
Toshiki SHIBAHARA Kohei YAMANISHI Yuta TAKATA Daiki CHIBA Taiga HOKAGUCHI Mitsuaki AKIYAMA Takeshi YAGI Yuichi OHSITA Masayuki MURATA
The number of infected hosts on enterprise networks has been increased by drive-by download attacks. In these attacks, users of compromised popular websites are redirected toward websites that exploit vulnerabilities of a browser and its plugins. To prevent damage, detection of infected hosts on the basis of proxy logs rather than blacklist-based filtering has started to be researched. This is because blacklists have become difficult to create due to the short lifetime of malicious domains and concealment of exploit code. To detect accesses to malicious websites from proxy logs, we propose a system for detecting malicious URL sequences on the basis of three key ideas: focusing on sequences of URLs that include artifacts of malicious redirections, designing new features related to software other than browsers, and generating new training data with data augmentation. To find an effective approach for classifying URL sequences, we compared three approaches: an individual-based approach, a convolutional neural network (CNN), and our new event de-noising CNN (EDCNN). Our EDCNN reduces the negative effects of benign URLs redirected from compromised websites included in malicious URL sequences. Evaluation results show that only our EDCNN with proposed features and data augmentation achieved a practical classification performance: a true positive rate of 99.1%, and a false positive rate of 3.4%.
Takahito KITO Iori OTOMO Takuya FUJIHASHI Yusuke HIROTA Takashi WATANABE
In conventional multiview video systems using progressive download, a user downloads videos of all viewpoints of one content to realize smooth view switching. This, however, increases the video traffic, and if the available download rate is low, the video quality suffers. Downloading only the desired viewpoint is one approach for reducing the traffic. However, in this case, playback stalls will occur after view switching. These stalls degrade the user's satisfaction for the application. In this paper, we aim at two objectives: 1) to achieve reduction in video traffic and 2) to minimize the number of playback stalls. To this end, we propose a new multiview video delivery scheme for progressive download. The main idea of the proposed scheme is that the user downloads a part of viewpoints only, which will be played back by the user with a high probability, to realize both traffic reduction and smooth view switching. In addition, we propose two download-scheduling algorithms to prevent playback stalls even at low download rates. The first algorithm prevents stalls in the cases with frequent view switching, such as zapping, while the second prevents stalls in gazing cases. Evaluations using a Joint Multiview Video Coding (JMVC) encoder and multiview video sequences show that our scheme achieves not only reduced video traffic but also decreased number of playback stalls, regardless of the user's view-switching model or download rate. In addition, we demonstrate that the proposed method does not cause playback stalls irrespective of high and low motion video contents.
Yoshiaki SHIRAISHI Masaki KAMIZONO Masanori HIROTOMO Masami MOHRI
In the case of drive-by download attacks, most malicious web sites identify the software environment of the clients and change their behavior. Then we cannot always obtain sufficient information appropriate to the client organization by automatic dynamic analysis in open services. It is required to prepare for expected incidents caused by re-accessing same malicious web sites from the other client in the organization. To authors' knowledge, there is no study of utilizing analysis results of malicious web sites for digital forensic on the incident and hedging the risk of expected incident in the organization. In this paper, we propose a system for evaluating the impact of accessing malicious web sites by using the results of multi-environment analysis. Furthermore, we report the results of evaluating malicious web sites by the multi-environment analysis system, and show how to utilize analysis results for forensic analysis and risk hedge based on actual cases of analyzing malicious web sites.
Yuta TAKATA Mitsuaki AKIYAMA Takeshi YAGI Takeshi YADA Shigeki GOTO
An incident response organization such as a CSIRT contributes to preventing the spread of malware infection by analyzing compromised websites and sending abuse reports with detected URLs to webmasters. However, these abuse reports with only URLs are not sufficient to clean up the websites. In addition, it is difficult to analyze malicious websites across different client environments because these websites change behavior depending on a client environment. To expedite compromised website clean-up, it is important to provide fine-grained information such as malicious URL relations, the precise position of compromised web content, and the target range of client environments. In this paper, we propose a new method of constructing a redirection graph with context, such as which web content redirects to malicious websites. The proposed method analyzes a website in a multi-client environment to identify which client environment is exposed to threats. We evaluated our system using crawling datasets of approximately 2,000 compromised websites. The result shows that our system successfully identified malicious URL relations and compromised web content, and the number of URLs and the amount of web content to be analyzed were sufficient for incident responders by 15.0% and 0.8%, respectively. Furthermore, it can also identify the target range of client environments in 30.4% of websites and a vulnerability that has been used in malicious websites by leveraging target information. This fine-grained analysis by our system would contribute to improving the daily work of incident responders.
Jie REN Ling GAO Hai WANG QuanLi GAO ZheWen ZHANG
Mobile traffic is experiencing tremendous growth, and this growing wave is no doubt increasing the use of radio component of mobile devices, resulting in shorter battery lifetime. In this paper, we present an Energy-Aware Download Method (EDM) based on the Markov Decision Process (MDP) to optimize the data download energy for mobile applications. Unlike the previous download schemes in literature that focus on the energy efficiency by simply delaying the download requests, which often leads to a poor user experience, our MDP model learns off-line from a set of training download workloads for different user patterns. The model is then integrated into the mobile application to deal the download request at runtime, taking into account the current battery level, LTE reference signal receiving power (RSRP), reference signal signal to noise radio (RSSNR) and task size as input of the decision process, and maximizes the reward which refers to the expected battery life and user experience. We evaluate how the EDM can be used in the context of a real file downloading application over the LTE network. We obtain, on average, 20.3%, 15% and 45% improvement respectively for energy consumption, latency, and performance of energy-delay trade off, when compared to the Android default download policy (Minimum Delay).
Yuta TAKATA Mitsuaki AKIYAMA Takeshi YAGI Takeo HARIU Shigeki GOTO
Drive-by download attacks force users to automatically download and install malware by redirecting them to malicious URLs that exploit vulnerabilities of the user's web browser. In addition, several evasion techniques, such as code obfuscation and environment-dependent redirection, are used in combination with drive-by download attacks to prevent detection. In environment-dependent redirection, attackers profile the information on the user's environment, such as the name and version of the browser and browser plugins, and launch a drive-by download attack on only certain targets by changing the destination URL. When malicious content detection and collection techniques, such as honeyclients, are used that do not match the specific environment of the attack target, they cannot detect the attack because they are not redirected. Therefore, it is necessary to improve analysis coverage while countering these adversarial evasion techniques. We propose a method for exhaustively analyzing JavaScript code relevant to redirections and extracting the destination URLs in the code. Our method facilitates the detection of attacks by extracting a large number of URLs while controlling the analysis overhead by excluding code not relevant to redirections. We implemented our method in a browser emulator called MINESPIDER that automatically extracts potential URLs from websites. We validated it by using communication data with malicious websites captured during a three-year period. The experimental results demonstrated that MINESPIDER extracted 30,000 new URLs from malicious websites in a few seconds that conventional methods missed.
Bo SUN Mitsuaki AKIYAMA Takeshi YAGI Mitsuhiro HATADA Tatsuya MORI
Modern web users may encounter a browser security threat called drive-by-download attacks when surfing on the Internet. Drive-by-download attacks make use of exploit codes to take control of user's web browser. Many web users do not take such underlying threats into account while clicking URLs. URL Blacklist is one of the practical approaches to thwarting browser-targeted attacks. However, URL Blacklist cannot cope with previously unseen malicious URLs. Therefore, to make a URL blacklist effective, it is crucial to keep the URLs updated. Given these observations, we propose a framework called automatic blacklist generator (AutoBLG) that automates the collection of new malicious URLs by starting from a given existing URL blacklist. The primary mechanism of AutoBLG is expanding the search space of web pages while reducing the amount of URLs to be analyzed by applying several pre-filters such as similarity search to accelerate the process of generating blacklists. AutoBLG consists of three primary components: URL expansion, URL filtration, and URL verification. Through extensive analysis using a high-performance web client honeypot, we demonstrate that AutoBLG can successfully discover new and previously unknown drive-by-download URLs from the vast web space.
Mitsuaki AKIYAMA Takeshi YAGI Youki KADOBAYASHI Takeo HARIU Suguru YAMAGUCHI
We investigated client honeypots for detecting and circumstantially analyzing drive-by download attacks. A client honeypot requires both improved inspection performance and in-depth analysis for inspecting and discovering malicious websites. However, OS overhead in recent client honeypot operation cannot be ignored when improving honeypot multiplication performance. We propose a client honeypot system that is a combination of multi-OS and multi-process honeypot approaches, and we implemented this system to evaluate its performance. The process sandbox mechanism, a security measure for our multi-process approach, provides a virtually isolated environment for each web browser. It prevents system alteration from a compromised browser process by I/O redirection of file/registry access. To solve the inconsistency problem of file/registry view by I/O redirection, our process sandbox mechanism enables the web browser and corresponding plug-ins to share a virtual system view. Therefore, it enables multiple processes to be run simultaneously without interference behavior of processes on a single OS. In a field trial, we confirmed that the use of our multi-process approach was three or more times faster than that of a single process, and our multi-OS approach linearly improved system performance according to the number of honeypot instances. In addition, our long-term investigation indicated that 72.3% of exploitations target browser-helper processes. If a honeypot restricts all process creation events, it cannot identify an exploitation targeting a browser-helper process. In contrast, our process sandbox mechanism permits the creation of browser-helper processes, so it can identify these types of exploitations without resulting in false negatives. Thus, our proposed system with these multiplication approaches improves performance efficiency and enables in-depth analysis on high interaction systems.
Jie REN Ling GAO Hai WANG Yan CHEN
The endurance time of smartphone still suffer from the limited battery capacity, and smartphone apps will increase the burden of the battery if they download large data over slow network. So how to manage the download tasks is an important work. To this end we propose a smartphone download strategy with low energy consumption which called CLSA (Concentrated Download and Low Power and Stable Link Selection Algorithm). The CLSA is intended to reduce the overhead of large data downloads by appropriate delay for the smartphone, and it based on three major factors: the current network situation, the length of download requests' queue and the local information of smartphone. We evaluate the CLSA using a music player implementation on ZTE V880 smartphone running the Android operation system, and compare it with the other two general download strategies, Minimum Delay and WiFi Only. Experiments show that our download algorithm can achieve a better trade-off between energy and delay than the other two.
Khamphao SISAAT Hiroaki KIKUCHI Shunji MATSUO Masato TERADA Masashi FUJIWARA Surin KITTITORNKUN
A botnet attacks any Victim Hosts via the multiple Command and Control (C&C) Servers, which are controlled by a botmaster. This makes it more difficult to detect the botnet attacks and harder to trace the source country of the botmaster due to the lack of the logged data about the attacks. To locate the C&C Servers during malware/bot downloading phase, we have analyzed the source IP addresses of downloads to more than 90 independent Honeypots in Japan in the CCC (Cyber Clean Center) dataset 2010 comprising over 1 million data records and almost 1 thousand malware names. Based on GeoIP services, a Time Zone Correlation model has been proposed to determine the correlation coefficient between bot downloads from Japan and other source countries. We found a strong correlation between active malware/bot downloads and time zone of the C&C Servers. As a result, our model confirms that malware/bot downloads are synchronized with time zone (country) of the corresponding C&C Servers so that the botmaster can be possibly traced.
Megumi SHIBUYA Tomohiko OGISHI Shu YAMAMOTO
P2P (Peer-to-Peer) file sharing architectures have scalable and cost-effective features. Hence, the application of P2P architectures to media streaming is attractive and expected to be an alternative to the current video streaming using IP multicast or content delivery systems because the current systems require expensive network infrastructures and large scale centralized cache storage systems. In this paper, we investigate the P2P progressive download enabling Internet video streaming services. We demonstrated the capability of the P2P progressive download in both laboratory test network as well as in the Internet. Through the experiments, we clarified the contribution of the FTTH links to the P2P progressive download in the heterogeneous access networks consisting of FTTH and ADSL links. We analyzed the cause of some download performance degradation occurred in the experiment and discussed about the effective methods to provide the video streaming service using P2P progressive download in the current heterogeneous networks.
Keuntae PARK Jaesub KIM Yongjin CHOI Daeyeon PARK
Transmission schemes that gain content from multiple servers concurrently have been highlighted due to their ability to provide bandwidth aggregation, stability on dynamic server departure, and load balancing. Previous approaches employ parallel downloading in the transport layer to minimize the receiver buffer size and maximize bandwidth utilization. However, they only focus on the receiver operations and induce considerable overhead at the senders in contradiction to the main goal of a multi-provider environment, offloading popular servers through replication. In the present work, the authors propose MTCP, a novel transport layer protocol that focuses on reduction of the sender overhead through the elimination of unnecessary disk I/Os and efficient buffer cache utilization. MTCP also balances trade-off objectives to minimize buffering at receivers and maximize the request locality at senders.
Junichi FUNASAKA Atsushi KAWANO Kenji ISHIDA
Parallel downloading retrieves different pieces of a file from different servers simultaneously and so is expected to greatly shorten file fetch times. A key requirement is that the different servers must hold the same file. We have already proposed a proxy system that can ensure file freshness and concordance. In this paper, we combine parallel downloading with the proxy server technology in order to download a file quickly and ensure that it is the latest version. Our previous paper on parallel downloading took neither the downloading order of file fragments nor the buffer space requirements into account; this paper corrects those omissions. In order to provide the user with the required file in correct order as a byte stream, the proxy server must reorder the pieces fetched from multiple servers and shuffle in the delayed blocks as soon as possible. Thus, "substitution download" is newly introduced, which requests delayed blocks from other servers to complete downloading earlier. Experiments on substitution download across the Internet clarify the tradeoff between the buffering time and the redundant traffic generated by duplicate requests to multiple servers. As a result, the pseudo-optimum balance is discovered and our method is shown both not to increase downloading time and to limit the buffer space. This network software can be applied to download files smoothly absorbing the difference in performance characteristics among heterogeneous networks.
Flash bulk files downloading in style of P2P through perpendicular pattern becomes more popular recently. Many peers download different pieces of shared files from the source in parallel. They try to reconstruct complete files by exchanging needed pieces with other downloading peers. The throughput of entire downloading community, as well as the perceived downloading rate of each peer, greatly depends on uploading bandwidth contributed by every individual peer. Unfortunately, without proper built-in incentive mechanism, peers inherently tend to relentlessly download while intentionally limiting their uploading bandwidth. In this paper, we propose a both effective and efficient incentive approach--Reciprocity, which is only based on end-to-end measurement and reaction: a peer caps uploading rate to each of its peers at the rate that is proportional to its downloading rate from that one. It requires no centralized control, or electronic monetary payment, or certification. Preliminary experiments' results reveal that this approach offers favorable performance for cooperative peers, while effectively punishing defective ones.
Jumpei TAKETSUGU Shinsuke HARA
Many reports have investigated TCP performance over wireless links, where a high and time-invariant frame error rate is assumed for cellular systems. However, the frame error rate is temporally and geographically changeable by fading and interference in cellular systems. On the other hand, SINR-based transmission power control, which is employed for the randomization of frame errors in DS-CDMA cellular systems, can not always work properly depending on the control parameters or the channel characteristics. In this paper, we investigate the TCP performance over the wireless links in a DS-CDMA cellular system by computer simulation. From the simulation results, it has been found that the assumption of random frame error is valid only for a part of the TCP performance even in the system with an SINR-based transmission power control scheme.
Shigeyuki SAKAZAWA Yasuhiro TAKISHIMA Yoshinori KITATSUJI Yasuyuki NAKAJIMA Masahiro WADA Kazuo HASHIMOTO
This paper presents a novel data transmission protocol "SVFTP," which enables high-speed and error-free video data transmission over IP networks. A video transmission system based on SVFTP is also presented. While conventional protocols are designed for file transmission, SVFTP focuses on video data as a continuous media. In order to fit a flexible video transmission system, SVFTP achieves higher throughput on the long distance link as well as transmission interruption/resumption and progressive download and play back. In addition, a rate shaping mechanism for SVFTP is introduced in order to control greediness and burst traffic of multiple-TCP sessions. Laboratory and field transmission experiments show that SVFTP achieves high performance and functionality.
For an efficient software download in cellular CDMA systems, location dependant session admission control (LDSAC) is presented. In the LDSAC scheme, a mobile that is located near cell center can request software download session, but the mobile that is located far from cell center can request session only after approaching near the cell center. Performance is analyzed in terms of handoff rate, mean channel holding time, session blocking probability and handoff forced termination probability. Numerical results show handoff rate between cells in the proposed scheme is reduced to 30-250% compared to conventional scheme, according to traffic characteristics such as terminal speed, session duration time and the size of the allowable zone area in a cell for the initiation of the session. And new session blocking probability decreases slightly, but handoff session forced termination probability decreases drastically.
Hiroshi HARADA Masayuki FUJISE
We have proposed two types of software download methods for software radio (SR) based intelligent transport systems (ITS): (1) broadcasting-type software download method and (2) communication-type software download method. In this paper, we study their feasibility of their employment in a newly developed prototype. We give tangible examples of method (1) using the vehicle information and communication system (VICS) and method (2) using the dedicated short range communication (DSRC) system. We describe the download formats and procedures for both methods and use the experimental prototype to evaluate the basic software download time and configuration time. Moreover we also propose architecture of SR-based multimode terminal that can reduce download time and utilize over-the-air software download services by VICS and DSRC links.
Junichi FUNASAKA Nozomi NAKAWAKI Kenji ISHIDA Kitsutaro AMANO
As a lot of programs and contents such as movie files are being delivered via the Internet, and copies are often stored in distributed servers in order to reduce the load on the original servers, to ease network congestion, and to decrease response time. To retrieve an object file, existing methods simply select one or more servers. Such methods divide a file into equal pieces whose size is determined a priori. This approach is not practical for networks that offer variable bandwidth. In order to more utilize variable bandwidth, we propose an adaptive downloading method. We evaluate it by experiments conducted on the Internet. The results show that the new method is effective and that it will become an important network control technology for assurance.