1-3hit |
Ha DAO Quoc-Huy VO Tien-Huy PHAM Kensuke FUKUDA
Universities collect and process a massive amount of Personal Identifiable Information (PII) at registration and throughout interactions with individuals. However, student PII can be exposed to the public by uploading documents along with university notice without consent and awareness, which could put individuals at risk of a variety of different scams, such as identity theft, fraud, or phishing. In this paper, we perform an in-depth analysis of student PII leakage at Vietnamese universities. To the best of our knowledge, we are the first to conduct a comprehensive study on student PII leakage in higher educational institutions. We find that 52.8% of Vietnamese universities leak student PII, including one or more types of personal data, in documents on their websites. It is important to note that the compromised PII includes sensitive types of data, student medical record and religion. Also, student PII leakage is not a new phenomenon and it has happened year after year since 2005. Finally, we present a study with 23 Vietnamese university employees who have worked on student PII to get a deeper understanding of this situation and envisage concrete solutions. The results are entirely surprising: the employees are highly aware of the concept of student PII. However, student PII leakage still happens due to their working habits or the lack of a management system and regulation. Therefore, the Vietnamese university should take a more active stand to protect student data in this situation.
Increased demand for DNS privacy has driven the creation of several encrypted DNS protocols, such as DNS over HTTPS (DoH), DNS over TLS (DoT), and DNS over QUIC (DoQ). Recently, DoT and DoH have been deployed by some vendors like Google and Cloudflare. This paper addresses privacy leakage in these three encrypted DNS protocols (especially DoQ) with different DNS recursive resolvers (Google, NextDNS, and Bind) and DNS proxy (AdGuard). More particularly, we investigate encrypted DNS traffic to determine whether the adversary can infer the category of websites users visit for this purpose. Through analyzing packet traces of three encrypted DNS protocols, we show that the classification performance of the websites (i.e., user's privacy leakage) is very high in terms of identifying 42 categories of the websites both in public (Google and NextDNS) and local (Bind) resolvers. By comparing the case with cache and without cache at the local resolver, we confirm that the caching effect is negligible as regards identification. We also show that discriminative features are mainly related to the inter-arrival time of packets for DNS resolving. Indeed, we confirm that the F1 score decreases largely by removing these features. We further investigate two possible countermeasures that could affect the inter-arrival time analysis in the local resolver: AdBlocker and DNS prefetch. However, there is no significant improvement in results with these countermeasures. These findings highlight that information leakage is still possible even in encrypted DNS traffic regardless of underlying protocols (i.e., HTTPS, TLS, QUIC).
Takuya WATANABE Mitsuaki AKIYAMA Tetsuya SAKAI Hironori WASHIZAKI Tatsuya MORI
Permission warnings and privacy policy enforcement are widely used to inform mobile app users of privacy threats. These mechanisms disclose information about use of privacy-sensitive resources such as user location or contact list. However, it has been reported that very few users pay attention to these mechanisms during installation. Instead, a user may focus on a more user-friendly source of information: text description, which is written by a developer who has an incentive to attract user attention. When a user searches for an app in a marketplace, his/her query keywords are generally searched on text descriptions of mobile apps. Then, users review the search results, often by reading the text descriptions; i.e., text descriptions are associated with user expectation. Given these observations, this paper aims to address the following research question: What are the primary reasons that text descriptions of mobile apps fail to refer to the use of privacy-sensitive resources? To answer the research question, we performed empirical large-scale study using a huge volume of apps with our ACODE (Analyzing COde and DEscription) framework, which combines static code analysis and text analysis. We developed light-weight techniques so that we can handle hundred of thousands of distinct text descriptions. We note that our text analysis technique does not require manually labeled descriptions; hence, it enables us to conduct a large-scale measurement study without requiring expensive labeling tasks. Our analysis of 210,000 apps, including free and paid, and multilingual text descriptions collected from official and third-party Android marketplaces revealed four primary factors that are associated with the inconsistencies between text descriptions and the use of privacy-sensitive resources: (1) existence of app building services/frameworks that tend to add API permissions/code unnecessarily, (2) existence of prolific developers who publish many applications that unnecessarily install permissions and code, (3) existence of secondary functions that tend to be unmentioned, and (4) existence of third-party libraries that access to the privacy-sensitive resources. We believe that these findings will be useful for improving users' awareness of privacy on mobile software distribution platforms.