Internet access traffic follows hourly patterns that depend on various factors, such as the periods users stay on-line at the access point (e.g. at home or in the office) or their preferences for applications. The clustering of Internet users may provide important information for traffic engineering and billing. For example, it can be used to set up service differentiation according to hourly behavior, resource optimization based on multi-hour routing and definition of tariffs that promote Internet access in low busy hours. In this work, we propose a methodology for clustering Internet users with similar patterns of Internet utilization, according to their hourly traffic utilization. The methodology resorts to three statistical multivariate analysis techniques: cluster analysis, principal component analysis and discriminant analysis. The methodology is illustrated through measured data from two distinct ISPs, one using a CATV access network and the other an ADSL one, offering distinct traffic contracts. Principal component analysis is used as an exploratory tool. Cluster analysis is used to identify the relevant Internet usage profiles, with the partitioning around medoids and Ward's method being the preferred clustering methods. For the two data sets, these methods lead to the choice of 3 clusters with different hourly traffic utilization profiles. The cluster structure is validated through discriminant analysis. It is also evaluated in terms of several characteristics of the user traffic not used in the cluster analysis, such as the type of applications, the amount of downloaded traffic, the activity duration and the transfer rate, resulting in coherent outcomes.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Maria Rosario de OLIVEIRA, Rui VALADAS, Antonio PACHECO, Paulo SALVADOR, "Cluster Analysis of Internet Users Based on Hourly Traffic Utilization" in IEICE TRANSACTIONS on Communications,
vol. E90-B, no. 7, pp. 1594-1607, July 2007, doi: 10.1093/ietcom/e90-b.7.1594.
Abstract: Internet access traffic follows hourly patterns that depend on various factors, such as the periods users stay on-line at the access point (e.g. at home or in the office) or their preferences for applications. The clustering of Internet users may provide important information for traffic engineering and billing. For example, it can be used to set up service differentiation according to hourly behavior, resource optimization based on multi-hour routing and definition of tariffs that promote Internet access in low busy hours. In this work, we propose a methodology for clustering Internet users with similar patterns of Internet utilization, according to their hourly traffic utilization. The methodology resorts to three statistical multivariate analysis techniques: cluster analysis, principal component analysis and discriminant analysis. The methodology is illustrated through measured data from two distinct ISPs, one using a CATV access network and the other an ADSL one, offering distinct traffic contracts. Principal component analysis is used as an exploratory tool. Cluster analysis is used to identify the relevant Internet usage profiles, with the partitioning around medoids and Ward's method being the preferred clustering methods. For the two data sets, these methods lead to the choice of 3 clusters with different hourly traffic utilization profiles. The cluster structure is validated through discriminant analysis. It is also evaluated in terms of several characteristics of the user traffic not used in the cluster analysis, such as the type of applications, the amount of downloaded traffic, the activity duration and the transfer rate, resulting in coherent outcomes.
URL: https://global.ieice.org/en_transactions/communications/10.1093/ietcom/e90-b.7.1594/_p
Copy
@ARTICLE{e90-b_7_1594,
author={Maria Rosario de OLIVEIRA, Rui VALADAS, Antonio PACHECO, Paulo SALVADOR, },
journal={IEICE TRANSACTIONS on Communications},
title={Cluster Analysis of Internet Users Based on Hourly Traffic Utilization},
year={2007},
volume={E90-B},
number={7},
pages={1594-1607},
abstract={Internet access traffic follows hourly patterns that depend on various factors, such as the periods users stay on-line at the access point (e.g. at home or in the office) or their preferences for applications. The clustering of Internet users may provide important information for traffic engineering and billing. For example, it can be used to set up service differentiation according to hourly behavior, resource optimization based on multi-hour routing and definition of tariffs that promote Internet access in low busy hours. In this work, we propose a methodology for clustering Internet users with similar patterns of Internet utilization, according to their hourly traffic utilization. The methodology resorts to three statistical multivariate analysis techniques: cluster analysis, principal component analysis and discriminant analysis. The methodology is illustrated through measured data from two distinct ISPs, one using a CATV access network and the other an ADSL one, offering distinct traffic contracts. Principal component analysis is used as an exploratory tool. Cluster analysis is used to identify the relevant Internet usage profiles, with the partitioning around medoids and Ward's method being the preferred clustering methods. For the two data sets, these methods lead to the choice of 3 clusters with different hourly traffic utilization profiles. The cluster structure is validated through discriminant analysis. It is also evaluated in terms of several characteristics of the user traffic not used in the cluster analysis, such as the type of applications, the amount of downloaded traffic, the activity duration and the transfer rate, resulting in coherent outcomes.},
keywords={},
doi={10.1093/ietcom/e90-b.7.1594},
ISSN={1745-1345},
month={July},}
Copy
TY - JOUR
TI - Cluster Analysis of Internet Users Based on Hourly Traffic Utilization
T2 - IEICE TRANSACTIONS on Communications
SP - 1594
EP - 1607
AU - Maria Rosario de OLIVEIRA
AU - Rui VALADAS
AU - Antonio PACHECO
AU - Paulo SALVADOR
PY - 2007
DO - 10.1093/ietcom/e90-b.7.1594
JO - IEICE TRANSACTIONS on Communications
SN - 1745-1345
VL - E90-B
IS - 7
JA - IEICE TRANSACTIONS on Communications
Y1 - July 2007
AB - Internet access traffic follows hourly patterns that depend on various factors, such as the periods users stay on-line at the access point (e.g. at home or in the office) or their preferences for applications. The clustering of Internet users may provide important information for traffic engineering and billing. For example, it can be used to set up service differentiation according to hourly behavior, resource optimization based on multi-hour routing and definition of tariffs that promote Internet access in low busy hours. In this work, we propose a methodology for clustering Internet users with similar patterns of Internet utilization, according to their hourly traffic utilization. The methodology resorts to three statistical multivariate analysis techniques: cluster analysis, principal component analysis and discriminant analysis. The methodology is illustrated through measured data from two distinct ISPs, one using a CATV access network and the other an ADSL one, offering distinct traffic contracts. Principal component analysis is used as an exploratory tool. Cluster analysis is used to identify the relevant Internet usage profiles, with the partitioning around medoids and Ward's method being the preferred clustering methods. For the two data sets, these methods lead to the choice of 3 clusters with different hourly traffic utilization profiles. The cluster structure is validated through discriminant analysis. It is also evaluated in terms of several characteristics of the user traffic not used in the cluster analysis, such as the type of applications, the amount of downloaded traffic, the activity duration and the transfer rate, resulting in coherent outcomes.
ER -