The search functionality is under construction.
The search functionality is under construction.

Open Access
Stop-Probability-Based Network Topology Discovery Method

Yuguang ZHANG, Zhiyong ZHANG, Wei ZHANG, Deming MAO, Zhihong RAO

  • Full Text Views

    306

  • Cite this
  • Free PDF (4.6MB)

Summary :

Using a limited number of probes has always been a focus in interface-level network topology probing to discover complete network topologies. Stop-set-based network topology probing methods significantly reduce the number of probes sent but suffer from the side effect of incomplete topology information discovery. This study proposes an optimized probing method based on stop probabilities (SPs) that builds on existing stop-set-based network topology discovery methods to address the issue of incomplete topology information owing to multipath routing. The statistics of repeat nodes (RNs) and multipath routing on the Internet are analyzed and combined with the principles of stop-set-based probing methods, highlighting that stopping probing at the first RN compromises the completeness of topology discovery. To address this issue, SPs are introduced to adjust the stopping strategy upon encountering RNs during probing. A method is designed for generating SPs that achieves high completeness and low cost based on the distribution of the number of RNs. Simulation experiments demonstrate that the proposed stop-probability-based probing method almost completely discovers network nodes and links across different regions and times over a two-year period, while significantly reducing probing redundancy. In addition, the proposed approach balances and optimizes the trade-off between complete topology discovery and reduced probing costs compared with existing topology probing methods. Building on this, the factors influencing the probing cost of the proposed method and methods to further reduce the number of probes while ensuring completeness are analyzed. The proposed method yields universally applicable SPs in the current Internet environment.

Publication
IEICE TRANSACTIONS on Communications Vol.E107-B No.9 pp.583-594
Publication Date
2024/09/01
Publicized
Online ISSN
1745-1345
DOI
10.23919/transcom.2024EBP3001
Type of Manuscript
PAPER
Category
Network

1.  Introduction

Accurate and comprehensive understanding of network topology is crucial for ensuring the reliability, efficiency, and security of networks [1]-[3]. Network topologies based on Internet protocol (IP) interfaces provide a fundamental view of network structures and connections [4], [5], serving as an essential framework and starting point for advanced network analysis and management [6]-[8]. However, the vast expanse of the IP address space presents a huge target domain for topology discovery, where large-scale probing can consume substantial network resources and affect regular network communications [9], [10]. Consequently, obtaining a complete network topology using a limited number of probes remains a focal area of research.

Currently, network topology discovery methods at the IP level can primarily be categorized into two types: those based on traceroute and those using tomographic imaging techniques [11]-[13]. Traceroute-based methods are known for their simplicity and adaptability and are widely used in various scenarios, including path analysis, fault diagnosis, and performance testing [14], [15]. These methods involve configuring specific time to live (TTL) values in the headers of IP packets that trigger hop-by-hop devices along the path to send timeout response packets back to the source, thereby revealing the complete routing path [16], [17]. Based on this core principle, researchers have continually proposed optimizations to enhance detection capabilities and improve efficiency, thereby achieving significant advancements in this field.

Initially, trace variants based on user datagram [5] and transmission control protocols [18], [19] significantly enhanced the penetration ability of probes, enabling more effective traversal through network firewalls and address translation devices. The Paris-Traceroute employs flow identification techniques and has been instrumental in identifying multipath routing [20], [21]. Subsequently, large-scale high-concurrency probing technologies have markedly increased the efficiency and scope of network discovery. The TraceNet architecture proposed by Tozal et al. collected all IP addresses in each subnet along the probing path, overcoming the traditional limitation of a traceroute targeting a single endpoint [22]. Scamper was developed by Luckie et al. and utilized multi-threading techniques to enhance probing efficiency [23]. This method was employed in the Center for Applied Internet Data Analysis (CAIDA) Archipelago (ARK) project [24]. The ARK probing platform, with its extensive deployment of probing sources in over 40 countries, has significantly expanded the coverage of network topology discovery, enhanced data quality and magnitude, and provided valuable resources for researching global Internet structures. To further accelerate network-wide probing, Beverly [25] adopted and improved the stateless asynchronous packet-transmission mechanism introduced by Durumeric et al. [26]. This approach established independent and parallel probing and response threads, thereby substantially increasing the scanning efficiency. However, the improvements in probing efficiency using these methods were essentially achieved at the cost of proportionally increased probes. This additional load on the network affects the effective utilization of bandwidth and may interfere with normal real-time responses, thereby adversely affecting the accuracy of the discovery results. In response to this issue, the academic community has researched methods to reduce the number of probes sent during the discovery process, primarily by minimizing probe redundancy [9].

Donnet et al. introduced the DoubleTree algorithm, which optimizes the probing process by constructing two types of tree structures: a ‘source tree’ rooted at the probing source and a ‘target tree’ rooted at the probing target [27]. The algorithm sets local and global stop sets for both trees to record discovered IP addresses. Probing starts from a midpoint in the path and proceeds towards the roots of the source and target trees. Probing in that direction ceases once an address is detected in the stop set, thereby avoiding redundancy and effectively reducing the number of required probes. The two primary challenges of this method are determining the midpoint and cost of synchronizing information between probing sources in the target tree. The more recently developed FlashRoute probing method utilizes the DoubleTree principle of redundancy elimination and achieves higher parallelism by separating the sending of probes from the receiving responses [28]. FlashRoute uses a local stop set during the backward probing process to save the probes sent during discovery. Although these methods effectively reduce probing redundancy using stop sets, they face the problem of incomplete topology information collection in multipath routing. These methods immediately stop probing upon encountering a repeat node (RN), potentially missing multipath routes beyond that node.

This study proposes an optimized probing method based on stop probabilities (SPs) that builds on existing stop-set-based network topology discovery methods to address the issue of incomplete topology information owing to multipath routing. The core principle of this method is not to immediately cease probing when an RN is discovered, but rather to stop with a probability that increases the chances of discovering multipath routes. Extensive probing results were collected from sources located in different global regions to determine reasonable SPs, and the probability distribution of the number of RNs in the routing paths was calculated. The SP at a RN was set as the cumulative probability corresponding to all RNs encountered up to that point in the probing path. It was determined that probing should only cease with 100% certainty upon reaching the last RN. This was based on another statistical finding of this study that indicated a high likelihood of multipath routing beyond a RN.

Simulations were conducted using the CAIDA 2021 and 2022 ARK datasets. The results demonstrate the effectiveness of the determined SPs for probing sources in different regions and times. The proposed approach significantly complemented the missing topology nodes and links compared to stop-set-based methods, with increases of approximately 7% and 30%, respectively. In addition, the experimental results verified node and link discovery rates of over 99.85% and 99.45%, respectively, while maintaining a reduction in probes of approximately 25% to 55% compared with that of traditional methods. Furthermore, the number of probes was drastically reduced while obtaining almost identical topology information compared with traditional probing methods and those utilizing stateless asynchronous mechanisms. Additionally, the deployment cost of this method was low, requiring only a small amount of decision logic and limited additional storage space to record the RNs, which could be processed by any modern server.

The remainder of this paper is organized as follows. The probing scenarios are introduced in Sect. 2, including RNs, multipath routing, and statistical demonstrations of their impact on probing completeness. The design and generation of the proposed network topology discovery method based on SPs is described in Sect. 3. The performances in different regions, times, and scales are evaluated in Sect. 4, including comparisons with three other mainstream probing schemes. The practical application aspects of the proposed method are outlined in Sect. 5. Finally, the main findings of this study and a discussion of future research directions are summarized in Sect. 6.

2.  Scenario

This study focuses on large-scale topology probing, starting from a single source and targeting IPs dispersed across the Internet. This represents the most fundamental scenario in cyber-topology detection, which forms the basis of synthesizing independent results from various sources for distributed large-scale platforms, such as ARK. Probes tend to disperse in areas far from the source owing to the divergence of probing paths and power-law distribution of routing degrees observable from a single point perspective within the Internet [29]-[31]. However, probes closer to the source often converge along a limited number of repetitive routing paths, leading to network congestion and probe wastage [32]. The objective of this study is to eliminate this redundancy while ensuring the complete discovery of topology nodes and links.

Current stop-set-based probing methods rely on a “tree structure” of topology nodes. The DoubleTree method uses two tree structures: a ‘source tree’ that connects multiple targets to a single probing source, and a ‘target tree’ that connects multiple sources to a single target. In this study, the target tree was not considered for two reasons. First, this scenario involves only one probing source. Second, utilizing a target tree for redundancy elimination requires maintaining a shared global stop set among multiple sources, which poses synchronization challenges in practical operations. Although the FlashRoute method adopts the DoubleTree concept, starting from a ‘split point’ and conducting backward and forward probing, it only utilizes a source tree to eliminate redundant probes. Using a source tree for stop-set-based probing only requires maintaining a local stop set at the probing source to store previously detected nodes, making it easily achievable.

This study employs a backward probing method, which is a key technique for eliminating redundancy using a source tree. This method consists of two main steps: estimating the hop-count distance from the probing target to the source, and initiating backward probing from this estimated distance. The hop count can be approximately determined by sending Internet control message protocol (ICMP) echo requests to the target and reading the remaining TTL in the ICMP echo reply. FlashRoute estimates hop-count distances in its preliminary probing phase by sending probes to the traceroute-reserved port 33434 and analyzing the remaining TTL in the ICMP port unreachable responses from the host, achieving an accuracy of approximately 89.7% [28]. However, this study does not focus on the estimation of hop-count distances to the target because the proposed optimizations are unrelated to this aspect. Instead, the focus was on addressing the issue of missed detections during the process of eliminating redundancy using a stop set. Furthermore, backward probing reverses the approach of traceroute, which incrementally increases the TTL values. Instead, the proposed method decrementally decreases TTL values and progressively identifies path nodes as they move from remote locations towards the source. This technique was initially proposed by Govindan et al. and has become a widely accepted approach adopted by methodologies such as DoubleTree, FlashRoute, and Scamper [33]. The subsequent sections delve into backward probing, multipath routing, and the quantitative relationship between multipath routing and the occurrence of RNs.

2.1  Multipath Routing and Repeat Nodes

The probing scenario used in this study is depicted in Fig. 1. The monitor positioned at the center of the image acts as the probing source and launches topology probes towards the scattered targets in cyberspace. The six servers encircling the monitor represent numerous targets dispersed throughout the network space during large-scale probing. The circles between the monitor and each target signify the routing devices in the routing paths, including routers, switches, and servers. These routing devices, together with the monitor and probing targets, are collectively termed as ‘nodes’. The ‘links’ refer to the logical connections between nodes established via routing protocols, depicted as black lines with arrows between two nodes. The direction of the arrows corresponds to the probe direction in the traditional traceroute, which is inverted in this approach and employs backward probing.

Node 1 is the monitor’s gateway and the initial hop for outward probing. The traditional traceroute starts with a TTL value of 1 that incrementally increases, thereby receiving ICMP timeout responses in a sequence from hop-by-hop devices starting from Node 1. In this method, a probing path is constructed from the gateway to the target. This process is referred to as a single round of probing. For example, the monitor traces the path [1, 3, 4, 5, 6, Target1] when it initiates a round of probing towards Target 1, and so on for other targets.

Fig. 1  Probing scenario.

During backward probing, the TTL values of the probes gradually decrease instead of increase. The initial value is equal to the hop-count distance from the target to the monitor. Estimation of the target hop count is a separate research topic. For this study, it is assumed that the targets in the scenario are the nodes at the estimated hop counts. The target hop count is denoted by \(h\). The monitor sequentially sends probes with TTL values of \(h\), \(h\)-1, \(h\)-2, \(h\)-3…and back down to 3, 2, and 1. This process enables the monitor to receive feedback signals from hop-by-hop devices starting from the target, reconstructing the routing path in the reverse order from distant to near. In this example, the probing paths for backward probing towards the six targets are labeled as Path1 to Path6.

Multipath routing is a routing strategy that employs multiple paths for packet scheduling to balance the network load and enhance data transmission performance [34]. This mechanism results in the possibility of multiple routing paths between two nodes. For example, there are two routing paths from Node 1 to Node 3, \([1,3]\) and \([1,2,3]\). A similar situation exists from Node 2 to Node 8, as indicated by the dashed arrows in Fig. 1. Additionally, RNs within the probing paths are identified using filled node markers. In this study, RNs refer to those appearing on at least two different paths, such as Node 3 in Fig. 1 that is present in Path1 and Path2. The other RNs include Nodes 1, 2, 8, and 17.

In the stop-set-based topology probing process, probing ceases when the first RN is encountered. Thus, other nodes and links on alternate paths may be missed if there is multipath routing between the RN and probing source. Unfortunately, multipath routing is prevalent beyond most RNs, as discussed in Sect. 2.2.

All the statistical and simulation experiments in this study were based on the CAIDA ARK probing dataset [35]. Each probing source in the dataset probed 24 different prefixes. During the probing of each of the 24 prefixes, only one target was randomly selected.

2.2  Statistics

There were 15 probing sources selected from the results dated September 2, 2022. Three sources were selected from each continent: Africa, America, Asia, Europe, and Oceania. These sources collectively probed 1,204,753 targets, with each target probed only once. The dispersed nature of these targets align with the probing scenario used in this study. The variability of targets across different sources illustrated general trends. Backward probing was simulated across all paths without employing redundancy-elimination techniques. The number of paths, nodes, and RNs were tallied for each source, followed by multipath routing.

The statistical outcomes were split into two sub-figures sharing the same x-axis, as shown in Fig. 2. The x-axis represents the location of the probing sources, formatted as “Continent - City Abbreviation - Country/Region Abbreviation”. The lower sub-figure shows the number of probing paths per source. In the upper sub-figure, each source is represented by three bars. The bars on the left indicate the number of unique nodes in the probe. The middle bars represent the number of RNs, with the percentages labeled above. The bars on the right represent the number of RNs followed by multipath routing, also denoted with a percentage. For convenience, the nodes indicated by the right bars are referred to as “Multipath RNs” (MPRNs).

Fig. 2  Statistics of paths, nodes, RNs, and RNs followed by multipath routing.

In the lower sub-figure, the number of probing paths per source ranged from 66,240 to 96,439, reflecting the scale of large-scale probing. In the original result files, asterisks (*) that indicated hops without response messages were excluded as RNs and the paths were deduplicated.

In the upper sub-figure, the focus is on the quantitative relationship between the nodes, RNs, and MPRNs. First, the number of RNs generally account for one-third of the total node count across all sources, suggesting a similar likelihood of encountering RNs worldwide. Second, the median percentage of MPRNs to RNs was 95% among the 15 sources, with 10 sources exceeding 90%, including four reaching 99%. This finding indicates that it is highly likely that multipath routing exists beyond an RN if it appears on a probing path. Therefore, there is a high probability of missing nodes and links on the multipath routes if probing ceases at the first RN. Hence, there is a clear need to explore methods that allow for a flexible determination of when to stop probing.

3.  Method

This section describes the stop-probability-based discovery method that optimizes the completeness of topology discovery in the stop-set approach by modifying the stop-decision rules at the RN. Subsequently, a method is introduced for generating SPs based on the distribution of RN occurrences.

3.1  Description

The concept of SP is introduced to regulate the timing of the probing cessation, denoted as \(SP_n\) and expressed as:

\[\begin{equation*} SP_n=P_n (\textit{Stop at } RN_n), \tag{1} \end{equation*}\]

where \(n\) represents the sequence number of the RN encountered during probing. The right-hand side of the equation signifies the probability of stopping probing upon encountering the \(n\)th RN.

For a probing source, the SP is a sequence tailored to RNs with varying sequence numbers, which is expressed as:

\[\begin{equation*} SP_{\textit{Source}}=[SP_{1},SP_{2},SP_{3},\ldots ,SP_{n}], \tag{2} \end{equation*}\]

Different probing sources may have a distinct \(SP_{\textit{Source}}\).

At \(RN_n\), the process of making a stop decision is represented by the function \(F(SP_n)\). The outcome of this computation follows a Bernoulli distribution, which is formally expressed as:

\[\begin{eqnarray*} &&\!\!\!\!\! P(F(SP_n)=\textit{true})=SP_n \tag{3} \\ &&\!\!\!\!\! P(F(SP_n)=\textit{false})=1-SP_n \tag{4} \end{eqnarray*}\]

The probing process based on SP, taking Path1 as an example, is illustrated in Fig. 3. Here, the backward probing did not cease until an RN was detected. Upon reaching Node 3, it was identified as \(RN_1\). Thus, \(SP_1\) was used for the calculation. To avoid confusion with other calculations, this process is denoted as \(F_3^1(SP_1)\), where superscript 1 indicates Path1, and subscript 3 denotes Node 3. The probing stops or continues if \(F_3^1(SP_1)=\textit{true}\) or false, respectively. If probing does not stop at Node 3, the monitor then encounters Node 1, identified as \(RN_2\), and subsequently calculates \(F_1^1(SP_2)\), and so on. The decision-making scenarios for stopping across all six probing paths are shown in Fig. 3.

Algorithm 1 formally defines the topology probing method based on SP. The function \(\textit{estimateInitialTTL}(t)\) estimates the initial TTL value for backward probing based on target \(t\). This estimation is contingent on a stop-set-based probing algorithm that employs the proposed optimization technique. This was implemented differently in the DoubleTree and FlashRoute algorithms, which are not discussed in this paper. \(Halt()\) denotes a primitive, indicating a scenario in which the probing process must be ceased for reasons other than those already specified, such as the detection of routing loops.

Fig. 3  Process of probing based on SPs.

SP is the only configurable parameter within the algorithm. The configuration of the SP value is further explained in Sect. 3.2.

3.2  Stop Probability

Different SP values were assigned to RNs of varying orders within the probing paths. This section introduces a method for configuring the SP based on the distribution of the RN numbers and calculating their values.

3.2.1  Algorithm

The primary objective of the proposed method was to ensure the completeness of information, followed by the reduction of redundant probes. Section 2.2 highlighted that there was a high likelihood of multipath routing beyond RNs. The detection of multipath routes beyond other RNs is maximized if the SP can be adjusted such that the probing stops precisely at the last RN. The last RN in a backward probing path is the probing source itself, thus multipath routes beyond this node do not need to be considered.

Building on this concept, if \(j\) paths contain \(n\) RNs in \(i\) probing paths, then probing should stop at a ratio of \(j/i\) for an individual \(RN_n\). This is expressed as:

\[\begin{equation*} P_{\textit{single}\_n}\left(\textit{Stop at } RN_n\right)= \frac{\left|\textit{Path}_n\right|}{|\textit{Path}|}\times 100\%, \tag{5} \end{equation*}\]

where Path denotes the set of all paths, \(\textit{Path}_n\) denotes the set of paths containing \(n\) RNs, and \(P_{\textit{single}\_n}(\textit{Stop at } RN_{n})\) represents the probability of stopping at that node considering only \(RN_n\).

Furthermore, if there are other RNs before \(RN_n\) in a single round of probing, the probability of stopping at \(RN_n\) should be the cumulative sum of its individual SPs and all previous RNs. This is expressed as:

\[\begin{equation*} SP_n=\sum_{k=1}^n P_{\textit{single}\_n}(\textit{Stop at } RN_n), \tag{6} \end{equation*}\]

which defines the SP for \(RN_n\).

There will be no reduction in probes if each probing instance reaches the probing source. However, the simulation results have shown that the process often stops midway with longer probing paths, which effectively eliminates redundancy near the probing source. Nodes near the probing source that are prone to redundant probing can be covered by shorter probing paths, thereby ensuring high completeness. Next, the values for SP will be calculated.

3.2.2  Values of SPs

The distribution of \(P_{\textit{single}\_n}\) across the probing sources in 15 global regions was analyzed using probing data from January 2, 2022. The results are shown in Fig. 4, where the x-axis represents the number of RNs in the path. The graph revealed a common pattern across different probing sources, whereby the probability initially increased and then decreased as the number of RNs increased. For most probing sources, the peak number of RNs was concentrated between 10 and 15, with the highest probability of approximately 14%. Paths containing more than 25 RNs were rare. These results demonstrated that the distribution of resources and routing strategies in general global networks did not vary significantly from the observational perspective of probing sources in different regions. This stability was utilized to predict the number of RNs that were likely to be encountered in a wide range of scenarios, allowing for the design of an optimized probing method with universal applicability.

Fig. 4  Distribution of RN numbers.

The SP values calculated based on Fig. 4 are shown in Fig. 5. Probing easily progressed from \(RN_1\) to \(RN_5\) owing to the lower SPs, whereas, probing was highly likely to stop beyond \(RN_{25}\) because the possibility of encountering a greater number of RNs in the path was low. It is noteworthy that \(SP_{\textit{oceania$\text{-}$per$\text{-}$au}}\) and \(SP_{\textit{oceania$\text{-}$wlg$\text{-}$nz}}\) exhibited the most significant variance, indicating that the probing source oceania-per-au encountered the fewest RNs, whereas oceania-wlg-nz encountered the highest number of RNs. This suggested that the probing paths from oceania-wlg-nz were more complex and generally longer. The effectiveness of the proposed probing method was verified through an experimental assessment, as outlined in Sect. 4.

Fig. 5  SPs.

4.  Experiment

This section presents an evaluation of the SP-based probing method through simulation experiments. The three metrics used to compare probing effectiveness are presented in Sect. 4.1. The effectiveness of the SPs across different regions and over different times is presented from Sect. 4.2 to Sect. 4.4, including a comparison with existing methods to highlight the advantages and characteristics of the proposed method. Finally, the impact of the scale of probing on effectiveness is analyzed.

4.1  Evaluation Metrics

The following three metrics were employed to experimentally evaluate the proposed method.

  • Node Discovery Rate: The proportion of detected nodes to the total nodes in the original probing paths, post-deduplication.

  • Link Discovery Rate: The proportion of detected links to the total links in the original probing paths, post-deduplication.

  • Probe Reduction Rate: The ratio of the number of probes saved to the total number of probes sent in the original probing, post-redundancy optimization.

In these metrics, the original probing paths refer to the paths used from the CAIDA dataset. Scamper’s probing paths were considered as the “true paths” for this assessment. The backward probing simulation started from the targets of these true paths and scanned backward until the probing stopped because of the stop decision or upon reaching the probing source.

In our evaluation experiments, we removed the “*” symbols from the CAIDA dataset, which represent “non-responsive nodes”, and deduplicated the paths after their removal. By excluding these nodes, we can reduce noise and errors in the analysis process, thereby improving the accuracy of the final results. Additionally, removing the unresolved nodes helps avoid confusion in interpreting and understanding the research findings, thereby enhancing the interpretability of the research discoveries.

4.2  Effectiveness of Stop Probability across Different Regions

The experiment used probing data collected on September 2, 2022. The range of path counts in datasets from different regions is 66,240–96,439, with an average of 80,317 paths. Three different sets of SPs were selected for evaluation based on the results from Sect. 3, including two extreme cases, \(SP_{\textit{oceania$\text{-}$per$\text{-}$au}}\) and \(SP_{\textit{oceania$\text{-}$wlg$\text{-}$nz}}\). The arithmetic mean of all SPs was also included, denoted as \(SP_{\textit{average}}\) and expressed as:

\[\begin{eqnarray*} &&\!\!\!\!\! SP_{\textit{average}}[j]=\frac{1}{|S|}\sum_{SP\in S}SP[j],\nonumber \\ &&\!\!\!\!\! S=\left\{SP_{\textit{africa$\text{-}$dar$\text{-}$tz}},SP_{\textit{africa$\text{-}$hla$\text{-}$za}},\ldots, SP_{\textit{source}}\right\} \tag{7} \end{eqnarray*}\]

The node discovery, link discovery, and probe reduction rates for different probing sources under various SPs are shown in Figs. 6, 7, and 8, respectively. The node and link discovery rates in all scenarios were above 99.85% and 99.45%, respectively, demonstrating exceptionally high probing completeness. The effect of SP on the probe reduction rate was significant. \(SP_{\textit{oceania$\text{-}$per$\text{-}$au}}\) exhibited the fastest increase in SP and resulted in the highest reduction rate at approximately 45%. Conversely, \(SP_{\textit{oceania$\text{-}$wlg$\text{-}$nz}}\) exhibited the smallest reduction rate of approximately 15%.

These figures collectively illustrate a trade-off between topology discovery completeness and probe reduction rate under the influence of SP. Lower SPs enhanced completeness but decreased the reduction rate, whereas higher SPs increased the reduction rate but also reduced completeness.

Fig. 6  Node discovery rates of probing sources from different regions under various SPs.

Fig. 7  Link discovery rates of probing sources from different regions under various SPs.

Fig. 8  Probe reduction rates of probing sources from different regions under various SPs.

It is noteworthy that the discovery rates for nodes and links were nearly 100% even under the least favorable conditions, indicating the effectiveness of SP in probing sources across different regions. For the remaining experiments in this study, \(SP_{\textit{oceania$\text{-}$per$\text{-}$au}}\) was used to evaluate subsequent metrics. \(SP_{\textit{oceania$\text{-}$per$\text{-}$au}}\) was selected as the baseline because of its ability to significantly reduce the number of probes while maintaining a minimum acceptable rate of discovery for topological elements. This approach is beneficial for evaluating the effectiveness of our method in identifying topological elements under the least favorable conditions, and it also highlights the potential of the method to reduce probing costs.

4.3  Effectiveness of Stop Probability over Different Times

This experiment used probing data collected from January 2021 to October 2022 that consisted of eight sets of results at three-month intervals. Each set comprised one probing source from each of the five continents. The range of path counts in datasets from different regions and times is 55,669–128,740, with an average of 80,936 paths. The node discovery, link discovery, and probe reduction rates for the different probing sources over time are shown in Figs. 9, 10, and 11, respectively.

Fig. 9  Node discovery rates of probing sources from different regions over different times.

Fig. 10  Link discovery rates of probing sources from different regions over different times.

Fig. 11  Probe reduction rates of probing sources from different regions over different times.

The node and link discovery rates in all cases were above 99.93% and 99.65%, respectively, indicating excellent probing completeness. The intertwined lines of different sources reflected random fluctuations when the values approached their maxima owing to computational errors, as shown in Figs. 9 and 10. The probe reduction rates exhibited two characteristics. Vertically, the reduction rates across the various sources were relatively stable, showing little dynamic variation. High completeness combined with stable reduction rates indicated the long-term effectiveness of SP. Horizontally, there was a variance in reduction rates among different sources, suggesting that the attributes of the probing sources affected the magnitude of the reduction rates. This topic will be explored further.

The significant disparities in the distribution of probe reduction rates across different sources ranging from 26.2% to 54.6% are shown in Fig. 11. Research has indicated a strong correlation between the reduction rate and average path length. The average path length refers to the mean number of hops in the original path of each probe source. Longer average paths correspond to higher reduction rates. For example, the reduction rate for the america-bwi-us source peaked at 54.6% on April 1, 2021, which coincided with the highest average path length of 18.47 hops. The average path lengths from Fig. 11 which show consistent patterns are listed in Table 1. This phenomenon occurred because shorter paths generally have fewer RNs and utilize smaller SPs in stop decisions, resulting in deeper probing. Conversely, longer paths with more RNs may stop probing midway because of increased SPs, leaving a larger proportion of unexplored paths. This suggests the need to configure different SPs for probing sources with varying average path lengths. It is strongly believed that configuring a higher SP for oceania-per-au could increase its reduction rate by more than 50% compared with that of the america-bwi-us case. The task of individually configuring the SPs for different probing sources should be pursued in future studies.

Table 1  Average path length (unit: Hops).

4.4  Comparison of Different Probing Methods

This experiment used data collected on October 2, 2022, and one probing source was selected from each of the five continents. The range of path counts in datasets from different regions is from 67,858–98,121, with an average of 80,887 paths. Existing probing methods, including the proposed method, were categorized into four groups. The first category included traditional probing methods, such as traceroute and Scamper, which served as the baseline in the dataset. The second category was the proposed method, and the third included stateless asynchronous packet transmission and reception probing methods, such as Yarrp. For Yarrp probing, it was necessary to set a maximum TTL value to determine the stop condition [23]. In addition, the default configuration \(TTL_{max}=32\) was used in this category. The fourth category consisted of stop-set-based probing methods, such as DoubleTree and FlashRoute. It is noteworthy that the initial hop-count estimation step is omitted in these methods, and backward probing starts directly from the identified target. The node discovery, link discovery, and probe reduction rates for different methods across various probing sources are shown in Figs. 12, 13, and 14, respectively.

Fig. 12  Node discovery rates of probing sources from different regions using different methods.

Fig. 13  Link discovery rates of probing sources from different regions using different methods.

Fig. 14  Probe reduction rates of probing sources from different regions using different methods.

The proposed and stateless methods achieved complete probing with negligible loss using traditional methods as a baseline, as shown in Figs. 12 and 13. The proposed method demonstrated an approximately 7% higher node discovery rate and approximately 30% higher link discovery rate than those of the stop-set-based methods.

The probe reduction rate of the baseline method was 0, as shown in Fig. 14. The proposed method reduced the number of probes by 26.55% to 49.59% compared with that of traditional methods. In contrast to stop-set-based methods, the proposed approach required sending 35% to 50% more probes, which was a trade-off for completeness in probing. Stateless probing methods needed to send more than twice as many probes to the network, significantly lagging behind other methods in terms of redundancy reduction.

The average path lengths for the five probing sources were 16.74, 15.60, 13.06, 12.42, and 10.83 hops. As shown in Fig. 14, the variations in length led to changes in the metrics. For example, the oceania-per-au probing source needed to send \((32-10.83)=21.17\) additional probes on average per path in the Yarrp method, which was twice the baseline amount. The probe reduction rate could be enhanced by configuring specific SPs for different probing sources based on their average path lengths, thereby narrowing the gap with the stop-set methods. It is believed that this approach could increase the reduction rates of other sources to match the current level of africa-dar-tz. This concept will be applied in future studies.

4.5  Impact of Probing Scale

This experiment used data collected on October 2, 2022. The range of path counts in datasets from different regions is from 67,858–98,121, with an average of 80,887 paths. The impact of the probing scale on the node discovery, link discovery, and probe reduction rates are shown in Figs. 15, 16, and 17, respectively. The number of probing paths exponentially increased along the x-axis, and the paths were randomly selected from the original data according to the required quantity.

Fig. 15  Impact of probing scale on node discovery rate.

Fig. 16  Impact of probing scale on link discovery rate.

Fig. 17  Impact of probing scale on probe reduction rate.

There was a fluctuation in the completeness of topology discovery when the scale of the paths ranged from 100 to 1000, as shown in Figs. 15 and 16. This was due to the randomness of path selection. Beyond this range, the effect of randomness diminished as the number of paths increased, and the rates of node and link discovery converged to above 99.95% and 99.7%, respectively.

The number of probes saved also increased as the probing scale expanded, benefiting from the presence of more RNs, as shown in Fig. 17. It should be noted that this experiment was limited by the number of paths in the original dataset. However, it can be inferred that the continued expansion of the probing scale would yield a higher reduction rate. This also demonstrates the advantage of intensive probing of regional targets.

Based on the above observations, the proposed method ensured the completeness of topology discovery in all scenarios covered by the simulations while significantly reducing the cost of probing. This represents an optimized approach that addresses the shortcomings of stopset-based probing methods. The process to employ this method in network environments is presented in Sect. 5.

5.  Deployment and Application

The proposed method can easily be deployed in addition to stop-set-based probing techniques. \(SP_{\textit{source}}\) can be configured as a constant or optional parameter and stored in the monitor to make stop decisions upon detecting an RN, posing no burden to the current servers.

For all probing sources, the proposed approach is particularly suitable for large-scale network topology probing or the intensive probing of regional targets. In simulation experiments, the proposed method demonstrated excellent completeness in topology discovery, although the probe reduction rate was influenced by various factors. The factors affecting the probe reduction rate and their impact on probe completeness are summarized in Table 2. The upward and downward arrows indicate increases and decreases, respectively. The horizontal line denotes no impact. In practice, the probing strategy can be adjusted according to the different application conditions.

Table 2  Factors affecting probe reduction rate. The upward and downward arrows indicate increases and decreases, respectively, and horizontal line denotes no impact.

Having identified the characteristics that influenced the probe reduction rate, SP configuration strategies were further investigated under varying probing source attributes, target regions, network infrastructures, and probing requirements. The probing task was divided into two phases: pre-probing and probing. Pre-probing gathers essential information to select the \(SP_{\textit{source}}\) based on best practices, which is then employed in the probing phase. This approach enhances the adaptability of the method to various network conditions and topological structures.

The proposed SP essentially reflects the overall multipath routing strategy in cyberspace, which does not change frequently, as discussed in Sect. 4.3. However, the effectiveness of the proposed SP generation method based on the distribution of RNs would need to be reassessed if network operators alter this strategy and cause significant transformative changes in the Internet topology, such as changes in RN and multipath routing patterns owing to software-defined networking. In such cases, the SP generation algorithm would need to be updated to align with the specific conditions.

To address this challenge, we recommend using periodic traditional comprehensive scanning to detect such changes. Although this method may increase the cost of probing, it can provide the most accurate information on the network state. Based on the scanning data, it is possible to further establish a feedback-based stop probability adjustment mechanism by implementing an adaptive system. This system evaluates the current discovery rate based on the most recent traditional in-depth scanning data, and dynamically adjusts the stop probability to maintain the required topology discovery rate. Additionally, continuous monitoring of open-source intelligence on network hardware and software within the area of interest can also provide significant support for detecting substantial changes in Internet topology.

6.  Conclusion

This study presented an optimization method that enhanced the completeness of topology discovery in advanced stop-set-based network probing techniques using SPs.

Through a statistical analysis of global probing data, it was found that there was a high likelihood of subsequent multipath routing once an RN appeared in a routing path. This was a key reason for the incompleteness of stop-set-based network topology probing methods. Therefore, SPs were introduced to adjust the stopping strategy when encountering RNs, addressing high completeness and low-cost requirements. A stop-probability generation method was devised based on the distribution of the number of RNs that yielded universally applicable SPs.

Simulation experiments demonstrated the effectiveness of the proposed SP method across different regions and times. The proposed method achieved node and link discovery rates above 99.85% and 99.45%, respectively, using 15 probing sources across five continents and targeting randomly selected addresses in various /24 prefixes. Over the two-year period from 2021 to 2022, the node discovery rate never fell below 99.93%, the link discovery rate was always above 99.65%, and the probe reduction rate ranged from 26.2% to 54.6%.

The proposed approach effectively eliminated instances of missed detections compared with stop-set-based methods. Consequently, the node and link discovery rates increased by 7% and 30%, respectively, although additional probing was required. Comparatively, traditional and stateless methods incurred higher costs to discover the same range of topology information.

The fluctuations in the probe reduction rate were also studied. The greater the number of probing attempts and the longer the average path length, the higher the reduction rate. It is believed that moderately increasing the SP could further save probes for probing sources with shorter average path lengths. A small-scale pre-probing step was recommended to gather key information before proceeding with official probing, such as the average path length and SP based on best practices. Adjusting the SPs under different conditions will be the focus of future research.

The main contributions of this work are summarized as follows.

  1. The pattern of likely multipath routing after RNs was revealed through a statistical analysis of global topology probing results, which is a phenomenon observed worldwide.

  2. A network topology discovery method was proposed based on SPs.

  3. An SP configuration method was developed based on the distribution of RN numbers using global probing data to obtain universally applicable SPs.

  4. The effectiveness of the proposed method was assessed in comparison with state-of-the-art probing methods and its performance was validated through simulations in different regions, times, and scales.

Given the ever-increasing scale, complexity, and flexibility of the Internet, the current SP generation scheme may not be applicable to future networks in which the distribution of RNs or multipath routing strategies undergo significant changes. Therefore, devising new stop-decision strategies that balance topology discovery completeness while reducing redundancy remains an avenue for future research.

Acknowledgments

This work was financially supported by National Defense Basic Research Program of China (JCKY2021210B071), and National Key Research and Development Program of China (2022YFB3105000) fund.

References

[1] G. Mao and B.D. Anderson, “Towards a better understanding of large-scale network models,” IEEE/ACM Trans. Netw., vol.20, no.2, pp.408-421, 2011. DOI: 10.1109/TNET.2011.2160650
CrossRef

[2] P. Richter and A. Berger, “Scanning scanners: Sensing the Internet using a massively distributed network telescope,” Proc. Internet Measurement Conference pp.144-157, Oct. 2019. DOI: 10.1145/3355369.3355595
CrossRef

[3] R. Fontugne, C. Pelsser, E. Aben, and R. Bush, “Pinpointing delays and forwarding anomalies using large-scale traceroute measurements,” Proc. 2017 Internet Measurement Conference, pp.15-28, Nov. 2017. DOI: 10.1145/3131365.3131384
CrossRef

[4] H. Yang, C.L. Li, and Z. Sun, “A survey of Internet topology data-acquisition methods,” J. Commun, vol.52, no.8, pp.1811-1817, 2019. DOI: 10.3969/j.issn.1002-0802.2019.08.001
CrossRef

[5] R. Motamedi, R. Rejaie, and W. Willinger, “A survey of techniques for internet topology discovery,” IEEE Commun. Surveys Tuts, vol.17, no.2, pp.1044-1065, 2014. DOI: 10.1109/COMST.2014.2376520
CrossRef

[6] Z. Wang, Z. Li, G. Liu, Y. Chen, Q. Wu, and G. Cheng, “Examination of WAN traffic characteristics in large-scale datacenter network,” Proc. 21st ACM Internet Measurement Conference, pp.1-14, Nov. 2021. DOI: 10.1145/3487552.3487860
CrossRef

[7] R.B. Basat, G. Einziger, J. Gong, J. Moraney, and D. Raz, “q-MAX: A unified scheme for improving network measurement throughput,” Proc. Internet Measurement Conference, pp.322-336, Oct. 2019. DOI: 10.1145/3355369.3355569
CrossRef

[8] R. Li, K. Makhijani, and L. Dong, “New IP: A data-packet framework to evolve the Internet,” Proc. 2020 IEEE 21st International Conference on High-Performance Switching and Routing (HPSR), pp.1-8, May 2020. DOI: 10.1109/HPSR48589.2020.9098996
CrossRef

[9] Z. Luo, J. Liu, G. Yang, Y. Zhang, and Z. Hang, “High-speed path probing method for large-scale network,” Sensors, vol.22, no.15, p.5650, 2022. DOI: 10.3390/s22155650
CrossRef

[10] B. Donnet, P. Raoult, T. Friedman, and M. Crovella, “Efficient algorithms for large-scale topology discovery,” Proc. 2005 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, pp.327-338, June 2005. DOI: 10.1145/1064212.1064256
CrossRef

[11] Y. Li, X. Tian, K. Pham, and G. Chen, “A review of network topology inference methods,” Sensors and Systems for Space Applications XIV, vol.11755, pp.57-64, 2021. DOI: 10.1117/12.2588330
CrossRef

[12] S. Pan, P. Li, C. Yi, D. Zeng, Y.-C. Liang, and G. Hu, “Edge intelligence empowered urban traffic monitoring: A network tomography perspective,” IEEE Trans. Intell. Transp. Syst., vol.22, no.4, pp.2198-2211, April 2021. DOI: 10.1109/TITS.2020.3024824
CrossRef

[13] S. Pan, Z. Zhang, F. Yu, and G. Hu, “End-to-end measurements for network tomography under multipath routing,” IEEE Commun. Lett., vol.18, no.5, pp.881-884, May 2014. DOI: 10.1109/LCOMM.2014.040214.132838
CrossRef

[14] Y. Zhang, R. Oliveira, Y. Wang, S. Su, B. Zhang, J. Bi, and L. Zhang, “A framework to quantify the pitfalls of using traceroute in AS-level topology measurement,” IEEE J. Sel. Areas Commun., vol.29, no.9, pp.1822-1836, 2011. DOI: 10.1109/JSAC.2011.111007
CrossRef

[15] J.F. Grailet and B. Donnet, “Travelling without moving: Discovering neighborhood adjacencies,” Proc. Network Traffic Measurement and Analysis Conference, 2021.
URL

[16] M. Di Bartolomeo, V. Di Donato, M. Pizzonia, C. Squarcella, and M. Rimondini, “Extracting routing events from traceroutes: A matter of empathy,” IEEE/ACM Trans. Netw., vol.27, no.3, pp.1000-1012, 2019. DOI: 10.1109/TNET.2019.2911330
CrossRef

[17] V. Giotsas, T. Koch, E. Fazzion, Í. Cunha, M. Calder, H.V. Madhyastha, and E. Katz-Bassett, “Reduce, reuse, recycle: Repurposing existing measurements to identify stale traceroutes,” Proc. ACM Internet Measurement Conference, pp.247-265, Oct. 2020. DOI: 10.1145/3419394.3423654
CrossRef

[18] S. Savage, “Sting: TCP-based network measurement tool,” Proc. USENIX Symposium on Internet Technologies and Systems, vol.2, p.7, Oct. 1999.

[19] M. Luckie, Y. Hyun, and B. Huffaker, “Traceroute probe method and forward IP path inference,” Proc. 8th ACM SIGCOMM Conference on Internet Measurement, pp.311-324, Oct. 2008. DOI: 10.1145/1452520.1452557
CrossRef

[20] B. Augustin, X. Cuvellier, B. Orgogozo, F. Viger, T. Friedman, M. Latapy, and R. Teixeira, “Avoiding traceroute anomalies with Paris traceroute,” Proc. ACM SIGCOMM Conference on Internet Measurement, pp.153-158, Oct. 2006. DOI: 10.1145/1177080.1177100
CrossRef

[21] K. Vermeulen, S.D. Strowes, O. Fourmaux, and T. Friedman, “Multilevel MDA-lite Paris traceroute,” Proc. Internet Measurement Conference, pp.29-42, Oct. 2018. DOI: 10.1145/3278532.3278536
CrossRef

[22] M.E. Tozal and K. Sarac, “Tracenet: An internet topology data collector,” Proc. 10th ACM SIGCOMM Conference on Internet Measurement, pp.356-368, Nov. 2010. DOI: 10.1145/1879141.1879188
CrossRef

[23] M. Luckie, “Scamper: A scalable and extensible packet prober for active measurement of the Internet,” Proc. 10th ACM SIGCOMM Conference on Internet Measurement, pp.239-245, Nov. 2010. DOI: 10.1145/1879141.1879171
CrossRef

[24] Y. Hyun and C. Claffy, “Archipelago measurement infrastructure,” 2021. Available online: http://www.caida.org/projects/ark/, accessed on 24 Nov. 2023.
URL

[25] R. Beverly, “Yarrp’ing the Internet: Randomized high-speed active topology discovery,” Proc. 2016 Internet Measurement Conference, pp.413-420, Nov. 2016. DOI: 10.1145/2987443.2987479
CrossRef

[26] Z. Durumeric, E. Wustrow, and J.A. Halderman, “ZMap: Fast Internet-wide scanning and its security applications,” Proc. 22nd USENIX Security Symposium (USENIX Security 13), pp.605-620, 2013.

[27] B. Donnet, P. Raoult, T. Friedman, and M. Crovella, “Deployment of an algorithm for large-scale topology discovery,” J. Sel. Areas Commun., vol.24, no.12, pp.2210-2220, 2006. DOI: 10.1109/JSAC.2006.884019
CrossRef

[28] Y. Huang, M. Rabinovich, and R. Al-Dalky, “FlashRoute: Efficient traceroute on a massive scale,” Proc. ACM Internet Measurement Conference, pp.443-455, Oct. 2020. DOI: 10.1145/3419394.3423619
CrossRef

[29] M. Faloutsos, P. Faloutsos, and C. Faloutsos, “On power-law relationships of the Internet topology,” ACM SIGCOMM Comput. Commun. Rev., vol.29, no.4, pp.251-262, 1999. DOI: 10.1145/316194.316229
CrossRef

[30] A. Lakhina, J.W. Byers, M. Crovella, and P. Xie, “Sampling biases in IP topology measurements,” Proc. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM) (IEEE Cat. no.03CH37428), vol.1, pp.332-341, March 2003. DOI: 10.1109/INFCOM.2003.1208685
CrossRef

[31] A. Clauset and C. Moore, “Traceroute sampling makes random graphs appear to have power law degree distributions,” arXiv preprint cond-mat/0312674.
URL

[32] S. Pan, P. Li, D. Zeng, S. Guo, and G. Hu, “A Q-learning based framework for congested link identification,” IEEE Internet Things J., vol.6, no.6, pp.9668-9678, Dec. 2019. DOI: 10.1109/JIOT.2019.2930459
CrossRef

[33] R. Govindan, H. Tangmunarunkit, “Heuristics for Internet map discovery,” Proc. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM) (Cat. no.00CH37064), vol.3, pp.1371-1380, March 2020. DOI: 10.1109/INFCOM.2000.832534
CrossRef

[34] K. Liu, W. Quan, D. Gao, C. Yu, M. Liu, and Y. Zhang, “Distributed asynchronous learning for multipath data transmission based on P-DDQN,” China Commun., vol.18, no.8, pp.62-74, 2021. DOI: 10.23919/JCC.2021.08.005
CrossRef

[35] CAIDA Data Server: Index of /datasets/topology/ark/ipv4/probe-data/team-1.Available online: https://publicdata.caida.org/datasets/topology/ark/ipv4/probe-data/team-1/, accessed on 24 Nov. 2023.
URL

Authors

Yuguang ZHANG
  Northwestern Polytechnical University,China Electronics Technology Cyber Security Co., Ltd.

is currently pursuing an EngD. degree from the School of Cybersecurity, Northwestern Polytechnical University (NPU) in Xi’an, China. His research interests include cyberspace reconnaissance and confrontations.

Zhiyong ZHANG
  China Electronics Technology Cyber Security Co., Ltd.,No.30 Institute of CETC

received his B.S. and Ph.D. degrees from UESTC in 2007 and 2014, respectively. He was a postdoctoral researcher at EPFL, Switzerland from 2014 to 2017. He is currently a senior engineer at the No.30 Institute of CETC. His current research interests include cyberspace cartography.

Wei ZHANG
  China Electronics Technology Cyber Security Co., Ltd.

received his Ph.D. in Optical Engineering from the School of Communication & Information, University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2017. He is currently a senior engineer at the No.30 Institute of CETC and China Electronic Technology Cyber Security Company, Ltd. His research interests include network concealment and confrontation.

Deming MAO
  China Electronics Technology Cyber Security Co., Ltd.

received B.S., M.S., and Ph.D. degrees in electronic and communication engineering from the School of Cybersecurity, Northwestern Polytechnical University (NPU), Xi’an, China in 2006, 2009, and 2021, respectively. He is the director of the Science and Technology Management Department, and is currently a professor with research projects at the No.30 Institute of CETC and China Electronic Technology Cyber Security Company, Ltd. His research interests include cyberspace security.

Zhihong RAO
  Northwestern Polytechnical University

is the chief scientist of the China Electronics Technology Group Corporation, a leading talent in the national ‘Ten Thousand Talents Program, and the project leader of a key special “Unk Software and System Vulnerability Analysis and Discovery Technology” project of the National Key Research and Development Plan “Unk Cyberspace Security”. He is a model worker at a central enterprise and an expert with a special government allowance from The State Council. His research interests include cyberspace security.

Keyword