Executable
Name |
Description |
sensorRouter,
sensorPort, and sensorTime |
These bash shell scripts run
flow-tools and extract the desired data when sensors are either
routers, ports, or time. The measurements can be either flows,
octets, or packets, separated in any way that flow-stat is able.
For example, traffic can be divided by source or destination
port, IP
address, or autonomous system (AS). An arbitrary filter
using flow-filter or flow-nfilter can also be applied to limit, for
example, the ports or IP addresses of the input traffic.
(Flow-tools was created by Mark
Fullmer and information is available online.)
The output is the sparse data vectors in a two-column text format. |
spl2dist |
This C-code executable inputs
the two-column sparse data vectors and outputs the distance between
each pair of vectors. When N
sparse data vectors are input, an N
by N matrix is output.
The data vectors can be optionally normalized, to use percent of total
rather than absolute traffic numbers. Distance is calculated as L2
(Euclidean) distance. |
wmds |
This C-code executable inputs
the N by N distance matrix and outputs
low-dimensional coordinates. The number of dimensions defaults to 2,
but can be set to any positive integer. The dimension reduction
is done using the weighted multi-dimensional scaling
(wMDS) method, as described in the paper. Arbitrary prior
coordinates can be set, along with the weights and weighting
scheme. Neighbors can be selected via K-nearest-neighbors, with
an arbitrary integer for K. |
coords2eps |
This C-code executable inputs N 2-dimensional coordinates and residuals ei, and produces an EPS file which plots the sensor map. The axis limits can be chosen automatically or set on the command line. |
ATLA |
Atlanta |
CHIN |
Chicago North |
DNVR |
Denver |
HSTN |
Houston |
IPLS |
Indianapolis |
KSCY |
Kansas City |
LOSA |
Los Angeles |
NYCM | New York City |
SNVA |
Sunnyvale |
STTL |
Seattle |
WASH |
Washington |
Map-Tools
Visualization |
Description |
Summary of the four weeks starting 02-Jan
ending 29-Jan: Mean and 1-standard
deviation uncertainty ellipse () of router
maps from 2-Jan to 29-Jan. Most location estimates fall within the
ellipse. Router maps are calculated every 5 minutes when sensors
are routers measuring number of flows per source IP address. Solid
lines show actual connections in Abilene backbone network. |
|
Sunday, 02-Jan-2005 at 2:40 UTD:
This is a `typical' map. Although there is some deviation from
the mean (eg. WASH and DNVR) the routers are placed very close to their
4-week mean. Immediately after this time (at 2:45) a large port
scan dramatically changes the router map. Compare this map to the
following map. Legend: The 4-week mean location () is connected to the current estimate () by a dashed red line (). The shading of the circle is proportional to the residual value ei: dark indicates high residual and white indicates low residual. |
|
Sunday, 02-Jan-2005 at 3:00 UTD:
There is a port scan occurring between 2:45 and 3:30 which involves two
source IP addresses sending a total of about 61,000 flows per 5
minutes. The traffic is measured only at CHIN, IPLS, DNVR, and
KSCY. The flows are coming from source IPs 198.59.80.0 (unknown)
and 140.113.200.0 (nctu.edu.tw) from port 48775 to destination IP
140.113.200.0 (du.se, Högskolan Dalarna, Sweden). The source
AS number is zero. Almost all of the flows are single, 29-byte
UDP packets, to a wide range of destination ports. There are a
few, larger (100-300 kB flows) to ports 22, 53, 6667, and
6669. Because of the low traffic level (it is a Sunday and
the day after New Year's day) this traffic corresponds to 40% of the
total number of flows, thus the map is dramatically changed -- the
affected routers are pushed North, while all other routers are pushed
far South. CHIN, IPLS, KSCY and DNVR are equally affected by
traffic from source IP 198.59.80.0, but only CHIN is affected by
traffic from 140.113.200.0 (nctu.edu.tw). This is why CHIN isn't
located exactly at the same place as IPLS, KSCY, and DNVR. Legend: The 4-week mean location () is connected to the current estimate () by a dashed red line (). The shading of the circle is proportional to the residual value ei: dark indicates high residual and white indicates low residual. |
|
Sunday, 02-Jan-2005 at 8:10 UTD:
There are about 13,000 flows going between two IP addresses:
129.171.184.0 (University of Miami, FL) and 64.4.16.0
(hotmail.com). There are about 6000 flows originating from the U.
Miami address from a wide variety of source ports to destination port
80 (TCP) of the hotmail.com address. Each flow contains 1-6 (for
an average of 2) 40-byte packets. The hotmail.com address replies
with 1500-byte packets, from source port 80 to a wide range of
destination ports. While there are normally many flows from the
hotmail.com address, this traffic accounts for about 80% of the total
flows coming from the hotmail.com address. The map shows
the source and destinations, ATLA and STTL, being mapped very far from
their mean location. Routers DNVR, KSCY, and IPLS are also
affected by the anomalous traffic, and are grouped very close
together. HSTN traffic is usally very similar to ATLA, but at
this time it is very different, and so HSTN is placed very far away. Legend: The 4-week mean location () is connected to the current estimate () by a dashed red line (). The shading of the circle is proportional to the residual value ei: dark indicates high residual and white indicates low residual. |
|
Wednesday, 5-Jan-2005 at 08:55 UTD:
At this time, there is scheduled
maintenance on the CHIN-IPLS link. Usually, IPLS and CHIN
traffic are very similar, but during the downtime, much of the traffic
on Abilene reroutes through different links, such as a more Southern
route through WASH and ATLA. As a result, the router map shows a
much larger distance between IPLS and CHIN, and a much flatter map,
since traffic on the Southern routers are, temporarily, very correlated
with Northern traffic. Legend: The 4-week mean location () is connected to the current estimate () by a dashed red line (). The shading of the circle is proportional to the residual value ei: dark indicates high residual and white indicates low residual. |
|
Thursday, 6-Jan-2005 at 17:55 UTD:
There is an anomaly that totals 90,000 flows at the CHIN router.
These are single, 40-byte packet flows from two source IP addresses in
Taiwan to a small range of destination IP addresses in Hungary.
This volume corresponds to about 25% of the typical flow volume on
CHIN. The traffic from the two Taiwanese source IP addresses was
observed on CHIN and no other router, thus distances between sensor
data recorded at CHIN and other routers are unusually high, and the 2-D
coordinates for CHIN must be kept very distant from all other
sensors. Also, because normalized distances are used to keep the
map size reasonably constant, the rest of the distances between routers
have shrunk to compensate. Legend: The 4-week mean location () is connected to the current estimate () by a dashed red line (). The shading of the circle is proportional to the residual value ei: dark indicates high residual and white indicates low residual. |
|
Friday, 7-Jan-2005 at 15:45 UTD:
There is an anomaly that totals 20,000 flows at the CHIN, NYCM, and
WASH routers.
These are single, 40-byte TCP packet flows from source IP address
140.123.64.0 (ccu.edu.tw) in
Taiwan to destination IP address 128.112.128.0 (princeton.edu).
There are a range of source ports (between 1024 and 2048) and a range
of low destination ports (between 1 and 139).
Since the traffic was
observed on CHIN, NYCM and WASH but no other router, these three
routers are moved East in the router map, while IPLS and ATLA are moved
West, to keep them far apart from each other. Legend: The 4-week mean location () is connected to the current estimate () by a dashed red line (). The shading of the circle is proportional to the residual value ei: dark indicates high residual and white indicates low residual. |
|
Wednesday, 12-Jan-2005 at 20:15 UTD:
There is a large anomaly of 71,000 flows at the STTL, LOSA, and SNVA
routers.
These flows are single, 29-byte UDP packet flows from source IP address
163.30.88.0 (possibly tyc.edu.tw) to destination IP address 134.71.24.0
(csupomona.edu, California Poly in Pomona). The packets are from
source port 40150 to random destination ports.
Since the traffic was
observed on LOSA, SNVA, and STTL but no other router, these routers are
placed far away to the West, while the rest of the routers, due to the
constraint on total distances, are placed very close together. Legend: The 4-week mean location () is connected to the current estimate () by a dashed red line (). The shading of the circle is proportional to the residual value ei: dark indicates high residual and white indicates low residual. |
|
Thursday, 20-Jan at 01:00 UTD:
There are a large number (14,000) of 29-byte packets from a 129.25.0.0
(Drexel U.) source IP address sent to a 131.252.120.0 (Portland State
U.) destination. The packets are UDP with source port 3095 or
3096 to a wide range of random destination ports >1024. These
packets travel through the WASH, NYCM, CHIN, IPLS, KSCY, DNVR, and STTL
backbone routers. Other routers (SNVA, LOSA, HSTN, and ATLA) do not see
any flows from this source address at this time. Distances between the
listed Northern routers and the other Southern routers are unusually
high. In the router map, there is a clear split in the map
between the two sets of routers. Legend: The 4-week mean location () is connected to the current estimate () by a dashed red line (). The shading of the circle is proportional to the residual value ei: dark indicates high residual and white indicates low residual. |
|
Destination port map for
01-Jan-2005 at 3:35 UTD (O dpo#) along with past 1 hour map history
(dotted line circles). Sensors are attached to the top 30 destination
ports (by total flows) and measure number of flows per source IP
address. |
|
Other examples will
be added to this table in the future. |