Introduction to the NLANR AMP Project for HPC sites

NLANR and NSF invite and encourage HPC Awardee sites to join the ongoing NLANR intersite Active Measurement Program. In this project, HPC sites and other collaborators participating in high performance networking activities are encouraged to host an active measurement monitor that makes connectivity, loss and round trip time measurements and on-demand throughput measurements to other HPC sites.

This NLANR project is intended to improve the understanding of how high performance networks perform as seen by participating sites and users, and to help in problem diagnosis for both the network's users and its providers. The project will also provide a research platform on which to develop more useful network performance statistics, especially a throughput related measure that can be measured without causing a significant load on the network. Collaboration by the sites on goals, objectives, methodologies, and result presentation is encouraged. Tony McGregor (tonym@nlanr.net) is assuming the lead role for this NLANR AMP activity.

Host Site Requirements

Measurements will be taken from a rack mountable1, FreeBSD machine,2 with a 10/100Mbps Ethernet interface3, that will be provided and administered by NLANR. Normally traffic to the machine is expected to be in the order of 10-20MB per day but may be higher if there are event initiated throughput tests. Apart from physical deployment of the machine, normal administration will be managed by NLANR, with the option of an additional local administrator at the site's request. This should minimize the burden on the deployment sites. Because the focus is active measurements, user data will not be captured.

The measurement machine would need to be located so it can serve as a "representative machine" for the local campus. That means that if a campus exhibits a performance problem, e.g. low throughput across the HPC network, the measurement machine should see this problem. Or, from an HPC network point of view, if the support team becomes aware of performance issues, it should be possible to use the data from the measurement machine to investigate whether it is an HPC network or site related problem. If the campus connection works and is usable, the measurement machine should be reachable at good performance levels as well. A 100Mbps Ethernet connection and 100Mbps path to the HPC connection is desirable for throughput tests. It is important that the machine not be behind a firewall so it can be reached from other HPC sites.

A number of monitors have already been deployed. Data from these monitors can be viewed at http://moat.nlanr.net/Active. The AMP project, with NSF funding, tries to purchase and install the measurement machines as quickly as possible once sites have expressed interest.

Performance can be determined at at least three different perimeter:

For the latter, we encourage sites to integrate departmental "user machines" into this NLANR AMP environment, by using comparable measurement methodologies between departmental hosts and the local AMP machine. Conceptually, this would create branches emanating from the site monitor, thus creating a measurement hierarchy at the perimeter to the meshed AMP machines.

We do need some questions answered by the sites prior to shipping a machine. A short version of the announcement is available as well.

If you have questions or issues, or if you want to discuss things further, please email to ampstaff@nlanr.net. It will help us greatly if you could respond quickly if you are willing to house a machine at your HPC campus, so we can get a better understanding of how many we have to purchase.


Footnotes

1 Pictures of the monitor are at http://moat.nlanr.net/AMP/Announcement/pictures.html

2 The specifications of the hardware of the active monitor are at http://moat.nlanr.net/AMP/Announcement/hardware.html

3 Currently the monitor can only be connected via a 10/100Mbps Ethernet interface. Other interfaces may made be available if needed. Contact ampstaff@nlanr.net for more information.