Friday, October 31, 2003

Peer to Peer and the effects on an ISP

An interesting problem raised it's ugly head the other week.

I used to manage the network and infrastructure for many clients. A couple of those clients were experiencing sub-optimal routing of the network on one of these clients, resulting in latency (sometimes as bad as 2 secs!) and even dropped packets of information. I conducted all the usual tests and nothing seemed to point to an issue. My last ditch attempt to identify the cause was to have a look at the routing equipment provided by our upstream providers.

Did some tests on the Cisco 3640 Transit Router and found the CPU usage to be abnormally high (averaging at 98%)... but why? the router is certainly capable of handling the upstream connectivitiy, so it had to be something else.

I made some further investigations and sent my findings up to the transit provider, who indeed confirmed my findings that the Cisco was at fault. We pulled out the logs from the router to find out why it was occuring and this led to some interesting results.

All routing equipment has what is known as PPS (processes or packets Per Second) - this is the number of concurrent requests that it can handle at any one time. The 3640 Router has a threshold of 30,000pps and it seemed from our logs that the problems with the overuse of the CPU was down to the router having to deal with more than it's pps threshold.

But what could do such as thing? the nature of Internet traffic means that a small packet of information is sent outward and then the data is streamed back to the client - what could use constant processing and be slowing up the entire network? The cause of the issue is actually down to a nice guy called Bram Cohen.

Bram is the author and inventor of 'Bit Torrent', the Peer to Peer application accredited for the slow demise of the Music/Movie Industry.The way that BitTorrent works is quite revolutionary. instead of downloading a file from a single server (thus putting extreme load on one central system), the BitTorrent software allows for a single file to be broken up into segments and downloaded from multiple sources and then 'restitched' together again on your computer.

Let me try to explain it without geekspeak: Lets say that 100 people are downloading a file, each one of those people are downloading segments from the other 99 people at the same time. Therefore, the file is able to be distributed faster, quicker and with no load on a web server or system.

Now the problem with this is that one file, may be downloading from 100 other people at the same time, but you could also be sharing that file with another 100 people (uploading), therefore, one single file is now contributing to 200 processes.

So, 200 Processes for a single file, lets say you are downloading 20 files (200 x 20) equals 4000 processes, now multiply that by just 100 customers (4000 x 100) equals 400,000 processes....
...you get the idea.

We sent our results to Cisco who confirmed that Peer to Peer software was the cause of this problem and asked me to carry out some tests - namely blocking some of the ports announced as Peer to Peer software and checking to see how the router handled normal traffic.
This in itself posed an interesting problem. Finding the appropriate ports responsible for Peer to Peer.

I found all the ports that were obvious and compiled an Access List - the ports affected were as follows:

  • Kazaa and FastTrack Clones TCP and UDP Port 1214
  • eDonkey and Clones TCP and UDP Ports 4661 to 4672 TCP Ports 5555, 4242, 3306, 2323, 6667, 7778
  • WinMX and Napster TCP and UDP Port 6257, 6699
  • BitTorrent TCP and UDP Ports 6881 to 6889
  • Gnutella TCP and UDP Port 6346

I put my Access-list in place and waited for the fallout over the next 24 hours. Surprisingly, there were no complaints and nobody seemed to notice any difference.

The reset of the Router apparently masked the supposed 'fix' because it had been rebooted, therefore the CPU and memory usage had been reset, so it wasn't overloaded this time. However, we saw no drop in our bandwidth usage and a few selective checks gave me cause for concern.

I loaded up and visited http://torrents.gentoo.org and started a download of the Gentoo ISO and sure enough, the file would not download - however, I then visited a 'less legitimate' site and successfully started a download of a well known movie currently in the cinema. But Why? surely the ports that I had blocked should stop this? apparently not.

upon further investigation I missed some key information that had made this entire 'test' pointless. Legitimate Peer to Peer software applications which use standard 'trackers' to initiate requests were indeed blocked (meaning that all legitimate use of Peer to Peer was being blocked) - however, illegal use of BitTorrent (software/movies/music/pornography etc) has evolved and become a lot smarter.

he newer Peer to Peer hybrids are now not using the 'standard' ports as above, but have decided to grow. Due to the number of corporate providers that block these ports on firewalls and routers, software writers have gotten smarter and are now using any port between 29100 and 65535.

Blocking this number of ports is simply impossible - not only because the load put on any device to process this number of rules would result in exactly the same problem (latency and speed issues), but it would also affect normal surfing, online gaming in fact, pretty much all Internet access.

So, what do you do? in an online world where illegal downloading of software/music/movies seems to be the norm - how do you monitor/track/stop it? Is it possible?

If and when ISP's start to block this kind of traffic, new providers will simply pop up charging a little more for the service, but with no port restrictions. It's happened before with NewsGroups and IRC.

Can you force your customers into using a service which does not allow the use of those ports? if Microsoft have their way (http://www.theregister.com/2005/06/16/filesharing_microsoft/), it'll be impossible. But, as always, only time will tell.


No comments: