Monday, November 17, 2008

An architecture for internet data transfer

In this paper, the authors propose an abstraction for data transfer. Control logic still stays with applications, and in particular, communications for control logic is separated from the data transfer. On the other hand, data transfer can be initiated as a separate service, where the application do not have to worry about the actual means of transfer, just that the receiver will eventually get the data transmitted by the sender. By providing this abstraction, the authors hope that various types of transfer techniques can be used and adopted widely.

To me, this is one of those papers that sounds like a good idea at first, but a little further thinking makes overturns that initial impression. I don't think all applications would want any kind of data transfer service; each application would have its own specific need, and would probably need to choose an appropriate type of transfer technique. Web surfing, downloading or video streaming would each require different types of services. One possibility would be for the application to specify its preference, but that defeats the purpose of the abstraction. Besides, applications can simply link against libraries of the desired transfer.

I'm also not sure how this would incorporate data streaming. Right now, the entire piece of data needs to be provided by the application before a hash can be computed for its ID. Obviously this is not possible for live streaming, where the application would not even know what the data is ahead of time, and the receiver wants to get the data when its available. One possibility would be to consider each chunk a separate piece of data and use the construct here, but that's clunky.

However, benefits such as load-balancing over multiple paths and caching of duplicate data does seem tempting. Perhaps a different abstraction that provides these benefits while letting the application choose the type of transfer would work better?

No comments: