Network paper blogs: 2008

Tuesday, November 25, 2008

A policy aware switching layer for Data Centers

This paper also talks about the mess of middle boxes that are used in traffic management, especially in the data centers where massive amounts of data are sorted, stored, and need to be looked up. Firewalls or load balancers etc are all connected in the network, which create a difficulty in changing policies, or managing the networks because of the complicated network structure. Several difficulties arise in the current implementation including configuration difficulties, manipulating link costs, and also to create separate VLANs. The current approach is not flexible nor scalable in the management, thus this paper introduces a new switch to connect the several components of the data center to create a more manageable and flexible architecture.

The switch architecture proposed is called a pswitch, which introduces a new layer between layer 2 and 3. This switch allows the middle boxes to be connected off the main network path, and to the switches. Inside the switches there are policy controls and rule tables which network managers can easily manipulate and change. And inside the pswitch, the traffic will be routed through the desired middle box component before it is passed onto the next hop, this allows a clean approach to the complicated network architecture and middle boxes that are used to manage the traffic in data centers.

Improving MapReduce Performance in Heterogeneous Environments

This paper talks about mapping a task in a multi processor environment. MapReduce is the technique of splitting a job into multiple smaller tasks and mapping it so it can scale to thousands of tasks being executed simultaneously. A popular open source implementation, Hadoop, which is developed by Yahoo, has been commonly used for MapReducing tasks in clusters. However, the Hadoop implementation has several inherit assumptions which cause it to perform poorly in certain environments. These assumptions include the assumptions that nodes are somewhat similar in the working environment. As a result, when performance is monitored to be used to speculatively execute tasks on idle nodes, it can cause poor estimation of progress, and waste node computing power.

THis paper proposes a new scheduler for the MapReduce operation. The LATE scheduler takes into account the slowest tasks that will effect response time, and only reschedules the tasks that are farthest away from finishing to duplicate execution on a faster node. This allows the overall response time of the MapReduce operation to be decreased, and in a heterogeneous environment this allows for better estimation and usage of the scheduler. The performance can be up to 2 times faster than the original scheduler.

Wednesday, November 19, 2008

Scalable Application Layer Multicast

This paper again is on multicast at the application layer. This paper focuses on a protocol to maintain an overlay topology for efficient delivery of multicast data. The paper introduces NICE trees, which is the topology of the overlay network. The idea uses the location of the nodes to form several clusters with nodes that are close. The center node of the cluster is the cluster leader which contacts other cluster leaders. The whole NICE tree is hierarchical with different layers. All nodes are in the lowest layer, the node leaders are also at a higher layer with the other node leaders, so communication goes by passing data first to the highest layer possible, then going down the tree. Since the nodes and clusters are allocated by their geographic location, so the different clusters can communicate within cluster faster, and across cluster, the cluster leader talks to another cluster leader on the same layer. It's kind of like routing within AS and outside of ASes.

With a protocol like this, the most important thing to notice is how nodes join and leave and how the hierarchy is continuously maintained. The NICE tress has a Rendezvous point that is in charge of telling the new nodes where to look and where to join, which also traverses down the hierarchy and finds its spot. Other important situations include when cluster leaders leave the network or are down, then a new leader must be selected. This process is summarized in another paper, but is not really mentioned in this one. The topology of this network definitely seems like it could work, and simulation results weren't bad either. Seemed like a great paper.

A reliable multicast framework for light-weight sessions and application level framing

This week our focus is on multicast in the network. Both papers were on application level overlay networks that are used for multicast. This paper takes the approach of building a generic framework protocol used for multicast. The analogy they used was like TCP for unicast, they wanted a framework for reliable delivery (but it's also best effort with eventual delivery of the message). It starts by stating several unicast methods that are bad suited for multicast networks. These networks include ACKs used for reliable delivery. Bad idea for multicast because of ACK explosion and the sender doesn't necessarily know all the receivers and participants involved. Another bad method is the state tracking of communication channels. These include sequence numbers, because participants can leave and join at any time. So the multicast is better suited using application data units. The paper implements it multicast framework using a whiteboard program, which allows users to have online conferencing and see the same whiteboard or presentation.

From what i read, the protocol involves a lot of random backoff timers to ensure that only one copy of repair and recover request is received, and all other receivers will not send out requests for the same data. Also, data is synced by page (for the whiteboard program), but the paper also mentions that you can request old pages, but they don't give any specifics. It seems like this protocol/framework was designed to work with different network topologies, as the paper tries to compare the different network topologies, but it seems that it still has some assumptions about the underlying topology. The next paper is more focused on maintaining an overlay topology for efficient delivery for multi cast networks.

Monday, November 17, 2008

An Architecture for Internet Data Transfer

Initially when i was reading this paper, since for this whole semester we've been reading papers on the network architecture, i finally realized that this was not a paper on the architecture of the internet, but more like overlay networks where services are implemented on top of the existing network for convenience. Several of his points seemed very appealing, including the fact that different applications that use or send the same data could potentially cache because they were using the same transfer service, thus save in time to transmit or obtain data.

THe idea of seperating data transfer and negotiation of data certainly is not new. FTP is the perfect example of a protocol which uses one connection for negotiation and another connection for data transfer. The difference here is that this is provides a general interface for use for all the applications instead of merely ftp. The implementation uses C++ and hooks into network transfer libraries and create callbacks when transferring data. The interface approach is interesting, but i'm just wondering about how applicable this is to most applications on the internet, since there are already a lot of existing services used for data transfer.

A Delay Tolerant Network Architecture for Challenged Internets

This paper addresses address the issues of links from the network with extra long delay and also unreliable connectivity and poor operation characteristics. The paper calls these kind of networks challenged networks. The paper states that the current internet makes certain assumptions about the underlying architecture and it's basic reliability and performance. I think this is only true in some cases. The basic internet architecture itself i don't think makes that assumption, that's why this research is even possible. From what i got from the paper, the mail idea is to insert nodes (they call it DTN Gateways) into the network to split the network into different Regions. Then within the Regions define the network characteristic of the region. The DTN nodes are therefore responsible for the retransmission and reliable transfers. The rest of the paper describes the different routing and naming algorithms of this overlay network. This reminds me of the split TCP mechanism, where in the middle of the TCP connection you have a midway point that is also responsible for a similar mechanism.

The idea presented isn't completely new, but in the original internet architecture proposal, one of the main points was not to keep state in the network. However, now with more and more wide range use of the network, it seems like there are several benefits of keep state in some nodes of the network. It seems like with challenged networks, such unreliability requires some sort of retransmission unit. But another argument is: that's what the different layers are for, thus we can implement it in the link layer or have a stronger retransmission mechanism. Nonetheless, with the internet evolving and more and more networks wanting to integrate into the internet, it seems like states in the network are somehow unavoidable.

Thursday, November 13, 2008

X-Trace: A Pervatsive Network Tracing Framework

This paper takes a different approach to measuring and tracing the performance of the internet than the previous paper. Instead of taking data and analyzing the packets and traces, this paper proposes a more active method of tracing the network. They introduce a framework that inserts tracing metadata within the packets into the network to help trace the network. This framework allows us to trace across layers and protocols, and analyze different causality relationships between the network. This concept is really effective, because it is indeed that a lot of network traffic is caused by a separate flow. For example, a DNS query is often triggered by another HTTP request or email packet. One since website query might lead to queries from ad servers or image servers etc. Thus, an effective way of categorizing network traffic is to take into account these causality relationships.

The framework introduced metadata that is appended to the packets, along with two propagation primitives pushNext and pushDown in order to propagate the metadata along the network. pushDown copies the metadata from one layer to the layer below, and pushNext pushes the metadata to the next hop. Then based on the metadata, you reconstruct the task tree and analyze it. The paper also gives several use cases and applications.

This method of course involves changing the network structure and introducing more traffic in order to trace traffic. Compared to the other method where is was just being a bystander observing the packets pass by. Forgetting about the security implications of the trace, i wonder if it's possible that trace data or mechanisms introduced itself affect the network traffic, thus causing the measured data or analysis to be skewed or inaccurate?

End to End Internet Packet Dynamics

This paper is a report based off of measurements off of the internet between 1994 and 1995. It tries to evaluate the performance of the internet and observe if there are any anomalies or weird dynamic behavior within the internet and specifically TCP. The paper evaluates several things, including out-of-0rder delivery, packet corruption, bottleneck bandwidth, packet duplication, packet delay etc. It takes traces from real internet data and analyzes the effects of each one.

A couple interesting things i saw from the paper. I didn't even know that packet duplication existed within the internet. Thus, when i read about it in the paper, it was quite interesting. Them receiving 9 copies of the same packet, that was kind of funny. The paper also studied the out of order delivery behavior of packets in the internet, and observed that it wasn't a rare event. But our of order doesn't necessarily decrease performance.

Overall i think these kind of studies can help us conclude a lot about specific internet behavior, such as TCP, in a specific region, but the whole internet is so complex and large, that it's hard to factor in everything when taking these huge amounts of data. Especially since they don't know about other traffic information, so the packet loss and packet delay etc can be caused by different factors. The next paper introduces another way of gathering data and analyzing the network, which takes on a more active role than purely analyzing packets.

Wednesday, November 5, 2008

Middleboxes no longer considered harmful

I had a hard time following this paper, since the main motivation for it i didn't truely understand. From what i've read, it seems like instead of trying to prevent NAT boxes or other middle boxes from existing within the network, they tried to setup a framework that allowed the NAT boxes to co-exist. One of the main methods of doing that was to basically give every unique internet host a unique identifier. However, besides the fact that it breaks the network layer infrastructure, i don't really see how firewalls and NAT boxes are harmful, infact, since i use a router i home, it seems to me that they are more beneficial then harmful, just because they can protect a user from being directly connected to the network, and can filter out harmful traffic. As a result, i didn't really understand the point behind this paper.

Internet Indirection Infrastructure

This paper is quite interesting, introducing an overlay on top of the current network architecture to create an abstraction that allows for more efficient implementation of different services, including multicast, or mobile networks. This overlay network utilizes a rendezvous based communication method which isolates senders and receivers so that the packet a sender is sending can be received by different receivers, or processed and routed differently depending on the receiver's request. This is done by unique id's on the sender and receiver side which is matched to ensure the transmission of the data. The underlying implementation of the overlay is using the chord lookup like the previous paper.

I think the idea of this is definitely very interesting, because it allows for abstraction on top of existing network solutions in which can easily implement several different services. However, despite the security section mentioned in the paper, it still seems like using the id for authentication of receiving messages might be really susceptible to eaves dropping. But it is true that the current Internet architecture is no better. One question i had is how the chord lookup table will affect the normal routing of the packets, since it seems like it's more work to have to go through the lookup table just to send a packet. But overall, i liked the paper, and it seemed like a good idea. Since this paper is from 2002, are there are any active deployment of it?

Monday, November 3, 2008

DNS Performance and the Effectiveness of Caching

This paper had a lot of data in it. Most of the paper was about how the data was collected and explaining the data. They collected it over several servers connecting MIT and Korea AI labs. One interesting thing is that the traces they collected didn't include the cached queries, so even though they said that it there were 23% DNS that were unanswered, it doesn't include the cached entries, which might have also accounted for a bulk of the queries.

After the analysis, the paper attempts to observe the effects of caching and adjusting the TTL to the network. Intuitively it seems like the longer the TTL, the better, since there will be less requests and most caught by caching. The paper concluded that the decreasing TTL didn't effect performance, but i thought their data that was collected from before was no including cached requests, so it's weird that they can make this claim. Caching does allow better load balancing as well, because all the requests don't need to consistently go to the same server over the wide area network.

Development of the Domain Name System

This paper is a good introduction paper on the domain name system and it talks about the purpose of the domain name system and what they had in mind when they were developing the domain name system. With the DARPA net, centralized management of anything seemed to be a bad idea, since the purpose was fault tolerance. Thus, DNS was designed so there was no central management, but instead a distributed system of matching names to hosts. It worked by establishing records, and whenever a host needs to do a lookup, it can look it up locally first, and then the requests can move up to the different root servers. The namespace is also split into different zones, to spread out the distribution even more.

In the paper it mentioned the current implementation status, and how applications had to be recompiled to convert from the old HOSTS.TXT to DNS. It seems now DNS is the standard and maybe the only way to do domain name lookup, i wonder if all the applications converted successfully, or does HOSTS.TXT still exist and is being used only by a small portion of the users on the internet.

Thursday, October 30, 2008

Chord: A scalable Peer to Peer lookup service for Internet Applications

This paper goes into more depth specifically about the Chord routing method, which is a skiplist-like routing, according to the previous article. The paper lists the goal of Chord, which is to address load balancing, decentralization of control, scalability, availability and also flexible naming. The Consistent hashing is what takes care of the distributed and load balancing, since it can provide any joining node a distributed way of locating evey key in the network. Chord offers a simple way of searching for data in distributed P2P system. It also provides mechanism to continue its correctness of functionality despite partial correctness of a node's information.

I don't know much about P2P applications or the underlying functions of them, but from the previous paper and this one it seems there are a lot of challenges that need to be solved. And the distributed property of P2P applications seem like a very propelling motivation, to be able to spread data quickly from a distributed source. Chord is looking at the searching function of P2p applications, but other issues i image also can be looked into, including how to efficiently download data, and algorithms to distribute that.

Looking up Data in P2P Systems

This article gives a brief introduction to some of the difficulties of P2P computing. It does first list the attractive reasons for P2P systems, including fault tolerant because of its decentralized nature. The challenge of course is how to efficiently use the distributed system. The article specifically talks about the look up problem of trying to find data in this huge distributed system. A broad range of techniques are evaluated and compared to each other, and briefly explained. All of them are based on a distributed hash table. These include the Chord skiplist like routing, tree-like routing, and a multi-dimension routing.

It's hard to determine which routing method is the best, as the summary of the article lists several open questions about the fault tolerance, routing, and security behaviors for malicious nodes. In the article, it does acknowledge the fact that most people directly associate p2p applications with illegal file sharing, including napster or KaZaa. The paper mentions that there are other applications, but i'm not sure if i can think of any off the top of my head...

Tuesday, October 28, 2008

Resilient Overlay Networks

This paper presents an overlay network that is designed for the purpose of detecting faults in the network, including outages connections and nodes. Overlay networks introduces an application layer overlay on top of the current network. This allows for control over the forwarding and routing between the specific nodes that have this application layer overlay network. The authors use this idea to create a fault detection and correction network named Resilient Overlay Networks.

Each RON node detects problems by aggressively probing and monitoring the paths connecting the nodes. RON nodes and forward and route to other RON nodes depending on the route condition. RON networks listed 3 design goals. First, it wanted fast failure detection and recovery. This meant they needed to be constantly monitoring the links. Second, they wanted a tighter Integration with the application, which allow the applications themselves to specify at what level the fault tolerant should be. Third, they wanted a more expressive policy routing, which is allowed at the application layer by changing the routing in between the RON nodes. Several interesting research questions arose, including the performance metric in which they select the routes and paths, and how they categorize the routes as "good."

The results of the evaluation came out to be pretty good, as it only took on average 18 seconds to detect and recover from a fault, which is better than BGP-4. This paper offers a very interesting overlay network, and it even categorizes the latencies of each path. It seems like this would be a great way to test out different network service and protocols.

Monday, October 27, 2008

Active Network Vision and Reality: Lessons from a capsule based system

This paper written by David Wetherall is a reflection after their creation of ANTS active network toolkit. The idea of Active Networking was first described in a paper in 1996, also by the same author. Active networking allows for customized programs to be executed within the network. This allows for easy creation of new Internet services, including custom forwarding and routing tables at each node. But active networking has also been controversy because of performance and security concerns. However, the author in this paper tries to reflect upon the experience of implementing an active network.

He first explains the essentials of active network, and how it works. Basically there are active nodes in the network which acts as a programmable router. Users are provided an interface to program "capsules" which are passed around the network. As the capsules travel through active nodes, different services can occur, which the users can specify. Several advantages of this approach includes it's ability for gradual deployment, since not all nodes need to be active nodes. We are able to gradually deploy this service, and only the nodes that belong to the active node set will process the capsules. The implementation of active networks also contains some inherent properties. The author lists 4, which are expressible, compact, fast, incrementally deployable. Expressible is the fact that the service a user wishes to create must be expressed by the API and interface which is provided to the user. It must be compact because there is a limit of 16KB of code so the network isn't flooded with service code. It must be fast because the routine must run to completion at each node. And lastly, it must be incrementally deployable, meaning that the service must not fail if there are nodes that aren't active. Currently, it still seems that performance and the factors above will limit the usefulness of active networking. but with more research and time, it is a good platform to experiment with new services from the network.

Thursday, October 16, 2008

Directed Diffusion: A Scalable and Robust Comunication Paradigm for Sensor Networks

This paper also talks about a coordinated communication of sensor nodes in a sensor network that requires low power. The idea is similar to the TAG paper in that no centralized server should receive all the data onces it makes a request, because this consumes too much power. Thus, they call it a data centric communication method, is basically that a requester requests the data by a named attribute, and then the request is propagated throughout the network, with nodes sending the request to its neighbors, and then receiving the data and caching it in the local nodes. The terms used in this paper i found very confusing, which made this paper very hard to read. Nodes send "interests" which are basically just requests, through the network. There is also a "gradient" which keeps track of the freshness of data that's stored in the local cache. The "gradient" also determines when to evict a data in its cache.

The testing of the paper didn't take into account of congestion. Another question i had along those lines was the issue of the caches. It seems like if a lot of requests are made through certain nodes, the data caches will be filled up very quickly, and they would need to evict data that's requested. Especially if data is cached along the path from the source to the sink, certain nodes will be in the path more than others. But overall the idea is similar to that of TAG, which is basically localized communication, and also allowing the nodes to process the data before propagating through the network to save messages which in turn saves power.

Wednesday, October 15, 2008

TAG: a Tiny AGgregation Service for Ad-Hoc Sensor Networks

This paper targets sensor networks which include devices that are in a low powered, distributed wireless environment. The sensor nodes communicate and gather data from the environment collectively and pass the data around through the wireless network. The paper proposes a language and communication mechanism for these network of motes to communicate and pass information. Because of its lower power nature, communication is best done by using multiple short hops instead of one long hop, which uses more power. Also, instead of a centralized server that all nodes send data to, several local interactions are used first to sum up the data gathered, then passed along. A tree routing topology is used so data is passed from the leafs up to the root, but each sub root sums up the data from its child nodes, so the amount of data that needs to be passed up is less, which saves power. The language proposed is a declarative language similar to SQL database query language which allow the users to query sensor data and filter, group the data gathered. Several aggregates can be specified by the language, which is all locally computed and then passed up the tree to the root.

I enjoyed reading this paper a lot, it was very clear in expressing its ideas, and the communication was definitely power efficient because you send out less messages due to the aggregation of data. Several optimizations have also been mentioned. These include mechanisms to maintain the topology and recover from errors and node losses. Also, adding a cache to cache data values also proves as a valuable optimization. Overall, for sensor networks which communicate a collective data set and require low power, the proposed methods seem promising.

Thursday, October 9, 2008

ExOR: Opportunistic Multi-Hop routing for wireless Networks

ExOR proposes a integrated routing and MAC protocol. The basic mechanism introduced is to use the inherit broadcast property of wireless links to broadcast a packet to multiple receivers. Once we know the receivers that actually received the packet, then based on the location of the receiver from the destination, we choose the best receiver to forward the packet. This effectively delays the forwarding decision until after the transmission of the packet. In the paper it mentions 4 key challenges. First, only a subset of nodes should receive the packet. Since there needs to be some sort of communication between the nodes to determine the best fit forwarder, the more nodes that receive the packet, the more communication overhead there will be. Thus, there needs to be a small yet effective set of nodes that receive the broadcast. Secondly, the metric used to determine which node forwards the packet needs to be well designed, and we need ot make sure that one and only one node forwards the packet. Third, similar to the first point, only a subset of useful nodes should receive the packet. Finally, ExOR must avoid simultaneous tranmission to minimize collisions. The results looked decent, giving on average a 2 to 4 times gain on the throughput.

Both the papers we read seems to rely on the fact that all nodes can openly see the packets being transmitted. Conceptually we can almost think of this like a bus, where only one sender can send at a time and all receivers can see if the packet is destined for them. So it seems like broadcasting is one of the essential features of improving throughput now. But with ExOR, one major concern is its compatibility with TCP. The results they obtained was not directly layering ExOR over TCP, they used a split way proxy, and only the connection between 2 proxy points were using ExOR, thus the deployment of this seems to need more improvement in terms of compatability with TCP.

Wednesday, October 8, 2008

XORs in the Air: Practical Wireless Network Coding

This paper puts a lot of theory into practice. The main idea that was introduced was basically doing packing coding to improve on the throughput. But XORing together different packets and broadcasting them, the receivers can XOR the received packet again to obtain the needed data. This way you can transmit multiple packets at once and save on number of transmissions. The paper lends itself to the natural broadcast that wireless networks come with. Receivers hear each other's packets all the time, which this method leverages.

Several issues arise, such as knowing which packets to XOR together to get maximum bandwidth. While reading the paper, it was still not clear to me how they determined what to XOR the packet with. They used simple examples of two nodes transmitting, but this method has to assume that the receiving node has the same data information as another sender's packet, or else how else would the router XOR 2 packets together? The results are more promising for UDP, as they don't implement congestion control in the protocol, thus it would get congested easier, which allows more packets to be XORed together. This of course improves the performance compared to the congested UDP, but does that mean that it can prevent network congestion?

All in all it seems like the implementation was rather complicated for a low gain. So many mechanisms need to be added to the hardware switches, and different packet headers need to be appended onto the data packets while being transmitted. Also, it now takes the receiver even more work after it receives it, and there's a chance that the XORed packet guessed wrong about the receiver node's packets, and the receiver actually can't XOR back the native packet. For this protocol to really become mainstream, it seems the gain in throughput needs to be off the roof for it to be convincing in anyway. But from the results, it wasn't spectacular, especially for the tcp cases. Congestion control would back off packets anyway, so throughput gain will be limited.

Tuesday, October 7, 2008

A performance comparison of Multi-Hop Wireless Ad Hoc Network Routing Protocols

This paper compares different network routing protocols including the Destination-sequenced distance vector, Temporally-Ordered routing Algorithm, Dynamic source routing, and Ad Hoc On-Demand Distance Vector. The paper gives a brief summary of the different routing protocols and their implementation of each one. This provides a good amount of background for the paper. The paper takes into account the lower level layer characteristics into their simulation which they claim has not previously been done.

The way they test these protocols is that they setup the nodes to move around. The algorithm used to move the nodes cases each node to move and remain stationary for a pause time. When pause time is up, then it picks a destination within a 1500m x 300m space, and then moves to that location in a speed uniformly distributed between 0 and a max speed. Then it waits for another pause time (also random), and then moves again. Different packet sizes and source communicators were used to observe the different effects of the variants. The metrics used were: Packet delivery ratio, routing overhead, and path optimality.

The results showed that DSDV delivered all packets when mobility rate and movement speed are low, but does not perform well when mobility increses. TORA was a little better than DSDV, but introduced a high overhead in the routing protocol. DSR was good at all mobility rates, but the paths need to be determined ahead of time, and introduces more overhead at the source. The AODV performs as well as DSR at all mobility rates, but is more expensive to implement than DSR. This paper however doesn't really consider the effects of congestion on the protocols. In fact, it seems that most wireless protocols focus on the link loss and link quality of the physical layer, while wired protocols focus on congestion control. It makes sense, but i think wireless protocols would still need to consider congestion, and there probably needs to be a mechanism separating congestion loss and quality loss.

A high throughput path metric for multi hop wireless routing

This paper describes a new metric to be used for routing protocols to determine the best path to route in a wireless environment. Previously used metric is the minimum hop count, which favors the least hop path. This has several disadvantages. First, the paths now might be longer, because it takes the least amount of hops, this degrades signal quality since the longer the path, the more the signal is attenuated. This also maximizes loss ratio.

The new metric introduced in this paper is the expected transmission count. This metric takes into account the retransmission counts, and finds the effective transmission count. This allows the mtric to find the path with high throughput, even in the midst of losses. The ETX for a link is calculated by two different ratios, categorized as the forward and reverse ratio. The forward ratio is the ratio that data arrives at the recipient. The reverse ratio measure the ratio of the ack is received. By obtaining the ratio of these 2 values, the metric can then calculate the effective transmission count. The two values are obtained by dedicated link probe packets, which each node broadcasts at a fixed rate. Then depending on whether or not a node receives a probe packet at that rate, we can then calculate the loss ratio based on the last probe packet we received.

This metric takes into account link quality, along with the hop count. This is essential for wireless connections, where link layer losses are common. The minimum hop count path now may not be the ideal in this situation. Several questions arose in my mind when reading the paper. Based on the design of the protocol, it seems that sending the probe packets at that fixed rate is the key to determining the link quality. However, there seems to be ways to delay the sending of a probe packet, for example, if a large data packet is being transmitted etc. This issue will effectively skew the link quality and might lead to false information.

Thursday, October 2, 2008

Modeling Wireless Links for Transport Protocols

This paper lists several different intrinsic characteristics of wireless links for transport protocols and how we can model them in different environments. This paper specifically targets the design of transport layer protocols, thus the methods proposed are for the purpose of high level simulation. It split the wireless links into three main categories, cellular, WLAN and satellite. Previous protocol designs for TCP or other transport layer protocols have the assumption that dropped packets are all due to congestion, because the underlying links were physical connections, thus there were very few link layer corruptions. But for wireless links, several different factors come into play, including interference, data losses, delay variation, packet reordering, bandwidth variation etc.

Reading over this paper, it did help in reinforcing the different things that affect a wireless network, and the paper also mentioned the effects on the specific different type of wireless links (cellular, WLAN and satellite), but the modeling techniques mentioned seem intuitive. Also, there are still several issues that pure modeling can't emulate, such as network topology etc. But this paper provides a high level basic simulation method for designing higher level protocols, which is the target of this paper.

Architecture and Evaluation of an Unplanned 802.11b Mesh Network

This paper evaluates the architecture of a wireless mesh network. This paper mainly evaluates the network with unplanned node placement and multi hop routing compared to a planned single hop network. The network implemented is the Roofnet 802.11b mesh network. The paper first explains the architecture and routing protocol of the Roofnet mesh network, which was tested in the MIT campus. The nodes were distributed in a unplanned but not completely random fashion. The routing protocol used was Srcr, which tries to find the highest throughput route between any pair of Roofnet nodes.

The evaluation method measures several different parameters to observe the effects of different values on the throughput of the network. From the results, we can see that the less hops means higher throughput. They also take into account the effect of link quality and distance. The observation was that as distance incrase, the throughput decreases, which makes complete sense because link quality should degrade and signal strength also degrades with distance. As density increases, the throughput also increases. This is a combined effect of lower hop count, but also interference. However, it offers a wider choice of high quality links.

The paper also compares multi-hop to a single hop network, which shows that multi-hop performs better, however the assumption is that the gateways are set and all connections are forced to operate in single hop. One could imagine a combination of planned and random multi-hop access to obtain the best performance results.

Tuesday, September 30, 2008

A comparison of Mechanisms for Improving TCP performance over wireless Links

This paper compares different techniques for improving TCP performance and evaluates them in the context of wireless links. The main 3 categories that they evaluated were end-to-end protocols, link layer protocols, and also split connection protocols. The results of the paper showed that link layer protocols were the most beneficial if they are TCP aware.

Thinking back on the various papers we read about TCP, when it was designed, the major issues that we read about were mostly about contention and congestion. Several switch design papers proposed mechanism to deal with congestion, and combined with TCP's resend mechanism it would prevent contention. However, now that we're using TCP over a very lossy link, if the guarantee mechanism was only on the end points, then if the link is lossy, we would possibly need to resend over and over again because the probability of packet loss is high in the middle of transit. Thus, intuitively, it's more beneficial to implement recovery services also at the link layer because of the lossy attributes at each link (back to the end to end argument). All in all, a comparison of different mechanisms for TCP over wireless links confirmed the intuition.

MACAW: a media access protocol for wireless LAN

This paper proposes a different MAC protocol for wireless medium which uses a request to send-clear to send-DATA packet exchange and binary exponential back off. There were several insights in the paper that led to this protocol:
1. The relevant congestion is at the receiver, not the sender. The paper provides a simple example to show this, by using 3 nodes (ABC) which A cannot hear from sender C and C can't hear from A, the author shows that both A and C can think the channel is clear and send, but contention happens at B when he is reciving.
2. Congestion is not a homogeneous phenomenon. The author states that it is not valid to only have a single backoff parameter, but we need separate back off parameters for each strem.
3. Congestion levels should be a collective enterprise. I felt like this was sort of related again back to the end to end arguments in which functionality need to be carefully evaluated as to where it is implemented. In this case, the author argues that individual congestion detection will lead to asymmetric views of the network, thus there should be a collective effort to create a better view.
4. the protocol should propagate synchronization information about contention periods, so all devices can contend equally.

While reading this paper, one thought related to the project came into mind, which is the guaranteed service for wireless LANs. I've read several papers on real-time ethernet or real-time LAN protocols which use token based mechanisms. But how well would those work for wireless environments? And how much synchronization mechanisms need to be introduced in order to easily guarantee service through a physically unreliable channel?

Tuesday, September 23, 2008

Scaling Internet Routers using Optics

Reading the other paper puts this paper into perspective. This paper further motivates the same subject of switch and router designs and suggests that a cross bar is not enough for the router, but we should use optical networks to scale and reduce power. This paper goes into a good amount of detail about the design and also multi racks for routers. It also lists the different challenges and possible solutions that will be faced when actually implementing this architecture of using optics.

As mentioned in the previous blog, i don't have a lot of background on this subject of matter, thus it was a hard paper for me to read. But this paper does seem to go in a fair amount of detail in the design. One thing that i started thinking about is ways to characterize the router's delay. Since our research project involves an estimation of network delay, it seems that as routers are scaling faster and faster, and as the structure gets more and more complicated, an estimation of the delay will become harder and harder.

It's also interesting that as other electronics are getting smaller and smaller, it seems that the router is growing bigger and bigger. This paper suggests that we would need multiple racks because of the power consumption and heat dissipation , instead of a single rack.

A fast switched backplane for a gigabit switched router

This paper motivated using switched backplanes for router designs. Although i don't have too much background in this area of specific router design, but it seems like the backplane they were talking about was basically the interconnect within the router that connects the input to output. It seems that previous to this paper, this interconnect was a bus. Buses in general have several disadvantages including its scalability and also contention, which is also mentioned in the paper. Thus, changing to a cross bar has it obvious advantages that now multiple packets and request can be forwarded to the outputs at the same time, thus increasing the bandwidth of the backplane. The paper then went on to explain the different design areas of a switched network.

Since i don't have much background on router designs, it's hard for me to comment on this paper. But it seems like since i started studying routers, what i learned was that the interconnect between input and output was a cross bar. Thus, i was quite shocked to hard that it used to be a single bus connecting everything. However, multicast does seem like it would be easier to do on a single bus, because i could imagine all of the outputs seeing the broadcast at the same time since they are all on the same bus. But as we move towards higher speed networks and domains, routers need to be able to support and keep up with the network.

Thursday, September 18, 2008

Supporting Real-Time Applications in an Integrated Services Packet Network: Architecture and Mechanism

This paper is also a paper on the real-time guarantees of the internet. It explains the details and need of the architecture and mechanisms to allow real-time services in the internet. However, the author's view of real-time applications seem somewhat narrow to mostly multimedia streams or apps that can be categorized as having a "play back point." Although it does distinguish between guarantee service and predicted service, which is probably their way of saying hard and soft real-time tasks.

Both papers recognize that a network needs to support both datagram traffic and real-time traffic. We know that there are pre-existing virtual circuit networks (ATM) out there that guarantee delivery, and there is obviously packet switched networks that don't have guarantee. But based on the previous paper (fundamental design issues for the future internet), they state that having two seperate links doesn't have the most efficacy, but instead, a link that can satisfy different service models is a link that is the most efficient in utilizing it's bandwidth. Thus, this paper is about the different scheduling algorithm required, and how to unify them.

It seems like the effort for real-time internet boils down to a couple general points:
- Need to reserve bandwidth for real-time connections
-Need to have admission control, because real-time guarantees can't be effected by congestion
-Need to have some way to prevent abuse, namely, pricing
- Need to be compatible with current internet, so use leftover bandwidth to complete non real-time packet requests

But none really talk about security issues, they just use pricing as a mechanism to prevent abuse. Intuitively, if there is a real-time guarantee of delivery, then there can't be denial of service attack because the bandwidth is isolated from the rest of the packets. But can we also think of a denial of service when there is an admission control and real-time requests can't be serviced because it will disrupt the current schedule? This seems like an interesting research topic and a good choice for the semester project.

Fundamental design issues for the future internet

This was a very interesting paper to read. It seems that now the internet is well received, and the amount of users are growing by the day, it's hard to imagine any change in the fundamental design of internet. Most of us are used to the way things are, the TCP protocols and the IP and Ethernet designs that we often are blinded to the so called "issues" of the ethernet. This is also an interesting paper to read because for the project for this class our topic was going to be something related to real-time ethernet, or synchronous ethernet, so this was definitely a good intro paper to read.

The paper surveys the design issues of ethernet, defining the quantity V, the efficacy of the architecture, to be used to evaluate different design techniques. It evaluated the efficacy of having separate services on the internet to address different needs. Obviously the current internet architecture has no guarantee of service, delivery, order, or delay when a packet is sent through the internet. However, in the future, with bandwidth and speed both improving, more and more applications will need better services. This paper thus evaluates different design issues of incorporating the different services in one network.

One of the interesting discussions mentioned in this paper is the issue of pricing. I like to think of it also as an issue of security. If there are high priority packets that recieve better bandwidth, why doesn't everyone just use them or abuse them? Of course pricing could be an effective way of preventing regular users from accessing the high priority bandwidth. This would be used with some sort of admission control to limit the and guarantee the services of this reliable network. However, more and more security issues will start to come into play. For example, if a hacker hacks the control system that does the admission control, he could potentially flood the network with high priority packets preventing any other real-time service to be able to gain access to the link etc. This seems like a good topic to do some research on.

Monday, September 15, 2008

Congestion Control for High Bandwidth-Delay Product Networks

This paper presents a new transport layer protocol called the Explicit Control Protocol (XPC). XPC addresses the problems of TCP where the the bandwidth is underutilized when TCP conservatively controls congestions through slow start and fairness control through the gateways. The proposal for XPC is to use control theory to model the protocol, and decouple the fairness and efficiency control mechanisms used to maintain flexibility and control. However, unlike the previous paper we read where the routers drop packets to control congestion, XPC instead appends a congestion header to all its data packets being sent around. The routers fill out congestion information in the congestion header for those packets and as the packets get routed from destination to destination, the most congested router would be the last to update the packet header, which then the end nodes use to control their sending window.

The XPC router splits efficiency and fairness into two separate algorithms, where as TCP uses the drop packet mechanism to maintain both. This allows the XPC protocol to efficiently use the underused bandwidth when recovering from a congestion slow down. And because it is decoupled from the fairness control, it doesn't have to worry about nodes not getting bandwidth, it just needs to keep the total utilized bandwidth at a high efficiency. The Fairness controller is the one that maintains the bandwidth fairness of the router.

The results of this paper seems to me that is was almost unrealistically good. There were almost no dropped packets, and the performance of the protocol is good when network is un-congested, and also good when the network is congested. Intuitively it seems that because this protocol and gateway doesn't drop packets (intentionally), the quality should be better, and it was built to overcome the flaws of TCP regarding recovery of congestion control mechanisms, thus, its performance can only be better than TCP, while the overhead includes an added packet header for each packet with different control algorithms. It's hard to believe just based on those it can have such a promising results.

Random Early Detection Gateways for Congestion Avoidance

This paper presented the Random Early Detection gateway algorithm to avoid congestion in a packet switched network. The basic idea presented in this paper is that in order to avoid congestion at the gateway level, an average queue size should be maintained in order to provide constant throughput. To maintain the average queue size in the gateway, packets will be dropped randomly at a ratio proportionate to the queue size, which represents the congestion level of the network. If the queue passes a threshold size, all packets incoming will be dropped.

Of course this technique requires the end to end protocol to ensure guarantee of delivery because it doesn't contact the sender directly to slow down the sending window. Instead it uses a passive method of just dropping the packet, and then waits for the sender to realize that a packet has been dropped, and thus it needs to slow down. The paper provided detailed algorithms on each of the decisions that need to be made, including the probability of dropping a packet etc. It also set a max and min thresholds to the queue size. If the queue size reaches the min threshold, then it means that the network is under utilized, thus no packets are dropped or marked. If the queue size reaches the maximum threshold, then all the incoming packets are dropped.

It was also interesting how they defined fairness in this paper. It seems like instead of defining fairness as attempting to give all the nodes the same amount of bandwidth, the fairness was defined as, "there will be the same probability that everyone's packet will be dropped, we will not discriminate against certain nodes and no drop their packet." This is different from the papers we read before, and it certainly does not guarantee that it tries to give all nodes the same bandwidth, it is merely used for congestion control.

Although i can see that when working together with TCP or any higher level protocol which has reliable delivery this would work to control congestion, because end nodes can detect that a packet is dropped and then slow down its window, but intuitively it just seems like randomly dropping a packet seems not to be the most efficient way, and it really relies on the protocols to have reliable delivery. However, the other way is to have the gateways notify the nodes by actually sending messages to them, which would only create more traffic and use up more bandwidth. There is always a tradeoff there.

Thursday, September 11, 2008

Core-Stateless Fair Queueing

This paper addresses some of the issues from the previous paper about the scalability and complexity of the fair queuing algorithm. The author proposes using an island of routers, having the edge maintaining the basic flow state, while the middle routers (core routers) don't need to maintain state, and they simply use a probabilistic drop packet algorithm and a FIFO queue.

The results after testing showed that they achieved a significant degree of fairness. Although this can't compare to a real fairness queuing algorithm, they argue that it's "good enough" and also cost effective and scales well. It also out performs several other queuing algorithms in terms of fairness. It seems somewhat related to the idea of landmark routing, where instead of being fair at each node, dedicate some nodes to monitor the fairness while the other core nodes save computation and power by a simpler FIFO queue.

It's always interesting to see a proposed algorithm that doesn't perform as well as a previous one, but is "good enough" and cost effective. The author claims exactly that for this algorithm. The two papers together both agree that routing is a big aspect of flow control now, but it seems the proposals focused on different tradeoffs, with the FQ being focused on fairness of bandwidth etc, while this paper focuses on the scalability and complexity efficiency.

Analysis and Simulation of a Fair Queueing Algorithm

It is interesting to me how most of the topics interrelate one way or another. To me, the end to end argument is also being applied here as well, as the author first describes the importance of flow control, and how not just the end to end flow control mechanisms are required, but also the routing and even at the gateway level, which is the queueing of packages.

The author does a good job explaining the definitions of fair and also fair scheduling. THe goal was to describe a new fair queueing algorithm, and undersatnd the performance of the FQ algorithm, and also to evaluate it by simulating it. The paper also briefly mentions the different abuses that could happen with a poorly designed system, including users spamming packets to block up the receiver's queue, users who spawn processes to eat up bandwidth etc.

All in all, the fair queueing does provide a major role in network congestion, but this paper doesn't mention anything about implementation costs, which from the next paper we infer that it doesn't scale well. Also, it isn't tested on a real network load, the paper even mentions that the tests implemented were limited, so real load testing is definitely needed.

Monday, September 8, 2008

Analysis of Increase and Decrease Algorithms for COngestion Avoidance in Computer Networks

This paper was published after the previous paper on congestion control, and it attempts to evaluate different techniques for congestion control. The conclusion of course is that additive increase and multiplicative decrease is the best solution. The categories that are evaluated are basically combinations of additive increase/decreases and multiplicative increase/decrease. The paper proses several metrics used for comparison, such as efficiency, fairness, convergence time and size of oscillation.

The paper defines efficiency as the closeness of total load on the resource to its knee, basically maximizing the bandwidth use without going over. Fairness is between the users of the network, and the allowed bandwidth of each, not letting any one user dominate the bandwidth. Distributed is an interesting metric, because it wants to decentralize control. Thus, instead of a central unit telling a particular user to back off, they assume a binary feedback, where all users get same information, congested or not congested. And finally convergence is the speed in which the network reaches its equilibrium state.

It's interesting to read about the ways of proving the set of different metrics. I think a good way to think about is that if its additive increase additive decrease (or multiplicative for both), the network although is trying it's best to decrease congestion by having users decrease load, but users also add load as fast as its decreasing load, so congestion will happen again very quickly. Obviously multiplicative increase and additive decrease will do no good at all. Only by decreasing at a faster rate than your increasing, can you control the network to slowly increase bandwidth, increasing the time it takes for the network to be congested again.

Congestion Avoidance and Control

This paper presented techniques to avoid congestion in the network, which stemmed from the data throughput rate from LBL to Berkeley being dropped from 32Kbps to 40 bps. It states that congestion is not caused by the protocol itself, but the implementation of the protocol. I think although this might be true in the TCP case, but not all protocols are designed to avoid congestion, and a poorly designed protocol would still run into the problem even with good implementation. Thus, the congestion avoidance mechanism much also be built into the protocol. The author also generalizes all congestion avoidance algorithms to contain a conversation of packet principle.

It then lists the several techniques (which are mostly pretty simple but effective) to be able to avoid congestion. First is a slow start, which restricts communications to start slower, and increase if there is bandwidth. The problem with starting too slow is that it's overhead and wasting bandwidth that could be used to transfer. But starting too fast between two different linked speeds would cause congestion (it's interesting that some of these methods are enforced to maintain better compatibility with older connection links, which brings us back to the previous week's paper that internet needs to support a wide range of networks).

The second technique is round trip timing, which is used to calculate when a packet is dropped from the network. This is important because if the calculation is not conservative enough, we would start inserting the packet retries earlier that the packet is dropped, wasting bandwidth. The third technique is exponential back off, where the sender adjusts to the link. More proof on why exponential back off is required in the next paper. The future work is interesting in that it is also related to the end to end argument, that not just the end points should implement avoidance, but maybe the gateways as well, to improve the congestion control.

Congestion control is not just limited to networks, traffic jam is the perfect example. They employ techniques like slowing down incoming traffic by using stop lights at freeway entrances to slow the incoming rate (back off). Without congestion control, then we are very vulnerable to congestion collapses as mentioned in the paper, which will easily turn a high bandwidth link to a low one, just like with the LBL and UCB link.

Wednesday, September 3, 2008

On Inferring Autonomous System Relationships in the Internet

I read this paper after the previous lecture paper on Interdomain routing, which made this paper a lot easier to read. It seems like the author was trying to create a model of the current internet interdomain struture using the BGP protocol assumptions to infer the various relationships of the internet.

The first half of the paper explains the BGP protocol again and the different relationships that exists in the protocol. IT then describes the algorithms it uses to infer the relationships and model the routing tables. The decisions made by each router regarding how the routing tables are exported and imported are based on the different relationships (provider-customer, peer-to-peer etc), and these algorithms were written into a perl program and ran. The results came out very well, as 99.1% of the inference results was confirmed by AT&T.

Without any background for research in networks, this paper was a very good read. First it confirms the previous lecture in saying that most of the network routing is driven by profit, not speed or quality. Looking at the results of the paper, over 90% of the relationships that were inferred on the internet is a provide-customer relationship. This means that even though there are ASes that share peer-to-peer relationships, but it's less than 10%. In a management of technology class that i just took, the professor defined successful engineering as coming up with something useful. He then defined usefulness to mean profitable. Basically, anything that is profitable enough to sustain itself is useful. Looks like the internet is also a good example of it.

Interdomain Internet Routing

This paper is a lecture series which explains the workings of inter- domain routing, which is the back bone of the internet. The interdomain routing protocol is called the border gateway protocol, which is used the route between domains. Since the main goal of the internet was to connect different networks together to form a large interconnected communication channel, there needed to be a way to route the packets from various domains and networks.

The internet is formed by different Automonous Systems including ISPs and campuses. The network within the ASes usually are routed by the ASes themselves, but the routing between the ASes were governed by policies and financial motivations. Thus a scalable protocol that is easily customizable between domains was created to allow ASes to easily configure what routes to broadcast. ultimately it costs money to transmit packets, so the least amount of traffic results in the most saved in earnings. This paper describes the different AS relationships and the route broadcast mechanisms that's resulting from that.

It was quite interesting reading about it. In all the previous networking classes or routing lectures and research papers, it always seems like the main focus is shortest path, or congestion control or flow control. But in reality, research is put into minimizing cost and maximizing profit.

Monday, September 1, 2008

End to End Arguments in System Design

This paper summarizes one of the key concepts and tradeoffs when designing the internet, or any type of network involving multiple layers interacting. The argument basically is how much functionality should be duplicated in each of the layers to ensure the guarantee of the functionality through the network to be robust. In the paper most examples are regarding robustness and reliability, but it could really be applied to any functionality that is to be implemented in the network.

The paper gives some detailed examples, starting with a careful file transfer protocol. It lists the concerns and errors that might occur during a simple file transfer protocol, and some of the methods to combat it. For example, if we just have the protocol application itself do the error checking, and resend the file if there was corruption or error in the transfer, then the lower level implementations wouldn't need to worry about the error checking. However, if error occurs frequently, then this would result in a huge performance loss and possibly network congestion because the whole file needs to be resent over and over. However, if we implement at a lower level, there still is a possibility that the protocol will result in error, so implementing at the lower level doesn't void the necessity to implement the repeated error checking function at a higher level. It's just the matter of how much.

The paper then lists out different scenarios and application backgrounds to help us understand that there is no correct answer but everything is a tradeoff. In particular, even the same type of application with a little different background can result in a different design. A real time voice communicator (phone) will probably care less about dropping one or two packets, and would care more about the latency, since the users can easily reask the information over the communication. However, if it's a voicemail, then we probably care more about the validity and quality of the content over the latency since we can't reask the information.

The examples used in the paper were very clear and really portrayed the ideas behind them. Instead of giving an absolute opinion on the end to end argument itself, the paper actually more serves as a guideline as to how to design the communication system and identify the needs in order to choose between the tradeoff.

I actually read this paper before the previous blog post paper, and i thought it was the perfect order to read the two papers of the week. This paper is a precursor of the different design difficulties faced in desiging the internet, especially in regards to using the datagram as a building block as opposed a more reliable building block. Ultimately the goal of the internet was to support a wide range of networks, and because of that, the reliable error checks and retransmission mechanisms might not be needed for all services and applications supported by the internet. As the argument states, implementing the functionalities in the lower level layers doesn't void the implemention at the higher layers, thus there was no need to have a reliable building block when a datagram is suffice. This clearly was one of the considerations taken into account when the internet was designed.

The Design Philosophy of the DARPA Internet Protocols

This paper written by David D. Clark discusses the internet and its protocols, reflecting on it's design goals and evaluating the success and implementation of each goal. It first gives a brief history of the DARPA network and lists the goals in which the developers had in mind when designing it.

"The fundamental goal for the internet architecture was to develop an effective technique for multipexed utilization of existing interconnected networks. " Writes Clark. This meant combining all the preexisting commonly used networks (including the ARPA packet radio network, ARPANET etc) together so they can communicate and share information. And the technique selected for that was packet switching. Along with the fundamental goal, there are also second level goals, including fault tolerant, multiple types of services etc. The paper goes on to discuss in detail the second level goals, and how the implementation came to be why each implementation was decided.

The paper was definitely a good read. David Clark was the chief protocol architect in the development of the internet, so he obviously knew what he was talking about. He used explained each goal and the reasonings behind it, giving us a good picture of what was going on in his mind during that time.

The internet is something we take for granted everyday, and it's almost becoming something we can't live without. Looking back at the design goals, the fundamental goal of interconnecting a wide range of networks is probably what gave the internet its popularity today. Being able to connect to people from all over the world through these different sets of mediums is why the internet is so attractive, and is the definition of the internet itself. And the benefits of fault tolerant allows us a pretty reliable network to use.

We can see the tradeoff between the flexibility of the internet along with the performance. It was interesting reading Clark describing how the internet architect is actually something that can't really be described, because it encapsulates such a wide range of services, so it can only be defined by the end point services themselves. And the performance (latency) of the internet also cannot fully be comprehended because the design goal is to support any network of any latency. But because of that, we are seeing the internet being used in ways no one would have ever imagined when the internet was designed. Because of the flexibility in services, today the internet is even moving onto the mobile platforms and revolutionizing out experience with interacting with the network.

However, with the internet technology continues to evolve, it will be interesting to see the backwards compatibility continue to play in its effect. How much longer will apps or developers continue to support slow networks like 56k modem etc, given the fact that the design goals of the internet was to incorporate all the different networks into one giant network.