CS 29 -- Mid-term Exam Solutions


Below are answers to the mid-term exam. A bit of discussion is included with some of the answers. This page is by no means intended to supplant questions that you may have, so don't hestitate to approach me with those. Be sure that you understand your mistakes and my comments on your graded exams. If you don't, then you should ask!

You will find that questions are in green, answers are in red, and discussion is in blue.

  1. Consider a noiseless channel of 10 kHz and devices with the ability to emit and detect 16 distinct signal levels. What is the maximum data rate of this channel?

    The maximum data rate is, as given by Nyquist, 2H lg V, where H is the bandwidth and V is the number of signaling levels. Therefore:

    H = 10 kHz = 10,000 Hz
    V = 16
    (2)(10,000)(lg 16) = 80,000 bits/sec

    As a straightforward problem, there were only two basic errors. In some cases, people either couldn't remember Nyquist's equation, or they mistakenly used Shannon's equation for noisy channels. In other cases, people cited the correct equation, but made mistakes either with the arithmetic or the units -- for example, some failed to notice that the bandwidth was reported in kHz, but the equation assumed Hz.

  2. Compare and contrast circuit-based and packet-switched networks. Discuss, at least, the properties of latency, throughput, and utilization.

    For a circuit-based network, the latency is high for the initial establishment of the circuit before sending the very first byte, but low for all subsequent transmissions where the circuit (and thus the route) has already been established. The throughput is stable, due to the guaranteed availability of bandwidth for the circuit, but limited, as transmissions will be carried no faster than the allocated bandwidth for the circuit provides. The utilization is not likely to be high, as any idle circuits occupy bandwidth that is not carrying data.

    A packet-switched network can have more varied latencies than a circuit-based network, as each packet must be routed individually, with some packets taking longer than others to reach their destination. The throughput will vary depending on the congestion of the network, but has the capacity to occupy all of the bandwidth between source and destination while those packets are being sent. The utilization can be much higher, as no capacity is reserved for hosts that are not transmitting, leaving all of the bandwidth available for hosts that are.

    All variety of mis-statements about the properties we presented in this question. Some asserted that packet-switched latenties would be strictly higher or lower, when in fact the overriding characteristic is its variability. A large number of people asserted that throughput would be high for circuit-based networks, citing the dedication of bandwidth. This assertion fails to acknowledge that the dedicated bandwidth is exceedingly limited compared to the total available bandwidth that is being shared, thus keeping throughput consistent, but often well below its potential. Most, but not all, correctly recognized that the utilization of a circuit-based network could be low whenever some circuit were idle. However, people less consistently recognized the high capability of packet switching to utilize the availble capacity of the network.

  3. A distributed, shared medium access protocol can have unlimited contention, limited contention, or be contention free. Explain these terms and provide examples for each. Describe the wait-time and utilization properties for each.

    An unlimited contention protocol does not distinguish between hosts in response to a collision. That is, when a collision occurs, all hosts follow the same protocol, and all have permission to attempt a retransmission. Ethernet (IEEE 802.3) is an example of such a protocol, where it is possible, following a collision, that any one of the hosts may randomly choose a shortest waiting time and retransmit. When such a network is lightly loaded, the wait time is short and the utilization high. When the network is heavily loaded, collisions and their management substantially increase the average wait time and lower the utilization.

    A contention free protocol is one that guarantees that only one host may attempt transmission at a given time. By slicing time and specifying that a unique host may transmit, collisions become impossible. The token bus (802.4) protocol is an example of a contention free protocol, as only the host holding the token may transmit. A lightly loaded contention-free network is going to have higher wait times than necessary, and thus slightly reduced utilization. For heavy loads, however, contention-free protocols are ideal, as utilization becomes nearly optimal, and the wait time is lower since collision management does not pollute the traffic.

    A limited contention protocol may allow only a subset of the hosts to transmit at a given time. Given high network load or collisions, such a protocol may divide time such that certain hosts may only attempt transmissions during certain time slices. The adaptive tree walk protocol is a limited contention protocol, where the response to a collision requires that first one half of the hosts be prevented from attempting transmission, and then (perhaps) the other half. A limited contention protocol exhibits low wait times and high utilization when the load is low, and adapts its behavior to reduce collisions when the load is high, thus keeping wait times as low as possible and maintaining high utilization.

    The most common and major error was to believe a contention-based network is one that did not resolve or manage collisions at all. It is important to notice that a collision-based network is one that never differentiates between (and thus restricts) hosts in collision resolution. Often, this mistake was coupled with the erroneous claim that Ethernet (802.3) was an example of a limited contention protocol.

    The analysis of wait time and utilization tended to be erratic for this question. Many people would comment on these properties for low loads or high loads, but not both, leaving an imcomplete picture as to the behavior of each type of protocol. Particularly, it should be clear that collision and their resolutions are the events that cause the network to spend time and traffic on management, rather than actual data transmission, and therefore reduce the utilization of the network for data communications.

  4. How can invalid physical layer codes be used to perform framing? Provide a simple example of such signaling under Manchester encoding.

    Given a valid signaling mechanism to transmit bits, we seek a method of transmitting the barriers of a frame that contains bits. We can choose special bit patterns to represent those boundaries of the frame, but it may be more efficient to choose additional signaling schemes, not normal valid (i.e. used for transmission of bits) to mark the frame boundaries.

    In Manchester encoding, there must be a transition of the signal from low to high or high to low in the middle of every time slice. We can mark the beginning and ending of a frame with a time slice where the signal does not change.

    It was important first to establish what framing is, as some failed to do. A few confused escape sequences with invalid codes. It was then critical not only to show that you knew what Manchester encoding was, but what an invalid Manchester encoding could be. You then had to tie these ideas together, stating that the invalid codes could be used not to signal errors (as some claimed) but rather to indicate frame boundaries.

  5. For 32-bit messages, how many check bits are needed to provide 1-bit error correction? Justify your answer by deriving the number of check bits needed for an arbitrary m-bit message and 1-bit errors.

    With m = 32 there are 2m possible m-bit messages. For each of those messages, there must be a unique n-bit codeword. It must also be possible that any one of the n bits of a valid codeword could be flipped and yet still be identifiable as originally having been the valid n-bit codeword. Therefore, we must reserve n+1 of the n-bit patterns for each m-bit message -- one for the valid codeword, and n at Hamming distance 1 from the valid codeword.

    If n = m+r, then we must be sure that r (and thus n) is large enough for there to be n+1 codewords for each m-bit message. So, if there are 2m possible messages, we need to be able to form at least 2m(n+1) codewords. Therefore:

    2m(n+1) <= 2n
    2m(m+r+1) <= 2m+r
    m+r+1 <= 2r

    Given m = 32, we find that r <= 2r - 33. The smallest integer for which this equation holds is 6.

    This question caused a good deal of trouble. Most commonly, people had memorized the inequality that allowed for the calculation of the number of check bits for 1-bit error correction, and then used it. This approach yielded the correct answer, but failed to provide the requested justification for that inequality.

    It may not have been completely clear that I was looking for a description, from first priciples, of the reasoning behind that inequality. Any attempt to justify the inequality was given at least most of the credit.

    A handful of people tried to apply the actual Hamming code (which is not to be confused with Hamming distance), and use that to derive the number of check bits needed. This yielded incorrect results, and failed to address the question clearly, but earned some of the credit as it is a somewhat reasonable approach.

    A number of people failed to understand the concept of Hamming distance, believing that it refered to the number of check bits in a code.

  6. Consider a data link layer that uses sliding windows with selective repeat and frame numbers represented with 4 bits. What is the maximum window size that can be used? Why can the window be no larger?

    A 4 bit value for a frame number implies that there are 16 distinct frame numbers. Therefore, the largest possible window size is 8, or half of 16.

    Consider a larger window of 9 frames. The sender could transmit a window-full of frames, numbered 0 to 8. The receiver accepts these frames, acknowledges their receipt, and moves its window forward to expect frames numbered 9, 10, ..., 15, 0, and 1. Now consider that the acknowledgements are lost. The sender will timeout and resend its original 9 frames, numbered 0 through 8. The receiver will then not be able to recognize that frames 0 and 1 are duplicates of the ones it had already acknowledged, rather than new frames carrying those numbers. It therefore should never be possible that two adjacent window configurations can have frame numbers in common.

    It was common, on this problem, to confuse the restrictions on window size for go-back-n with those for selective repeat. Commonly, either the answer for the number of frames for one was provided, but the reasoning for the other was the accompaniment.

    Another, deeper problem was that of people who confused frames, frame numbers, bits, and window size. Be sure you know exactly what these terms mean, and that you're using them correctly.

  7. Consider a network whose routers use localized adaptive routing. How would such a network respond to the insertion of a new, high-speed link? How would such a network respond if later that high-speed link was eliminated?

    Adaptive routing that is localized learns only from information available within that router. Therefore, as new packets cross the new link, and the routers attached to that link collect information about the speed of that link and the distance it provides from other network locations, slowly the information will spread via backwards learning as packets that have passed through the link reach other routers.

    When the link is broken, there will be no new information indicating the unavailability of that link to routers other than the ones connected to the link. This weakness of local adaptivity can be addressed only if historical information in the routing table is deprecated, thus causing routers to "forget" about the now-missing link.

    I did not like this question. As a result, nobody lost more than half of the points, and the benefit of the doubt was often given.

    The essential element of this question is that you demonstrate that you knew the difference between localized routing and distributed routing. Specifically, many people claimed that neighboring routers would share routing information, thus allowing the slow-but-steady distribution of information about the insertion and deletion of the new link. Such a description is correct for distributed routing. Localized adaptivity does not allow for the sharing of information between routers. The only example we saw of localized routers adapting to new topological features was the use of backward learning, to which a few people hinted in their answers, but nobody mentioned by name.

  8. List two virtues of flooding as a routing solution. Explain your answers.

    1. Flooding always find the shortest path. If each router sends each incoming packet out on every available outgoing channel (except the one on which the packet arrived), it must attempt every possible non-redundant path, including the shortest one.
    2. Flooding is robust. Since every possible routing is attempted, if there is any path from the source to the destination that persists during the travel of the packet, then the packet will be delivered.

    Any two reasoable virtues, including the claim that flooding is simple to implement, were given credit. A few people claimed robustness and nearly-guaranteed delivery as two separate virtues, when they are the same one.

  9. Consider a sorting algorithm whose serial runtime Ts(n) = O(n lg n). If a given parallel sorting algorithm uses n/2 processors and has a total runtime for p processors of Tp(n) = O(n), is that parallel sorting algorithm work efficient? Justify you answer.

    Wp(n) = p * Tp(n)
    Wp(n) = n/2 * O(n)
    Wp(n) = O(n2)
    Wp(n) != O(Ts(n))

    Therefore, this parallel algorithm is not work efficient.

    Errors on this problem were individualized -- no two people made quite the same mistake in introducing superfluous terms or making mistake in transforming the expressions from one step to the next. Most people did achieve the basic conclusion correctly.


Scott F. Kaplan
Last modified: Wed Dec 12 14:02:48 EST 2001