0b55f123 |
1 | |
2 | |
3 | Answers to selected exercises from |
4 | 'Design and Validation of Computer Protocols' |
5 | ============================================= |
6 | |
7 | 1.1 |
8 | Assuming that torches in the two groups would be |
9 | raised and lowered simultaneously, |
10 | the code for the first character in the first group |
11 | could have clashed with the pre-defined start of text code. |
12 | |
13 | If torches are not raised simultaneously it is conceivable |
14 | that group numbers could be paired with the wrong character numbers. |
15 | |
16 | A well-trained transmitter might also overestimate the receiver's |
17 | ability to translate the codes on the fly. |
18 | as is still true today: receiving is a more time consuming task |
19 | than transmitting. |
20 | |
21 | 1.2 |
22 | Assuming that a torch code is displayed for a minimum of 30 seconds, |
23 | the torch telegraph transmits a choice of 1 out of 25 (between 4 |
24 | and 5 bits of information), giving a speed of roughly 0.15 bits/sec. |
25 | Chappe's telegraph transmitted a choice of 1 out of 128 every 15 to 20 |
26 | seconds, giving a transmission speed of roughly 0.5 bits/sec. |
27 | On Polybius' telegraph the 15 characters of the message |
28 | ``protocol failure'' would take 15x30 seconds or 7.5 minutes to transmit... |
29 | (Note that a code for the space was not available.) |
30 | On Chappe's system the 16 characters (space included) would be |
31 | transmitted in 4 minutes, assuming that no predefined code |
32 | was assigned to either the word `protocol' or the word `failure.' |
33 | (As far as we know, there wasn't.) |
34 | |
35 | 1.3 |
36 | Removing the redundancy in messages increases the chance that a |
37 | single transmission error would make a large part of a message |
38 | inrecognizable. It could cause a lot of extra traffic from receiver |
39 | back to sender, asking for retransmissions, and additional transmissions |
40 | of messages. The same tradeoff is still valid on today's communication |
41 | channels (see Chapter 3). |
42 | |
43 | 1.4 |
44 | The signalman at A had to make sure that not one but two |
45 | trains would leave the tunnel, before he admitted the third. |
46 | The two trains could reach signalman B in approximately 2 minutes. |
47 | At 25 symbols per minute, that would allow the two signalmen |
48 | to exchange roughly 50 characters of text. |
49 | A could have signaled: "two trains now in tunnel - how many left?" |
50 | for a total of 42 characters. |
51 | Assuming that B would have answered eventually "one train left," |
52 | that would still leave A puzzling if B had really understood his |
53 | message, and if so, where the second train could possibly be. |
54 | Considering also that signalman A had been on duty for almost |
55 | 18 hours when the accident occured, it is not entirely certain |
56 | that he could have succesfully resolved the protocol problem. |
57 | Note that he still would have had to `invent' part of the protocol |
58 | for resolving the problem in real-time. |
59 | |
60 | 1.5 |
61 | Replace the message `train in tunnel' with `increment the number of |
62 | trains in the tunnel by one.' Similarly, replace the message `tunnel |
63 | is clear' by `decrement the number of trains in the tunnel by one.' |
64 | The message `is tunnel clear' becomes `how many trains are in the |
65 | tunnel?' with the possible responses spelled out with numerals 0 to 9. |
66 | Either signalman can increment or decrement the number. |
67 | The rule of the tunnel is invariantly that the number of trains in |
68 | the tunnel is either zero or one, and only one signalman may transmit |
69 | at a time. (To resolve conflicts on access to the transmission line, |
70 | one could simply give one of the signalmen a fixed priority.) |
71 | |
72 | 1.6 |
73 | A livelock would result. Assuming that the semaphore operators would |
74 | quickly enough recognize the livelock, it is still an open question |
75 | what they would (should) do to recover properly from such an occurence. |
76 | |
77 | 1.7 |
78 | One possible scenario, observed by Jon Bentley in real life, is that |
79 | two connections are made, and both parties are charged for the call. |
80 | Clearly, a dream come true for the telephone companies. |
81 | |
82 | 2.1 |
83 | Service - the simplex transfer of arbitrary messages from a designated |
84 | sender to a designated receiver. |
85 | |
86 | Assumptions about the environment - sufficient visibility and small enough |
87 | distance to make and accurate detection (and counting) of torches possible |
88 | for both sender and receiver. There seems to be an implicit assumption of |
89 | an error-free channel. There is also the implicit assumption that the |
90 | receiver will always be able to keep up with the sender and will not get |
91 | out of sync with the symbols that have to be decoded. |
92 | |
93 | Vocabulary - 24 characters from the Greek alphabet, plus two control messages |
94 | (the start of text message and its acknowledgement). |
95 | |
96 | Encoding - each character, and each control message, is encoded into two |
97 | numbers, both between 1 and 5. |
98 | Since there are 26 distinct messages and only 5x5=25 distinct codes, some |
99 | ambiguity is unavoidable. |
100 | |
101 | Procedure rules - minimal. Apparently there was only a single handshake |
102 | at the start of the transmission. All other interactions (end of transmission, |
103 | error recovery, flow control) were undefined and would have had to be |
104 | improvised in the field. There is also no resolution of a potential conflict |
105 | at the start of transmission (assuming that both parties could decide to |
106 | start sending at the same time). |
107 | |
108 | 2.2 |
109 | The procedure rules can include a poll message once per |
110 | complete page - interrogating the printer about it's status |
111 | (confirming that it is online and confirming that it |
112 | is not out of paper - both conditions that can change from |
113 | one page to the next). Note that the procedure rules must |
114 | also guarantee that no more than one user can use the printer |
115 | at a time. |
116 | |
117 | 2.3 |
118 | Service - establishing a voice connection between two subscribers |
119 | of a phone system. (Let's conveniently ignore multi-party connections, |
120 | or faxes and modems.) |
121 | |
122 | Assumptions about environment - the phone system is infinitely |
123 | available and error-free (sic). |
124 | |
125 | Vocabulary - off-hook, on-hook, dial 0 ... dial 9 (ignore star and sharp). |
126 | Dial-tone, busy-tone, ring-tone. All messages listed here are control |
127 | messages - the `data' of the protocol is encoded in the voice message. |
128 | (For completeness then we could list `voice' as a separate message in the |
129 | vocabulary, without attempting to formalize it's `encoding.') |
130 | |
131 | Encoding - lifting the receiver, lowering the receiver, |
132 | pushing one of 10 labeled buttons. |
133 | |
134 | Informal procedure rules - Go off-hook, if no dial-tone is returned |
135 | go on-hook and repeat a random amount of time later. |
136 | If there is a dial-tone, push the sequence of buttons that identify the |
137 | required destination to the phone system. |
138 | If a busy-tone is returned in this interval, go on-hook, wait a random |
139 | amount of time, and repeat from the start. |
140 | If a ring-tone is returned - the call has been established - wait a |
141 | random amount of time, go on-hook. |
142 | |
143 | Note that the random wait period after a busy signal makes it less likely |
144 | that a `Lovers' Paradox' can be created (cf. Exercise 1.7). |
145 | To be complete, the phone systems behavior should also be listed. |
146 | Be warned that the latter is not a trivial exercise.... |
147 | |
148 | 2.4 |
149 | The revised version is the famous alternating bit protocol, see Chapter 4. |
150 | |
151 | 2.5 |
152 | The receiver can then determine where a message (should) end by |
153 | counting the number of bytes it receives, following the header. |
154 | It does not have to scan for a pre-defined message terminator. |
155 | |
156 | 2.6 |
157 | For isntance, a character stuffed protocol always transmits an integral |
158 | number of bytes. A bit stuffed protocol carries slightly less overhead. |
159 | |
160 | 2.7 |
161 | See Bertsekas and Gallager, [1987, p. 78-79]. |
162 | |
163 | 2.8 |
164 | More detailed sample assignments for software projects such as |
165 | this one are available from the author (email to gerard@research.att.com). |
166 | |
167 | 3.1 |
168 | The code rate is 0.2. Protection against bursts is limited |
169 | to errors affecting maximally 2 out of the 5 bytes. |
170 | At 56 Kbits/sec that means bursts smaller than 0.28 msec. |
171 | |
172 | 3.3 |
173 | Does the crc need to be protected by a crc? |
174 | |
175 | 3.4 |
176 | In many cases English sentences are redundant enough that |
177 | forward error correction is possible. Any real conversation, |
178 | though, contains many examples of feedback error correction |
179 | to resolve ambiguities. |
180 | To stick with the example - if the sentence ``the dog run'' is |
181 | received, the original version (i.e., one or more dogs) cannot be |
182 | determined without feedback error control. |
183 | |
184 | 3.5 |
185 | (a) - the checksum is 6 bits wide. |
186 | (b) - the original data is equal to the first n-6 bits of the code word |
187 | (6 bits in this case). |
188 | (c) - there were no transmission errors other than possible multiples |
189 | of the generator polynomial itself. |
190 | |
191 | 3.6 |
192 | The standard example is that of the voyager space craft near |
193 | the planet Jupiter receiving a course adjustment from earth. |
194 | A retransmission of the message would mean hours delay and invalidate |
195 | the original commands. |
196 | If the return channel has a high probability of error (e.g., a low |
197 | power transmitter on board the spacecraft, sending out a very weak |
198 | signal back to earth), the chances that a retransmission request would |
199 | reach the earth intact may also be unacceptably low. |
200 | |
201 | 3.8 |
202 | It is impossible to reduce a non-zero error rate to zero. |
203 | The probability of error can be brought arbitrarily close to zero, |
204 | at the expense of transmission speed, but it cannot reach zero. |
205 | The scheme suggested would violate this principle and therefore |
206 | should be placed in the same category as perpetuum mobiles and time-travel. |
207 | |
208 | 3.9 |
209 | Fletcher's algorithm can be classified as a systematic block code. |
210 | |
211 | 4.2 |
212 | The alternating bit protocol does not protect against |
213 | duplication errors or reordering errors. |
214 | Duplication errors persist (duplicate messages do not dissapear |
215 | but generate duplicate acks etc, for the duration of the session.) |
216 | Reordering can cause erroneous data to be accepted. |
217 | |
218 | 4.5 |
219 | Too short a timeout creates duplicate messages. |
220 | The duplicates lower the throughput for the remainder of the session. |
221 | Too long a timeout increases idle-time and lowers the |
222 | throughput. |
223 | |
224 | 4.6 |
225 | Ignore duplicate acks. |
226 | |
227 | 4.8 |
228 | See Bertsekas and Gallager [1987, pp. 28-29]. |
229 | |
230 | 4.9 |
231 | In at least one case (when the receiver is one full window of |
232 | messages behind the sender) there is a confusion case where |
233 | the receiver cannot tell if an incoming message is a repeat |
234 | from W messages back, or a new message. |
235 | |
236 | 4.10 |
237 | Store the data in buffer[n%W] where W is window size. |
238 | |
239 | 4.11 |
240 | Use time-stamps and restrict the maximum life time of |
241 | a message. Note however that time-stamps are just another |
242 | flavor of sequence numbers and they would have to be selected |
243 | from a sufficiently large window. |
244 | For 32 bit sequence numbers one message transmission |
245 | per micro-second would recycle the number in 71 minutes. |
246 | For 64 bit sequence numbers, the number recycles |
247 | at the same transmission speed in 500,000 years. |
248 | |
249 | 4.12 |
250 | Alpha controls the rate of adaption to changes in |
251 | network performance. |
252 | Beta controls the allowed variance in response time. |
253 | (It estimates the load variance of the remote host.) |
254 | |
255 | 4.13 |
256 | Most importantly, all assumptions about the environment, |
257 | specifically of the tranmission channel used, are missing completely. |
258 | |
259 | 4.14 |
260 | The message could be received again and cause a duplicate acceptance. |
261 | |
262 | 5.1 |
263 | An assignment is always executable. The variable b would be set |
264 | to the value 0. |
265 | |
266 | 5.2 |
267 | If the receive is executable on the first attempt to execute |
268 | the statement, the message would be received, and the condition |
269 | would be false (since the `executability' value of the receive is |
270 | non-zero). The statement would block, and would be repeated. |
271 | If the receive is (finally) non-executable, the receive fails, |
272 | but the condition becomes true and executable. |
273 | For all clarity: this is not valid Promela syntax. In Promela |
274 | the rule is that the evaluation of a condition must always be |
275 | completely side-effect free. |
276 | |
277 | 5.3 |
278 | |
279 | /***** Factorial - without channels *****/ |
280 | |
281 | int par[16]; |
282 | int done[16]; |
283 | int depth; |
284 | |
285 | proctype fact() |
286 | { int r, n, m; |
287 | |
288 | m = depth; |
289 | n = par[m]; |
290 | if |
291 | :: (n <= 1) -> r = 1 |
292 | :: (n >= 2) -> |
293 | depth = m + 1; |
294 | par[m+1] = n-1; |
295 | run fact(); |
296 | done[m+1]; |
297 | r = par[m+1]; |
298 | depth = m; |
299 | r = r*n |
300 | fi; |
301 | par[m] = r; |
302 | printf("Value: %d\n", par[m]); |
303 | done[m] = 1 |
304 | } |
305 | |
306 | init |
307 | { depth = 0; |
308 | par[0] = 12; |
309 | run fact(); |
310 | done[0]; |
311 | printf("value: %d\n", par[0]) |
312 | /* factorial of 12: 12*11*10....*2*1 = 479001600 */ |
313 | } |
314 | |
315 | /***** Ackermann's function *****/ |
316 | |
317 | short ans[100]; |
318 | short done[100]; /* synchronization */ |
319 | |
320 | proctype ack(short a, b, c) |
321 | { |
322 | if |
323 | :: (a == 0) -> |
324 | ans[c] = b+1 |
325 | :: (a != 0) -> |
326 | done[c+1] = 0; |
327 | if |
328 | :: (b == 0) -> |
329 | run ack(a-1, 1, c+1) |
330 | :: (b != 0) -> |
331 | run ack(a, b-1, c+1); |
332 | done[c+1]; /* wait */ |
333 | done[c+1] = 0; |
334 | run ack(a-1, ans[c+1], c+1) |
335 | fi; |
336 | done[c+1]; /* wait */ |
337 | ans[c] = ans[c+1] |
338 | fi; |
339 | done[c] = 1 |
340 | } |
341 | init |
342 | { |
343 | run ack(3, 3, 0); |
344 | done[0]; /* wait */ |
345 | printf("ack(3,3) = %d\n", ans[0]) |
346 | } |
347 | |
348 | 5.10 |
349 | |
350 | Here, as an inspiration, is another sort program, performing |
351 | a tree-sort. |
352 | |
353 | /**** Tree sorter ****/ |
354 | |
355 | mtype = { data, eof }; |
356 | |
357 | proctype seed(chan in) |
358 | { byte num, nr; |
359 | |
360 | do |
361 | :: (num < 250) -> num = num + 5 |
362 | :: (num > 3) -> num = num - 3 |
363 | :: in!data,num |
364 | :: (num > 200) -> in!eof,0 -> break |
365 | od |
366 | } |
367 | |
368 | init { |
369 | chan in[1] of { byte, byte }; |
370 | chan rgt [1] of { byte, byte }; |
371 | byte n; |
372 | |
373 | run seed(in); |
374 | in?data,n; |
375 | run node(n, rgt, in); |
376 | do |
377 | :: rgt?eof,0 -> printf("\n"); break |
378 | :: rgt?data,n -> printf("%d, ", n) |
379 | od |
380 | |
381 | } |
382 | |
383 | proctype node(byte hold; chan up, down) |
384 | { chan lft [1] of { byte, byte }; |
385 | chan rgt [1] of { byte, byte }; |
386 | chan ret [1] of { byte, byte }; |
387 | byte n; bit havergt, havelft; |
388 | |
389 | do |
390 | :: down?data,n -> |
391 | if |
392 | :: (n > hold) -> |
393 | if |
394 | :: ( havelft) -> lft!data,n |
395 | :: (!havelft) -> havelft=1; |
396 | run node(n, ret, lft) |
397 | fi |
398 | :: (n <= hold) -> |
399 | if |
400 | :: ( havergt) -> rgt!data,n |
401 | :: (!havergt) -> havergt=1; |
402 | run node(n, ret, rgt) |
403 | fi |
404 | fi |
405 | :: down?eof,0 -> break |
406 | od; |
407 | if |
408 | :: (!havergt) -> skip |
409 | :: ( havergt) -> rgt!eof,0; |
410 | do |
411 | :: ret?data,n -> up!data,n |
412 | :: ret?eof,0 -> break |
413 | od |
414 | fi; |
415 | up!data,hold; |
416 | if |
417 | :: (!havelft) -> skip |
418 | :: ( havelft) -> lft!eof,0; |
419 | do |
420 | :: ret?data,n -> up!data,n |
421 | :: ret?eof,0 -> break |
422 | od |
423 | fi; |
424 | up!eof,0 |
425 | } |
426 | |
427 | 5.13 |
428 | Promela is a validation modeling language, not an implementation language. |
429 | Why does a civil engineer not use steel beams in wooden bridge models? |
430 | |
431 | 6.1 |
432 | The assertion can be placed inside the critical section. |
433 | The simplest way is as follows (rewritten with some features |
434 | from the more recent versions of Spin): |
435 | |
436 | mtype = { p, v } |
437 | |
438 | chan sema[0] of { mtype }; |
439 | byte cnt; |
440 | |
441 | active proctype dijkstra() /* 1 semaphore process */ |
442 | { do |
443 | :: sema!p -> sema?v |
444 | od |
445 | } |
446 | active [3] proctype user() /* 3 user processes */ |
447 | { |
448 | sema?p; |
449 | cnt++; |
450 | assert (cnt == 0 || cnt == 1); /* critical section */ |
451 | cnt--; |
452 | sem!v |
453 | skip /* non-critical section */ |
454 | } |
455 | |
456 | 6.2 |
457 | To check the truth of the invariant |
458 | for every reachable state, one can write simply: |
459 | |
460 | never { do :: assert(invariant) od } |
461 | |
462 | Or to match an invalid behavior by reaching the |
463 | end of the never claim, without assertions: |
464 | |
465 | never |
466 | { do |
467 | :: (invariant) /* things are fine, the invariant holds */ |
468 | :: !(invariant) -> break /* invariant fails - match */ |
469 | od |
470 | } |
471 | |
472 | Note that semi-colons (or arrows) in never claims match system transitions, |
473 | (i.e., each transition in the system must be matched by a move in the |
474 | never claim; the claim does not move independently). |
475 | |
476 | 6.5 |
477 | Using accept labels, for instance: |
478 | |
479 | proctype A() |
480 | { do |
481 | :: x = true; |
482 | t = Bturn; |
483 | (y == false || t == Aturn); |
484 | ain = true; |
485 | CS: skip; /* the critical section */ |
486 | ain = false; |
487 | x = false |
488 | od |
489 | } |
490 | ... and simularly for proctype B() |
491 | |
492 | never { |
493 | do |
494 | :: skip /* allow arbitrary initial execution */ |
495 | :: !A[1]@CS -> goto accept1 |
496 | :: !B[2]@CS -> goto accept2 |
497 | od; |
498 | accept1: |
499 | do :: !A[1]@CS od /* process 1 denied access forever */ |
500 | accept2: |
501 | do :: !B[2]@CS od /* process 2 denied access forever */ |
502 | } |
503 | |
504 | |
505 | 6.6.a |
506 | never { |
507 | do |
508 | :: skip /* after an arbitrary number of steps */ |
509 | :: p -> break |
510 | od; |
511 | accept: |
512 | do |
513 | :: p /* can only match if p remains true forever */ |
514 | od |
515 | } |
516 | |
517 | 6.6.b |
518 | For instance: |
519 | never { |
520 | do |
521 | :: assert(q || p) /* "!q implies p" */ |
522 | od |
523 | } |
524 | |
525 | 6.6.c |
526 | never { /* <> (p U q) */ |
527 | do |
528 | :: skip /* after an arbitrary number of steps */ |
529 | :: p -> break /* eventually */ |
530 | od; |
531 | do |
532 | :: p /* p remains true */ |
533 | :: q -> break /* at least until q becomes true */ |
534 | od |
535 | /* invalid behavior is matched if we get here */ |
536 | } |
537 | |
538 | The translation has been automated, and is standard in Spin version 2.7 |
539 | and later (Spin option -f). |
540 | |
541 | 6.7 |
542 | A research problem -- there are no easy answers. |
543 | |
544 | 6.8 |
545 | assert(0) |
546 | is an immediate violation in both finite or infinite execution sequences, |
547 | and is a safety property. |
548 | |
549 | accept: do :: skip od |
550 | |
551 | is only a violation for an infinite execution sequence, and is a liveness |
552 | property (i.e., requires a nested depth-first search for acceptance |
553 | cycles). The safety property can be proven more effieciently. |
554 | Other than this, the two properties are equivalent; |
555 | |
556 | 7.1 |
557 | Layers 3 to 5 and layer 7. |
558 | |
559 | 7.2 |
560 | At the sender: first checksumming, then byte stuffing and framing. |
561 | At the receiver: first unstuffing and de-framing, then checksum |
562 | verification. |
563 | |
564 | 7.3 |
565 | Rate control is placed at the layer that handles the units it |
566 | refers too (for bit rates, it is the physical layer - for packets |
567 | it would be the layer that produces packets, etc.). |
568 | Dynamic flow control belongs in the flow control module. |
569 | |
570 | 7.13 |
571 | The one-bit solution will no longer work. |
572 | |
573 | 8.1 |
574 | The dash is used as a don't care symbol - any valid symbol |
575 | could replace it without changing the validity of the specification. |
576 | The epsilon is a null-element, i.e. it represents the absence |
577 | of a symbol (the empty set). |
578 | |
579 | 8.7 |
580 | No, the run and chan operators are defined to be unexecutable |
581 | when an (implementation dependent) bound is reached. |
582 | |
583 | 9.2 |
584 | No. |
585 | |
586 | 9.5 |
587 | More states, up to a predefined bound only. Fewer states, yes. |
588 | |
589 | 9.6 |
590 | No, the IUT could be arbitrarily large. |
591 | |
592 | 9.8 |
593 | It can no longer detect transfer errors. |
594 | |
595 | 10.2 |
596 | The computational complexity will make an interactive |
597 | solution impossible for all but the simplest |
598 | applications. |
599 | |
600 | 11.3 |
601 | Missing from the program text is that variable j is |
602 | initialized to 1 minus the value of i. |
603 | |
604 | 12.1 |
605 | Note that the number of states to be searched by a validator on |
606 | such a protocol would multiply by the range of the time count... |
607 | |
608 | 13.1 |
609 | If done right, the changes will be minimal (say 10 to 20 lines |
610 | of source). |
611 | The memory requirements will very quickly make validations |
612 | effectively impossible. |