11.3 Selection Revisited--The CASE Statement

Sometimes when one has a number of selection alternatives, the multiple IF .. ELSIF .. ELSIF .. ELSE construction may be rather cumbersome, particularly when all the decisions are being made on the same variable and over a small range of values, as in the following outline:

(* Suppose that digit is known to be in the range 0 .. 9. *)
IF digit > 1 AND digit < 5
    Statement Sequence 1;
  ELSIF digit < 7 THEN
    Statement Sequence 2;
  ELSIF digit < 9 THEN
    Statement Sequence 3;
    StatementSequence 4

This code takes one action when 1 < digit < 5, a second if 5 < digit < 7, a third if 7 < digit < 9, and a fourth otherwise. In view of the fact that the same variable is used for each decision, it would be neater to compact this and write it in a way that mentions the variable only once.

In Modula-2 this is achieved with the CASE statement, an alternative selection construction. Here is the above code rewritten using CASE.

CASE digit OF
  2 .. 4 :
     Statement Sequence 1 |
  5, 6 :
     Statement Sequence 2 |
  7, 8 :
     Statement Sequence 3 
     Statement Sequence 4

The syntax is further illustrated in figure 11.3:

NOTES: 1. The colon is not part of an assignment := in this context. Rather, it is a marker (or, delimiter) between the list of values for an individual case and the statements associated with that case.

2. The colons are required after the individual ordinal values or ranges that determine the cases.

3. The vertical bar ( | ) is a new punctuation sign and is used to separate the cases. It does not need to appear before the ELSE or END. If it is included in such places, Modula-2 treats such as empty cases, just as it allows empty statements, and in these positions, additional bars will not cause errors.

4. The range 2 .. 4 may be replaced by a list of single ordinals 2, 3, 4, but all possibilities listed must be expression compatible with the type of the variable name (here it is digit) after the CASE.

5. The ELSE clause is not required, but if it is left out and the value of digit does not match any of the listed possibilities, then an error will result at execution time. It is therefore better to include it even when it does not govern any statement sequence (no action to be taken.) Some Modula-2 compilers even generate compiler errors when the ELSE clause is left out, though this action is not, strictly speaking appropriate according to the ISO standard. Wirth's initial definition was silent on this point, and non-standard compilers might exhibit almost any behaviour.

6. No selector constant may be used twice in the list of selectors, either singly, or as part of an overlapping range of selectors.

Here is a sketch outline of a CASE statement. Assume that the various procedures have been defined and that DayName can take the indicated values.

  Sunday :
    Goto (church);
    Eat (lunch);
    Eat (supper);
    Goto (church);
    Sleep |
  Monday .. Friday :
      Goto (class);
    UNTIL Learned;
    Go (home);
    Sleep |
  Saturday :
    Hoe (garden);    (* no bar here *)
    Writeln ("Error in case lists, no selection made. ");

As can be seen, the selector variable may be of any ordinal type whether predefined or user defined and any kind of statement sequence is allowed, as long as each selector list after the first is preceded by a vertical bar.

The error in the ELSE part of the CASE statement above should never be able to be reported. This means that it probably will be.

Scalars that are not ordinals cannot be used for the case selector expression. Neither can other types such as strings. Both of the following are illegal:

CASE theReal OF  (* illegal* )
  1.5 : 
    statement sequence |
  2.7 :
    statement sequence |
  pi :
    statement sequence |

CASE theString OF  (* illegal* )
  "Mon" : 
    statement sequence |
  "Tue" :
    statement sequence |
  "Wed" :
    statement sequence |

The individual cases may be singletons, lists, or ranges in any combination, as illustrated in the following: Suppose theResult has values in the range [0 .. 12].

CASE theResult OF
  1 :
    action1 |
  0, 2, 5 :
    action2 |
  3, 4, 6 :
    action3 |
  7 .. 9, 11 :

However, the following will produce a compiler error, because of the overlap of items in the lists.

CASE theResult OF
  2..5 :
    action1 |
  3, 4, 6 :
    action2 |

Selection should be performed with the CASE statement instead of an IF .. THEN statement when:

1. the decision involves only the value of a single variable
2. there are several (but not very many) adjacent alternative values
3. the majority of the alternatives do NOT fall into the ELSE category.

There is little to be gained in writing:

  TRUE :
    WriteString ("all okay") |
    WriteString ("Error in library module");

instead of using the more natural:

IF Done
    WriteSting ("all okay")
    WriteString ("Error in library module")

Assume the range of count is [1 .. 100]. Here is another bad example:

CASE count OF
   1 :
     statement sequence 1 |
   100 :
     statement sequence 2 |
   ELSE (* almost all cases end up here *)
     statement sequence 3 |

This one should have been formulated with an IF statement. Not only are the cases not adjacent in the range from which they are derived, but most of them are caught by the ELSE clause, that is not, therefore, the exception that its name implies, but the rule. At the very least, this code does not look very professional.

There is another problem with CASE selector variables covering large ranges that is more than simply aesthetic. To understand this problem, it is necessary to know how a CASE statement is usually implemented when the compiler generates code. Each statement sequence governed by a case selection is compiled and stored. The location of these sequences are recorded. A table is constructed with one entry for each possible case that can occur, and beside that the location of the sequence to be executed. Assuming that statement sequence 1 causes code1 to be executed, statement sequence 2 causes code2 to be executed, and statement sequence 3 causes code3 to be executed, the last example above generates a table that could be pictured as:

Value Code To Execute
1 location of code1
2 location of code3
3 location of code3
4 location of code3
5 location of code3
99 location of code3
100 location of code2

When the program is actually run, searching the table for a valid value of the selector variable looking up the appropriate code, and executing it will happen very quickly. As the number of entries is increased to cover all the possible values for a variable with a larger range, the table that must be searched grows larger, but this does not add much, if anything to the run time. However, the larger the possible range, the more table space that must be reserved within the code that is generated. In an example such as this one, much of that space is wasted, and the final program is unnecessarily bloated. For the two considerations cited (logical, and space) variables governing CASE statements should be of an ordinal type that has a modest range.

NOTES: 1. Some versions of Modula-2 therefore restrict the total number of case labels (i.e. the size of the table) in one CASE statement to some arbitrary limit (often as small as 255--the minimum restriction allowed by the ISO standard.) An attempt to use a selector variable with a range larger than this generates a compiler error.

2. Many implementations also impose some upper limit on the ordinal size of CASE labels. If the limit is, say, 32767 a variable whose type is a range [0 .. 10] would be allowed, but one whose type is a range [40000 .. 40010] would not. Even though the number of cases is the same for both, the values of the second range exceeds the implementation defined limit. The actual limit in a given implementation could be much smaller than this.

3. Contrary to the cautions here, some compilers work very hard to produce compact and optimal code even when the CASE selector is of type, say, CARDINAL, and only a few values are listed. That is, they turn bad planning into good code via optimization. This type of help should not be expected by the programmer.

Indeed, to emphasize the above points, many teachers will require students to explicitly list all possible values of the selector variable in the range. That is, if number is in the range [0 .. 8], then instead of including "don't care" values in an empty ELSE clause as:

CASE number OF
  0 :
    action 1 |
  2,3 :
    action 2 |
  5 .. 7 :
    action 3

some prefer:

CASE number OF
  0 :
    action 1 |
  1 :
  2, 3 :
    action 2 |
  4 :
  5 .. 7 :
    action 3 |
  8 :

However, the latter construction is of little help when dealing with implementations of Modula-2 that over-zealously enforce good style and generate an error on a missing ELSE clause.

What follows is the example of Section 7.4 rewritten to use CASE instead of IF selection.

PROCEDURE MonthEnum (mon : ARRAY OF CHAR) : MonthName;

  ch : CHAR;
  (* check for unique characters in third position *)
  CASE CAP (mon [2]) OF
    "B" :
      RETURN Feb |
    "C" :
      RETURN Dec |
    "G" :
      RETURN Aug |
    "L" :
      RETURN Jul |
    "P" :
      RETURN Sep |
    "T" :
      RETURN Oct |
    "V" :
      RETURN Nov |
    "Y" :
      RETURN May |
    ELSE   (* any other third letter passes to next step. *)
  (* check for unique characters in second position *)
  CASE CAP (mon [1]) OF
    "P" :
      RETURN Apr |
    "U" :
      RETURN Jun | (* Jul and Aug are done already *)
    ELSE   (* any other second letter passes to next step. *)
  (* look at remaining first letters *)
  CASE CAP (mon [0]) OF
    "J" :
      RETURN Jan |  (* Jun and Jul are done already *)
    "M" :
      RETURN Mar | (* May is done already *)
    ELSE   (* any other second letter passes to next step. *)
      RETURN Err;  (* anything else is an error *)

END MonthEnum;

Here is an example of a little module that employs the CASE statement in the course of averaging three marks and then assigning a letter grade to a student. This one also shows an alternate prettyprint for case statements that may be preferred by some--putting the bar in front of the new cases at the beginning of the new line instead of at the end of the previous one. Since there is no difference as far as the compiler is concerned (because carriage returns are ignored), the choice of style is up to those in control of the working environment of the programmer.

MODULE Grader;

  WriteString, WriteLn, WriteChar, ReadChar, SkipLine;
  ReadCard, WriteCard;
  ReadResults, ReadResult;

  numMarks = 3;
  MarkArrayType = ARRAY [1 .. numMarks] OF CARDINAL;

  count, total : CARDINAL;
  average : REAL;
  marks : MarkArrayType;
  letterGrade, ans : CHAR;
  res: ReadResults;

BEGIN      (* main program *)
    WriteString ("Please give me the marks now");
    total := 0;
    FOR count := 1 TO numMarks
        WriteString ("Enter a whole number percentage, please ");
        WriteString ("for mark number ");
        WriteCard (count, 4);
        WriteString ("==> ");
        ReadCard (marks [count]);
        res := ReadResult ();
        UNTIL (res = allRight) AND (marks [count] <= 100);
        total := total + marks [count];
      END;    (* for *)
    average :=  FLOAT (total) / FLOAT (numMarks);
    IF average < 50.0
        letterGrade := 'F'
      ELSIF average < 60.0 THEN
        letterGrade := 'D'
      ELSIF average < 70.0 THEN
        letterGrade := 'C'
      ELSIF average < 80.0 THEN
        letterGrade := 'B'
        letterGrade := 'A';
      END;    (* if *)
  CASE letterGrade OF
    'A' .. 'B':
      WriteString ("Congratulations, you got a ");
      WriteChar (letterGrade) 
    | 'C':
      WriteString ("Well done, you earned a C.")
    | 'D':
      WriteString ("You got credit for the course with a D.")
      WriteString ("I regret to inform you that you ");
      WriteString ("only received a ");
      WriteChar (letterGrade);
      WriteString ("and will not obtain credit for the course.");
    END;    (* case *)
  WriteString ("Do Another? Y/N ");
  ReadChar (ans);
  UNTIL CAP (ans) # "Y";
END Grader.

Notice that it was inappropriate to use a CASE statement on the whole number values of the marks, because there were one hundred one of them, and only six distinguishable cases.