8.7 Binary I/O

The text files used thus far are very versatile, for they are stored in much the same form by all applications in all environments. Thus it is possible for the text editor to be used to create a data file for program use. Likewise, the text output files generated by programs can easily be read by any other program with text capability.

There are times, however, when it is more desirable to input or output data in a binary form. The channels employed are still streams, but not of text. Rather, they are streams of raw data copied directly from the computer's memory.

Typical facilities for raw I/O may be found in non-standard versions residing directly in the Filer/Files/FileSystem module, usually in the form of the following procedures:

(* typical non-ISO procedures *)
PROCEDURE ReadWord (file : File; VAR input : WORD);
 (* Reads the next word from the file 'file' and stores it in 'input'. *)

PROCEDURE ReadBytes (file : File; buffer : ADDRESS; VAR length : CARDINAL);
  (* Reads 'length' bytes from the file 'file' and stores them at 'buffer'. Actual number of bytes read returned in length  If the buffer is too small, data will be overwritten. This is a low level procedure -- don't use unless you know what you are doing *)

PROCEDURE WriteByte (file : File; output : BYTE);
  (* Writes the byte in 'output' to the file 'file'. *)

PROCEDURE WriteWord (file : File; output : WORD);
 (* Writes the word in 'output' to the file 'file'. *)

PROCEDURE WriteBytes (file : File; buffer : ADDRESS; VAR length : CARDINAL);
  (* Writes the 'length' bytes starting at 'buffer' to the file 'file'. Actual number of bytes read returned in length. If the buffer is too small, undefined bytes will be written. So don't use this unless you know what you are doing *) 

As it will be used below, it is useful to define the term buffer.

A buffer is a temporary storage area that is used to store information being transmitted to or from an external location (including a physical file).

In the ISO suite of I/O library modules, on the other hand, channels can be opened in the same way as previously described using SeqFile or StreamFile and then I/O operations handled using facilities of RawIO. There is also an SRawIO, but unless the user is aware of the meaning of sending raw binary data to the standard output, there is little point in employing its routines. Here is the definition of RawIO.

DEFINITION MODULE RawIO; (* from ISO suite *)

  (* Reading and writing data over specified channels using raw operations, that is, with no conversion or interpretation. The read result is of the type IOConsts.ReadResults. *)
 
IMPORT IOChan, SYSTEM;
 
PROCEDURE Read (cid: IOChan.ChanId; VAR to: ARRAY OF SYSTEM.LOC);
  (* Reads storage units from cid, and assigns them to successive components of to. The read result is set to the value allRight, wrongFormat, or endOfInput. *)
 
PROCEDURE Write (cid: IOChan.ChanId; from: ARRAY OF SYSTEM.LOC);
  (* Writes storage units to cid from successive components of from. *)
 
END RawIO.

For instance, the integers collected in the first example GetNStash of section 8.5.1 could have been stored to a file in raw form rather than in text form by replacing the line

WholeIO.WriteInt (outfile, number, 0);

with the line

RawIO.Write (outfile, number);

as the variable number of the type INTEGER (and all other variables, of whatever type) is compatible with the parameter ARRAY OF SYSTEM.LOC.

Then, in the module ReadNAdd that followed the line

ReadInt (infile, number);

would be replaced with the line

Read (infile, number);

where Read had been imported from RawIO.

In both cases, however, the file should be opened with the parameter raw as shown below:

Open (outfile, "numbers", write+raw, res);

and

Open (infile, "numbers", read+raw, res);

When using either StreamFile or SeqFile without the raw flag, a flag of text is implied. Both raw and text forms of I/O could be done with a channel if both flags were given, but that combination is rather unlikely. The disadvantage to using raw I/O is that the resulting files would not be text files, but images of the computer's memory. Thus, they could not be read by a text editor. On the other hand, the memory storage of an integer is likely to occupy only two or four locations, whereas one location is required for each character when it is written as text. The raw file is much smaller in size, and can be written to and read from much more quickly than the corresponding text file. As is often the case, speed is gained by taking a low level view, but the benefit is at the expense of convenience.

To illustrate some of these ideas with a fuller example, the FileCopy module is here rewritten to employ the ISO modules rather than the low-level Filer as in section 8.6.3. Observe that although the file being copied happens to be in fact a text file, this is not used by the program, and its code could copy any file. In addition, this program makes use of its own internal buffer to store the file.

MODULE FileCopyRaw;

(* Written by R.J. Sutcliffe *)
(* to illustrate the use of ISO module RawIO *)
(* last revision 1994 02 10 *)
(* This module reads a file called "numbers" and copies it to "numbers.bak". *)

FROM StreamFile IMPORT
  ChanId, Open, read, write, old, raw, Close, OpenResults;
FROM IOResult IMPORT
  ReadResult, ReadResults;
IMPORT RawIO; 
FROM SYSTEM IMPORT
  LOC;

CONST
  bMaxIndex = 20;
TYPE
  Buffer = ARRAY [0 .. bMaxIndex] OF LOC;
VAR
  infile, outfile : ChanId;
  ch : CHAR;
  buffer: Buffer;
  res : OpenResults;
  countR, countW : CARDINAL;

BEGIN
  Open (infile, "numbers", read+raw, res);
  IF res = opened
    THEN
      Open (outfile, "numbers.bak", write+raw+old, res);
      (* allow overwrite *)
      IF res = opened
        THEN
          REPEAT
            countR := 0;
            REPEAT (* fill the buffer as much as possible *)
              RawIO.Read (infile, buffer [countR]);
              INC (countR);
            UNTIL (ReadResult (infile) = endOfInput) OR (countR > bMaxIndex);

            IF ReadResult (infile) = endOfInput
              THEN  (* last one read should not be counted *)
                DEC (countR)
              END;
 
            countW := 0;
            WHILE countR > 0 (* write stuff read in last loop *)
              DO
                RawIO.Write (outfile, buffer [countW]);
                INC (countW);
                DEC (countR);
             END;
          UNTIL (ReadResult (infile) = endOfInput)
        END;
      END;
  Close (infile);
  Close (outfile);
END FileCopyRaw.

Contents