Exact file sizes in CP/M Plus
DOS Plus, CP/M 3 and some CP/M 2 clones (specifically, DOS+2.5) support exact file sizes. The following system calls support them:
BDOS function 15 - open a file
Entered with DE=address of File Control Block, C=15 (0Fh)
If the byte at FCB+32 (FCB+20h) is 255 (0FFh) then on return from this function the byte will contain the Last Record Byte Count. Remember to reset this byte to zero before attempting sequential I/O.
BDOS function 17 - search for first
Entered with DE=address of File Control Block, C=17 (11h)
Returns a pointer to the file's directory entry. The byte at ENTRY+13 (ENTRY+0Dh) contains the Last Record Byte Count.
BDOS function 18 - search for next
Entered with C=18 (12h)
Returns a pointer to the file's directory entry. The byte at ENTRY+13 (ENTRY+0Dh) contains the Last Record Byte Count.
BDOS function 30 - set file attributes
Entered with DE=address of File Control Block, C=30 (1Eh)
To set the Last Record Byte Count, store the required value at FCB+32 (FCB+20h), and set bit 7 of FCB+6.
What is the Last Record Byte Count?
From CP/M's point of view, it's just a number from 0-255 associated with the file which programs can use for any purpose whatever. The documentation defines no interpretation for it.
If we want programs to be able to share files with exact lengths, then there had better be some sort of agreement on what the numbers mean. They must satisfy:
- If the number is zero, the file uses every byte in its disc image (for compatibility with earlier versions).
- It must be possible to find the number of bytes in the last record exactly.
Unfortunately, this still leaves two plausible systems:
- The number is the number of unused bytes in the last record.
- The number is the number of bytes used in the last record; since this ranges from 1-128, 0 is not a valid value and we know it means 128.
Even more unfortunately, programs exist that use both systems:
- ISX, an ISIS-II emulator, uses the first interpretation.
Digital Research's DOS Plus operating system uses the second (for example, when copying files from DOS to CP/M disks, or redirecting console output to a file). Since this is an official DRI product, I think it's the best example to follow. Here's a snippet of code in the DOS Plus BDOS handling I/O redirection:
loc_8_2268: mov cx, 10h ;F_CLOSE mov dx, 1 call BDOScallback ;Close the file cmp byte ptr ds:0, 0 ;File open for write? jnz loc_8_2291 ;If not, don't set byte count mov al, ds:25h ;Low byte of file size test al, 7Fh ;Is it a multiple of 128? jz loc_8_2291 ;If so, don't need to set size mov bx, 1 mov cx, 1Eh ;F_ATTRIB mov dx, bx or byte ptr [bx+6], 80h ;Set F6' mov [bx+20h], al ;Store last record byte count call BDOScallback loc_8_2291:
The same BDOS is used in Personal CP/M-86 v2.0.
- DOS+2.5 (the CP/M 2 clone) does not define any meaning for the byte, but its documentation suggests that the ISX interpretation was the intended one.
- The BDOS source code for 8-bit CP/M Plus calls the byte 'ubytes ;unfilled bytes field' suggesting the ISX interpretation. But it does not enforce either convention itself.
- Most DOS transfer software has tended to follow the second (DOS Plus) convention, such as MSDOS by Tilmann Reh and MSODBALL by me. A few utilities (mainly those aware of ISX, such as cpmtools and LibDsk) allow either convention to be selected using a configuration file.
- I also used the DOS Plus convention in other CP/M and emulation software, including PIPEMGR and zxcc.
- The patched version of Hi-Tech C at github originally used the ISX convention, including unofficial patches to PIPEMGR. As I read the documentation, these were because of a personal preference for the ISX convention by the late Jon Saxton; I don't agree (to put it mildly) with his choice. The latest release at the time of writing now defaults to the DOS Plus convention, with ISX behaviour selectable using a configuration file.