User:Combuster/CDRom BS

From OSDev Wiki
Jump to: navigation, search

My collection of random CDrom knowledge (or maybe BS). Maybe make a full low-level CDrom info doc later. Currently for storing background stuff for my work for a Hitachi-SH4 platform, and the tools seem to be lacking *nix support.

And I want that (***) Red Book. ECMA-119 and ECMA-130 don't cover *everything* and sometimes they are extremely cryptic if not incomplete.


Disk Layout

Each disk consists of a series of frames, each is 33 bytes in length = 24 bytes of data + 8 error correction + 1 subchannel byte (one bit for each channel) On disk, the data is encoded by 8-to-14 modulation, with three interleaving bits so that each byte occupies 17 bits on the disk's surface. This is to make sure the CD reader can stay in sync with the data. A 27-bit number is added to differentiate between frames.

98 frames are grouped into one sector, making each sector have 2352 bytes of data (and 12ΒΌ byte for each subchannel)

Audio CDs (following CD-DA convention) use the full 2352 bytes Data CDs (Mode 1 CD-ROM) use only 2048 bytes for actual data, while the remainder is again used for error correction. This additional layer is meant to be able to cover scratches, which normally render several consecutive frames unreadable, but not enough to destroy most of a sector's contents, so that the data can be recovered. For audio, the extra error correction is not used since bad frames can be concealed by interpolation of data. Data CDs that are less sensitive to data loss (Video CD's for example) or want more space (like PS games) do not store the error correction, but only the header. The Mode 2 CD-ROM therefore has 2336 information bytes per sector.

In total, there are two error correction mechanisms on the sector level, and one one the frame level.

Ascii Art

  ... | sector | sector | sector | sector | sector | sector |
                    ____/        \_____
             ______/                   \_______
Sector:_____/                                  \______
  ___/                                                \___
  | 12 | 4 |                         | 4 | 8 | 172 | 104 |    field size
  |SYNC|HDR|       data x 2048       |EDC| - |ECC-P|ECC-Q|    Mode-1 data
  |SYNC|HDR|             data x 2336                     |    Mode-2 data
  |                     data x 2352                      |    CD-DA (Audio)
                ____/  \_____
           ____/             \_____
Frame:____/                        \____
    |     data x 24      |   CIRC   |sub|
    |        14          |     8    | 1 |

Somebody with access to the red book please confirm a frame's ordering

Synchronisation field

On data discs, this block gives an unique pattern which a CD drive can synchronise to. According to the standard, the rest of the sector is scrambled because of this.


This field holds the sector number of the current sector, and informs the reader if it deals with a mode-1 or mode-2 disc.

Byte Contents
1 Minute
2 Seconds
3 Fractions
4 Format Byte
  • 0x00 empty
  • 0x01 Mode-1 data
  • 0x02 Mode-2 data

The timing fields are ordered this way because of the Audio CD (which came before the Data CD). They are stored in binary coded decimal - i.e. the seconds go 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x10 0x11 etc. The numbers 0xA - 0xF are skipped.

Fractions are numbered from 0x00 to 0x74 (75 fractions in a second) Seconds are numbered from 0x00 to 0x59 Minutes are numbered from 0x00 upwards to whatever the CD permits. (There's mention of numbers starting at A0 in the spec)

Note that the actual data seems to start from 00:02:00, leaving 150 sectors for something else. (Red Book, anyone)

Error Detection Code

The error detection mechanism on Mode-1 data CD's. Basically it is a textbook CRC with a polynomial of 1.8001801B, least significant bit first. The standard is rather cryptic about this number.

The 8 bits following the CRC are empty, probably to detect error conditions in the CRC area.


todo. Looks like a pair of CIRC codes.

Cross-Interleaved Reed-Solomon Code

The error correction mechanism used at the frame level. Sufficient to correct minor read errors. Wikipedia has more info on it:

Image formats

ISO image

Stores the important stuff. Looks basically like a concatenation of the 2048 byte data parts of a raw CD. Starts from sector 150 onward.

CDI image

according to cdi2iso, comes in 4 flavours. (raw, normal, PQ, CD+G). normal images are just the same as ISO images, but they have 150 sectors more at the start. (seems to store TOC stuff) The other formats start with a header reading 00 ff ff ff ff ff ff ff ff ff ff 00, and contain other parts beside the data content.

BIN/CUE images

A hex dump revealed bin files to have raw sector contents, that means it includes all the headers and error codes of each sector and is the 2352 bytes per sector. For data sectors, its contents starts with sector 00:02:00. The TOC is stored in a separate .CUE file, which is plain text. Example:

FILE "cd_contents.bin" BINARY
TRACK 01 MODE1/2352
INDEX 01 00:00:00
Personal tools