A disk drive consists of a disk pack containing one or more platters stacked like phonograph records. Information is stored on both sides of the platter.
Each platter is divided into concentric rings called tracks, and each track is divided into sectors. All transfers to and from the disk are performed at the sector level.
For example, to modify a single byte, the system reads the sector containing the byte off the disk, modifies the byte, and rewrites the entire sector.
The disk uses a read/write head to transfer data to and from the platter. The operation of moving the head from one track to another is called seeking, and the heads generally move together as a unit.
Because the disk heads move together as unit, it is useful to group the tracks at the same head position into logical units called cylinders. Sectors in the same cylinder can be read without an intervening seek.
One or more disk drives connects to a disk controller, which handles the details of moving the heads, etc. The controller communicates with the CPU through a host interface. Moreover, through direct memory access (DMA), the controller can access main memory directly, transferring entire sectors without interrupting the CPU.
Look at picture. Fig 5-3 from Tanenbaum.
Three factors influence the delay, or latency, that occurs in transferring data to/from a disk:
Current technologies (Maxtor Diamond Max 1750 Drive)
An important (experimental) observation is that:
We want to handle both file types efficiently.
The operating system may choose to use a larger block size than the sector size of the physical disk. Each block consists of consecutive sectors. Motivation:
Note:
The size of transfer convenient for operating system is a disk block. It may be the same size of a sector or larger. Generally moving to larger block size. NTFS uses 4K block size for disks larger than 2GB. FAT-32 uses 4K up to 8GB, 8K up to 16GB, 16K up to 32GB and 32K above 32GB.
Can also look at allocating blocks in a contiguous manner, but may not know the total needed for a file at creation time. Also can lead to fragmentation with too many small block runs.
Details of policies for reducing latencies for retrieval of blocks from disk discussed in previous course.
Approaches for keeping track of free blocks:
Can simplify this approach and keep track of one free block in each linked list element.
Also can groups set of consecutive free blocks using an address/count approach.
Can look at space/time tradeoffs for each approach.
Caching is crucial to improve performance. Why?
Unfortunately, caching may cause data in memory to become out of step with data on disk. This is known as the cache coherency problem. The problem is most significant in following contexts:
One solution is for the operating system to provide a ``flush'' system call, that writes to disk the file blocks associated with the file descriptor. The system might also flush the cache when the file is closed.