Persistant Storage
On traditional hard-disk drives, storage was accessed on physical platters logically organized into tracks (concentric rings) and sectors (vertical cross-sections). The intersection of a track and a sector is called a block, and it is usually a pre-defined size set by the manufacturer, usually somewhere from 512B to 4KB.
The address of a single byte on the disk can be reached from the block address and a byte offset into the block.
To access data on disk, it must first be pulled into main memory.
Note: On memory, the efficiency of data arrangement is determined by data structures and cache utilization; on disk, organizing data into DBMSs determines its efficiency.
An index stores the key to the storage pointer into the disk block that holds the record. Indexes allow us to access many less blocks to access a particular record or sets of records. To further improve performance, we can use sparse, multilevel indexes stores indexes of indexes.
Multiway search trees, B trees and B+ trees can act as variations on multilevel indexes. B trees extend m-way search trees by enforcing certain invariants to keep them balanced.
B tree invariants:
- all nodes must have m/2 children
- root can have min of 2 children
- all leaf nodes must be at the same level
- created from the bottom up
B+ trees differ from B trees because all record pointers must exist as copies in the leaf nodes and not on parent nodes. All of the leaf nodes are strung together in a single linked list.