CPSC 2120 - DAY 10 MAY 27, 2016 ================================================================================ B-TREES ------- B Trees are designed with hard disks in mind databases are typically larger than avaliable memory (RAM) Differences between memory and disk speeds 25 million instructions per second Disk concentric tracks 3600 rev/sec 1/60 = 16.7ms access time 8.3ms 120 disk accesses/sec A sector is a unit of reading/writing from a disk The sector is read into memory all at once Disk can become fragmented, a particular file can be located in many different sectors Number of accesses Assume N=10M log2 N = log2 10M ≈ 24 BST (worst case) = O(N) access 10M records BST (average case) = 1.38 log N accesses 1.38 * 24 ≈ 33 accesses AVL (worst case) = 1.44 log N accesses 1.44 * 24 ≈ 34 accesses AVL (average case) ≈ log N accesses ≈ 24 accesses ^ Structures too slow for access time (3-6 sec is too slow...) 4-5 accesses (.64-.8 seconds ideal access time 4-5 accesses) using a B Tree A B-tree of order M is an M-ary tree The data items are stored in the leaves The interior nodes store up to M - 1 keys to guide the searching; key i represents the smallest key in the subtree