1.

Introduction

B-trees

were originally designed to work on direct access secondary storage (Magnetic

disk). In a computer system the memory capacity is consists of two parts: primary

memory (uses memory chips) and secondary storage (based on magnetic disk). When

dealing with large datasets, it is often impossible to maintain a whole dataset

in the primary memory. Instead, a small part of the dataset is maintained in

the primary memory and the remaining data is read from a secondary storage 1.

B-tree

is a data structure used for sorting huge datasets for fast retrieval from a

disk. A typical B-tree has a single-digit height, even with millions of entries

2. Which means

that few disk accesses are needed to find the place where data is stored in the

tree. At most two accesses are required to search for any node. “From a

practical point of view, B-trees, therefore, guarantee an access time of less

than 10 ms even for extremely large datasets.” —Dr. Rudolf Bayer,

inventor of the B-tree

2. Description

B-trees

save time by using nodes with many branches (children). Unlike binary tree, in

which each node has only two children. When a node has many children a record

can be found easily by passing through fewer nodes than if there are two children

per node. See the example in Figure 1.

The example

in Figure 1 shows that the red node in a binary tree has depth of three while

the corresponding node in the B-tree has depth of one. Clearly, B-trees locates

desired records faster, assuming that all system parameters are identical. The

difference in depth between binary tree and B-tree is greater in a practical

database than in the example in Figure 1. Real world B-trees height depends on

the number of records in the database.

The

tradeoff is that the decision process at each node in B-tree is more

complicated compared with a binary tree. A complex program is required to

execute B-tree’s operations, but this program runs fast because it is stored in

RAM.

3. Definition

A

B-tree of order m (the maximum number of children for each node) is a

self-balancing search tree.

3.1 Node Structure

See

a B-tree node’s structure in Figure 2.

Figure 2: Node structure

–

K : is a search key. Search operation depends on the key value.

According to its value we can find the record that we are searching for. All

keys in a node are in ascending order: k1 17? No, go to left sub-tree.

Next node (4,9), is key = 4? Is key > 4? Yes, move to the next key.

Is key = 9? Is key > 9? No, go to sub-tree left of 9.

Next node (7,8), is key = 7? Yes, found.

·

Example 2 : Search for key = 11.

By applying B-TREE-SEARCH:

Starts at the root (17), is key = 17? Is

key > 17? No, go to left sub-tree.

Next node (4,9), is key = 4? Is key > 4? Yes, move to the next key.

Is key = 9? Is key > 9? Yes, go to the right sub-tree (there is no

more keys in the node to check).

Next node (12), is key = 12? Is key >12? No, the left sub-tree is a

leaf. Which means that 11 is not in the tree.

4.2 Creating an empty B-tree

B-TREE-CREATE operation builds an empty B-tree by first creating a new empty root

node and then call B-TREE-INSERT to

add new keys. B-TREE-CREATE and

B-TREE-INSERT operations are

both using ALLOCATE-NODE which

allocates one disk page to be used as a new node in O(1) time. When a node is

created by ALLOCATE-NODE it

does not require DISK-READ,

because there is no useful information stored on the disk for that node yet.

4.2.1

B-TREE-CREATE pseudo-code

B-TREE-CREATE(T)

1 x ALLOCATE-NODE()

2 leafx TRUE

3 nx 0

4 DISK-WRITE(x)

5

rootT x

4.2.2

B-TREE-CREATE Running Time

The running

time is O(1).

4.3 Splitting a node in a B-tree

When

a node becomes full it is necessary to perform split operation. B-TREE-SPLIT-CHILD moves the

median key of node y up into its parent x (it has to be non-full node), where y is

the ith child of x. All keys in y right

of the median key are moved to the new node. The keys left of the median key

remain in the original node y. The

new node z becomes the right child of the median key that was moved to

its parent x. The original

node y becomes the left child of the median

key that was moved to its parent x 5.

4.3.1

B-TREE-SPLIT-CHILD pseudo-code

Inputs: is a non-full

internal node.

is an index.

is

the ith child of x and is the node being split.

Node y originally has 2t – 1 children but is

reduced to t – 1 children by this operation.

B-TREE-SPLIT-CHILD(x,i,y)

1 z ALLOCATE-NODE()

2 leafz leafy

3 nz t – 1

4 for j 1 to t – 1

5 do keyjz

keyj+ty

6 if not leaf y

7 then for j 1 to t

8 do cjz

cj+ty

9 ny t – 1

10 for j nx + 1 downto i + 1

11 do cj+1x

cjx

12 ci+1x

z

13 for j nx downto i

14 do keyj+1x

keyjx

15 keyix

keyty

16 nx nx + 1

17 DISK-WRITE(y)

18 DISK-WRITE(z)

19 DISK-WRITE(x)

Lines 1-8 : Create a new node z and give it the larger t –

1 keys and corresponding t children of y.

Line 9 : Adjusts the key count for y.

Lines 10-16 : Insert z as a child of x. To separate y from z,

move the median key from y up to x, and adjust the

key count for x.

Lines 17-19 : write out all modified disk page 6.

4.3.2

B-TREE-SPLIT-CHILD Running Time

The running time is ?(t).

4.3.3

B-TREE-SPLIT-CHILD Example

Insert key (j) to the following B-tree of order 5 in Figure 4.

j > F,

Insert key (j) to the right sub-tree. See Figure 5.

Figure 5: The B-tree after

inserting ‘j’ ( B-Tree-Split-Child Example )

It is a full node, By applying B-TREE-SPLIT-CHILD:

The median key which is (j) moves up to the parent node because the

parent not is non-full. All keys on the left of the median key (G,H) remain on

the original node. All keys on the right of the median key (K,M) move to the

new node. See Figure 6.

Figure 6: The B-tree after

Splitting a node ( B-Tree-Split-Child Example )

4.4

Inserting

into a B-tree

Inserting a key into a B-tree T of height h is done

in a single pass down the tree. The B-TREE-INSERT procedure uses

B-TREE-SPLIT-CHILD to guarantee

that the recursion never occurs to a full node 6.

4.4.1

B-TREE-INSERT pseudo-code

Input: is a key to

be inserted.

B-TREE-INSERT(T,k)

1 r rootT

2 if nr

= 2t – 1

3 then s ALLOCATE-NODE()

4 rootT

s

5 leafs

FALSE

6 ns

0

7 c1s

r

8

B-TREE-SPLIT-CHILD(s,1,r)

9 B-TREE-INSERT-NONFULL(s,k)

10 else

B-TREE-INSERT-NONFULL(r,k)

Lines 1: Root node r is created.

Lines 2-9: Handle the case where the root node r is full, the root is split and a new node s (having two children)

becomes the root. The height of the B-tree is increased at the top. The

procedure finishes by calling B-TREE-INSERT-NONFULL to

perform the insertion of key k in the tree rooted at the non-full root

node 6.

Lines 10: Recurses and

guarantees that the node to which it recurs is not full by calling B-TREE-SPLIT-CHILD.

The recursive procedure B-TREE-INSERT-NONFULL

inserts key k into node x, assumed to be non-full. The

operation of B-TREE-INSERT and the recursive operation of B-TREE-INSERT-NONFULL guarantee

that this assumption is true 6.

4.4.2

B-TREE-INSERT-NONFULL pseudo-code

B-TREE-INSERT-NONFULL(x,k)

1 i nx

2 if leafx

3 then while i

1 and k keyix

16 then

i i + 1

17

B-TREE-INSERT-NONFULL(cix,k)

Lines 3-8: Handle the case where x is a leaf node by inserting key k

into node x. If x is not a leaf node, then insert k into

the appropriate leaf node in the sub-tree rooted at internal node x,

in this case go to lines 9-11.

Lines 9-11: Determine the child of node x to which the

recursion descends.

Lines 13: Detects whether the recursion would descend to a full child, in

this case go to line 14.

Lines 14: Uses B-TREE-SPLIT-CHILD to

split that child into two non-full children.

Lines 15-16: Determines which one of the two children is the correct is one to

descend to.

Line 17: Then recurses

to insert k into the appropriate sub-tree 6.

4.4.3

B-TREE-INSERT Running Time

The running time is O(t1ogt n).

4.4.4

B-TREE-INSERT Example

Insert the following keys (70, 90, 25) to the B-tree

of order 6 in Figure 7.

Figure 7: a B-tree of order

6 ( B-Tree-Insert Example )

By applying B-TREE-INSERT :

First compare the key (70) with all key

values, it is bigger so it is inserted after key (50).

Then compare the key (90) with all key values, it is

bigger so it is inserted after key (70). See Figure 8.

Figure 8: The B-tree after

inserting ‘70,90’ ( B-Tree-Insert Example )

The node is full, to insert the key (25)

a split procedure must be applied first. Take the median key (50) and promote

it as the new root node and perform split operation on the original node. All

keys on the left of the median key which are (10) and (30) stay on the original

node and this node become the left child of the median key (50) which is the

new root node. All keys on the right of the median key which are (70) and (90)

move to new node and become the right child of the median key (50) which is in the

new root node.

Now insert the key (25) by first

comparing it with the root node key (50) which is bigger, so go to the left

sub-tree. The left sub-tree has two keys (10,30), first compare the key (25)

with the first key (30) which is bigger so the key (25) is inserted before the

key (30). See Figure 9.

Figure 9: The B-tree after

inserting ’25’ ( B-Tree-Insert Example )

4.5 Deleting from a B-tree

There are three possible cases for

deletion in a B-tree. Let k be the key to be deleted and x is the

node containing the key k 7.

The three cases are:

4.5.1

Case-?

If the key

is a leaf node then simply remove the key to be deleted.

Key k is in node x and x is a

leaf, simply delete k from x.

·

Case-? Example

Figure 10: a B-tree of order

4 ( B-Tree-Delete-key Case-? Example )

Delete the key (6) from the B-tree of

order 4 in Figure 10.

Figure 11: The B-tree after deleting ‘6’ (

B-Tree-Delete-key Case-? Example )

See the B-tree after deleting key (6) in Figure 11.

4.5.2

Case-??

If key k is in node x and x is

an internal node, there are three cases to consider:

4.5.2.1 Case-??-a

If the child y that precedes k

in node x has at least t keys (more than the minimum), then find

the predecessor key k’ in the sub-tree rooted at y. Recursively

delete k’ and replace k with k’ in x 7.

4.5.2.2 Case-??-b

If the child node z that follows k

in node x has at least t keys, find the successor k’ and

delete then replace as in Case-??-a. Note that finding k’ and deleting it can be

performed in a single downward pass 7.

· Case-??-b Example

Figure 12: a B-tree of order

4 ( B-Tree-Delete-key Case-?|-b Example )

Delete the

key (13) from the B-tree of order 4 in Figure 12.

See the

B-tree after deleting key (13) in Figure 13.

Figure 13: The B-tree after deleting ’13’ (

B-Tree-Delete-key Case-?|-b Example )

4.5.2.3 Case-??-c

If both y and z have only t?1

(minimum number) keys, merge k and all of z into y, so

that both k and the pointer to z are removed from x. y

now contains 2t ? 1 keys, and subsequently k is deleted 7.

· Case-??-c Example

Figure 14: a B-tree of order

5 ( B-Tree-Delete-key Case-?|-c Example )

Delete the

key (7) from the B-tree of order 5 in Figure 14.

See the

B-tree after deleting key (7) in Figure 15.

Figure 15: The B-tree after

deleting ‘7’ ( B-Tree-Delete-key Case-?|-c Example )

4.5.3

Case-???

If key k is not in an internal

node x, determine the root of the appropriate sub-tree that must contain

k. If the root has only t ? 1 keys, execute either of the

following two cases to descend to a node containing at least t keys. Then

recurs to the appropriate child of x 7.

4.5.3.1 Case-???-a

If the root has only t?1 keys and has a

sibling with t keys, give the root an extra key by moving a key from node x

to the root. Moving a key from the roots immediate left or right sibling up

into node x, and moving the appropriate child from the sibling to node x

7.

· Case-???-a Example

Figure 16: a B-tree of order 6 ( B-Tree-Delete-key

Case-?|?-a Example )

Delete the

key (2) from the B-tree of order 6 in Figure 16.

Figure 17: The B-tree after

deleting ‘2’ ( B-Tree-Delete-key Case-?|?-a Example )

See the

B-tree after deleting key (2) in Figure 17.

4.5.3.2 Case-???-b

If the root and all of its siblings have t?1

keys, merge the root with one sibling. Then move a key down from node x

into the new merged node to become the median key for that node.

· Case-???-b Example

Figure 18: a B-tree of order

6 ( B-Tree-Delete-key Case-?|?-b Example )

Delete the key (4) from the B-tree of

order 6 in Figure 18.

Figure 19: The B-tree after

deleting ‘4’ ( B-Tree-Delete-key Case-?|?-b Example )

See the

B-tree after deleting key (2) in Figure 19 and Figure 20.

Figure 20: The B-tree after

deleting ‘4’ ( B-Tree-Delete-key Case-?|?-b Example ) – 2

4.5.4

B-TREE-DELETE-KEY

pseudo-code

B-Tree-Delete-Key(x, k)

1 if not leafx then

2 y ? Preceding-Child(x)

3 z ? Successor-Child(x)

4 if ny > t

? 1 then

5 k’ ?

Find-Predecessor-Key(k, x)

6 Move-Key(k’, y, x)

7 Move-Key(k, x, z)

8 B-Tree-Delete-Key(k,

z)

9 else if nz

> t ? 1 then

10 k’ ?

Find-Successor-Key(k, x)

11 Move-Key(k’, z, x)

12 Move-Key(k, x, y)

13 B-Tree-Delete-Key(k,

y)

14 else

15 Move-Key(k, x, y)

16 Merge-Nodes(y, z)

17 B-Tree-Delete-Key(k,

y)

18 else (leaf node)

19 y ? Preceding-Child(x)

20 z ? Successor-Child(x)

21 w ? root(x)

22 v ? RootKey(x)

23 if nx > t

? 1 then Remove-Key(k, x)

24 else if ny

> t ? 1 then

25 k’ ?

Find-Predecessor-Key(w, v)

26 Move-Key(k’, y,w)

27 k’ ?

Find-Successor-Key(w, v)

28 Move-Key(k’,w, x)

29 B-Tree-Delete-Key(k,

x)

30 else if nw > t ? 1 then

31 k’ ? Find-Successor-Key(w,

v)

32 Move-Key(k’, z,w)

33 k’ ?

Find-Predecessor-Key(w, v)

34 Move-Key(k’,w, x)

35 B-Tree-Delete-Key(k,

x)

36 else

37 s ? Find-Sibling(w)

38 w’ ? root(w)

39 if nw’ = t

? 1 then

40 Merge-Nodes(w’,w)

41 Merge-Nodes(w,

s)

42 B-Tree-Delete-Key(k,

x)

43 else

44 Move-Key(v,w, x)

45 B-Tree-Delete-Key(k,

x)

Preceding-Child(x): Returns the left child of key x.

Move-Key(k, x, z): Moves key k from node x to

node z.

Merge-Nodes(y, z): Merges the keys of nodes y and z

into a new node.

Find-Predecessor-Key(k, x): Returns the key preceding key k

in the child of node x.

Remove-Key(k, x): Deletes key k from node x.

x must be a leaf node.

5.

Branching Factor

Given enough

memory, increasing the branching factor will reduce the height of the B-tree and

will make each level of the B-tree hold more nodes. Fewer levels means fewer

disk access (if any) and better performance 8.

But if the branching

factor in a B-tree is so big, it means that the first view levels cannot be

held in the memory. Which means accessing each level can cause disk swapping and

poor performance 8.

There are two major concerns:

1.

Limited

memory: a B-tree parameters are limited by the available memory 8. All search starts

from the top of a B-tree, so the first level of nodes need to be in memory. If the

number of nodes fits in main memory, it means that there is no need for disk

storage. But scaling physical memory is an expensive proposition and infeasible

beyond a point 8.

The

solution to this problem is sharding. Sharding a B-tree storage enables a

B-tree to hold a limited number of keys. Sharding the database along with its

b-tree indices into different physical machine is one of the most effective

scaling solutions for growing data volumes