We are going to serialize a binary tree to an array, A binary tree is a tree data structure. Each node can have 0-2 children(s).
visualization of a binary tree
2
/ \
/ \
/ \
1 3
/ \ / \
0 7 9 1
/ / \ / \
2 1 0 8 8
Representation of a Node
A node should contain the data(value) and reference to child nodes.
A simple class like Node
below will work for us.
class Node:
data = None
left = None
right = None
def __init__(self, data):
self.data = data
Height of a Binary Tree
Height is the max level of nodes in a tree. In the example above tree’s height is 4. Given a reference to a tree’s root node, we can find its height as below
def get_height(root):
if root is None:
return 0
stack = [(1, root)]
max_level = 1
while stack:
level, node = stack.pop()
if level > max_level:
max_level = level
if node.left:
stack.append((level + 1, node.left))
if node.right:
stack.append((level + 1, node.right))
return max_level
We use a stack to traverse the tree. Each entry in the stack contains the node’s level and the node itself. As we traverse the tree we keep track of the highest level seen so far. We add 1 to the current level while adding children of the current node to a stack. In the end we will have the tree’s height in max level.
An interesting thing to note here is we can also use Queue to traverse the tree, however, a sibling will be visited earlier than a child, hence making it a BFS (Breadth-First Search) instead of a DFS (Depth-First Search) We have to traverse the complete tree, hence, the time complexity of finding out the height of the tree will be O(n)
Serialization
We can represent a binary tree with an array, where left and right children of a node at index n
can be placed at (n * 2) + 1
and (n * 2) + 2
index respectively.
To represent a Binary Tree we would need an array to hold all the nodes.
Our array length is dependent on the tree’s height and since a binary tree might have some missing subtree,
but the array has to have space for all items as if the tree is a complete binary tree.
The length of this array will be 2**h - 1
, h
being the tree’s height.
Once we have size of array finalized, we can traverse the tree to fill our array.
def serialize(root):
h = get_height(root)
arr = [None] * ((2**h) - 1)
stack = [(0, root)]
while stack:
index, node = stack.pop()
arr[index] = node.data
if node.left:
stack.append((index*2 + 1, node.left))
if node.right:
stack.append((index*2 + 2, node.right))
return arr
Our stack will contain tuples of 2 items, a node with its index.
We start with a stack like [(0, root)]
. In one iteration, we pop one item from the stack to get node
and index
and assign node’s data to arr[index]
, after that we place children of the node to the top of stack.
De-Serialization
we should be able to create the tree back from the array we got above.
def deserialize(arr):
nodes = [None] * len(arr)
for index, data in enumerate(arr):
if data:
node = nodes[index] or Node(data)
left_index = index * 2 + 1
right_index = index * 2 + 2
if left_index < len(arr) and arr[left_index]:
node.left = Node(arr[left_index])
if right_index < len(arr) and arr[right_index]:
node.right = Node(arr[right_index])
nodes[index] = node
if right_index == len(arr):
break
return nodes[0] if nodes else None
we start by iterating through the array in each iteration we create a Node
object with the current value and check if child positions do not exceed the array length then initialize those child nodes and link them to parent.
and with that we were able to serialize and deserialize the tree by calculating its height if not given then traversing through the tree once.