Information about Balanced Tree

In computer science, a self-balancing binary search tree or height-balanced binary search tree is a binary search tree that attempts to keep its height, or the number of levels of nodes beneath the root, as small as possible at all times, automatically. It is one of the most efficient ways of implementing associative arrays, sets, and other data structures.

Overview

Most operations on a binary search tree take time directly proportional to the tree's height, so it is desirable to keep the height small. Ordinary binary search trees have the primary disadvantage that they can attain very large heights in rather ordinary situations, such as when the keys are inserted in order. The result is a data structure similar to a linked list, making all operations on the tree expensive. If we know all the data ahead of time, we can keep the height small on average by adding values in a random order, but we don't always have this luxury, particularly in online algorithms.

Self-balancing binary trees solve this problem by performing transformations on the tree (such as tree rotations) at key times, in order to reduce the height. Although a certain overhead is involved, it is justified in the long run by drastically decreasing the time of later operations.

The height must always be at least the ceiling of log n, since there are at most 2k nodes on the kth level; a complete or full binary tree has exactly this many levels. Balanced BSTs are not always so precisely balanced, since it can be expensive to keep a tree at minimum height at all times; instead, they keep the height within a constant factor of this lower bound.

Times for various operations in terms of number of nodes in the tree n:
OperationBig-O time
LookupO(log n)
InsertionO(log n)
RemovalO(log n)
In-order iteration over all elementsO(n)
For some implementations these times are worst-case, while for others they are amortized.

Implementations

Popular data structures implementing this type of tree include:

Applications

Self-balancing binary search trees can be used in a natural way to construct associative arrays; key-value pairs are simply inserted with an ordering based on the key alone. In this capacity, self-balancing BSTs have a number of advantages and disadvantages over their main competitor, hash tables. Lookup is somewhat complicated in the case where the same key can be used multiple times.

Many algorithms can exploit self-balancing BSTs to achieve good worst-case bounds with very little effort. For example, if binary tree sort is done with a BST, we have a very simple-to-describe yet asymptotically optimal O(n log n) sorting algorithm (although such an algorithm has practical disadvantages due to bad cache behavior). Similarly, many algorithms in computational geometry exploit variations on self-balancing BSTs to solve problems such as the line segment intersection problem and the point location problem efficiently.

Self-balancing BSTs are a flexible data structure, in that it's easy to extend them to efficiently record additional information or perform new operations. For example, one can record the number of nodes in each subtree having a certain property, allowing one to count the number of nodes in a certain key range with that property in O(log n) time. These extensions can be used, for example, to optimize database queries or other list-processing algorithms.

See also

External links

References

  • Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0. Section 6.2.3: Balanced Trees, pp.458–481.
Computer science, or computing science, is the study of the theoretical foundations of information and computation and their implementation and application in computer systems.
..... Click the link for more information.
binary search tree (BST) is a binary tree data structure which has the following properties:
  • Each node has a value.
  • A total order is defined on these values.
  • The left subtree of a node contains only values less than the node's value.

..... Click the link for more information.
An associative array (also map, mapping, hash, dictionary, finite map, lookup table, and in query-processing an index or index file
..... Click the link for more information.
In computer science, a set is a collection (container) of certain values, without any particular order, and no repeated values. It corresponds with a finite set in mathematics. Disregarding sequence, and the fact that there are no repeated values, it is the same as a list.
..... Click the link for more information.
binary search tree (BST) is a binary tree data structure which has the following properties:
  • Each node has a value.
  • A total order is defined on these values.
  • The left subtree of a node contains only values less than the node's value.

..... Click the link for more information.
linked list is one of the fundamental data structures, and can be used to implement other data structures. It consists of a sequence of nodes, each containing arbitrary data fields and one or two references ("links") pointing to the next and/or previous nodes.
..... Click the link for more information.
In computer science, an online algorithm is one that can process its input piece-by-piece, without having the entire input available from the start. In contrast, an offline algorithm
..... Click the link for more information.
A tree rotation is an operation on a binary search tree that changes the structure without interfering with the order of the elements. A tree rotation moves one node up in the tree and one node down.
..... Click the link for more information.
Overhead may be:
  • Overhead (business), the ongoing operating costs of running a business
  • Engineering overhead, ancillary design features required by a component of a device

..... Click the link for more information.
In computational complexity theory, big O notation is often used to describe how the size of the input data affects an algorithm's usage of computational resources (usually running time or memory).
..... Click the link for more information.
In computer science, tree-traversal refers to the process of visiting each node in a tree data structure, exactly once, in a systematic way. Such traversals are classified by the order in which the nodes are visited.
..... Click the link for more information.
amortized analysis refers to finding the average running time per operation over a worst-case sequence of operations. Amortized analysis differs from average-case performance in that probability is not involved; amortized analysis guarantees the time per operation over
..... Click the link for more information.
An Arne Andersson tree (AA tree) in computer science is a red-black tree with one additional rule. Unlike red-black trees, RED nodes on an AA tree can only be added as a right subchild. In other words, no RED node can be a left subchild.
..... Click the link for more information.
In computer science, an AVL tree is a self-balancing binary search tree, and the first such data structure to be invented. In an AVL tree the heights of the two child subtrees of any node differ by at most one, therefore it is also called height-balanced.
..... Click the link for more information.
A red-black tree is a type of self-balancing binary search tree, a data structure used in computer science, typically used to implement associative arrays. The original structure was invented in 1972 by Rudolf Bayer who called them "symmetric binary B-trees", but acquired its
..... Click the link for more information.
A splay tree is a self-balancing binary search tree with the additional unusual property that recently accessed elements are quick to access again. It performs basic operations such as insertion, look-up and removal in O(log(n)) amortized time.
..... Click the link for more information.
In computer science, a scapegoat tree is a self-balancing binary search tree, invented by Igal Galperin and Ronald L. Rivest. It provides worst-case O(log n) lookup time, and O(log n) amortized insertion and deletion time.
..... Click the link for more information.
An associative array (also map, mapping, hash, dictionary, finite map, lookup table, and in query-processing an index or index file
..... Click the link for more information.
In computer science, a hash table, or a hash map, is a data structure that associates keys with values. The primary operation it supports efficiently is a lookup: given a key (e.g. a person's name), find the corresponding value (e.g. that person's telephone number).
..... Click the link for more information.
A tree sort is a sort algorithm that builds a binary search tree from the keys to be sorted, and then traverses the tree so that the keys come out in sorted order.


..... Click the link for more information.
In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor (independent of the input size) worse than the best possible algorithm.
..... Click the link for more information.
In computer science, computational geometry is the study of algorithms to solve problems stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and the study of such problems is also considered to be part of
..... Click the link for more information.
In computational geometry, the line segment intersection problem supplies a list of line segments in the plane and asks us to determine whether any two of them intersect, or cross.
..... Click the link for more information.
The DSW algorithm, or in full Day/Stout/Warren algorithm, is a method for efficiently balancing binary search trees — that is, decreasing their height to O(log n) nodes, where n is the total number of nodes.
..... Click the link for more information.
Invented in 1990 by William Pugh, a skip list is a probabilistic data structure, based on parallel linked lists, with efficiency comparable to a binary search tree (order O(log n) average time for most operations).
..... Click the link for more information.
B-tree is a tree data structure that keeps data sorted and allows searches, insertions, and deletions in logarithmic amortized time. It is most commonly used in databases and filesystems.
..... Click the link for more information.
A red-black tree is a type of self-balancing binary search tree, a data structure used in computer science, typically used to implement associative arrays. The original structure was invented in 1972 by Rudolf Bayer who called them "symmetric binary B-trees", but acquired its
..... Click the link for more information.
Donald Ervin Knuth

Photographed by Jacob Appelbaum, 25 October 2005
Born January 10 1938 (1938--)
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter