Information about Index (database)
A database index is a data structure that improves the speed of operations in a table. Indexes can be created using one or more columns, providing the basis for both rapid random lookups and efficient ordering of access to records. The disk space required to store the index is typically less than the storage of the table (since indexes usually contain only the key-fields according to which the table is to be arranged, and excludes all the other details in the table), yielding the possibility to store indexes into memory from tables that would not fit into it. In a relational database an index is a copy of part of a table. Some databases extend the power of indexing by allowing indexes to be created on functions or expressions. For example, an index could be created on
Indexes may be defined as unique or non-unique. A unique index acts as a constraint on the table by preventing identical rows in the index and thus, the original columns.
Clustering re-orders the data block in the same order as the index, hence it is also an operation on the data storage blocks as well as on the index. Exact operation of database systems vary, but because the row data can only be stored in one order physically only one clustered index may be created on a given database table. Clustered indexes can greatly increase access speed, but usually only where the data is accessed sequentially in the same or reverse order of the clustered index, or when a range of items are selected.
Since the physical records are in this sort order on disk the next row item in the sequence is immediately before or after the last one, and so fewer data block reads are required. The primary feature of a clustered index is therefore the ordering of the physical data rows in accordance with the index blocks that point to them. Some databases separate the data and index blocks into separate files, while others intermix the two different types of data blocks within the same physical file(s). Databases that use the latter scheme may be said to store the actual data in the leaf node of the index, whereas, in fact there is still a distinction between the index and the data block, and the data blocks can be traversed without using the index by way of a link list that links the data blocks in order.
In Microsoft SQL Server, the leaf node of the clustered index corresponds to the actual data, not simply a pointer to data that resides elsewhere, as is the case with a non-clustered index. [2] Each relation can have a single clustered index and many unclustered indexes. [3].
Indexes can be implemented using a variety of data structures. Popular indices include balanced trees, B+ trees and hashes.[4]
For example, imagine a phone book that is organized by city first, then by last name, and then by first name. If given the city, you can easily extract the list of all phone numbers for that city. However, in this phone book it would be very tedious to find all the phone numbers for a given last name. You would have to look within each city's section for the entries with that last name. Some databases can do this, others just won’t use the index.
Consider this SQL statement:
A bitmap index is a special kind of index that stores the bulk of its data as bitmaps and answers most queries by performing bitwise logical operations on these bitmaps. The most commonly used index, such as B+trees, are most effective if the values it indexes do not repeat or repeat a relatively smaller number of times. In contrast, the bitmap index is designed for cases where the values of a variable repeats very frequently. For example, the gender field in a customer database usually contains two distinct values, male or female. For such variables, the bitmap index can have a significant performance advantage over the commonly used trees.
The term wildcard character has the following meanings:
..... Click the link for more information.
upper(last_name), which would only store the uppercase versions of the last_name field in the index. Another option sometimes supported is the use of "filtered" indexes, where index entries are created only for those records that satisfy some conditional expression. A further aspect of flexibility is to permit indexing on user-defined functions, as well as expressions formed from an assortment of built-in functions. All of these indexing refinements are supported in Visual FoxPro, for example.[1]
Indexes may be defined as unique or non-unique. A unique index acts as a constraint on the table by preventing identical rows in the index and thus, the original columns.
Architecture
Index architectures can be classified as clustered or non-clustered. A non-clustered index normally contains a reference to a block that contains the row data for which the particular index item has been constructed. This block will hold several other rows depending on the row size. For each index lookup on a non-clustered index a data block that houses the row sought after must also be retrieved.Clustering re-orders the data block in the same order as the index, hence it is also an operation on the data storage blocks as well as on the index. Exact operation of database systems vary, but because the row data can only be stored in one order physically only one clustered index may be created on a given database table. Clustered indexes can greatly increase access speed, but usually only where the data is accessed sequentially in the same or reverse order of the clustered index, or when a range of items are selected.
Since the physical records are in this sort order on disk the next row item in the sequence is immediately before or after the last one, and so fewer data block reads are required. The primary feature of a clustered index is therefore the ordering of the physical data rows in accordance with the index blocks that point to them. Some databases separate the data and index blocks into separate files, while others intermix the two different types of data blocks within the same physical file(s). Databases that use the latter scheme may be said to store the actual data in the leaf node of the index, whereas, in fact there is still a distinction between the index and the data block, and the data blocks can be traversed without using the index by way of a link list that links the data blocks in order.
In Microsoft SQL Server, the leaf node of the clustered index corresponds to the actual data, not simply a pointer to data that resides elsewhere, as is the case with a non-clustered index. [2] Each relation can have a single clustered index and many unclustered indexes. [3].
Indexes can be implemented using a variety of data structures. Popular indices include balanced trees, B+ trees and hashes.[4]
Column order
The order in which columns are listed in the index definition is important. It is possible to retrieve a set of row identifiers using only the first indexed column. However, it is not possible or efficient (on most databases) to retrieve the set of row identifiers using only the second or greater indexed column.For example, imagine a phone book that is organized by city first, then by last name, and then by first name. If given the city, you can easily extract the list of all phone numbers for that city. However, in this phone book it would be very tedious to find all the phone numbers for a given last name. You would have to look within each city's section for the entries with that last name. Some databases can do this, others just won’t use the index.
Applications and limitations
Indexes are useful for many applications but come with some limitations. Consider the following SQL statement:SELECT first_name FROM people WHERE last_name = 'Finkelstein';. To process this statement without an index the database software must look at the last_name column on every row in the table (this is known as a full table scan). With an index the database simply follows the b-tree data structure until the Finkelstein entry has been found; this is much less computationally expensive than a full table scan.
Consider this SQL statement:
SELECT email_address FROM customers WHERE email_address LIKE '%@yahoo.com';. This query would yield an email address for every customer whose email address ends with "@yahoo.com", but even if the email_address column has been indexed the database still must perform a full table scan. This is because the index is built with the assumption that words go from left to right. With a wildcard at the beginning of the search-term the database software is unable to use the underlying b-tree data structure. This problem can be solved through the addition of another index created on reverse(email_address) and a SQL query like this: SELECT email_address FROM customers WHERE reverse(email_address) LIKE reverse('%@yahoo.com');. This puts the wild-card at the right most part of the query (now moc.oohay@%) which the index on reverse(email_address) can satisfy.
Types
Bitmap index
A bitmap index is a special kind of index that stores the bulk of its data as bitmaps and answers most queries by performing bitwise logical operations on these bitmaps. The most commonly used index, such as B+trees, are most effective if the values it indexes do not repeat or repeat a relatively smaller number of times. In contrast, the bitmap index is designed for cases where the values of a variable repeats very frequently. For example, the gender field in a customer database usually contains two distinct values, male or female. For such variables, the bitmap index can have a significant performance advantage over the commonly used trees.
Dense index
A dense index in databases is a file with pairs of keys and pointers for every record in the data file. Every key in this file is associated with a particular pointer to a record in the sorted data file. In clustered indexes with duplicate keys the dense index points to the first record with that key.[5]Sparse index
A sparse index in databases is a file with pairs of keys and pointers for every record in the data file. Every key in this file is associated with a particular pointer to the block in the sorted data file. In clustered indexes with duplicate keys the sparse index points to the lowest search key in each block.See also
Please add coverage of Hashing in Link ListsReferences
1. ^ Visual FoxPro 9.0 SP1 - Working with Table Indexes. MSDN. Microsoft (2007). Retrieved on 2007-05-24.
2. ^ Clustered Index Structures. SQL Server 2005 Books Online (September 2007).
3. ^ Daren Bieniek, Randy Dess, Mike Hotek, Javier Loria, Adam Machanic, Antonio Soto, Adolfo Wiernik (2006-01). Chapter 4: Creating Indexes. SQL Server 2005 Implementation and Management. Microsoft Press.
4. ^ Gavin Powell (2005-12). Chapter 12: Building Fast-Performing Data Models. Beginning Database Design ISBN 978-0-7645-7490-0. Wrox Publishing.
5. ^ Database Systems: The Complete Book. Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer D. Widom
2. ^ Clustered Index Structures. SQL Server 2005 Books Online (September 2007).
3. ^ Daren Bieniek, Randy Dess, Mike Hotek, Javier Loria, Adam Machanic, Antonio Soto, Adolfo Wiernik (2006-01). Chapter 4: Creating Indexes. SQL Server 2005 Implementation and Management. Microsoft Press.
4. ^ Gavin Powell (2005-12). Chapter 12: Building Fast-Performing Data Models. Beginning Database Design ISBN 978-0-7645-7490-0. Wrox Publishing.
5. ^ Database Systems: The Complete Book. Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer D. Widom
Topics in database management systems (DBMS)
| |
|---|---|
|
Concepts Database Database models Database storage Relational model Distributed DBMS ACID Null Relational database Relational algebra Relational calculus Database normalization Referential integrity Relational DBMS Primary key, Foreign key, Surrogate key, Superkey, Candidate key | |
|
Objects Trigger View Table Cursor Log Transaction Index Stored procedure Partition |
Topics in SQL Select Insert Update Merge Delete Join Union Create Drop Begin work Commit Rollback Truncate Alter |
| Implementations of database management systems | |
|
Types of implementations Relational Flat file Deductive Dimensional Hierarchical Object oriented Object relational Temporal XML data stores | |
|
Database products Object-oriented (comparison) Relational (comparison) |
Components Query language Query optimizer Query plan ODBC JDBC |
database is a structured collection of records or data that is stored in a computer system so that a computer program or person using a query language can consult it to answer queries. The records retrieved in answer to queries are information that can be used to make decisions.
..... Click the link for more information.
..... Click the link for more information.
data structure is a way of storing data in a computer so that it can be used efficiently. Often a carefully chosen data structure will allow the most efficient algorithm to be used. The choice of the data structure often begins from the choice of an abstract data type.
..... Click the link for more information.
..... Click the link for more information.
table is a set of data elements (values) that is organized using a model of horizontal rows and vertical columns. The columns are identified by name, and the rows are identified by the values appearing in a particular column subset which has been identified as a candidate key.
..... Click the link for more information.
..... Click the link for more information.
In the context of a relational database table, a column is a set of data values of a particular simple type, one for each row of the table.[1] The columns provide the structure according to which the rows are composed.
..... Click the link for more information.
..... Click the link for more information.
A relational database is a database that conforms to the relational model, and refers to a database's data and schema (the database's structure of how that data is arranged).
..... Click the link for more information.
..... Click the link for more information.
An expression in a programming language is a combination of values, variables, operators, and functions that are interpreted (evaluated) according to the particular rules of precedence and of association for a particular programming language, which computes and then
..... Click the link for more information.
..... Click the link for more information.
IDE: English, German, Spanish
Runtime: Above, French, Chinese, Russian, Czech, Korean
Status: Maintenance mode
Genre: Database Programming language
License: Microsoft EULA
Website: msdn.microsoft.
..... Click the link for more information.
Runtime: Above, French, Chinese, Russian, Czech, Korean
Status: Maintenance mode
Genre: Database Programming language
License: Microsoft EULA
Website: msdn.microsoft.
..... Click the link for more information.
Microsoft SQL Server is a relational database management system (RDBMS) produced by Microsoft. Its primary query language is Transact-SQL, an implementation of the ANSI/ISO standard Structured Query Language (SQL) used by both Microsoft and Sybase.
..... Click the link for more information.
..... Click the link for more information.
In computer science, a leaf node is a node of a tree data structure that has zero child nodes. Often, leaf nodes are the nodes farthest from the root node. In the graph theory tree, a leaf node is a vertex of degree 1 other than the root (except when the tree has only one vertex;
..... Click the link for more information.
..... Click the link for more information.
In computer science, a self-balancing binary search tree or height-balanced binary search tree is a binary search tree that attempts to keep its height, or the number of levels of nodes beneath the root, as small as possible at all times, automatically.
..... Click the link for more information.
..... Click the link for more information.
B+ tree (also known as a Quarternary Tree) is a type of tree, which represents sorted data in a way that allows for efficient insertion, retrieval and removal of records, each of which is identified by a key.
..... Click the link for more information.
..... Click the link for more information.
In computer science, a hash table, or a hash map, is a data structure that associates keys with values. The primary operation it supports efficiently is a lookup: given a key (e.g. a person's name), find the corresponding value (e.g. that person's telephone number).
..... Click the link for more information.
..... Click the link for more information.
SQL
Paradigm: multi-paradigm
Appeared in: 1974
Designed by: Donald D. Chamberlin and Raymond F. Boyce
Developer: IBM
Latest release: SQL:2003/ 2003
Typing discipline: static, strong
Major implementations: Many
SQL
..... Click the link for more information.
Paradigm: multi-paradigm
Appeared in: 1974
Designed by: Donald D. Chamberlin and Raymond F. Boyce
Developer: IBM
Latest release: SQL:2003/ 2003
Typing discipline: static, strong
Major implementations: Many
SQL
..... Click the link for more information.
B-tree is a tree data structure that keeps data sorted and allows searches, insertions, and deletions in logarithmic amortized time. It is most commonly used in databases and filesystems.
..... Click the link for more information.
..... Click the link for more information.
- For other meanings of 'wild card' see wild card.
The term wildcard character has the following meanings:
Telecommunication
In telecommunications, a wildcard character..... Click the link for more information.
A bitmap index is a special kind of index that stores the bulk of its data as bit arrays (commonly called "bitmaps") and answers most queries by performing bitwise logical operations on these bitmaps.
..... Click the link for more information.
..... Click the link for more information.
database is a structured collection of records or data that is stored in a computer system so that a computer program or person using a query language can consult it to answer queries. The records retrieved in answer to queries are information that can be used to make decisions.
..... Click the link for more information.
..... Click the link for more information.
computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage.
..... Click the link for more information.
..... Click the link for more information.
pointer is a programming language data type whose value refers directly to (or “points to”) another value stored elsewhere in the computer memory using its address. Obtaining the value to which a pointer refers is called dereferencing the pointer.
..... Click the link for more information.
..... Click the link for more information.
In computer science, object composition (not to be confused with function composition) is a way and practice to combine simple objects or data types into more complex ones.
..... Click the link for more information.
..... Click the link for more information.
Microsoft Corporation
Public (NASDAQ: MSFT )
Founded Albuquerque, New Mexico, USA (April 4 1975)[1]
Headquarters Redmond, Washington, United States
Key people Bill Gates, Co-founder and Executive Chairman ;
Paul Allen, Co-founder ;
..... Click the link for more information.
Public (NASDAQ: MSFT )
Founded Albuquerque, New Mexico, USA (April 4 1975)[1]
Headquarters Redmond, Washington, United States
Key people Bill Gates, Co-founder and Executive Chairman ;
Paul Allen, Co-founder ;
..... Click the link for more information.
Wrox press (established in 1992) is a computer book publisher, based in the UK. Wrox has pioneered the philosophy of "Programmer to Programmer" books for technology professionals. All books published by Wrox are written by software developers.
..... Click the link for more information.
..... Click the link for more information.
A database management system (DBMS) is computer software designed for the purpose of managing databases. Typical examples of DBMSs include Oracle, DB2, Microsoft Access, Microsoft SQL Server, PostgreSQL, MySQL, FileMaker and Sybase Adaptive Server Enterprise.
..... Click the link for more information.
..... Click the link for more information.
database is a structured collection of records or data that is stored in a computer system so that a computer program or person using a query language can consult it to answer queries. The records retrieved in answer to queries are information that can be used to make decisions.
..... Click the link for more information.
..... Click the link for more information.
A data model is not just a way of structuring data: it also defines a set of operations that can be performed on the data. The relational model, for example, defines operations such as select, project, and join.
..... Click the link for more information.
..... Click the link for more information.
flat file database describes any of various means to encode a data model (most commonly a table) as a plain text file.
..... Click the link for more information.
Unordered
Unordered storage typically stores the records in the order they are inserted, while having..... Click the link for more information.
The relational model for database management is a database model based on predicate logic and set theory. It was first formulated and proposed in 1969 by Edgar Codd with aims that included avoiding, without loss of completeness, the need to write computer programs to
..... Click the link for more information.
..... Click the link for more information.
A distributed database management system is a software system that permits the management of a distributed database and makes the distribution transparent to the users. A distributed database is a collection of multiple, logically interrelated databases distributed over a computer network.
..... Click the link for more information.
..... Click the link for more information.
ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction.
..... Click the link for more information.
..... Click the link for more information.
Null is a special marker used to indicate that a data value is unknown in the Structured Query Language (SQL). Introduced by the creator of the relational database model, Dr. E.F.
..... Click the link for more information.
..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus