Information about Search Engines



A search engine is an information retrieval system designed to help find information stored on a computer system. Search engines help to minimize the time required to find information and the amount of information which must be consulted, akin to other techniques for managing information overload.

The most popular form of a search engine is a Web search engine which searches for information on the public World Wide Web. Other kinds of search engines include enterprise search engines, which search on intranets, desktop search engines, and mobile search engines.

How search engines work

Querying

Main article: Search query
Search engines provide an interface to a group of items that enables users to specify criteria about an item of interest and have the engine find the matching items within the group.

In the most popular form of search, items are documents or web pages and the criteria are words or concepts that the documents may contain[1].

There are several varieties of syntax in which a search engine user can express a query. Some methods are formalized and require a strict, logical and algebraic syntax. Other approaches are less strict and allow for a less defined query. One form of a less-restricted query syntax is referred to as Natural Language Search, which is a term typically used to describe web search engines that apply natural language processing of some form. For example, instead of searching for one or two words, a query could consist of an English sentence or paragraph. A natural language search engine will then parse the query into words and evaluate searches for these words. This places less burden on the search engine user to formulate a specific query using restrictive, and sometimes difficult to learn, syntax. A second definition of natural language search engines reflects how the search engine performs indexing, unrelated to the query syntax. This requires a semantic understanding of the query in order to disambiguate the text.

Traditional search engines tend to use a non-linguistic model of language and the hypothesis is that NLS will provide better results - that is to say, results that more accurately and efficiently support a user's need.

Ranking

A Boolean search for an item within a group of items will either return the exact matching item or nothing. This is a rather orthodox search method where the equality between the desired item and the actual item must be exact. In application, it is sometimes far more beneficial and useful to incorporate a more lax measure of similarity between the desired item(s) and the items that exist in the group being searched.

For example, instead of finding only the exact book in a library, a library search engine may return a list of 'similar' books, with the exact book listed first.

The list of items that meet the criteria specified by the query are typically sorted, or ranked, in some regard so as to place the most 'relevant' items first. Placing the most relevant items first reduces the time required by users to determine whether one or more of the resulting items are sufficiently similar to the query. It has become common knowledge through the use of Web search engines that the further down the list of matching items you browse, the less relevant the items become.

Indexing

Main article: Index (search engine)


To provide a set of matching items quickly, a search engine will typically collect information, or metadata, about the group of items under consideration beforehand. For example, a library search engine may determine the author of each book automatically and add the author name to a description of each book. Users can then search for books by the author's name. Other metadata in this example might include the book title, the number of pages in the book, the date it was published, and so forth.

The metadata collected about each item is typically stored on a computer in the form of an index. The index typically requires a smaller amount of computer storage and provides a way for the search engine to calculate the relevance, or similarity, between the query and the set of items.

Helpful Features

# Spell checker
# Highlighter


Users can save time in typing correct words with auto correct options enable/disable. Highlighter such as yellow line markers help in highlighting certain search item on the results. Could be used for copying and editing as well.

Predictive search engines such as study consumer or end user pattern in presenting results. Variants of search engines present personalized data.

See also

References

1. ^ Voorhees, E.M. Natural Language Processing and Information Retrieval. National Institute of Standards and Technology. March 2000.
(The entries are in alphabetical order within each category.)

General search engines

  • Alexa Internet
  • Ask.com (formerly Ask Jeeves)
  • Exalead
  • Gigablast
  • Google
  • Live Search (formerly MSN Search)
  • MozDex
  • WiseNut
  • Yahoo! Search


..... Click the link for more information.
Information retrieval (IR) is the science of searching for information in documents, searching for documents themselves, searching for metadata which describe documents, or searching within databases, whether relational stand-alone databases or hypertextually-networked databases
..... Click the link for more information.
computer is a machine which manipulates data according to a list of instructions.

Computers take numerous physical forms. The first devices that resemble modern computers date to the mid-20th century (around 1940 - 1941), although the computer concept and various machines
..... Click the link for more information.
Information overload (aka information flood) is a term that is usually used in conjunction with various forms of Computer-mediated communication such as Electronic mail. It refers to the state of having too much
..... Click the link for more information.
Web search engines provide an interface to search for information on the World Wide Web. Information may consist of web pages, images and other types of files.

Some search engines also mine data available in newsgroups, databases, or open directories.
..... Click the link for more information.
World Wide Web (commonly shortened to the Web) is a system of interlinked, hypertext documents accessed via the Internet. With a web browser, a user views web pages that may contain text, images, videos, and other multimedia and navigates between them using hyperlinks.
..... Click the link for more information.
An intranet is a private computer network that uses Internet protocols, network connectivity to securely share part of an organization's information or operations with its employees. Sometimes the term refers only to the most visible service, the internal website.
..... Click the link for more information.
Desktop search is the name for the field of search tools which search the contents of a user's own computer files, rather than searching the Internet. These tools are designed to find information on the user's PC, including web browser histories, e-mail archives, text documents,
..... Click the link for more information.
worldwide view of the subject.
Please [ improve this article] or discuss the issue on the talk page.


Mobile Search is an evolving branch of information retrieval services that is centered around the convergence of search engines and mobile phones or other
..... Click the link for more information.
A web search query is a query that a user enters into web search engine to satisfy his or her information needs. Web search queries are distinctive in that they are unstructured and often ambiguous; they vary greatly from standard query languages which are governed by strict syntax
..... Click the link for more information.
An interface defines the communication boundary between two entities, such as a piece of software, a hardware device, or a user. It generally refers to an abstraction that an entity provides of itself to the outside.
..... Click the link for more information.
A Web page or webpage is a resource of information that is suitable for the World Wide Web and can be accessed through a web browser. This information is usually in HTML or XHTML format, and may provide navigation to other web pages via hypertext links.
..... Click the link for more information.
In computer science, SYNTAX is a system used to generate lexical and syntactic analyzers (parsers) (both deterministic and non-deterministic) for all kind of context-free grammars
..... Click the link for more information.
In the philosophy of language, a natural language (or ordinary language) is a language that is spoken, written, or signed (visually or tactilely) by humans for general-purpose communication, as distinguished from formal languages (such as computer-programming
..... Click the link for more information.


Parse (پارسه) is the area in between the Shatt al-Arab and the Persian Gulf, in which the Persian Empire existed.
..... Click the link for more information.
Boolean may refer to:
  • Boolean datatype, a certain datatype in computer science
  • Boolean algebra (logic), a logical calculus of truth values or set membership

..... Click the link for more information.
Similarity is some degree of symmetry in either analogy and resemblance between two or more concepts or objects. The notion of similarity rests either on exact or approximate repetitions of patterns in the compared items.
..... Click the link for more information.
Web search engines provide an interface to search for information on the World Wide Web. Information may consist of web pages, images and other types of files.

Some search engines also mine data available in newsgroups, databases, or open directories.
..... Click the link for more information.
This article or section needs copy editing for grammar, style, cohesion, tone and/or spelling.
You can assist by [ editing it] now. A how-to guide is available, as is general .
This article has been tagged since January 2007.
..... Click the link for more information.
Metadata is data about data. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items.

Metadata (sometimes written 'meta data') is used to facilitate the understanding, use and management of data.
..... Click the link for more information.
computer is a machine which manipulates data according to a list of instructions.

Computers take numerous physical forms. The first devices that resemble modern computers date to the mid-20th century (around 1940 - 1941), although the computer concept and various machines
..... Click the link for more information.
index is a system used to make finding information easier.

Index may also refer to:
  • Index (publishing), a detailed list, usually arranged alphabetically, of the specific information in a publication

..... Click the link for more information.
Computer data storage, computer memory, and often casually storage or memory refer to computer components, devices and recording media that retain digital data used for computing for some interval of time.
..... Click the link for more information.
(The entries are in alphabetical order within each category.)

General search engines

  • Alexa Internet
  • Ask.com (formerly Ask Jeeves)
  • Exalead
  • Gigablast
  • Google
  • Live Search (formerly MSN Search)
  • MozDex
  • WiseNut
  • Yahoo! Search


..... Click the link for more information.
Automatic summarization is the creation of a shortened version of a text by a computer program. The product of this procedure still contains the most important points of the original text.
..... Click the link for more information.
bibliographic or library database is a database of bibliographic information. It may be a database containing information about books and other materials held in a library (e.g.
..... Click the link for more information.
Desktop search is the name for the field of search tools which search the contents of a user's own computer files, rather than searching the Internet. These tools are designed to find information on the user's PC, including web browser histories, e-mail archives, text documents,
..... Click the link for more information.
Federated search is the simultaneous search of multiple online databases and is an emerging feature of automated, Web-based library and information retrieval systems. It is also often referred to as a portal, as opposed to simply a Web-based search engine.
..... Click the link for more information.
Image search (or image search engine) is a search engine specialised on finding pictures, images, animations etc. Like the text search, image search is an information retrieval system designed to help to find information on the Internet and it allows the user to look for
..... Click the link for more information.
An inverted index (also referred to as postings file or inverted file) is an index structure storing a mapping from words to their locations in a document or a set of documents, allowing full text search.
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus


page counter