Information about Distributed Computing
Distributed computing is a method of computer processing in which different parts of a program run simultaneously on two or more computers that are communicating with each other over a network. Distributed computing is a type of segmented or parallel computing, but the latter term is most commonly used to refer to processing in which different parts of a program run simultaneously on two or more processors that are part of the same computer. While both types of processing require that a program be segmented—divided into sections that can run simultaneously, distributed computing also requires that the division of the program take into account the different environments on which the different sections of the program will be running. For example, two computers are likely to have different file systems and different hardware components.
An example of distributed computing is BOINC, a framework in which large problems can be divided into many small problems which are distributed to many computers. Later, the small results are reassembled into a larger solution.
Distributed computing is a natural result of using networks to enable computers to communicate efficiently. But distributed computing is distinct from computer networking or fragmented computing. The latter refers to two or more computers interacting with each other, but not, typically, sharing the processing of a single program. The World Wide Web is an example of a network, but not an example of distributed computing.
There are numerous technologies and standards used to construct distributed computations, including some which are specially designed and optimized for that purpose, such as Remote Procedure Calls (RPC) or Remote Method Invocation (RMI) or .NET Remoting.

Another important factor is the ability to send software to another computer in a portable way so that it may execute and interact with the existing network. This may not always be possible or practical when using differing hardware and resources, in which case other methods must be used such as cross-compiling or manually porting this software.
Consequently, open distributed systems are required to meet the following challenges:
Troubleshooting and diagnosing problems in a distributed system can also become more difficult, because the analysis may require connecting to remote nodes or inspecting communication between nodes.
Many types of computation are not well suited for distributed environments, typically owing to the amount of network communication or synchronization that would be required between nodes. If bandwidth, latency, or communication requirements are too significant, then the benefits of distributed computing may be negated and the performance may be worse than a non-distributed environment.
Other projects suffer from lack of planning on behalf of their well-meaning originators. These poorly planned projects may not generate results that are palpable, or may not generate data that ultimately result in finished, innovative scientific papers. Sensing that a project may not be generating useful data, the project managers may decide to abruptly terminate the project without definitive results, resulting in wastage of the electricity and computing resources used in the project. Volunteers may feel disappointed and abused by such outcomes. There is an obvious opportunity cost of devoting time and energy to a project that ultimately is useless, when that computing power could have been devoted to a better planned distributed computing project generating useful, concrete results.
Another problem with distributed computing projects is that they may devote resources to problems that may not ultimately be soluble, or to problems that are best pursued later in the future, when desktop computing power becomes fast enough to make pursuit of such solutions practical. Some distributed computing projects may also attempt to use computers to find solutions by number-crunching mathematical or physical models. With such projects there is the risk that the model may not be designed well enough to efficiently generate concrete solutions. The effectiveness of a distributed computing project is therefore determined largely by the sophistication of the project creators.
Distributed programming typically falls into one of several basic architectures or categories: Client-server, 3-tier architecture, N-tier architecture, Distributed objects, loose coupling, or tight coupling.
Common suppliers include Mercury Computer Systems, CSPI, and SKY Computers.
Common uses include 3D medical imaging devices and mobile radar.
Languages specifically tailored for distributed programming are:
Distributed computing projects also often involve competition with other distributed systems. This competition may be for prestige, or it may be a matter of enticing users to donate processing power to a specific project. For example, stat races are a measure of the work a distributed computing project has been able to compute over the past day or week. This has been found to be so important in practice that virtually all distributed computing projects offer online statistical analyses of their performances, updated at least daily if not in real-time.
An example of distributed computing is BOINC, a framework in which large problems can be divided into many small problems which are distributed to many computers. Later, the small results are reassembled into a larger solution.
Distributed computing is a natural result of using networks to enable computers to communicate efficiently. But distributed computing is distinct from computer networking or fragmented computing. The latter refers to two or more computers interacting with each other, but not, typically, sharing the processing of a single program. The World Wide Web is an example of a network, but not an example of distributed computing.
There are numerous technologies and standards used to construct distributed computations, including some which are specially designed and optimized for that purpose, such as Remote Procedure Calls (RPC) or Remote Method Invocation (RMI) or .NET Remoting.
a group of distributed computers working together can outperform one large mainframe. [1]
Organization
Organizing the interaction between each computer is of prime importance. In order to be able to use the widest possible range and types of computers, the protocol or communication channel should not contain or use any information that may not be understood by certain machines. Special care must also be taken that messages are indeed delivered correctly and that invalid messages are rejected which would otherwise bring down the system and perhaps the rest of the network.Another important factor is the ability to send software to another computer in a portable way so that it may execute and interact with the existing network. This may not always be possible or practical when using differing hardware and resources, in which case other methods must be used such as cross-compiling or manually porting this software.
Goals and advantages
There are many different types of distributed computing systems and many challenges to overcome in successfully designing one. The main goal of a distributed computing system is to connect users and resources in a transparent, open, and scalable way. Ideally this arrangement is drastically more fault tolerant and more powerful than many combinations of stand-alone computer systems.Openness
Openness is the property of distributed systems such that each subsystem is continually open to interaction with other systems (see references). Web Services protocols are standards which enable distributed systems to be extended and scaled. In general, an open system that scales has an advantage over a perfectly closed and self-contained system.Consequently, open distributed systems are required to meet the following challenges:
- Monotonicity
- Once something is published in an open system, it cannot be taken back.
- Pluralism
- Different subsystems of an open distributed system include heterogeneous, overlapping and possibly conflicting information. There is no central arbiter of truth in open distributed systems.
- Unbounded nondeterminism
- Asynchronously, different subsystems can come up and go down and communication links can come in and go out between subsystems of an open distributed system. Therefore the time that it will take to complete an operation cannot be bounded in advance (see unbounded nondeterminism).
Drawbacks and disadvantages
- See also:
Technical issues
If not planned properly, a distributed system can decrease the overall reliability of computations if the unavailability of a node can cause disruption of the other nodes. Leslie Lamport famously quipped that: "A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."[2]Troubleshooting and diagnosing problems in a distributed system can also become more difficult, because the analysis may require connecting to remote nodes or inspecting communication between nodes.
Many types of computation are not well suited for distributed environments, typically owing to the amount of network communication or synchronization that would be required between nodes. If bandwidth, latency, or communication requirements are too significant, then the benefits of distributed computing may be negated and the performance may be worse than a non-distributed environment.
Project-related problems
Distributed computing projects may generate data that is proprietary to private industry, even though the process of generating that data involves the resources of volunteers. This may result in controversy as private industry profits from the data which is generated with the aid of volunteers. In addition, some distributed computing projects, such as biology projects that aim to develop thousands or millions of "candidate molecules" for solving various medical problems, may create vast amounts of raw data. This raw data may be useless by itself without refinement of the raw data or testing of candidate results in real-world experiments. Such refinement and experimentation may be so expensive and time-consuming that it may literally take decades to sift through the data. Until the data is refined, no benefits can be acquired from the computing work.Other projects suffer from lack of planning on behalf of their well-meaning originators. These poorly planned projects may not generate results that are palpable, or may not generate data that ultimately result in finished, innovative scientific papers. Sensing that a project may not be generating useful data, the project managers may decide to abruptly terminate the project without definitive results, resulting in wastage of the electricity and computing resources used in the project. Volunteers may feel disappointed and abused by such outcomes. There is an obvious opportunity cost of devoting time and energy to a project that ultimately is useless, when that computing power could have been devoted to a better planned distributed computing project generating useful, concrete results.
Another problem with distributed computing projects is that they may devote resources to problems that may not ultimately be soluble, or to problems that are best pursued later in the future, when desktop computing power becomes fast enough to make pursuit of such solutions practical. Some distributed computing projects may also attempt to use computers to find solutions by number-crunching mathematical or physical models. With such projects there is the risk that the model may not be designed well enough to efficiently generate concrete solutions. The effectiveness of a distributed computing project is therefore determined largely by the sophistication of the project creators.
Architecture
Various hardware and software architectures are used for distributed computing. At a lower level, it is necessary to interconnect multiple CPUs with some sort of network, regardless of whether that network is printed onto a circuit board or made up of loosely-coupled devices and cables. At a higher level, it is necessary to interconnect processes running on those CPUs with some sort of communication system.Distributed programming typically falls into one of several basic architectures or categories: Client-server, 3-tier architecture, N-tier architecture, Distributed objects, loose coupling, or tight coupling.
- Client-server — Smart client code contacts the server for data, then formats and displays it to the user. Input at the client is committed back to the server when it represents a permanent change.
- 3-tier architecture — Three tier systems move the client intelligence to a middle tier so that stateless clients can be used. This simplifies application deployment. Most web applications are 3-Tier.
- N-tier architecture — N-Tier refers typically to web applications which further forward their requests to other enterprise services. This type of application is the one most responsible for the success of application servers.
- Tightly coupled (clustered) — refers typically to a set of highly integrated machines that run the same process in parallel, subdividing the task in parts that are made individually by each one, and then put back together to make the final result.
- Peer-to-peer — an architecture where there is no special machine or machines that provide a service or manage the network resources. Instead all responsibilities are uniformly divided among all machines, known as peers. Peers can serve both as clients and servers.
- Space based — refers to an infrastructure that creates the illusion (virtualization) of one single address-space. Data are transparently replicated according to application needs. Decoupling in time, space and reference is achieved.
Concurrency
Distributed computing implements a kind of concurrency. It interrelates tightly with concurrent programming so much that they are sometimes not taught as distinct subjects [4].Multiprocessor systems
A multiprocessor system is simply a computer that has more than one CPU on its motherboard. If the operating system is built to take advantage of this, it can run different processes (or different threads belonging to the same process) on different CPUs.Multicore systems
Intel CPUs from the late Pentium 4 era (Northwood and Prescott cores) employed a technology called Hyperthreading that allowed more than one thread (usually two) to run on the same CPU. The more recent Sun UltraSPARC T1, AMD Athlon 64 X2, AMD Athlon FX, AMD Opteron, Intel Pentium D, Intel Core, Intel Core 2 and Intel Xeon processors feature multiple processor cores to also increase the number of concurrent threads they can run.Multicomputer systems
A multicomputer may be considered to be either a loosely coupled NUMA computer or a tightly coupled cluster. Multicomputers are commonly used when strong compute power is required in an environment with restricted physical space or electrical power.Common suppliers include Mercury Computer Systems, CSPI, and SKY Computers.
Common uses include 3D medical imaging devices and mobile radar.
Computing taxonomies
The types of distributed systems are based on Flynn's taxonomy of systems; single instruction, single data (SISD), single instruction, multiple data (SIMD), multiple instruction, single data (MISD), and multiple instruction, multiple data (MIMD). Other taxonomies and architectures available at Computer architecture and in .Computer clusters
Grid computing
Languages
Nearly any programming language that has access to the full hardware of the system could handle distributed programming given enough time and code. Remote procedure calls distribute operating system commands over a network connection. Systems like CORBA, Microsoft DCOM, Java RMI and others, try to map object oriented design to the network. Loosely coupled systems communicate through intermediate documents that are typically human readable (e.g. XML, HTML, SGML, X.500, and EDI).Languages specifically tailored for distributed programming are:
- Ada programming language [5]
- Alef programming language
- E programming language
- Erlang programming language
- Limbo programming language
- Oz programming language
- ZPL (programming language)
Examples
Projects
Distributed computing projects also often involve competition with other distributed systems. This competition may be for prestige, or it may be a matter of enticing users to donate processing power to a specific project. For example, stat races are a measure of the work a distributed computing project has been able to compute over the past day or week. This has been found to be so important in practice that virtually all distributed computing projects offer online statistical analyses of their performances, updated at least daily if not in real-time.
See also
- Fallacies of Distributed Computing
- List of distributed computing publications
- Parallel computing
- Network Agility
- Application server
- Software componentry
- Distributed computing environment
- High-Throughput Computing
- List of distributed computing projects
- Basenet
References
1. ^ Fish Picture from the Distributed Computing Review
2. ^ Leslie Lamport. Subject: distribution (Email message sent to a DEC SRC bulletin board at 12:23:29 PDT on 28 May 87). Retrieved on 2007-04-28.
3. ^ A database-centric virtual chemistry system, J Chem Inf Model. 2006 May-Jun;46(3):1034-9
4. ^ CS236370 Concurrent and Distributed Programming 2002
5. ^ Ada Reference Manual, ISO/IEC 8652:2005(E) Ed. 3, Annex E Distributed Systems
6. ^ David P. Anderson (2005-05-23). "A Million Years of Computing". Retrieved on 2006-08-11.
2. ^ Leslie Lamport. Subject: distribution (Email message sent to a DEC SRC bulletin board at 12:23:29 PDT on 28 May 87). Retrieved on 2007-04-28.
3. ^ A database-centric virtual chemistry system, J Chem Inf Model. 2006 May-Jun;46(3):1034-9
4. ^ CS236370 Concurrent and Distributed Programming 2002
5. ^ Ada Reference Manual, ISO/IEC 8652:2005(E) Ed. 3, Annex E Distributed Systems
6. ^ David P. Anderson (2005-05-23). "A Million Years of Computing". Retrieved on 2006-08-11.
Further reading
- Attiya, Hagit and Welch, Jennifer (2004). Distributed Computing: Fundamentals, Simulations, and Advanced Topics. Wiley-Interscience. ISBN 0471453242.
- Lynch, Nancy A (1997). Distributed Algorithms. Morgan Kaufmann. ISBN 1558603484.
- Tel, Gerard (1994). Introduction to Distributed Algorithms. Cambridge University Press.
- Davies, Antony (June 2004). "Computational Intermediation and the Evolution of Computation as a Commodity". Applied Economics.
- Kornfeld, William (January 1981). "[https://dspace.mit.edu/handle/1721.1/5693 The Scientific Community Metaphor]". MIT AI (Memo 641).
- Hewitt, Carl (August 1983). "Analyzing the Roles of Descriptions and Actions in Open Systems". Proceedings of the National Conference on Artificial Intelligence.
- Hewitt, Carl (April 1985). "The Challenge of Open Systems". Byte Magazine.
- Hewitt, Carl (1999-10-23–1999-10-27). "Towards Open Information Systems Semantics". Proceedings of 10th International Workshop on Distributed Artificial Intelligence.
- Hewitt, Carl (January 1991). "Open Information Systems Semantics". Journal of Artificial Intelligence.
- Nadiminti, Dias de Assunção, Buyya (September 2006). "Distributed Systems and Recent Innovations: Challenges and Benefits". InfoNet Magazine, Volume 16, Issue 3, Melbourne, Australia.
External links
- A primer on distributed computing
- Distributed computing at the Open Directory Project
- Distributed computing journals at the Open Directory Project
- Distributed computing conferences at the Open Directory Project
- MIT's Open Course - Distributed Algorithms
- Melbourne's Masters Course in Distributed Computing
- Distributed Computing Frequently Asked Questions (FAQs)
Parallel computing is the simultaneous execution of some combination of multiple instances of programmed instructions and data on multiple processors in order to obtain results faster.
..... Click the link for more information.
..... Click the link for more information.
The Berkeley Open Infrastructure for Network Computing (BOINC) is a non-commercial middleware system for volunteer computing, originally developed to support the SETI@home project, but intended to be useful for other applications in areas as diverse as mathematics, medicine,
..... Click the link for more information.
..... Click the link for more information.
Computer networking is the engineering discipline concerned with communication between computer systems or devices. Networking, routers, routing protocols, and networking over the public Internet have their specifications defined in documents called RFCs.
..... Click the link for more information.
..... Click the link for more information.
Remote procedure call (RPC) is a technology that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote
..... Click the link for more information.
..... Click the link for more information.
Java Remote Method Invocation API, or Java RMI, is a Java application programming interface for performing the object equivalent of remote procedure calls.
There are two common implementations of the API.
..... Click the link for more information.
There are two common implementations of the API.
..... Click the link for more information.
.NET Remoting is a Microsoft application programming interface (API) for interprocess communication released in 2002 with the 1.0 version of .NET Framework. It is one in a series of Microsoft technologies that began in 1990 with the first version of Object Linking and Embedding
..... Click the link for more information.
..... Click the link for more information.
transparent if the system after change adheres to previous external interface as much as possible while changing its internal behaviour. The purpose is to shield from change all systems (or human users) on the other end of the interface.
..... Click the link for more information.
..... Click the link for more information.
In telecommunications and software engineering, scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged.
..... Click the link for more information.
..... Click the link for more information.
Fault-tolerant design refers to a method for designing a system so it will continue to operate, possibly at a reduced level (also known as graceful degradation), rather than failing completely, when some part of the system fails.
..... Click the link for more information.
..... Click the link for more information.
Stand-alone is a loosely defined term used to sort computer programs. It tries to draw some distinction between programs invoked by some computer event and those invoked by other programs.
..... Click the link for more information.
..... Click the link for more information.
The W3C defines a Web service (many sources also capitalize the second word, as in Web Services) as "a software system designed to support interoperable Machine to Machine interaction over a network.
..... Click the link for more information.
..... Click the link for more information.
A Protocol is a set of guidelines or rules that help in governing an operation on the internet and communications over it. They are of many types such as ftp, http, tcp/ip etc.
..... Click the link for more information.
..... Click the link for more information.
In computer science, unbounded nondeterminism or unbounded indeterminacy is a property of concurrency by which the amount of delay in servicing a request can become unbounded as a result of arbitration of contention for shared resources
..... Click the link for more information.
..... Click the link for more information.
In general, reliability (systemic def.) is the ability of a person or system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances.
The IEEE defines it as ". . .
..... Click the link for more information.
The IEEE defines it as ". . .
..... Click the link for more information.
availability has the following meanings:
1. The degree to which a system, subsystem, or equipment is operable and in a committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time.
..... Click the link for more information.
1. The degree to which a system, subsystem, or equipment is operable and in a committable state at the start of a mission, when the mission is called for at an unknown, i.e., a random, time.
..... Click the link for more information.
Dr. Leslie Lamport (born February 7, 1941 in New York City) is an American computer scientist. A graduate of the Bronx High School of Science, he received a B.S. in mathematics from the Massachusetts Institute of Technology in 1960, and M.A. and Ph.D.
..... Click the link for more information.
..... Click the link for more information.
Troubleshooting is a form of problem solving. It is the systematic search for the source of a problem so that it can be solved. Troubleshooting is often a process of elimination - eliminating potential causes of a problem.
..... Click the link for more information.
..... Click the link for more information.
Computation is a general term for any type of information processing that can be represented mathematically. This includes phenomena ranging from simple calculations to human thinking.
..... Click the link for more information.
..... Click the link for more information.
Synchronization (or Sync) is a problem in timekeeping which requires the coordination of events to operate a system in unison. The familiar conductor of an orchestra serves to keep the orchestra in time.
..... Click the link for more information.
..... Click the link for more information.
A node is a device that is connected as part of a computer network. For example, a node may be a computer, personal digital assistant, cell phone, router, switch, or hub.
..... Click the link for more information.
..... Click the link for more information.
Bandwidth is the difference between the upper and lower cutoff frequencies of, for example, a filter, a communication channel, or a signal spectrum, and is typically measured in hertz.
..... Click the link for more information.
..... Click the link for more information.
Latent can mean "hidden", "presently inactive" or "potentially existing but not yet realized". In specific contexts, it can mean:
..... Click the link for more information.
- Latent heat, the amount of energy released or absorbed by a substance during a change of phase
- Latent geologic fault
..... Click the link for more information.
performance, in performing arts, generally comprises an event in which one group of people (the performer or performers) behave in a particular way for another group of people (the audience).
..... Click the link for more information.
..... Click the link for more information.
Client-server is a computing architecture which separates a client from a server, and is almost always implemented over a computer network. Each client or server connected to a network can also be referred to as a node.
..... Click the link for more information.
..... Click the link for more information.
In software engineering, multi-tier architecture (often referred to as n-tier architecture) is a client-server architecture, originally designed by Jonathon Bolster of Hematites Corp, in which an application is executed by more than one distinct software agent.
..... Click the link for more information.
..... Click the link for more information.
In software engineering, multi-tier architecture (often referred to as n-tier architecture) is a client-server architecture, originally designed by Jonathon Bolster of Hematites Corp, in which an application is executed by more than one distinct software agent.
..... Click the link for more information.
..... Click the link for more information.
Distributed objects are software modules that are designed to work together, but reside either in multiple computers connected via a network or in different processes inside the same computer.
..... Click the link for more information.
..... Click the link for more information.
Loose coupling describes a resilient relationship between two or more systems or organizations with some kind of exchange relationship. Each end of the transaction makes its requirements explicit and makes few assumptions about the other end.
..... Click the link for more information.
..... Click the link for more information.
A computer cluster is a group of tightly coupled computers that work together closely so that in many respects they can be viewed as though they are a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area
..... Click the link for more information.
..... Click the link for more information.
Client-server is a computing architecture which separates a client from a server, and is almost always implemented over a computer network. Each client or server connected to a network can also be referred to as a node.
..... Click the link for more information.
..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus