Just knowledge it: distributed computing

This concept involves the creation of a software system that runs programs and stores data across a number of dif-ferent computers, an idea pervasive today. A simple form is the central computer (such as in a bank or credit card company) with which thousands of terminals communicate to submit transactions. While this system is in some sense distributed, it is not really decentralized. Most of the work is done by the central computer, which is not dependent on the terminals for its own functioning. However, responsi-bilities can be more evenly apportioned between computers (see client-server computing).

Today the World Wide Web is in a sense the world’s largest distributed computing system. Millions of docu-ments stored on hundreds of thousands of servers can be accessed by millions of users’ Web browsers running on a variety of personal computers. While there are rules for specifying addresses and creating and routing data pack-ets (see Internet and tcp/ip), no one agency or computer complex controls access to information or communication (such as e-mail).

Elements of a Distributed Computing System

The term distributed computer system today generally refers to a more specific and coherent system such as a database where data objects (such as records or views) can reside on any computer within the system. Distributed computer sys-tems generally have the following characteristics:

• The system consists of a number of computers (some-times called nodes). The computers need not neces-sarily use the same type of hardware, though they generally use the same (or similar) operating systems.

• Data consists of logical objects (such as database records) that can be stored on disks connected to any computer in the system. The ability to move data around allows the system to reduce bottlenecks in data flow or optimize speed by storing the most frequently used data in places from which it can be retrieved the most quickly.

• A system of unique names specifies the location of each object. A familiar example is the DNS (Domain Naming System) that directs requests to Web pages.

• Typically, there are many processes running concur-rently (at the same time). Like data objects, processes can be allocated to particular processors to balance the load. Processes can be further broken down into threads (see concurrent programming). Thus, the system can adjust to changing conditions (for exam-ple, processing larger numbers of incoming transac-tions during the day versus performing batches of “housekeeping” tasks at night).

• A remote procedure call facility enables processes on one computer to communicate with processes run-ning on a different computer.

• In inter-process communication protocols specify the processing of “messages” that processes use to report status or ask for resources. Message-passing can be asynchronous (not time-dependent, and analogous to mailing letters) or synchronous (with interactive responses, as in a conversation).

• The capabilities of each object (and thus the messages it can respond to or send) are defined in terms of an interface and an implementation. The interface is like the declaration in a conventional program: It defines the types of data that can be received and the types of data that will be returned to the calling process. The implementation is the code that specifies how the actual processing will be done. The hiding of imple-mentation details within the object is characteristic of object-oriented programming (see class).

• A distributed computing environment includes facili-ties for managing objects dynamically. This includes lower-level functions such as copying, deleting, or moving objects and systemwide capabilities to dis-tribute objects in such as way as to distribute the load on the system’s processors more evenly, to make backup copies of objects (replication), and to reclaim and reorganize resources (such as memory or disk space) that are no longer allocated to objects.

Three widely used systems for distributed computing are Microsoft’s DCOM (Distributed Component Object Model), OMG’s Common Object Request Broker Archi-tecture (see Microsoft .net and corba), and Sun’s Java/Remote Method Invocation (Java/RMI). While these imple-mentations are quite different in details, they provide most of the elements and facilities summarized above.

Applications

Distributed computing is particularly suited to applica-tions that require extensive computing resources and that may need to be scaled (smoothly enlarged) to accommo-date increasing needs (see grid computing). Examples might include large databases, intensive scientific comput-ing, and cryptography. A particularly interesting example is SETI@home, which invites computer users to install a special screen saver that runs a distributed process dur-ing the computer’s idle time. The process analyzes radio telescope data for correlations that might indicate receipt of signals from an extraterrestrial intelligence (see coopera-tive processing).

Besides being able to marshal very large amounts of computing power, distributed systems offer improved fault tolerance. Because the system is decentralized, if a par-ticular computer fails, its processes can be replaced by ones running on other machines. Replication (copying) of data across a widely dispersed network can also provide improved data recovery in the event of a disaster.

Just knowledge it

Search This Blog

Thursday, 14 November 2013

distributed computing

No comments:

Post a Comment