Just knowledge it: October 2013

Tuesday, 29 October 2013

digital dashboard

The dashboard of a car is designed to present vital real-time information to the driver, such as speed, fuel supply, and engine status. Ideally this information should be easy to grasp at a glance, allowing for prompt action when neces-sary. Conversely, unnecessary and potentially distracting information should be avoided, or at least relegated to an unobtrusive secondary display.

A digital dashboard is a computer display that uses sim-ilar concepts. Its goal is to provide an executive or manager with the key information that allows him or her to monitor the health of the enterprise and to take action when neces-sary. (A digital dashboard can also be part of a larger set of management tools—see decision support system.)

The screen display for a digital dashboard can use a variety of objects (see graphical user interface). These can include traditional charts (line, bar, or pie), color-coded maps, depictions of gauges, and a variety of other interface elements sometimes known as “widgets.”

However information is depicted, the dashboard is designed to summarize the current status of business or other functions, identify trends, and warn the user when attention is required. For example, a dashboard might summarize production and shipping for each of a compa-ny’s factories. Bars on a chart might be green when levels are within normal parameters, but turn red if, for exam-ple, production has fallen more than 20 percent below target goals. Dashboard displays can also be useful for graphically showing the degree to which project objectives are being met.

Digital dashboards can be custom built or obtained in forms specialized for various types of business. Typi-cally the dashboard is hosted on the corporate Web server and is accessible through Web browsers—perhaps with an abbreviated version that can be viewed on PDAs and smart phones.

Critique

Today dashboards are in widespread use in many top cor-porations, from Microsoft to Home Depot. An oft-cited advantage of dashboard technology is that it keeps manag-ers focused and provides for quick response in situations where time may be crucial. No longer is it necessary for the manager to track down key individuals and try to make sense of their reports over the phone.

Some critics, however, worry that dashboards may make management too “data driven.” Those regular calls, after all, can form an important part of the relationship between an executive or manager and subordinates, as well as get-ting a sense of morale and possible personnel problems that may be affecting productivity. Overreliance on dashboards and “bottom line” numbers may also hurt the morale of salespeople and others who come to feel that they are being micromanaged. Further, the dashboard may omit important considerations that in turn are likely to receive less atten-tion and support.

digital convergence

Since the late 20th century, many forms of communication and information storage have been transformed from ana-log to digital representations (see analog and digital). For example, the phonograph record (an electromechani-cal analog format) gave way during the 1980s to a wholly digital format (see cd-rom). Video, too, is now increasingly being stored in digital form (DVD or laser disks) rather than in the analog form of videotape. Voice telephony, which originally involved the conversion of sound to analogous electrical signals, is increasingly being digitized (as with many cell phones) and transmitted in packet form over the communications network.

The concept of digital convergence is an attempt to explore the implications of so many formerly disparate ana-log media now being available in digital form. All forms of digital media have key features in common. First, they are essentially pure information (computer data). This means that regardless of whether the data originally represented still images from a camera, video, or film, the sound of a human voice, music, or some other form of expression, that data can be stored, manipulated, and retrieved under the control of computer algorithms. This makes it easier to create seamless multimedia presentations (see multime-dia and hypertext and hypermedia). Services or products previously considered to be separate can be combined in new ways. For example, many radio stations now provide their programming in the form of “streaming audio” that can be played by such utilities as RealPlayer or Microsoft Windows Media Player (see streaming). Similarly, televi-sion news services such as CNN can offer selected excerpts of their coverage in the form of streaming video files. As more users gain access to broadband Internet connections (such as cable or DSL), it is gradually becoming feasible to deliver TV programs and even full-length feature films in digital format. By the middle of the decade, media deliv-ery began to proliferate on new platforms that represent a further convergence of function. Many “smart phones” can play audio and video (see smartphone). In July 2007 Apple’s iPhone entered the market, combining phone, media player, and Web browsing functions, and similar devices will no doubt follow (see also pda).

Emerging Issues

The merging of traditional media into a growing stream of digital content has created a number of difficult legal and social issues. Digital images or sounds from various sources can easily be combined, filtered, edited, or other-wise altered for a variety of purposes. As a result, the value of photographs as evidence may be gradually compromised.

The ownership and control of the intellectual property rep-resented by music, video, and film has also been compli-cated by the combination of digitization and the pervasive Internet. For example, during 2000–2001 the legal battles involving Napster, a program that allows users to share music files pitted the rights of music producers and artists to control the distribution of their product against the tech-nological capability of users to freely copy and distribute the material. While a variety of copy protection systems (both software and hardware-based) have been developed in an attempt to prevent unauthorized copying, histori-cally such measures have had only limited effectiveness (see copy protection, digital rights management, and intellectual property and computing).

Digital convergence also raises deeper philosophical issues. Musicians, artists, and scholars have frequently sug-gested that the process of digitization fails to capture subtle-ties of performance that might have been accessible in the original media. At the same time, the richness and immer-sive qualities of the new multimedia may be drawing people further away from the direct experience of the “real” analog world around them. Ultimately, the embodiment of digital convergence in the form of virtual reality likely to emerge in the early 21st century will pose questions as profound as those provoked by the invention of printing and the devel-opment of mass broadcast media (see virtual reality).

digital cash

Also called digital money or e-cash, digital cash represents the attempt to create a method of payment for online trans-actions that is as easy to use as the familiar bills and coins in daily commerce (see e-commerce). At present, credit cards are the principal means of making online payments. While using credit cards takes advantage of a well-estab-lished infrastructure, it has some disadvantages. From a security standpoint, each payment potentially exposes the payer to the possibility that the credit card number and possibly other identifying information will be diverted and used for fraudulent transactions and identity theft. While the use of secure (encrypted) online sites has reduced this risk, it cannot be eliminated entirely (see computer crime and security). Credit cards are also impracticable for very small payments from cents to a few dollars (such as for access to magazine articles) because the fees charged by the credit card companies would be too high in relation to the value of the transaction.

One way to reduce security concerns is to make trans-actions that are anonymous (like cash) but guaranteed. Products such as DigiCash and CyberCash allow users to purchase increments of a cash equivalent using their credit cards or bank transfers, creating a “digital wallet.” The user can then go to any Web site that accepts the digital cash and make a payment, which is deducted from the wallet. The merchant can verify the authenticity of the cash through its issuer. Since no credit card information is exchanged between consumer and merchant, there is no possibility of compromising it. The lack of wide acceptance and stan-dards has thus far limited the usefulness of digital cash.

The need to pay for small transactions can be han-dled through micropayments systems. For example, users of a variety of online publications can establish accounts through a company called Qpass. When the user wants to read an article from the New York Times, for example, the fee for the article (typically $2–3) is charged against the user’s Qpass account. The user receives one monthly credit card billing from Qpass, which settles accounts with the publications. Qpass, eCharge, and similar companies have had modest success. A similar (and quite successful) ser-vice is offered by companies such as PayPal and Billpoint, which allow winning auction bidders to send money from their credit card or bank account to the seller, who would not otherwise be equipped to accept credit cards. True micropayments would extend down to just a few cents.

“True” digital cash, allowing for anonymous payments and micropayments, has been slow to catch on. However, the successful digital cash system is likely to have the fol-lowing characteristics:

• Protects the anonymity of the purchaser (no credit card information transmitted to the seller)

• Verifiable by the seller, perhaps by using one-time encryption keys

• The purchaser can create digital cash freely from credit cards or bank accounts

• micropayments can be aggregated at a very low trans-action cost

As use of digital cash becomes more widespread, it is likely that tax and law enforcement agencies will press for the inclusion of some way to penetrate the anonymity of trans-actions for audit or investigation purposes. They will be opposed by civil libertarians and privacy advocates. One likely compromise may be requiring that transaction infor-mation or encryption keys be deposited in some sort of escrow agency, subject to being divulged upon court order.

Diffie, Bailey Whitfield

Diffie, Bailey Whitfield

(1944– ) American

Mathematician, Computer Scientist

Bailey Whitfield Diffie created the system of public key cryptography that many computer users depend on today to protect their sensitive information (see encryption).

Diffie was born on June 5, 1944, in the borough of Queens, New York City. As a youngster he read about secret codes and became fascinated. Although he was an indiffer-ent high school student who barely qualified for graduation, Diffie scored so high on standardized tests that he won admission to the University of California, Berkeley, in 1962, where he studied mathematics for two years. However, in 1964 he transferred to the Massachusetts Institute of Tech-nology (MIT) and obtained his B.S. in mathematics in 1965.

After graduation Diffie took a job at Mitre Corporation, a defense contractor, where he plunged into computer pro-gramming, helping create Mathlab, a program that allowed mathematicians to not merely calculate with a computer, but also to manipulate mathematical symbols to solve equa-tions. (The program would eventually evolve into Macsyma, a software package used widely in the mathematical com-munity—see mathematics software.)

By the early 1970s Diffie had moved to the West Coast, working at the Stanford Artificial Intelligence Laboratory (SAIL), where he met Lawrence Roberts, head of informa-tion processing research for ARPA, the Defense Depart-ment’s research agency. Roberts’s main project was the creation of the ARPAnet, the computer network that would later evolve into the Internet.

Roberts was interested in providing security for the new network, and (along with AI researcher John McCarthy) he helped revive Diffie’s dormant interest in cryptogra-phy. By 1974 Diffie had learned that IBM was developing a more secure cipher system, the DES (Data Encryption Standard), under government supervision. However, Diffie soon became frustrated with the way the National Security Agency (NSA) doled out or withheld information on cryp-tography, making independent research in the field very difficult. Seeking to learn the state of the art, Diffie traveled widely, seeking out people who might have fresh thoughts on the subject.

Diffie found one such person in Martin Hellman, a Stan-ford professor who had also been struggling on his own to develop a better system of encryption. They decided to pool their ideas and efforts, and Diffie and Hellman came up with a new approach, which would become known as pub-lic key cryptography. It combined two important ideas that had already been discovered to an extent by other research-ers. The first idea was the “trap-door function”—a math-ematical operation that can be easily performed “forward” but that was very hard to work “backward.” Diffie realized, however, that a trap-door function could be devised that could be worked backward easily if the person had the appropriate key.

The second idea was that of key exchange. In classical cryptography, there is a single key used for both encryption and decryption. In such a case it is absolutely vital to keep the key secret from any third party, so arrangements have to be made in advance to transmit and protect the key.

Diffie, however, was able to work out the theory for a system that generates pairs of mathematically interrelated keys: a private key and a public key. Each participant publishes his or her public key, but keeps the correspond-ing private key secret. If one wants to send an encrypted message to someone, one uses that person’s public key (obtained from the electronic equivalent of a phone direc-tory). The resulting message can only be decrypted by the intended recipient, who uses the corresponding secret, private key.

The public key system can also be used as a form of “digital signature” for verifying the authenticity of a mes-sage. Here a person creates a message encrypted with his or her private key. Since such a message can only be decrypted using the corresponding public key, any other person can use that key (together with a trusted third-party key ser-vice) to verify that the message really came from its pur-ported author.

Diffie and Hellman’s 1976 paper in the IEEE Transac-tions on Information Theory began boldly with the statement that “we stand today on the brink of a revolution in cryp-tography.” This paper soon came to the attention of three researchers who would create a practical implementation called RSA (for Rivest, Shamir, and Adelman).

Through the 1980s Diffie, resisting urgent invitations from the NSA, served as manager of secure systems research for the phone company Northern Telecom, designing sys-tems for managing security keys for packet-switched data communications systems (such as the Internet).

In 1991 Diffie was appointed Distinguished Engineer for Sun Microsystems, a position that has left him free to deal with cryptography-related public policy issues. The best known of these issues has been the Clipper Chip, a proposal that all new computers be fitted with a hardware encryption device that would include a “back door” that would allow the government to decrypt data. Along with many civil libertarians and privacy activists, Diffie did not believe users should have to trust largely unaccountable government agencies for the preservation of their privacy. Their opposition was strong enough to scuttle the Clipper Chip proposal by the end of the 1990s. Another proposal, using public key cryptography but having a third-party “key escrow” agency hold the keys for possible criminal investigation, also fared poorly. In 1998 Diffie and Susan Landau wrote Privacy on the Line, a book about the politics of surveillance and encryption. The book was revised and expanded in 2007.

Diffie has received a number of awards for both technical excellence and contributions to civil liberties. These include the IEEE Information Theory Society Best Paper Award (1979), the IEEE Donald Fink Award (1981), the Electronic Frontier Foundation Pioneer Award (1994), and even the National Computer Systems Security Award (1996), given by the NIST and NSA.

device driver

A fundamental problem in computer design is the control of devices such as disk drives and printers. Each device is designed to respond to a particular set of control com-mands sent as patterns of binary values through the port to which the device is connected. For example, a printer will respond to a “new page” command by skipping lines to the end of the current page and moving the print head to the start of the next page, taking margin settings into account. The problem is this: When an applications pro-gram such as a word processor needs to print a document, how should the necessary commands be provided to the printer? If every application program has to include the appropriate set of commands for each device that might be in use, programs will be bloated and much development effort will be required for supporting devices rather than extending the functionality of the product itself. Instead, the manufacturers of printers and other devices such as scanners and graphics tablets typically provide a program called a driver. (A version of the driver is created for each major operating system in use.) The driver serves as the intermediary between the application, the operating system and the low-level device control system. It is sometimes useful to have drivers in the form of continually running programs that monitor the status of a device and wait for commands (see demon).

Modern operating systems such as Microsoft Windows typically take responsibility for services such as printing documents. When a printer is installed, its driver program is also installed in Windows. When the application pro-gram requests to print a document, Windows’s print system accesses the driver. The driver turns the operating system’s “generic” commands into the specific hardware control commands needed for the device.

While the use of drivers simplifies things for both pro-gram developers and users, there remains the need for users to occasionally update drivers because of an upgrade either in the operating system or in the support for device capa-bilities. Both Windows and the Macintosh operating system implement a feature called plug and play. This allows for a newly installed device to be automatically detected by the system and the appropriate driver loaded into the operat-ing system (see plug and play). Other device management components enable the OS to keep track of the driver ver-sion associated with each device. Some of the newest operat-ing systems include auto-update features that can search on the Web for the latest driver versions and download them.

The need to provide drivers for popular devices creates something of a barrier to the development of new operating systems. In a catch-22, device manufacturers are unlikely to support a new OS that lacks significant market share, while the lack of device support in turn will discourage users from adopting the new OS. (Users of the Linux operat-ing system faced this problem. However, that system’s open source and cooperative development system made it easier for enthusiasts to write and distribute drivers without wait-ing for manufacturers to do so.)

developing nations and computing

Most writing about computer technology tends to focus on developments in technically advanced nations, such as the United States, European Union, and Japan. There is also growing coverage of the rapidly developing information economy in the world’s two most populous nations, India and China. But what about the poorest or least developed nations, particularly those in Africa?

Infrastructure

A common problem in developing countries is a lack of basic infrastructure to support electronic devices—phone lines, television cables, even a reliable power grid. (About two billion people on this planet still have no access to elec-tricity!)

One way around this obstacle is to skip over the wired stage of development in favor of wireless connections, per-haps using battery or even solar power. The necessity for large government investments in infrastructure can then be avoided in favor of mobile, distributed, flexible access that can be gradually spread and scaled up. Already, in some of the poorest nations mobile phone use has been growing at an annual rate of 50 percent or more.

Once access to communications and data is provided, users can immediately start getting an economic return or otherwise improving their lives. Farmers, for example, can get weather reports and keep in touch with market prices. Of course online communications might also give farmers a tool for organizing themselves politically or economically (such as into co-ops). People start to get in touch with developments around the world that might affect them, and discover possible ways to a better life. However, authori-tarian governments often resist such trends because they fear the development of well-connected democratic reform movements.

Closing the Gap

Much of the barrier to developing countries joining the net-worked world is human rather than technological. Before people can learn to use computers, they need to be able to read. They also need some idea of what science and tech-nology are about and why they are important for their eco-nomic well-being.

Beyond people learning to use computers to commu-nicate, or in agriculture or commerce, a developing coun-try needs to have enough people with the advanced skills needed for a self-sustaining information economy. These include technicians, support staff, teachers, engineers, pro-grammers, and computer scientists.

One reason for the rapid growth of computing in India and especially China is that these countries, while still hav-ing millions of people living on subsistence, also have effec-tive educational systems including advanced training. Their growing pool of skilled but relatively inexpensive work-ers in turn attracts foreign investment capital. In addition to China and India, other nations with strong electronics manufacturing industries include Singapore, Korea, Malay-sia, Mexico, and Brazil.

The United Nations has developed the Technology Achievement Index (TAI) to measure the ability of a coun-try to innovate, to effectively use new and existing technol-ogy, and to build a base of technically skilled workers.

One Laptop per Child

While the conventional view of technological development stresses the importance of infrastructure and skills, some visionary educational activists are suggesting a way to “jump-start” the information economy in poor and devel-oping countries. They note that despite the potential of wireless technology, adequate computing power for joining the world network has simply been too expensive for all but the elite in developing countries. (A $400 no-frills PC costs more than the annual per capita income of Haiti, for example.)

In response, MIT computer scientists (see Mit Media Lab and Negroponte, Nicholas) have started an initiative called One Laptop Per Child. Their machine (introduced as a prototype in 2005) includes the following features:

• very low power consumption (2–3 watts)

• lower and higher power modes (the latter, for exam-ple, can provide backlighting for the screen when an external power source is available)

• ability to use a variety of batteries or an external power source, including a hand-powered generator

• built-in wireless networking

• tough construction, including a water-resistant mem-brane keyboard

• flash memory instead of a hard drive or CD-ROM

• built-in color camera, microphone, and stereo speakers

• open-source Linux operating system and other soft-ware, including programming languages especially useful for learners

The computer is intended ultimately to cost no more than $100 per unit, and is to be distributed through participating governments. Countries that have made at least tentative commitments to the project as of 2007 include Argentina, Cambodia, Costa Rica, Dominican Republic, Egypt, Greece, Libya, Nigeria, Pakistan, Peru, Rwanda, Tunisia, Uruguay, and, in the United States, the states of Massachusetts and Maine.

The underlying philosophy of the project is based on “constructivist learning,” the idea that children can learn powerful ideas through using suitable interactive systems (see logo and Papert, Seymour). In a way it is intended to be a sort of lever to create a generation with the skills to function in the 21st-century information economy, without re-creating the cumbersome industrial-style educational systems of the previous 200 years.

Although, generally, some well received critics are con-cerned about the environmental impact of producing (and eventually disposing of) millions more computers, while others (including some officials in developing countries) believe the money for providing computers to children should be used instead for more urgent needs such as clean water, public health, and basic school supplies.

Whether using top-down or bottom-up approaches, the web of connection, communication, and information con- tinues its rapid though uneven spread around the world. However, as new technologies continue to emerge in the developed world, the position of technological “have-nots” may worsen if effective education and access programs are not developed.

design patterns

Design patterns are an attempt to abstract and general-ize what is learned in solving one problem so that it can be applied to future similar problems. The idea was first applied to architecture by Christopher Alexander in his book A Pattern Language. Alexander described a pattern as a description of situations in which a particular problem occurs, with a solution that takes into account the factors that are “invariant” (not changed by context). Guidance for applying the solution is also provided.

For example, a bus stop, a waiting room, and a line at a theme park are all places where people wait. A “place to wait” pattern would specify the problem to be solved (how to make waiting as pleasant as possible) and suggest solu-tions. Patterns can have different levels of abstraction or scales on which they apply (for example, an intimate the-ater and a stadium are both places of entertainment, but one is much larger than the other).

Patterns in turn are linked into a network called a pat-tern language. Thus when working with one pattern, the designer is guided to consider related patterns. For exam-ple, a pattern for a room might relate to patterns for seating or grouping the occupants.

Patterns in Software

The concept of patterns and pattern languages carries over well into software design. As with architectural patterns, a software pattern describes a problem and solution, along with relevant structures (see class and object-oriented programming). Note that patterns are not executable code; they are at a higher level (one might say abstract enough to be generalizable, specific enough to be applicable).

Software patterns can specify how objects are created and ways in which they function and interface with other objects. Patterns are generally documented using a common format; one example is provided in the book Design Pat-terns. This scheme has the following sections:

• name and classification

• intent or purpose

• alternative names

• problem—the kind of problem the pattern addresses, and conditions under which it can be used

• applicability—typical situations of use

• structure description—such as class or interaction diagrams

• participants—classes and objects involved in the pat-tern and the role each plays

• collaboration—how the objects interact with one another

• consequences—the expected results of using the pat-tern, and possible side effects or shortcomings

• implementation—explains a way to implement the pattern to solve the problem

• sample code—usually in a commonly used program-ming language

• known uses—actual working applications of the pattern

• related patterns—other patterns that are similar or related, with a description of how they differ

An example given in Design Patterns is the “publish-subscribe” pattern. This pattern describes how a number of objects (observers) can be dependent on a “subject.” All observers are “subscribed” to the subject, so they are noti-fied whenever any data in the subject changes. This pattern could be used, for example, to set up a system where differ-ent reports, spreadsheets, etc., need to be updated whenever notified by a controlling object that has received new data.

Some critics consider the use of patterns to be too abstract and inefficient. Since a pattern has to be re-imple-mented for each use, it has been argued that well-docu-mented, reusable classes or objects would be more useful.

Proponents, however, argue that “design reuse” is more powerful than mere “object reuse.” A pattern provides a whole “language” for talking about a problem and its proven solutions, and can help both the original designer and oth-ers understand and extend the design.

desktop publishing (DTP)

Traditionally documents such as advertisements, brochures, and reports were prepared by combining typed or printed text with pasted-in illustrations (such as photographs and diagrams). This painstaking layout process was necessary in order to produce “camera-ready copy” from which a printing company could produce the final product.

Starting in the late 1980s, desktop computers became powerful enough to run software that could be used to cre-ate page layouts. In addition, display hardware gained a high enough resolution to allow for pages to be shown on the screen in much the same form as they would appear on the printed page. (This is known by the acronym WYSI-WYG, or “what you see is what you get.”) The final ingredi-ent for the creation of desktop publishing was the advent of affordable laser or inkjet printers that could print near print quality text and high-resolution graphics (see printers).

This combination of technologies made it feasible for trained office personnel to create, design, and produce many documents in-house rather than having to send copy to a printing company. Adobe’s PageMaker program soon became a standard for the desktop publishing industry, appearing first on the Apple Macintosh and later on systems running Micro-soft Windows. (The Macintosh’s support for fonts and WYSI-WYG displays gave it a head start over the Windows PC in the DTP industry, and to this day many professionals prefer it.)

There is no hard-and-fast line between desktop publish-ing and the creation of text itself. Modern word processing software such as Microsoft Word includes a variety of features for selection and sizing of fonts, and the ability to define styles for creating headings, types of paragraphs, and so on (see word processing). Word and other programs also allow for the insertion and placement of graphics and tables, the division of text into columns, and other layout features. In general, however, word processing emphasizes the creation of text (often for long documents), while desktop publishing software emphasizes layout considerations and the fine-tun-ing of a document’s appearance. Thus, while a word proces-sor might allow the selection of a font in a given point size, a desktop publishing program allows for the exact specification of leading (space between lines) and kerning (the adjustment of space between characters). Most desktop publishing pro-grams can import text that was originally created in a word processor. This is helpful because using desktop publishing software to create the original text can be tedious.

Desktop publishing is generally used for short docu-ments such as ads, brochures, and reports. Material to be published as a book or magazine article is normally submit-ted by the author as a word processing document. The pub-lisher’s production staff then creates a print-ready version. Books and other long documents are generally produced using in-house computer typesetting facilities.

Today desktop publishing is part of a range of technolo-gies used for the production of documents and presenta-tions. Document designers also use drawing programs (such as Corel Draw) and photo manipulation programs (such as Adobe Photoshop) in preparing illustrations. Further, the growing use of the Web means that many documents must be displayable on Web pages as well as in print. Adobe’s Por-table Document Format (PDF) is one popular way of creat-ing files that exactly portray printed text (see PDF).

Dertouzos, Michael L.

Dertouzos, Michael L.

(1936–2001) Greek-American

Computer Scientist, Futurist

Born in Athens, Greece, on November 5, 1936, Michael Dertouzos spent adventurous boyhood years accompany-ing his father (an admiral) in the Greek navy’s destroyers and submarines. He became interested in Morse Code, shipboard machinery, and mathematics. At the age of 16 he read an article about Claude Shannon’s work in infor-mation theory and a project at the Massachusetts Institute of Technology that sought to build a mechanical robot “mouse.” He quickly decided that he wanted to come to America to study at MIT.

After the hardships of the World War II years inter-vened, Dertouzos received a Fulbright scholarship that placed him in the University of Arkansas, where he earned his bachelor’s and master’s degrees while working on acous-tic-mechanical devices for the Baldwin Piano Company. He was then able to fulfill his boyhood dream by receiving his Ph.D. from MIT, then promptly joined the faculty. He was director of MIT’s Laboratory for Computer Science (LCS) starting in 1974. The lab has been a hotbed of new ideas in computing, including computer time-sharing, Ethernet networking, and public-key cryptography. Dertouzos also embraced the growing Internet and serves as coordinator of the World Wide Web consortium, a group that seeks to cre-ate standards and plans for the growth of the network.

Combining theoretical interest with an entrepreneur’s eye on market trends, Dertouzos started a small company called Computek in 1968. It made some of the first “smart terminals” that included their own processors.

In the 1980s, Dertouzos began to explore the rela-tionship between developments and infrastructure in information processing and the emerging “information marketplace.” However, the spectacular growth of the information industry has taken place against a backdrop of the decline of American manufacturing. Dertouzos’s 1989 book, Made In America, suggested ways to revitalize Amer-ican industry.

During the 1990s, Dertouzos brought MIT into closer relationship with the visionary designers who were creating and expanding the World Wide Web. When Tim Berners-Lee and other Web pioneers were struggling to create the World Wide Web consortium to guide the future of the new technology, Dertouzos provided extensive guidance to help them set their agenda and structure. (See World Wide

Web and Berners-Lee, Tim.)

Dertouzos was dissatisfied with operating systems such as Microsoft Windows and with popular applications pro-grams. He believed that their designers made it unneces-sarily difficult for users to perform tasks, and spent more time on adding fancy features than on improving the basic usability of their products. In 1999, Dertouzos and the MIT LCS announced a new project called Oxygen. Working in collaboration with the MIT Artificial Intelligence Labora-tory, Oxygen was intended to make computers “as natural a part of our environment as the air we breathe.”

As a futurist, Dertouzos tried to paint vivid pictures of possible future uses of computers in order to engage the general public in thinking about the potential of emerging technologies. His 1995 book, What Will Be, paints a vivid portrait of a near-future pervasively digital environment. His imaginative future is based on actual MIT research, such as the design of a “body net,” a kind of wearable computer and sensor system that would allow people to not only keep in touch with information but also to com-municate detailed information with other people simi-larly equipped. This digital world will also include “smart rooms” and a variety of robot assistants, particularly in the area of health care. However, this and his 2001 publica-tion, The Unfinished Revolution, are not unalloyed celebra-tions of technological wizardry. Dertouzos has pointed out that there is a disconnect between technological visionar-ies who lack understanding of the daily realities of most peoples’ lives, and humanists who do not understand the intricate interconnectedness (and thus social impact) of new technologies.

Dertouzos was given an IEEE Fellowship and awarded membership in the National Academy of Engineering, He died on August 27, 2001, after a long bout with heart dis-ease. He was buried in Athens near the finish line for the Olympic marathon.

demon

The unusual computing term demon (sometimes spelled daemon) refers to a process (program) that runs in the background, checking for and responding to certain events. The utility of this concept is that it allows for automation of information processing without requiring that an operator initiate or manage the process.

For example, a print spooler demon looks for jobs that are queued for printing, and deals with the negotiations nec-essary to maintain the flow of data to that device. Another demon (called chron in UNIX systems) reads a file describ-ing processes that are designated to run at particular dates or times. For example, it may launch a backup utility every morning at 1:00 a.m. E-mail also depends on the periodic operation of “mailer demons.”

While the term demon originated in the UNIX culture, similar facilities exist in many operating systems. Even in the relatively primitive MS-DOS for IBM personal comput-ers of the 1980s, the ability to load and retain small utility programs that could share the main memory with the cur-rently running application allowed for a sort of demon that could spool output or await a special keypress. Microsoft Windows systems have many demon-like operating system components that can be glimpsed by pressing the Ctrl-Alt-Delete key combination.

The sense of autonomy implied in the term demon is in some ways similar to that found in bots or software agents that can automatically retrieve information on the Internet, or in the Web crawler, which relentlessly pursues, records, and indexes Web links for search engines. (See software agent and search engine.)

Dell, Inc.

Dell Computer (NASDAQ: DELL) is one of the world’s lead-ing manufacturers and sellers of desktop and laptop com-puters (see personal computer). By 2008 Dell had more than 88,000 employees worldwide.

The company was founded by Michael Dell, a student at the University of Texas at Austin whose first company was PC’s Limited, founded in 1984. Even at this early stage Dell successfully employed several practices that would come to typify the Dell strategy: Sell directly to customers (not through stores), build each machine to suit the customer’s preferences, and be aggressive in competing on price.

In 1988 the growing company changed its name to Dell Computer Corporation. In the early 1990s Dell tried an alternative business model, selling through warehouse clubs and computer superstores. When that met with little success, Dell returned to the original formula. In 1999 Dell overtook Compaq to become the biggest computer retailer in America.

Generally, the Dell product line has aimed at two basic segments: business-oriented (OptiPlex desktops and Lati-tude laptops) and home/consumer (XPS desktops and Inspiron laptops, and in 2007, Inspiron desktops).

Challenges and Diversification

Around 2002, Dell, perhaps facing the growing commod-ity pricing of basic PCs, began to expand into computer peripherals (such as printers) and even home entertainment products (TVs and audio players). In 2003 the company changed its name to Dell, Inc. (dropping “Computer”). Dell also experienced an increase in international sales in 2005, while achieving a first place ranking in Fortune magazine as “most admired company.” However, the company also made some missteps, losing $300 million because of faulty capacitors on some motherboards. Earnings continued to fall short of analysts’ expectations, and in January 2007 Michael Dell returned as CEO after the resignation of Kevin B. Rollins, who had held the post since 2004.

Meanwhile, Dell has made further attempts at diversify-ing the product line. In 2006 the company began, for the first time, to introduce AMD (instead of Intel) processors in certain products, and in 2007 Dell responded to cus-tomer suggestions by announcing that some models could be ordered with Linux rather than Microsoft Windows installed. Also in 2007, Dell acquired Alienware, maker of high-performance gaming machines.

Dell has struggled to boost its sagging revenue as it lost ground to competitors, notably HP. Known primarily as a mail-order and online company, Dell has announced that it will also sell PCs through “big box” retailers such as Wal-Mart.

Dell continues to receive praise and criticism from vari-ous quarters. On the positive side, the company has been praised for its computer-recycling program by the National Recycling Coalition. Dell products also tend to score at or near the top in performance reviews by publications such as PC Magazine.

On the other hand, there have been complaints about Dell’s technical support operation. Technicians apparently follow “scripts” very closely, making customers take sys-tems apart and follow troubleshooting directions regardless of what the customer might already know or have done. The increasing “offshoring” of support has also led to com-plaints about language and communication problems.

decision support system

A decision support system (DSS) is a computer applica-tion that focuses on providing access to or analysis of the key information needed to make decisions, particularly in business. (It can be thought of as a more narrowly focused approach to computer assistance to management—see man-agement information system.)

The development of DSS has several roots reaching back to the 1950s. This includes operational analysis and the the-ory of organizations and the development of the first inter-active (rather than batch-processing) computer systems. Indeed, the SAGE automated air defense system developed starting in the 1950s could be described as a military DSS. The system presented real-time information (radar plots) and enabled the operator to select and focus on particular elements using a light pen. By the 1960s more-systematic research on DSS was underway and included the provoca-tive idea of “human-computer symbiosis” for problem solv-ing (see Licklider, J. C. R.).

The “back end” of a DSS is one or more large databases (see data warehouse) that might be compiled from transac-tion records, statistics, online news services, or other sources. The “middle” of the DSS process includes the ability to ana-lyze the data (online analytical processing, or OLAP; see also data mining). Other elements that might be included in a DSS are rules-based systems (see expert system) and inter-active models (see simulation). These elements can help the user explore alternatives and “what if” scenarios.

The structure of a DSS is sometimes described as model driven (generally using a small amount of selected data), data driven (based on a large collection of historical data), knowledge driven (perhaps using an expert system), or communications driven (focusing on use of collaborative software—see groupware, as well as more recent develop-ments) (see wikis and Wikipedia).

User Interface—The “Front End”

All the data and tools in the world are of little use if the user cannot work with it effectively (see user interface). Information or the results of queries or modeling must be displayed in a way that is easy to grasp and use. (A spread-sheet with nothing highlighted or marked would be a poor choice.) Graphical “widgets” such as dials, buttons, sliders, and so on can help the user see the results and decide what to look at next (see digital dashboard).

Another key principle is that decision making in the modern world is as much a social as an individual process. Therefore a DSS should facilitate communication and col-laboration (or interface with software that does so).

A variety of specialized DSSs have been developed for various fields. Examples include PROMIS (for medical deci-sion making) and Carnegie Mellon’s ZOG/KMS, which has been used in military and business settings.

data warehouse

Modern business organizations create and store a tremen-dous amount of data in the form of transactions that become database records. Increasingly, however, businesses are relying on their ability to use data that was collected for one purpose (such as sales, customer service, and inventory) for purposes of marketing research, planning, or decision support. For example, transaction data might be revisited with a view to identifying the common characteristics of the firm’s best customers or determining the best way to market a particular type of product. In order to conduct such research or analysis, the data collected in the course of business must be stored in such a way that it is both accu-rate and flexible in terms of the number of different ways in which it can be queried. The idea of the data warehouse is to provide such a repository for data.

When data is used for particular purposes such as sales or inventory control, it is usually structured in records where certain fields (such as stock number or quantity) are routinely processed. It is not so easy to ask a differ-ent question such as “which customers who bought this product from us also bought this other product within six months of their first purchase?” One way to make it easier to query data in new ways is to store the data not in records but in arrays where, for example, one dimension might be product numbers and another categories of customers. This approach, called Online Analytical Processing (OLAP) makes it possible to extract a large variety of relationships without being limited by the original record structure.

Implementation

The key in designing a data warehouse is to provide a way that researchers using analytical tools (such as statistics programs) can access the raw data in the underlying data-base. Software using query languages such as SQL can serve as such a link. Thus, the researcher can define a query using the many dimensions of the data array, and the OLAP software (also called middleware) translates this query into the appropriate combination of queries against the underly-ing relational database.

The data warehouse is closely related to the concept of data mining. In fact, data mining can be viewed as the exploitation of the collection of views, queries, and other elements that can be generated using the data warehouse as the infrastructure (see data mining).

data types

As far as the circuitry of a computer is concerned, there’s only one kind of data—a series of bits (binary digits) fill-ing a series of memory locations. How those bits are to be interpreted by the people using the computer is entirely arbitrary. The purpose of data types is to define useful con-cepts such as integer, floating-point number, or character in terms of how they are stored in computer memory.

Thus, most computer languages have a data type called integer, which represents a whole number that can be stored in 16 bits (two bytes) of memory. When a programmer writes a declaration such as:

int Counter;

in the C language, the compiler will create machine instruc-tions that set aside two bytes of memory to hold the con-tents of the variable Counter. If a later statement says:

Counter = Counter + 1;

(or its equivalent, Counter++) the program’s instructions are set up to fetch two bytes of memory to the processor’s accumulator, add 1, and store the result back into the two memory bytes.

Similarly, the data type long represents four bytes (32 bits) worth of binary digits, while the data type float stores a floating-point number that can have a whole part and a decimal fraction part (see numeric data). The char (char-acter) type typically uses only a single byte (8 bits), which is enough to hold the basic ASCII character codes up to 255 (see characters and strings).

The Bool (Boolean) data type represents a simple true or false (usually 1 or 0) value (see Boolean operators).

Structured Data Types

The preceding data types all hold single values. However, most modern languages allow for the construction of data types that can hold more than one piece of data. The array is the most basic structured data type; it represents a series of memory locations that hold data of one of the basic types. Thus, in Pascal an array of integer holds integers, each taking up two bytes of memory.

Many languages have composite data types that can hold data of several different basic types. For example, the struct in C or the record in Pascal can hold data such as a person’s first and last name, three lines of address (all arrays of characters, or strings), an employee number (perhaps an integer or double), a Boolean field representing the presence or absence of some status, and so on. This kind of data type is also called a user-defined data type because programmers can define and use these types in almost the same ways as they use the language’s built-in basic types.

What is the difference between data types and data structures? There is no hard-and-fast distinction. Gen-erally, data structures such as lists, stacks, queues, and trees are more complex than simple data types, because they include data relationships and special functions (such as pushing or popping data on a stack). However, a list is the fundamental data type in list-processing lan-guages such as Lisp, and string operators are built into languages such as Snobol. (See list processing, stack, queue, and tree.) Further, in many modern languages fundamental and structured data types are combined seamlessly into classes that combine data structures with the relevant operations (see class and object-oriented programming).

Monday, 28 October 2013

data structures

A data structure is a way of organizing data for use in a computer program. There are three basic components to a data structure: a set of suitable basic data types, a way to organize or relate these data items to one another, and a set of operations, or ways to manipulate the data.

For example, the array is a data structure that can consist of just about any of the basic data types, although all data must be of the same type. The way the data is orga-nized is by storing it in sequentially addressable locations. The operations include storing a data item (element) in the array and retrieving a data item from the array.

Types of Data Structures

The data structures commonly used in computer science include arrays (as discussed above) and various types of lists. The primary difference between an array and a list is that an array has no internal links between its elements, while a list has one or more pointers that link the elements. There are several types of specialized list. A tree is a list that has a root (an element with no predecessor), and each other element has a unique predecessor. The guarantee of a unique path to each tree node can make the operations of inserting or deleting an item faster. A stack is a list that is accessible only at the top (or front). Any new item is inserted (“pushed”) on top of the last item, and remov-ing (“popping”) an item always removes the item that was last inserted. This order of access is called LIFO (last in, first out). A list can also be organized in a first in, first out (FIFO) order. This type of list is called a queue, and is useful in a situation where tasks must “wait their turn” for attention.

Implementation Issues

The implementation of any data structure depends on the syntax of the programming language to be used, the data types and features available in the language, and the algo-rithms chosen for the data operations that manipulate the structure. In traditional procedural languages such as C, the data storage part of a data structure is often specified in one part of the program, and the functions that operate on that structure are defined separately. (There is no mechanism in the language to link them.) In object-oriented languages such as C++, however, both the data storage declarations and the function declarations are part of the same entity, a class. This means that the designer of the data structure has complete control over its implementation and use.

Together with algorithms, data structures make up the heart of computer science. While there can be numerous variations on the fundamental data structures, understand-ing the basic forms and being able to decide which one to use to implement a given algorithm is the best way to assure effective program design.

data security

In most institutional computing environments, access to program and data files is restricted to authorized persons. There are several mechanisms for restricting file access in a multiuser or networked system.

User Status

Because of their differing responsibilities, users are often given differing restrictions on access. For example, there might be status levels ranging from root to administrator to “ordinary.” A user with root status on a UNIX system is able to access any file or resource. Any program run by such a user inherits that status, and thus can access any resource. Generally, only the user(s) with ultimate responsibility for the technical functioning of the system should be given such access, because commands used by root users have the potential to wipe out all data on the system. A person with administrator status may be able to access the files of other users and to access certain system files (in order to change configurations), but will not be able to access certain core system files. Ordinary users typically have access only to the files they create themselves and to files designated as “public” by other users.

File Permissions

Files themselves can have permission status. In UNIX, there are separate statuses for the user, any group to which the user belongs, and “others.” There are also three different activities that can be allowed or disallowed: reading, writ-ing, and executing. For example, if a file’s permissions are the user can read or write the file or (if it is a directory or program), execute it. Members of the same group can read or write, but not execute, while others can only read the file without being able to change it in any way. Operating sys-tems such as Windows NT use a somewhat different struc-ture and terminology, but also provide for varying user status and access to objects.

Record-level Security

Security on the basis of whole directories or even files may be too “coarse” for many applications. In a particular data-base file, different users may be given access to different data fields. For example, a clerk may have read-only access to an employee’s basic identification information, but not to the results of performance evaluations. An administra-tor may have both read and write access to the latter. Using some combination of database management and operating system level capabilities, the system will maintain lists of user accounts together with the objects (such as record types or fields) they can access, and the types of access (read only or read/write) that are permitted. Rather than assigning access capabilities separately for each user, they may be defined for a group of similar users, and then indi-vidual users can be assigned to the group.

Other Security Measures

Security is also important at the program level. Because a badly written (or malicious) program might destroy impor-tant data or system files, most modern operating systems restrict programs in a number of ways. Generally, each pro-gram is allowed to access only such memory as it allocates itself, and is not able to change data in memory belonging to other running programs. Access to hardware devices can also be restricted: an operating system component may have the ability to access the innermost core of the operating sys-tem (where drivers interact directly with devices), while an ordinary applications program may be able to access devices only through facilities provided by the operating system.

There are a number of techniques that unauthorized intruders can use to try to compromise operating systems (see computer crime and security). Access capabilities that are tied to user status are vulnerable if the user can get the login ID and password for the account. If the account has a high (administrator or root) status, then the intruder may be able to give viruses, Trojan horses, or other mali-cious programs the status they need in order to be able to penetrate the defenses of the operating system (see also computer virus).

Files that have intrinsically sensitive or valuable data are often further protected by encoding them (see encryp-tion). Encryption means that even intruders who gain read access to the file will need either to crack the encryption (very difficult without considerable time and computer resources) or somehow obtain the key. Encryption does not prevent the deletion or copying of a file, however, just the understanding of its contents.

The dispersal of valuable or sensitive data (such as cus-tomers’ social security numbers) across expanding networks increases the risk of “data breaches” where the privacy, financial security, and even identity of thousands of peo-ple are compromised (see also identity theft). In recent years, for example, there have been numerous cases where laptop computers containing thousands of sensitive records have been stolen from universities, financial institutions, or government agencies—in such cases there is often no way to know whether the thief will actually access the data. (Often affected individuals are notified that they may be at risk, and such prophylactic measures as credit monitor-ing are provided.) In response to public anxiety there has been pressure for federal or state legislation that would make companies responsible for breaches of their data and specify compensation or other recourse for affected custom-ers. (Opponents of such laws cite government reports that find that most data breaches do not lead to identity theft, and that the regulations would increase the cost of millions of daily transactions.)

There is a continuing tradeoff between security and ease of use. From the security standpoint, it might be assumed that the more barriers or checkpoints that can be set up for verifying authorization, the safer the system will be. However, as security systems become more complex, it becomes more difficult to ensure that authorized users are not unduly inconvenienced. If users are sufficiently frus-trated, they will be tempted to try to bypass security, such as by sharing IDs and passwords or making files they create “public.”

data mining

The process of analyzing existing databases in order to find useful information is called data mining. Generally, a data-base, whether scientific or commercial, is designed for a particular purpose, such as recording scientific observa-tions or keeping track of customers’ account histories. How-ever, data often has potential applications beyond those conceived by its collector.

Conceptually, data mining involves a process of refining data to extract meaningful patterns—usually with some new purpose in mind. First, a promising set or subset of the data is selected or sampled. Particular fields (variables) of interest are identified. Patterns are found using techniques such as regression analysis to find variables that are highly correlated to (or predicted by) other variables, or through clustering (finding the data records that are the most simi-lar along the selected dimensions). Once the “refined” data is extracted, a representation or visualization (such as a report or graph) is used to express newly discovered infor-mation in a usable form.

Similar (if simpler) techniques are being used to target or personalize marketing, particularly to online customers. For example, online bookstores such as Amazon.com can find what other books have been most commonly bought by people buying a particular title. (In other words, iden-tify a sort of reader profile.) If a new customer searches for that title, the list of correlated titles can be displayed, with an increased likelihood of triggering additional purchases. Businesses can also create customer profiles based on their longer-term purchasing patterns, and then either use them for targeted mailings or sell them to other businesses (see e-commerce). In scientific applications, observations can be “mined” for clues to phenomena not directly related to the original observation. For example, changes in remote sensor data might be used to track the effects of climate or weather changes. Data-mining techniques can even be applied to the human genome (see bioinformatics).

Trends

Data mining of consumer-related information has emerged as an important application as the volume of e-commerce continues to grow, the amount of data generated by large systems (such as online bookstores and auction sites) increases, and the value of such information to marketers becomes established. However, the use of consumer data for purposes unrelated to the original purchase, often by companies that have no pre-existing business relationship to the consumer, can raise privacy issues. (Data is often rendered anonymous by removing personal identification information before it is mined, but regulations or other ways to assure privacy remain incomplete and uncertain.)

The most controversial applications of data mining are in the area of intelligence and homeland security. Because such applications are often shrouded in secrecy, the public and even lawmakers have difficulty in assessing their value and devising privacy safeguards. According to the Govern-ment Accountability Office, as of 2007 some 199 different data-mining programs were in use by at least 52 federal agencies. One of the most controversial is ADVISE (Anal-ysis Dissemination, Visualization, Insight and Semantic Enhancement), developed by the Department of Homeland Security since 2003. The program purportedly can match and create profiles using government records and users’

Web sites and blogs. Privacy advocates and civil libertarians have raised concerns, and legislation has been introduced that would require that all federal agencies report their data-mining activities to Congress (see also counterterrorism and computers and privacy in the digital age.)

data dictionary

A modern enterprise database system can contain hundreds of separate data items, each with important characteristics such as field types and lengths, rules for validating the data, and links to various databases that use that item (see data-base management system). There can also be many different views or ways of organizing subsets of the data, and stored procedures (program code modules) used to perform vari-ous data processing functions. A developer who is creating or modifying applications that deal with such a vast database will often need to check on the relationships between data elements, views, procedures, and other aspects of the system.

One fortunate characteristic of computer science is that many tools can be applied to themselves, often because the contents of a program is itself a collection of data. Thus, it is possible to create a database that keeps track of the elements of another database. Such a database is sometimes called a data dictionary. A data dictionary system can be developed in the same way as any other database, but many database development systems now contain built-in facilities for gen-erating data dictionary entries as new data items are defined, and updating definitions as items are linked together and new views or stored procedures are defined. (A similar approach can be seen in some software development systems that cre-ate a database of objects defined within programs, in order to preserve information that can be useful during debugging.)

Data dictionaries are particularly important for creating data warehouses (see data warehouse), which are large collections of data items that are stored together with the procedures for manipulating and analyzing them.

data conversion

The developer of each application program that writes data files must define a format for the data. The format must be able to preserve all the features that are supported by the program. For example, a word processing program will include special codes for font selection, typestyles (such as bold or italic), margin settings, and so on.

In most markets there are more than one vendor, so there is the potential for users to encounter the need to convert files such as word processing documents from one vendor’s format to another. For example, a Microsoft Word user needing to send a document to a user who has Word-Perfect, or the user may encounter another user who also has Microsoft Word, but a later version.

There are some ways in which vendors can relieve some of their users’ file conversion issues (and thus potential customer dissatisfaction). Vendors often include facilities to read files created by their major rivals’ products, and to save files back into those formats. This enables users to exchange files. Sometimes the converted document will look exactly like the original, but in some cases there is no equivalence between a feature (and thus a code) in one application and a feature in the other application. In that case the formatting or other feature may not carry over into the converted ver-sion, or may be only partially successful.

Vendors generally make a new version of an applica-tion downwardly compatible with previous versions (see also compatability and portability). This means that the new version can read files created with the earlier versions. (After all, users would not be happy if none of their existing documents were accessible to their new software!) Similarly, there is usually a way to save a file from the later version in the format of an earlier version, though features added in the later version will not be available in the earlier format.

Another strategy for exchanging otherwise incompatible files is to find some third format that both applications can read. Thus Rich Text Format (RTF), a format that includes most generic document features, is supported by most mod-ern word processors. A user can thus export a file as RTF and the user of a different program will be able to read it (see rtf). Similarly, many database and other programs can export files as a series of data values separated by commas (comma-delimited files), and the files can be then read by a different program and converted to its “native” format.

A variety of format conversion utilities are available as either commercial software or shareware. There are also busi-

Sunday, 27 October 2013

data compression

The process of removing redundant information from data so that it takes up less space is called data compression. Besides saving disk space, compressing data such as e-mail attachments can make data communications faster.

Compression methods generally begin with the realiza-tion that not all characters are found in equal numbers in text. For example, in English, letters such as e and s are found much more frequently than letters such as j or x. By assigning the shortest bit codes to the most common characters and the longer codes to the least common char-acters, the number of bits needed to encode the text can be minimized.

Huffman coding, first developed in 1952, is an algorithm that uses a tree in which the pairs of the least probable (that is, least common) characters are linked, the next least prob-able linked, and so on until the tree is complete.

Another coding method, arithmetic coding, matches characters’ probabilities to bits in such a way that the same bit can represent parts of more than one encoded character. This is even more efficient than Huffman coding, but the necessary calculations make the method somewhat slower to use.

Another approach to compression is to look for words (or more generally, character strings) that match those found in a dictionary file. The matching strings are replaced by numbers. Since a number is much shorter than a whole word or phrase, this compression method can greatly reduce the size of most text files. (It would not be suitable for files that contain numerical rather than text data, since such data, when interpreted as characters, would look like a random jumble.)

The Lempel-Ziv (LZ) compression method does not use an external dictionary. Instead, it scans the file itself for text strings. Whenever it finds a string that occurred earlier in the text, it replaces the later occurrences with an offset, or count of the number of bytes separating the occurrences. This means that not only common words but common prefixes and suffixes can be replaced by num-bers. A variant of this scheme does not use offsets to the file itself, but compiles repeated strings into a dictionary and replaces them in the text with an index to their posi-tion in the dictionary.

Graphics files can often be greatly compressed by replac-ing large areas that represent the same color (such as a blue sky) with a number indicating the count of pixels with that value. However, some graphics file formats such as GIF are already compressed, so further compression will not shrink them much.

More exotic compression schemes for graphics can use fractals or other iterative mathematical functions to encode patterns in the data. Most such schemes are “lossy” in that some of the information (and thus image texture) is lost, but the loss may be acceptable for a given application. Lossy compression schemes are not used for binary (numeric data or program code) files because errors introduced in a pro-gram file are likely to affect the program’s performance (if not “break” it completely). Though they may have less seri-ous consequences, errors in text are also generally consid-ered unacceptable.

Trends

There are a variety of compression programs used on unix systems, but variants of the Zip program are now the overwhelming favorite on Windows-based systems. Zip combines compression and archiving. Archiving, or the bundling together of many files into a single file, contrib-utes a further reduction in file size. This is because files in most file systems must use a whole number of disk sectors, even if that means wasting most of a sector. Combining files into one file means that at most a bit less than one sector will be wasted.

data communications

Broadly speaking, data communications is the transfer of data between computers and their users. At its most abstract level, data communications requires two or more comput-ers, a device to turn data into electronic signals (and back again), and a transmission medium. Telephone lines, fiber optic cable, network (Ethernet) cable, video cable, radio (wireless), or other kinds of links can be used. Finally, there must be software that can manage the flow of data.

Until recently, the modem was the main device used to connect personal computers to information services or networks (see modem). In general, data being sent over a communications link must be sent one bit at a time (this is called serial transmission, and is why an external modem is connected to a computer’s serial port). However most phone cables and other links are multiplexed, meaning that they carry many channels (with many streams of data bits) at the same time.

To properly recognize data in a bit stream coming over a link, the transmission system must use some method of flow control and have some way to detect errors (see error cor-rection). Typically, the data is sent as groups or “frames” of bits. The frame includes a checksum that is verified by the receiver. If the expected and actual sums don’t match, the recipient sends a “negative acknowledgment” message to the sender, which will retransmit the data. In the original system, the sender waited until the recipient acknowledged each frame before sending the next, but modern protocols allow the sender to keep sending while the frames being received are waiting to be checked.

The actual transmission of data over a line can be con-sidered to be the lowest level of the data communications scheme. Above that is packaging of data as used and inter-preted by software. Unless two computers are directly con-nected, the data is sent over a network, either a local area network (LAN) or a wide-area network such as the global Internet. A network consists of interconnected nodes that include switches or routers that direct data to its destina-tion (see network). Networks such as the Internet use packet-switching: Data is sent as individual packets that contain a “chunk” of data, an address, and an indication of where the data fits within the message as a whole. The packets are routed at the routers using software that tries to find the fastest link to the destination. When the pack-ets arrive at the destination, they are reassembled into the original message.

Applications

Data communications are the basis both for networks and for the proper functioning of servers that provide ser-vices such as World Wide Web pages, electronic mail, online databases, and multimedia content (such as audio and streaming video). While Web page design and e-com-merce are the “bright lights” that give cyberspace its char-acter, data communications are like the plumbing without which computers cannot work together. The growing demand for data communications, particularly broadband services such as DSL and cable modems, translates into a steady demand for engineers and technicians specializing in the maintenance and growth of this infrastructure (see broadband).

Besides keeping up with the exploding demand for more and faster data communications, the biggest chal-lenge for data communications in the early 21st century is the integration of so many disparate methods of com-munications. A user may be using an ordinary phone line (19th-century technology) to connect to the Inter-net, while the phone company switches might be a mix-ture of 1970s or later technology. The same user might go to the workplace and use fast Ethernet cables over a local network, or connect to the Internet through DSL, an enhanced phone line. Traveling home, the user might use a personal digital assistant (PDA) with a wireless link to make a restaurant reservation (see wireless comput-ing). The user wants all these services to be seamless and essentially interchangeable, but today data communica-tions is more like roads in the early days of the automo-bile—a few fast paved roads here and there, but many bumpy dirt paths.

database administration

Database administration is the management of database systems (see database management system). Database administration can be divided into four broad areas: data security, data integrity, data accessibility, and system development.

Data Security

With regard to databases, ensuring data security includes the assignment and control of users’ level of access to sensi-tive data and the use of monitoring tools to detect compro-mise, diversion, or unauthorized changes to database files (see data security). When data is proprietary, licensing agreements with both database vendors and content provid-ers may also need to be enforced.

Data Integrity

Data integrity is related to data security, since the com-pleteness and accuracy of data that has been compromised can no longer be guaranteed. However, data integrity also requires the development and testing of procedures for the entry and verification of data (input) as well as verifying the accuracy of reports (output). Database administrators may do some programming, but generally work with the programming staff in maintaining data integrity. Since most data in computers ultimately comes from human beings, the training of operators is also important.

Within the database structure itself, the links between data fields must be maintained (referential integrity) and a locking system must be employed to ensure that a new update is not processed while a pending one is incomplete (see transaction processing).

Internal procedures and external regulations may require that a database be periodically audited for accuracy. While this may be the province of a specially trained infor-mation processing auditor, it is often added to the duties of the database administrator. (See also auditing in data processing.)

Data Accessibility

Accessibility has two aspects. First, the system must be reli-able. Data must be available whenever needed by the orga-nization, and in many applications such as e-commerce, this means 24 hours a day, 7 days a week (24/7). Reliability requires making the system as robust as possible, such as by “mirroring” the database on multiple servers (which in turn requires making sure updates are stored concurrently). Failure must also be planned for, which means the imple- mentation of onsite and offsite backups and procedures for restoring data (see backup and archive systems).

data acquisition

There are a variety of ways in which data (facts or mea-surements about the world) can be turned into a digital representation suitable for manipulation by a computer. For example, pressing a key on the keyboard sends a signal that is stored in a memory buffer using a value that represents the ASCII character code for the key pressed. Moving the mouse sends a stream of signals that are proportional to the rotation of the ball which in turn is calibrated into a series of coordinates and ultimately to a position on the screen where the cursor is to be moved. Digital cameras and scan-ners convert the varying light levels of what they “see” into a digital image.

Besides the devices that are familiar to most computer users, there are many specialized data acquisition devices (DAQs). Indeed, most instruments used in science and engineering to measure physical characteristics are now designed to convert their readings into digital form. (Some-times the instrument includes a processor that provides a representation of the data, such as a waveform or graph. In other cases, the data is sent to a computer for processing and display.)

Components of a Data Acquisition System

The data acquisition system begins with a transducer, which is a device that converts a physical phenomenon (such as heat) into a proportional electrical signal. Trans-ducers include devices such as thermistors, thermocouples, and pressure or strain gauges. The output of the transducer is then fed into a signal conditioning circuit. The purpose of signal conditioning is to make sure the signal fits into the range needed by the data processing device. Thus the signal may be amplified or its voltage may be adjusted or scaled to the required level. Another function of signal con-ditioning is to isolate the incoming signal from the com-puter to which the acquisition device is connected. This is necessary both to protect the delicate computer circuits from possible “spikes” in the incoming signal and to pre-vent “noise” (extraneous electromagnetic signals created by the computer itself) from distorting the signal, and thus the ultimate measurements. Various sorts of filters can be added for this purpose.

The conditioned signal is fed as an analog input into the data acquisition device, which is often a board inserted into a personal computer. The purpose of the board is to sample the signal and turn it into a stream of digital data. The digital data is stored in a buffer (either on the board or in the computer’s main memory). Software then takes over, analyz-ing the data and creating appropriate displays (such as digi-tal readings, graphs, or warning signals) as configured by the user. If the data is being displayed in real time, the speed of the software, the operating system, and the computer’s clock speed may become significant (see clock speed).

Performance Considerations

The sampling rate, or the number of times the signal is mea-sured per second, is of fundamental importance. A higher sampling rate usually means a more accurate representa-tion of the physical data (thus audio sampled at higher rates sounds more “natural”). The faster the sampling rate, the larger the amount of data to be processed and the greater the amount of computer resources needed. Thus, picking a sampling rate usually involves a tradeoff between accuracy and speed (for a real-time application, data must be pro-cessed fast enough so that whoever is using it can respond to it as it comes in).

Three internal factors determine the performance of a DAQ. The resolution is the number of bits available to quantify each measurement. Clearly the ability to measure thousands of voltage levels is useless if the resolution of a system is only 8 bits (256 possible values.) The range is the distance between the minimum and maximum voltage lev-els the DAQ can recognize. If a signal must be “squeezed” into too narrow a range, a corresponding amount of reso-lution will be lost. Finally, there is the gain or the ratio between changes in the measured quantity and changes in the signal strength.

Applications

Data acquisition systems are essential to gathering and pro-cessing the detailed data required by scientific and engi-neering applications. The automated control of chemical or biochemical processes requires the ability of the control software to assess real-time physical data in order to make timely adjustments to such factors as temperature, pressure, and the presence of catalysts, inhibitors, or other compo-nents of the process. The highly automated systems used in modern aviation and increasingly, even in ground vehicles, depend on real-time data acquisition. It is not surprising, then, that data acquisition is one of the fastest-growing fields in computing.

Search This Blog

Tuesday, 29 October 2013

Monday, 28 October 2013

Sunday, 27 October 2013