Search This Blog

Tuesday, 29 October 2013

data warehouse

Modern business organizations create and store a tremen-dous amount of data in the form of transactions that become database records. Increasingly, however, businesses are relying on their ability to use data that was collected for one purpose (such as sales, customer service, and inventory) for purposes of marketing research, planning, or decision support. For example, transaction data might be revisited with a view to identifying the common characteristics of the firm’s best customers or determining the best way to market a particular type of product. In order to conduct such research or analysis, the data collected in the course of business must be stored in such a way that it is both accu-rate and flexible in terms of the number of different ways in which it can be queried. The idea of the data warehouse is to provide such a repository for data.

When data is used for particular purposes such as sales or inventory control, it is usually structured in records where certain fields (such as stock number or quantity) are routinely processed. It is not so easy to ask a differ-ent question such as “which customers who bought this product from us also bought this other product within six months of their first purchase?” One way to make it easier to query data in new ways is to store the data not in records but in arrays where, for example, one dimension might be product numbers and another categories of customers. This approach, called Online Analytical Processing (OLAP) makes it possible to extract a large variety of relationships without being limited by the original record structure.

Implementation

The key in designing a data warehouse is to provide a way that researchers using analytical tools (such as statistics programs) can access the raw data in the underlying data-base. Software using query languages such as SQL can serve as such a link. Thus, the researcher can define a query using the many dimensions of the data array, and the OLAP software (also called middleware) translates this query into the appropriate combination of queries against the underly-ing relational database.

The data warehouse is closely related to the concept of data mining. In fact, data mining can be viewed as the exploitation of the collection of views, queries, and other elements that can be generated using the data warehouse as the infrastructure (see data mining).


No comments:

Post a Comment