Search This Blog

Saturday, 22 November 2014

ABILITIES

Abilities represent an individual’s capacity to perform a wide range of tasks. They are believed to be somewhat stable traits or attributes but can be developed or refined over time. There are numerous types of abilities, which can be classified into four categories. Reasoning, judging, reading, writing, mathematical reasoning, and related capabilities reflect mental capacity or cognitive ability. Abilities related to muscular activities and bodily movement are labeled psychomotor abilities. Examples of psychomotor abilities are reaction time, reaction speed, precision, coordination, and dexterity. Sensory or perceptual abilities, such as visual and auditory abilities, relate to detecting and recognizing stimuli. Physical abilities refer to muscular strength, cardiovascular endurance, and movement quality. The abilities that are most useful in explaining career-related outcomes are cognitive abilities, which are thus the focus of this entry.

Interest in the human intellect and cognitive ability has existed since these concepts were first introduced by Alfred Binet and Charles Spearman around 1904. Some of the first applications of this knowledge took place in the U.S. Army, through tests that assessed  cognitive ability. The Army Alpha and Army Beta sts of general cognitive ability were used during orld War I to select army recruits. Cognitive ability tests evolved over the years to assess different dimensions  of cognitive ability, such as verbal, spatial  andd quantitative ability. These tests were used during World War II to select and classify aircrew. Subsequent years saw the application of cognitive ability tests to civilian employment settings and guidance counseling in school settings.

One hundred years of interest in cognitive abilities have seen numerous issues arise related to the definition and measurement of cognitive ability. One issue that has received widespread attention is the way the different cognitive abilities are structured. There are competing viewpoints as to whether cognitive abilities are best represented by flat or hierarchical models. There is a general consensus that abilities have a hierarchical structure, although there is still some disagreement on the number of levels in the hierarchy. However, in all hierarchical models, general intelligence or general mental ability is the highest-order factor, often labeled the general factor. Different tests designed to assess cognitive ability share some common variance. The general factor forms the largest component of most tests of cognitive ability and explains the greatest common variance among these tests. Hence, most tests of cognitive ability assess general mental ability or the general factor. Second-order factors in the hierarchical structure of cognitive abilities have been labeled quantitative or numerical abilities, spatial or mechanical abilities, and verbal or linguistic abilities. A number of standardized tests used in educational settings assess these second-order ability factors. Third-order factors are defined by more specialized abilities. Examples of specialized abilities that reflect the verbal factor are oral and written comprehension and expression, deductive and inductive reasoning, and information gathering. Specialized abilities in the quantitative factor are mathematical reasoning and number facility. Specialized abilities of the spatial factor include spatial organization or visualization.

Given the hierarchical structure of cognitive ability, there has been an interest in understanding whether the more specialized abilities are more useful than general cognitive ability or the general factor in explaining important individual outcomes. Tests of specialized abilities have been used to determine whether these abilities are more effective than the general factor in predicting success in a given job. Although specialized abilities explain unique variation in an individual’s performance in a job, the amount of unique variance explained by specialized abilities is minimal and rarely exceeds 3 percent over and above general mental ability. Thus, most tests of cognitive ability used for personnel selection are designed to assess general mental ability rather than specialized abilities.

Cognitive ability is an important and meaningful construct insofar as it satisfies criteria for scientific significance. One criterion is that the items in a test of cognitive ability demonstrate acceptable levels of internal consistency reliability. The second criterion is that cognitive ability be useful in explaining important individual outcomes. Numerous tests of cognitive ability demonstrate acceptable levels of internal consistency
reliability, suggesting that the questions on these tests measure the same construct. Research overwhelmingly indicates that cognitive ability explains significant variation in numerous outcomes of life, including work. A full understanding of how cognitive ability influences life, work, and career outcomes first necessitates a discussion of its influence in skill acquisition and school settings.

Individuals’ cognitive abilities impact their capacity to learn. Skill acquisition is an important first step in learning how to perform novel tasks. Skill acquisition involves three phases: cognitive, associative, and autonomous. Cognitive ability is most influential during the first phase (cognitive), which occurs when individuals are first confronted with the novel task. Other abilities are relevant during different phases of skill acquisition. The first phase places the greatest demands on the individual’s cognitive resources. Performance on the task increases with practice and at a faster rate for individuals high in cognitive ability. However, the important role that cognitive ability plays in the process of skill acquisition depends on the type of task that an individual is trying to learn. Tasks can be either consistent or inconsistent. Consistent tasks can be learned and become automatic or routine with practice (e.g., fast and effortless). The correlation between cognitive ability and performance decreases after practice on consistent tasks. In contrast, inconsistent tasks retain their novelty over time and do not become routine. The relationship between cognitive ability and performance remains positive and strong even after practice on inconsistent tasks. Thus, the degree of association between cognitive ability and performance depends on the consistency of the task and the phase of skill acquisition.

The school setting is one of the first settings in which individuals acquire basic skills and develop these skills to levels that permit them to function effectively in society. The early years of formal education are the most formative years, in which the basic skills of reading, writing, and arithmetic are acquired and developed. Subsequent education in high school, college, and graduate studies provides for the development of more specialized skills and the acquisition of specialized knowledge. General mental ability influences an individual’s achievement in each of these stages of academic development. More specifically, general mental ability has been shown to explain approximately 36 percent to 49 percent of the variation in course grades in elementary school, 25 percent to 36 percent in high school, and 16 percent to 25 percent in college. Furthermore, general mental ability has been shown to explain significant variation in numerous indicators of success in graduate studies. For example, an individual’s score on tests of general mental ability predicts attainment of the graduate degree, the time required to complete the degree, research productivity, scores on comprehensive examinations, and general performance ratings by faculty members.

Success in the academic arena equips individuals with the required knowledge and skills to function in society at large. General mental ability continues to explain variation in life outcomes even after school. The world of work is another setting in which general mental ability has proven its utility. Numerous years of research have demonstrated that general mental ability predicts job performance in both civilian and military settings. A full understanding of the extent to which cognitive ability relates to job performance requires knowledge of the different components or categories of job performance on which employees are measured. One category represents the core functions or duties of a job and has been labeled task performance. A second category, labeled organizational citizenship, contextual, or extra-role performance, represents nontask behaviors that contribute in a positive way to the organization. A third category, labeled counterproductive work behavior, represents negative behaviors that detract from the goals of the organization. Cognitive ability is a stronger predictor of task performance than of organizational citizenship behavior or of counterproductive work performance. However, it is a significant predictor of all three categories of job performance. Furthermore, cognitive ability is a stronger predictor of objective measures of performance (e.g., dollar sales) than subjective measures (e.g., supervisory ratings) but is a significant predictor of both.

The predictive power or utility of cognitive ability also depends on the complexity of the job. Jobs at the highest level of complexity include professional, scientific, and upper-management jobs, which account for approximately 14 percent of the jobs in the economy of the United States. Cognitive ability explains the greatest percentage of variation in performance in these jobs, approximately 34 percent. Cognitive ability explains the smallest percentage of variation in job performance, approximately 5 percent, in jobs at the lowest level of complexity, which account for approximately 2.5 percent of the jobs in the U.S. economy. Thus, the utility of cognitive ability is directly related to the information-processing requirements of the job. Cognitive ability is more useful in predicting job performance for cognitively complex jobs.

An important consideration for any construct that is used to make decisions about selection into educational programs or occupational settings is that the construct, in this case cognitive ability, is fair or is not inadvertently biased against members of different racial, ethnic, or gender groups. Concerns about racial bias have haunted cognitive ability tests for years. These concerns focus on the language and structure of cognitive ability tests. One popular model for assessing predictive bias in employment tests is called the regression model. An application of this model requires data on cognitive ability test scores for minority and majority group members and ratings of job performance for these members. Regressions predicting job performance from ability test scores are computed separately for majority and minority group members. These regression lines are then tested for significant differences in their slopes, intercepts, and error variances. Predictive bias is said to occur if the slope is significantly smaller for minority group members relative to majority group members. Research

Tuesday, 18 November 2014

OPTIMIZER TOOLS AND INFORMATION

Aerodynamics, electronics, chemistry, biochemistry, planning, and business are just a few of the fields in which optimization plays a role. Because optimization is of interest to so many problem-solving areas, research goes on everywhere, information is abundant, and optimization tools proliferate. Where can this information be found? What tools and products are available?

Brute force optimizers are usually buried in software packages aimed primarily at tasks other than optimization; they are usually not available on their own. In the world of trading, products like TradeStation and SuperCharts from Omega Research (800-292.3453), Excalibur from Futures Truth (828-697-0273), and MetaStock from Equis International (800-882-3040) have built-in brute force optimizers. If you write your own software, brute force optimization is so trivial to implement using in-line progranuning code that the use of special libraries or components is superfluous. Products and code able to carry out brute force optimization may also serve well for user-guided optimization.

Although sometimes appearing as built-in tools in specialized programs, genetic optimizers are more often distributed in the form of class libraries or  software components, add-ons to various application packages, or stand-alone research instruments. As an example of a class library written with the component paradigm in mind, consider OptEvolve, the C+ + genetic optimizer from Scientific Consultant Services (516-696-3333): This general-purpose genetic optimizer implements several algorithms, including differential evolution, and is sold in the form of highly portable C+ + code that can be used in UNIXiLINUX, DOS, and Windows environments. TS-Evolve, available from Ruggiero Associates (800-21 l-9785) gives users of TradeStation the ability to perform full-blown genetic optimizations. The Evolver, which can be purchased from Palisade Corporation (800.432.7475) is a general-purpose genetic optimizer for Microsoft’s Excel spreadsheet; it comes with a dynamic link library (DLL) that can provide genetic optimization services to user programs written in any language able to call DLL functions. GENESIS, a standalone instrument aimed at the research community, was written by John Grefenstette of the Naval Research Laboratory; the product is available in the form of generic C source code. While genetic optimizers can occasionally be found in modeling tools for chemists and in other specialized products, they do not yet form a native part of popular software packages designed for traders.

Information about genetic optimization is readily available. Genetic algorithms are discussed in many books, magazines, and journals and on Internet newsgroups. A good overview of the field of genetic optimization can be found in the Handbook of Generic Algorithms (Davis, 1991). Price and Storm (1997) described an algorithm for “differential evolution,” which has been shown to be an exceptionally powerful technique for optimization problems involving real-valued parameters. Genetic algorithms are currently the focus of many academic journals and conference proceedings. Lively discussions on all aspects of genetic optimization take place in several Internet newsgroups of which compaigenetic is the most noteworthy.

A basic exposition of simulated annealing can be found in Numericnl Recipes in C (Press et al., 1992), as can C functions implementing optimizers for both combinatorial and real-valued problems. Neural, Novel & Hybrid Algorithms for Time Series Prediction (Masters, 1995) also discusses annealing-based optimization and contains relevant C+ + code on the included CD-ROM. Like genetic optimization, simulated annealing is the focus of many research studies, conference presentations, journal articles, and Internet newsgroup discussions. Algorithms and code for conjugate gradient and variable metric optimization, two fairly sophisticated analytic methods, can be found in Numerical Recipes in C (Press et al., 1992) and Numerical Recipes (Press et al., 1986). Masters (1995) provides an assortment of analytic optimization procedures in C+ + (on the CDROM that comes with his book), as well as a good discussion of the subject.

Additional procedures for analytic optimization are available in the IMSL and the NAG library (from Visual Numerics, Inc., and Numerical Algorithms Group, respectively) and in the optimization toolbox for MATLAB (a general-purpose mathematical package from The MathWorks, 508-647-7000, that has gamed popularity in the financial engineering community). Finally, Microsoft’s Excel spreadsheet contains a built-in analytic optimizer-the Solver-that employs conjugate gradient or Newtonian methods.

As a source of general information about optimization applied to trading system development, consult Design, Testing and Optimization qf Trading Systems by Robert Pardo (1992). Among other things, this book shows the reader how to optimize profitably, how to avoid undesirable curve-fitting, and how to carry out walkforward tests.

ALTERNATIVES TO TRADITIONAL OPTIMIZATION

There are two major alternatives to traditional optimization: walk-forward optimization and self-adaptive systems. Both of these techniques have the advantage that any tests carried out are, from start to finish, effectively out-of-sample. Examine the performance data, run some inferential statistics, plot the equity curve, and the system is ready to be traded. Everything is clean and mathematically unimpeachable. Corrections for shrinkage or multiple tests, worries over excessive curve-fitting, and many of the other concerns that plague traditional optimization methodologies can be forgotten. Moreover, with today’s modem computer technology, walk-forward and self-adaptive models are practical and not even difficult to implement.

The principle behind walk-forward optimization (also known as walk-forward testing) is to emulate the steps involved in actually trading a system that requires periodic optimization. It works like this: Optimize the system on the data points 1 through M. Then simulate trading on data points M + I through M + K. Reoptimize the system on data points K + 1 through K + M. Then simulate trading on points (K + M) + 1 through (K + M) + K. Advance through the data series in this fashion until no more data points are left to analyze. As should be evident, the system is optimized on a sample of historical data and then traded. After some period of time, the system is reoptimized and trading is resumed. The sequence of events guarantees that the data on which trades take place is always in the future relative to the optimization process; all trades occur on what is, essentially, out-ofsample data. In walk-forward testing, M is the look-back or optimization window and K the reoptimization interval.

Self-adaptive systems work in a similar manner, except that the optimization or adaptive process is part of the system, rather than the test environment. As each bar or data point comes along, a self-adaptive system updates its internal state (its parameters or rules) and then makes decisions concerning actions required on the next bar or data point. When the next bar arrives, the decided-upon actions are carried out and the process repeats. Internal updates, which are how the system learns about or adapts to the market, need not occur on every single bar. They can be performed  at fixed intervals or whenever deemed necessary by the model.

The trader planning to work with self-adapting systems will need a powerful, component-based development platform that employs a strong language, such as Cf f, Object Pascal, or Visual Basic, and that provides good access to thirdparty libraries and software components. Components are designed to be incorporated into user-written software, including the special-purpose software that constitutes an adaptive system. The more components that are available, the less work there is to do. At the very least, a trader venturing into self-adaptive systems should have at hand genetic optimizer and trading simulator components that can be easily embedded within a trading model. Adaptive systems will be demonstrated in later chapters, showing how this technique works in practice.

There is no doubt that walk-forward optimization and adaptive systems will become more popular over time as the markets become more efficient and difficult to trade, and as commercial software packages become available that place these techniques within reach of the average trader.

Verification of Results

After optimizing the rules and parameters of a trading system to obtain good behavior on the development or in-sample data, but before risking any real money, it is essential to verify the system’s performance in some manner. Verification of system performance is important because it gives the trader a chance to veto faime and embrace success: Systems that fail the test of verification can be discarded, ones that pass can be traded with confidence. Verification is the single most critical step on the road to success with optimization or, in fact, with any other method of discovering a trading model that really works.

To ensure success, verify any trading solution using out-of-sample tests or inferential statistics, preferably both. Discard any solution that fails to be profitable in an out-of-sample test: It is likely to fail again when the rubber hits the road. Compute inferential statistics on all tests, both in-sample and out-of-sample. These statistics reveal the probability that the performance observed in a sample reflects something real that will hold up in other samples and in real-time trading. Inferential statistics work by making probability inferences based on the distribution of profitability in a system’s trades or returns. Be sure to use statistics that are corrected for multiple tests when analyzing in-sample optimization results. Out-of-sample tests should be analyzed with standard, uncorrected statistics. Such statistics appear in some of the performance reports that are displayed in the chapter on simulators. The use of statistics to evaluate trading systems is covered in depth in the following chapter. Develop a working knowledge of statistics; it will make you a better trader.

Some suggest checking a model for sensitivity to small changes in parameter values. A model highly tolerant of such changes is more “robust” than a model not as tolerant, it is said. Do not pay too much attention to these claims. In truth, parameter tolerance cannot be relied upon as a gauge of model robustness. Many extremely robust models are highly sensitive to the values of certain parameters. The only true arbiters of system robustness are statistical and, especially, out-ofsample tests.

Few Rules and Parameters

To achieve success, limit the number of free rules and parameters, especially when working with small data samples. For a given sample size, the fewer the rules or parameters to optimize, the greater the likelihood that a trading system will maintain its performance in out-of-sample tests and real-time trading. Although several dozen parameters may be acceptable when working with several thousand trades taken on 100,000 l-minute bars (about 1 year for the S&P 500 futures), even two or three parameters may be excessive when developing a system using a few years of end-of-day data. If a particular model requires many parameters, then significant effort should be put into assembling a mammoth sample (the legendary Gann supposedly went back over 1,000 years in his study of wheat prices). An alternative that sometimes works is optimizing a trading model on a whole portfolio, using the same rules and parameters across all markets-a technique used extensively in this book.

Large, Representative Samples

As suggested earlier, failure is often a consequence of presenting an optimizer with the wrong problem to solve. Conversely, success is likely when the optimizer is optimized on data from the near future, the data that will actually be traded; do that and watch the profits roll in. The catch is where to find tomorrow’s data today. Since the future has not yet happened, it is impossible to present the optimizer with precisely the problem that needs to be solved. Consequently, it is necessary to attempt the next-best alternative: to present the optimizer with a broader problem, the solution to which should be as applicable as possible to the actual,
but impossible-to-solve, problem. One way to accomplish this is with a data sample that, even though not drawn from the future, embodies many characteristics that might appear in future samples. Such a data sample should include bull and bear markets, trending and nontrending periods, and even crashes. In addition, the data in the sample should be as recent as possible so that it will reflect current patterns of market behavior. This is what is meant by a representative sample. As well as representative, the sample should be large. Large samples make it harder for optimizers to uncover spurious or artifact-determined solutions. hrinkage, the expected decline in performance on unoptimized data, is reduced when large samples are employed in the optimization process.

Sometimes, however, a trade-off must be made between the sample’s size and the extent to which it is representative. As one goes farther back in history to bolster a sample, the data may become less representative of current market conditions. In some instances, there is a clear transition point beyond which the data become much less representative: For example, the S&P 500 futures began trading in 1983, effecting a structural change in the general market. Trade-offs become much less of an issue when working with intraday data on short time frames, where tens of thousands or even hundreds of thousands of bars of data can be gathered without going back beyond the recent past.

Finally, when running simulations and optimizations, pay attention to the number of trades a system takes. Like large data samples, it is highly desirable that simulations and tests involve numerous trades. Chance or artifact can easily be responsible for any profits produced by a system that takes only a few trades, regardless of the number of data points used in the test!

HOW TO SUCCEED WITH OPTIMIZATION

Four steps can be taken to avoid failure and increase the odds of achieving successful optimization. As a first step, optimize on the largest possible representative sample and make sure many simulated trades are available for analysis. The second step is to keep the number of free parameters or rules small, especially in relation to sample size. A third step involves running tests on out-of-sample data, that is, data not used or even seen during the optimization process. As a fourth and final step, it may be worthwhile to statistically assess the results.

No Verification

One of the better ways to get into trouble is by failing to verify model performance using out-of-sample tests or inferential statistics. Without such tests, the spurious solutions resulting from small samples and large parameter sets, not to mention other less obvious causes, will go undetected. The trading system that appears to be ideal on the development sample will be put “on-line,” and devastating losses will follow. Developing systems without subjecting them to out-of-sample and statistical tests is like flying blind, without a safety belt, in an uninspected aircraft.

Large Parameter Sets

An excessive number of free parameters or rules will impact an optimization effort in a manner similar to an insufficient number of data points. As the number of elements undergoing optimization rises, a model’s ability to capitalize on idiosyncrasies in the development sample increases along with the proportion of the model’s fitness that can be attributed to mathematical artifact. The result of optimizing a large number of variables-whether rules, parameters, or both-will be a model that performs well on the development data, but poorly on out-of-sample test data and in actual trading.

It is not the absolute number of free parameters that should be of concern, but the number of parameters relative to the number of data points. The shrinkage formula discussed in the context of small samples is also heuristically relevant here: It illustrates how the relationship between the number of data points and the number of parameters affects the outcome. When there are too many parameters, given the number of data points, mathematical artifacts and capitalization on chance (curve-fitting, in the bad sense) become reasons for failure

Small Samples

Consider the impact of small samples on the optimization process. Small samples of market data are unlikely to be representative of the universe from which they are drawn: consequently, they will probably differ significantly from other samples obtained from the same universe. Applied to a small development sample, an optimizer will faithfully discover the best possible solution. The best solution for the development sample, however, may turn out to be a dreadful solution for the later sample on which genuine trades will be taken. Failure ensues, not because optimization has found a bad solution, but because it has found a good solution to the wrong problem!

Optimization on inadequate samples is also good at spawning solutions that represent only mathematical artifact. As the number of data points declines to the number of free (adjustable) parameters, most models (trading, regression, or otherwise) will attain a perfect tit to even random data. The principle involved is the same one responsible for the fact that a line, which is a two-parameter model, can always be drawn through any two distinct points, but cannot always be made to intersect three arbitrary points. In statistics, this is known as the degrees-of-freedom issue; there are as many degrees of freedom as there are data points beyond that which can be fitted perfectly for purely mathematical reasons. Even when there are enough data points to avoid a totally artifact-determined solution, some part of the model fitness obtained through optimization will be of an artifact-determined nature, a by-product of the process.

For multiple regression models, a formula is available that can be used to estimate how much “shrinkage” would occur in the multiple correlation coefficient (a measure of model fitness) if the artifact-determined component were removed. The shrinkage correction formula, which shows the relationship between the number of parameters (regression coefficients) being optimized, sample size, and decreased levels of apparent fitness (correlation) in tests on new samples, is shown below in FORTRAN-style notation:

In this equation, N represents the number of data points, P the number of model parameters, R the multiple correlation coefficient determined for the sample by the regression (optimization) procedure, and RC the shrinkage-corrected multiple correlation coefficient. The inverse formula, one that estimates the optimizationinflated correlation (R) given the true correlation (RfJ existing in the population from which the data were sampled, appears below

These formulas, although legitimate only for linear regression, are not bad for estimating how well a fully trained neural network model-which is nothing more than a particular kind of nonhnezu regression-will generalize. When working with neural networks, let P represent the total number of connection weights in the model. In addition, make sure that simple correlations are used when working with these formulas; if a neural network or regression package reports the squared multiple correlation, take the square root.

HOW TO FAIL WITH OPTIMIZATION

Most traders do not seek failure, at least not consciously. However, knowledge of the way failure is achieved can be of great benefit when seeking to avoid it. Failure with an optimizer is easy to accomplish by following a few key rules. First, be sure to use a small data sample when running sirindations: The smaller the sample, the greater the likelihood it will poorly represent the data on which the trading model will actually be traded. Next, make sure the trading system has a large number of parameters and rules to optimize: For a given data sample, the greater the number of variables that must be estimated, the easier it will be to obtain spurious results. It would also be beneficial to employ only a single sample on which to run tests; annoying out-of-sample data sets have no place in the rose-colored world of the ardent loser. Finally, do avoid the headache of inferential statistics. Follow these rules and failure is guaranteed.

What shape will failure take? Most likely, system performance will look great in tests, but terrible in real-time trading. Neural network developers call this phenomenon “poor generalization”; traders are acquainted with it through the experience of margin calls and a serious loss of trading capital. One consequence of such a failure-laden outcome is the formation of a popular misconception: that all optimization is dangerous and to be feared.

In actual fact, optimizers are not dangerous and not all optimization should be feared. Only bad optimization is dangerous and frightening. Optimization of large parameter sets on small samples, without out-of-sample tests or inferential statistics, is simply a bad practice that invites unhappy results for a variety of reasons.

Linear Programming

The techniques of linear programming are designed for optimization problems involving linear cost or fitness functions, and linear constraints on the parameters or input variables. Linear programming is typically used to solve resource allocation problems. In the world of trading, one use of linear programming might be to allocate capital among a set of investments to maximize net profit. If riskadjusted profit is to be optimized, linear programming methods cannot be used: Risk-adjusted profit is not a linear function of the amount of capital allocated to each of the investments; in such instances, other techniques (e.g., genetic algorithms) must be employed. Linear programming methods are rarely useful in the development of trading systems. They are mentioned here only to inform readers of their existence.

Analytic Optimizers

Analysis (as in “real analysis” or “complex analysis”) is an extension of classical college calculus. Analytic optimizers involve the well-developed machinery of analysis, specifically differential calculus and the study of analytic functions, in the solution of practical problems. In some instances, analytic methods can yield a direct (noniterative) solution to an optimization problem. This happens to be the case for multiple regression, where solutions can be obtained with a few matrix calculations. In multiple regression, the goal is to find a set of regression weights that minimize the sum of the squared prediction errors. In other cases, iterative techniques must be used. The connection weights in a neural network, for example, cannot be directly determined. They must be estimated using an iterative procedure, such as back-propagation.

Many iterative techniques used to solve multivariate optimization problems (those involving several variables or parameters) employ some variation on the theme of steepest ascent. In its most basic form, optimization by steepest ascent works as follows: A point in the domain of the fitness function (that is, a set of parameter values) is chosen by some means. The gradient vector at that point is evaluated by computing the derivatives of the fitness function with respect to each of the variables or parameters; this defines the direction in ndimensional parameter space for which a fixed amount of movement will produce the greatest increase in fitness. A small step is taken up the hill in fitness space, along the direction of the gradient. The gradient is then recomputed at this new point, and another, perhaps smaller, step is taken. The process is repeated until convergence occurs.

A real-world implementation of steepest ascent optimization has to specify how the step size will be determined at each iteration, and how the direction defined by the gradient will be adjusted for better overall convergence of the optimization process. Naive implementations assume that there is an analytic fitness surface (one that can be approximated locally by a convergent power series) having hills that must be climbed. More sophisticated implementations go further, commonly assuming that the fitness function can be well approximated locally by a quadratic form. If a fitness function satisfies this assumption, then much faster
convergence to a solution can be achieved. However, when the fitness surface has many irregularly shaped bills and valleys, quadratic forms often fail to provide a good approximation. In such cases, the more sophisticated methods break down entirely or their performance seriously degrades.

Worse than degraded performance is the problem of local solutions. Almost all analytic methods, whether elementary or sophisticated, are easily trapped by local maxima: they generally fail to locate the globally best solution when there are’ many hills and valleys in the fitness surface. Least-squares, neural network predictive modeling gives rise to fitness surfaces that, although clearly analytic, are full of bumps, troughs, and other irregularities that lead standard analytic techniques (including back-propagation, a variant on steepest ascent) astray. Local maxima and other hazards that accompany such fitness surfaces can, however, be sidestepped by cleverly marrying a genetic algorithm with an analytic one. For titness surfaces amenable to analytic optimization, such a combined algorithm can provide the best of both worlds: fast, accurate solutions that are also likely to be globally optimal.

Some fitness surfaces are simply not amenable to analytic optimization. More specifically, analytic methods cannot be used when the fitness surface has flat areas or discontinuities in the region of parameter space where a solution is to be sought. Flat areas imply null gradients, hence the absence of a preferred direction in which to take a step. At points of discontinuity, the gradient is not defined; again, a stepping direction cannot be determined. Even if a method does not explicitly use gradient information, such information is employed implicitly by the optimization algorithm. Unfortunately, many fitness functions of interest to traders-including, for instance, all functions that involve net profit, drawdown, percentage of winning trades, risk-to-reward ratios, and other like items-have plateaus and discontinuities. They are, therefore, not tractable using analytic methods. Although the discussion has centered on the maximization of fitness, everything said applies as well to the minimization of cost. Any maximization technique can be used for minimization, and vice versa: Multiply a fitness function by - 1 to obtain an equivalent cost function; multiply a cost function by - 1 and a fitness function is the result. If a minimization algorithm takes your fancy, but a maximization is required, use this trick to avoid having to recode the optimization algorithm.

Optimization by Simulated Annealing

Optimizers based on annealing mimic the thermodynamic process by which liquids freeze and metals anneal. Starting out at a high temperature, the atoms of a liquid or molten metal bounce rapidly about in a random fashion. Slowly cooled, they mange themselves into an orderly configuration-a crystal-that represents a minimal energy state for the system. Simulated in software, this thermodynamic process readily solves large-scale optimization problems. 

As with genetic opimization, optimization by simulared annealing is a very powerful Stochastic technique, modeled upon a natural phenomenon, that can find globally optimal solutions and handle ill-behaved fitness functions. Simulated annealing has effectively solved significant combinatorial problems, including the famous “traveling salesman problem,” and the problem of how best to arrange the millions of circuit elements found on modem integrated circuit chips, such as those that power computers. Methods based on simulated annealing should not be construed as limited to combinatorial optimization; they can readily be adapted to the optimization of real-valued parameters. Consequently, optimizers based on simulated annealing are applicable to a wide variety of problems, including those faced by traders.

Since genetic optimizers perform so well, we have experienced little need to explore optimizers based on simulated annealing. In addition, there have been a few reports suggesting that, in many cases, annealing algorithms do not perform as well as genetic algorithms. Because of these reasons, we have not provided examples of simulated annealing and have little more to say about the method.

Genetic Optimizers

Imagine something powerful enough to solve all the problems inherent in the creation of a human being. That something surely represents the ultimate in problem solving and optimization. What is it? It is the familiar process of evolution. Genetic optimizers endeavor to harness some of that incredible problem- solving power through a crude simulation of the evolutionary process. In terms of overall performance and the variety of problems that may be solved, there is no general-purpose optimizer more powerful than a properly crafted genetic one.

Genetic optimizers are Stochustic optimizers in the sense that they take advantage of random chance in their operation. It may not seem believable that tossing dice can be a great way to solve problems, but, done correctly, it can be! In addition to randomness, genetic optimizers employ selection and recombination. The clever integration of random chance, selection, and recombination is responsible for the genetic optimizer’s great power. A full discussion of genetic algorithms, which are the basis for genetic optimizers, appears in Part II.

Genetic optimizers have many highly desirable characteristics. One such characteristic is speed, especially when faced with combinatorial explosion. A genetic optimizer can easily be many orders of magnitude faster than a brute force optimizer when there are a multiplicity of rules, or parameters that have many possible values, to manipulate. This is because, like user-guided optimization, genetic optimization can focus on important regions of solution space while mostly ignoring blind alleys. In contrast to user-guided optimization, the benefit of a selective search is achieved without the need for human intervention

Genetic optimizers can swiftly solve complex problems, and they are also more immune than other kinds of optimizers to the effects of local maxima in the fitness surface or, equivalently, local minima in the cost surface. Analytic methods are worst in that they almost always walk right to the top of the nearest hill or bottom of the nearest valley, without regard to whether higher hills or lower valleys exist elsewhere. In contrast, a good genetic optimizer often locates the globally best solution-quite an impressive feat when accomplished for cantankerous fitness surfaces, such as those associated with matrices of neural connection weights

Another characteristic of genetic optimization is that it works well with fitness surfaces marked by discontinuities, flat regions, and other troublesome irregularities. Genetic optimization shares this characteristic with brute force, user-guided, annealing-based, and other nonanalytic optimization methods. Solutions that maximize such items as net profit, return on investment, the Sharpe Ratio, and others that define difficult, nonanalytic fitness landscapes can be found using a genetic optimizer. Genetic optimizers shine with difficult fitness functions that lie beyond the purview of analytic methods. This does not mean that they cannot be used to solve problems having more tractable fitness surfaces: Perhaps slower than the analytic methods, they have the virtue of being more resistant to the traps set by local optima.

Overall, genetic optimizers are the optimizers of choice when there are many parameters or rules to adapt, when a global solution is desired, or when arbitrarily complex (and not necessarily differentiable or continuous) fitness or cost functions must be handled. Although special-purpose optimizers can outperform genetic optimizers on specific kinds of problems, for general-purpose optimization, genetic optimizers are among the most powerful tools available.

What does a genetic optimizer look like in action? The dual moving-average crossover system discussed earlier was translated to Cl 1 so that the genetic optimizer in the C-Trader toolkit could be used to solve for the two system parameters, LenA and LenB. LenA, the period of the first moving average, was examined over the range of 2 through 50, as was LenB, the period of the second moving average. Optimization was for net profit so that the results would be directly comparable with those produced earlier by brute force optimization. Below is the Cl 1 code for the crossover system:

To solve for the best parameters, brute force optimization would require that 2,041 tests be performed; in TradeStation, that works out to about 56 minutes of computing time, extrapolating from the earlier illustration in which a small subset of the current solution space was examined. Only 1 minute of running time was required by the genetic optimizer; in an attempt to put it at a significant disadvantage, it was prematurely stopped after performing only 133 tests.

The output from the genetic optimizer appears in Table 3-2. In this table, PI represents the period of the faster moving average, P2 the period of the slower moving average, NETthe total net profit, NETLNG the net profit for long positions, NETSiS the net profit for short positions, PFAC the profit factor, ROA% the annualized rehm on account, DRAW the maximum drawdown, TRDS the number of trades taken by the system, WIN% the percentage of winning trades, AVGT the profit or loss resulting from the average trade, and FZTthe fitness of the solution (which, in this instance, is merely the total net p&it). As with the brute force data in Table 3-1, the genetic data have been sorted by net profit (fitness) and only the 25 best solutions were presented. Comparison of the brute force and genetic optimization results (Tables 3- 1 and3-2,  respectively) reveals that the genetic optimizer isolated a solution with a greater net profit ($172,725) than did the brute force optimizer ($145,125). This is no surprise since a larger solution space, not decimated by increments, was explored. The surprise is that the better solution was found so quickly, despite the handicap of a prematurely stopped evolutionary process. Results like these demonstrate the incredible effectiveness of genetic optimization.

Brute Force Optimizers

A brute force optimizer searches for the best possible solution by systematically testing all potential solutions, i.e., all definable combinations of rules, parameters, or both. Because every possible combination must be tested, brute force optimization can be very slow. Lack of speed becomes a serious issue as the number of combinations to be examined grows. Consequently, brute force optimization is subject to the law of “combinatorial explosion.” Just how slow is brute force optimization? Consider a case where there are four parameters to optimize and where each parameter can take on any of 50 values. Brute force optimization would require that 504 (about 6 million) tests or simulations be conducted before the optimal parameter set could be determined: if one simulation was executed every 1.62 seconds (typical for TradeStation), the optimization process would take about 4 months to complete. This approach is not very practical, especially when many systems need to be tested and optimized, when there are many parameters, when the parameters can take on many values, or when you have a life. Nevertheless, brute force optimization is useful and effective. If properly done, it will always find the best possible solution. Brute force is a good choice for small problems where combinatorial explosion is not an issue and solutions can be fond in minutes, rather than days or years.

Only a small amount of programming code is needed to implement brute force optimization. Simple loop constructs are commonly employed. Parameters to be optimized are stepped from a start value to a stop value by some increment using a For loop (C, C+ +, Basic, Pascal/Delphi) or a Do loop (FORTRAN). A brute force optimizer for two parameters, when coded in a modem dialect of Basic, might appear as follows:

Because brute force optimizers are conceptually simple and easy to program, they are often built into the more advanced software packages that arc available for traders.

As a practical illustration of bmte force optimization, TradeStation was used to optimize the moving averages in a dual moving-average crossover system. Optimization was for net profit, the only trading system characteristic that Trade- Station can optimize without the aid of add-on products, The Easy Language code for the dual moving-average trading model appears below:

The system was optimized by stepping the length of the first moving average (LenA) from 2 to 10 in increments of 2. The length of the second moving average (LenB) was advanced from 2 to 50 with the same increments. Increments were set greater than 1 so that fewer than 200 combinations would need to be tested
(TradeStation can only save data on a maximum of 200 optimization runs). Since not all possible combinations of values for the two parameters were explored, the optimization was less thorough than it could have been; the best solution may have been missed in the search. Notwithstanding, the optimization required 125 tests, which took 3 minutes and 24 seconds to complete on 5 years of historical, end-ofday data, using an Intel 486 machine running at 66 megahertz. The results generated by the optimization were loaded into an Excel spreadsheet and sorted for net profit. Table 3-l presents various performance measures for the top 25 solutions.In the table, LENA represents the period of the shorter moving average,

LENB the period of the longer moving average, NetPrft the total net profit, LtNerPlft the net profit for long positions, S:NefPrji the net profit for short positions, PFact the profit factor, ROA the total (unannualized) return-on-account, MaxDD the maximum drawdown, #Trds the total number of trades taken, and %Prji the percentage of profitable trades.

Since optimization is a problem-solving search procedure, it frequently results in surprising discoveries. The optimization performed on the dual movingaverage crossover system was no exception to the rule. Conventional trading wisdom says that “the trend is your friend.” However, having a second moving average that is faster than the first, the most profitable solutions in Table 3. I trade against the trend. These profitable countertrend solutions might not have been discovered without the search performed by the optimization procedure

Successful user-guided optimization calls for skill, domain knowledge, or both, on the part of the person guiding the optimization process. Given adequate skill and experience, not to mention a tractable problem, user-guided optimization can be extremely efficient and dramatically faster than brute force methods. ‘Ibe speed and efficiency derive from the addition of intelligence to the search process: Zones with a high probability of paying off can be recognized and carefully examined, while time-consuming investigations of regions unlikely to yield good results can be avoided.

User-guided optimization is most appropriate when ballpark results have already been established by other means, when the problem is familiar or well understood, or when only a small number of parameters need to be manipulated. As a means of “polishing” an existing solution, user guided-optimization is an excellent choice. It is also useful for studying model sensitivity to changes in rules or parameter values.

Monday, 17 November 2014

lmpllcit Optimizers

A mouse cannot be used to click on a button that says “optimize.” There is no special command to enter. In fact, there is no special software or even machine in sight. Does this mean there is no optimizer? No. Even when there is no optimizer apparent, and it seems as though no optimization is going on, there is. It is known as implicit optimization and works as follows: The trader tests a set of rules based upon some ideas regarding the market. Performance of the system is poor, and so the trader reworks the ideas, modifies the system’s rules, and runs another simulation Better performance is observed. The trader repeats this process a few times, each time making changes based on what has been learned along the way. Eventually, the trader builds a system worthy of being traded with real money. Was this system an optimized one? Since no parameters were ever explicitly adjusted and no rules were ever rearranged by the software, it appears as if the trader has succeeded in creating an unoptimized system. However, more than one solution from a set of many possible solutions was tested and the best solution was selected for use in trading or further study. This means that the system was optimized after all! Any form of problem solving in which more than one solution is examined and the best is chosen constitutes de facto optimization. The trader has a powerful brain that employed mental problem-solving algorithms, e.g., heuristically guided trial-and-error ones, which are exceptionally potent optimizers. This means that optimization is always present: optimizers are always at work. There is no escape!

TYPES OF OPTIMIZERS

There are many kinds of optimizers, each with its own special strengths and weak nesses, advantages and disadvantages. Optimizers can be classified along such dimensions as human versus machine, complex versus simple, special purpose versus general purpose, and analytic versus stochastic. All optimizers-regardless of kind, efficiency, or reliability-execute a search for the best of many potential solutions to a formally specified problem.

HOW OPTIMIZERS ARE USED

Optimizers are wonderful tools that can be used in a myriad of ways. They help shape the aircraft we fly, design the cars we drive, and even select delivery routes for our mail. Traders sometimes use optimizers to discover rule combinations that trade profitably. In Part II, we will demonstrate how a genetic optimizer can evolve profitable rule-based entry models. More commonly, traders call upon optimizers to determine the most appropriate values for system parameters; almost any kind of optimizer, except perhaps an analytic optimizer, may be employed for this purpose. Various kinds of optimizers, including powerful genetic algorithms, are effective for training or evolving neural or fuzzy logic networks. Asset allocation problems yield to appropriate optimization strategies. Sometimes it seems as if the only limit on how optimizers may be employed is the user’s imagination, and therein lies a danger: It is easy to be seduced into “optimizer abuse” by the great and alluring power of this tool. The correct and incorrect applications of optimizers are discussed later in this chapter.

WHAT OPTIMIZERS DO

Optimizers exist to find the best possible solution to a problem. What is meant by the best possible solution to a problem? Before attempting to define that phrase, let us first consider what constitutes a solution. In trading, a solution is a particular set of trading rules and perhaps system parameters

All trading systems have at least two mles (an entry rule and an exit rule), and most have one or more parameters. Rules express the logic of the trading system, and generally appear as “if-then” clauses in whatever language the trading system has been written. Parameters determine the behavior of the logic expressed in the rules they can include lengths of moving averages, connection weights in neural  networks, thresholds used in comparisons, values that determine placements for stops and profit targets, and other similar items. The simple moving-average crossover system, used in the previous chapter to illustrate various trading simulators, had two rules: one for the buy order and one for the sell order. It also had a single parameter, the length of the moving average. Rules and parameters completely define a trading system and determine its performance. To obtain the best performance from a trading system, parameters may need to be adjusted and rules juggled. 

There is no doubt that some rule and parameter combinations define systems that trade well, just as others specify systems that trade poorly; i.e., solutions differ in their quality. The goodness of a solution or trading model, in terms of how well it performs when measured against some. standard, is often calledjfitness. The converse of fitness, the inadequacy of a solution, is frequently referred to as coot.

In practice, fitness is evaluated by a@ness function, a block of programming code that calculates a single number that reflects the relative desirability of any solution. A fitness function can be written to appraise fitness howsoever the trader desires. For example, fitness might be interpreted as net profit penalized for excessive drawdown. A costfuncrion works in exactly the same way, but higher numbers signify worse solutions. The sum of the squared errors, commonly computed when working with linear regression or neural network models, is a cost function.

The best possible solution to a problem can now be defined: It is that particular solution that has the greatest fitness or the least cost. Optimizers endeavor to find the best possible solution to a problem by maximizing fitness, as measured by a titness function, or minimizing cost, as computed by a cost function.

The best possible solution to a problem may be discovered in any number of ways. Sometimes problems can be solved by simple trial-and-error, especially when guided by human insight into the problem being worked. Alternatively, sophisticated procedures and algorithms may be necessary. For example, simulating the process of evolution (as genetic optimizers do) is a very powerful way to discover or evolve high-quality solutions to complex problems. In some cases, the best problem solver is an analytic (calculus-based) procedure, such as a conjugate gradient. Analytic optimization is an efficient approach for problems with smooth (differentiable) fitness surfaces, such as those encountered in training neural networks, developing multiple linear regression models, or computing simple-structure factor rotations.

SIMULATORS USED IN THIS BOOK

We personally prefer simulators built using modern, object-oriented programming practices. One reason for our choice is that an object-oriented simulator makes it easy to create as many simulation instances or simulated accounts as might be desired. This is especially useful when simulating the behavior of a trading system on an entire portfolio of tradables (as is done in most tests in this book), rather than on a single instrument. An object-oriented simulator also comes in handy when building adaptive, self-optimizing systems where it is sometimes necessary to implement internal simulations. In addition, such software makes the construction of metasystems (systems that trade in or out of the equity curves of other systems) a simple matter. Asset allocation models, for instance, may be treated as metasysterns that dynamically allocate capital to individual trading systems or accounts. A good object-oriented simulator can generate the portfolio equity curves and other information needed to create and back-test asset allocation models operating on top of multiple trading systems. For these reasons, and such others as familiarity, most tests carried out in this book have been performed using the C-Trader toolkit. Do not be alarmed. It is not necessary to have any expertise in C+ + or modem software practices to benefit from this book. The logic of every system or system element examined will be explained in great detail in the text.

CHOOSING THE RIGHT SIMULATOR

If you are serious about developing sophisticated trading systems, need to work with large portfolios, or wish to perform tests using individual contracts or options, then buckle down, climb the learning curve, and go for an advanced simulator that employs a generic programming language such as the C++ or Object Pascal. Such a simulator will have an open architecture that provides access to an incredible selection of add-ons and libraries: technical analysis libraries, such as those from PM Labs (609-261-7357) and Scientific Consultant Services (516-696- 3333); and general numerical algorithm libraries, such as Numerical Recipes (800.872-7423), Numerical Algorithms Group (NAG) (44-1865.511.245), and International Mathematics and Statistics Library (IMSL), which cover statistics, linear algebra, spectral analysis, differential equations, and other mathematics. Even neural network and genetic algorithm libraries are readily available. Advanced simulators that employ generic programming languages also open up a world of third-party components and graphical controls which cover everything from sophisticated charting and data display to advanced database management, and which are compatible with Borland’s C++ Builder and Delphi, as well as with Microsoft’s Visual Basic and Visual C+ +.

If your needs are somewhat less stringent, choose a complete, integrated solution Make sure the simulation language permits procedures residing in DLLs to be called when necessary. Be wary of products that are primarily charting tools with limited programming capabilities if your intention is to develop, back-test, and trade mechanical systems that go significantly beyond traditional or “canned” indicators

RELIABILITY OF SIMULATORS

Trading simulators vary in their reliability and trustworthiness. No complex software, and that includes trading simulation software, is completely bug-free. This is true even for reputable vendors with great products. Other problems pertain to the assumptions made regarding ambiguous situations in which any of several orders could be executed in any of several sequences during a bar. Some of these items, e.g., the so-called bouncing tick (Ruggiero, 1998), can make it seem like the best system ever had been discovered when, in fact, it could bankrupt any trader. It seems better that a simulator makes worst-case assumptions in ambiguous situations: this way, when actual trading begins, there is greater likelihood of having a pleasant, rather than an unpleasant, surprise. All of this boils down to the fact that when choosing a simulator, select one that has been carefully debugged, that has a proven track record of reliability, and in which the assumptions and handling of ambiguous situations are explicitly stated. In addition, learn the simulator’s quirks and how to work around them.

SIMULATOR PERFORMANCE

Trading simulators vary dramatically in such aspects of performance as speed, capacity, and power. Speed is important when there is a need to carry out many tests or perform complex optimizations, genetic or otherwise. It is also essential when developing systems on complete portfolios or using long, intraday data series involving thousands of trades and hundreds of thousands of data points. In some instances, speed may determine whether certain explorations can even be attempted. Some problems are simply not practical to study unless the analyses can be accomplished in a reasonable length of time. Simulator capacity involves problem size restrictions regarding the number of bars on which a simulation may be performed and the quantity of system code the simulator can handle. Finally, the power a simulator gives the user to express and test complex trading ideas, and to run tests and even system optimizations on complete portfolios, can be significant to the serious, professional trader. A fairly powerful simulator is required, for example, to run many of the trading models examined in this book.

Speed
The most significant determinant of simulation processing speed is the nature of the scripting or programming language used by the simulator, that is, whether the language is compiled or interpreted. Modern optimizing compilers for generic languages, such as Cf f, FORTRAN, and Pascal/Delphi, translate the userwritten source code into highly efficient machine code that the processor can execute directly at full bore; this makes simulator toolkits that use such languages and compilers remarkably fast. On the other hand, proprietary, interpreted languages, such as Microsoft’s Visual Basic for Applications and Omega’s Easy Language, must be translated and fed to the processor line by line. Simulators that employ interpreted languages can be quite sluggish, especially when executing complex or “loopy” source code. Just how much speed can be gained using a compiled language over an interpreted one? We have heard claims of systems running about 50 times faster since they were converted from proprietary languages to C+ + !

Capacity
While speed is primarily a function of language handling (interpreted versus compiled), capacity is mostly determined by whether 16-bit or 32-bit software is used. Older, 16-bit software is often subject to the dreaded 64K limit. In practical terms, this means that only about 15,000 bars of data (about 4 days of ticks, or 7 weeks of l-minute bars on the S&P 500) can be loaded for system testing. In addition, as the system code is embellished, expect to receive a message to the effect that the system is too large to verify. Modem C+ + or FORTRAN products, on the other hand, work with standard 32.bit C+ + or FORTRAN compilers. Consequently, they have a much greater problem size capacity: With continuous-contract data on a machine with sufficient memory, every single tick of the S&P 500 since its inception in 1983 can easily be loaded and studied! In addition, there are virtually no limits on the number of trades a system can take, or on the system’s size and complexity. All modern Cf +, FORTRAN, and Pascal/Delphi compilers are now full 32.bit programs that generate code for, and run under, 32-bit operating systems, such as Windows 95, Windows NT, or LINUXAJNIX. Any simulator that works with such a compiler should be able to handle large problems and enormous data sets with ease. Since most software packages are upgrading to 32-bit status, the issue of problem size capacity is rapidly becoming less significant than it once was.

Power
Differences in simulator power are attributable mostly to language and to design. Consider language first: In this case, it is not whether the language is compiled or  interpreted, as was the case for speed, but rather its expressive power. Can the most elaborate and unusual trading ideas be expressed with precision and grace? In some languages they can; in others they cannot. It is unfortunate that the most powerful languages have steep learning curves. However, if one can climb the curve, a language like Cf + makes it possible to do almost anything imagina le. Your word processor, spreadsheet, web browser, and even operating system were all probably written in C++ or its predecessor, C. Languages like C++ and Object Pascal (the basis of Borland’s Delphi) are also extensible and can easily be customized for the purpose of trading system development by the use of appropriate libraries and add-on components. Visual Basic and Easy Language, although not as powerful as general-purpose, object-oriented languages like Cf + or Object Pascal, have gentler learning curves and are still quite capable as languages go. Much less powerful, and not really adequate for the advanced system developer, are the macro-like languages embedded in popular charting packages, e.g., Bquis International’s MetaStock. The rule of thumb is the more powerful the language, the more powerful the simulator.

Design issues are also a consideration in a simulator’s power. Extendability and modularity are especially important. Simulators that employ C+ + or Object Pascal (Borland’s Delphi) as their native language are incredibly extensible and can be highly modular, because such general-purpose, object-oriented languages are themselves highly extensible and modular; they were designed to be so from the ground up. Class libraries permit the definition of new data types and operators. Components can provide encapsulated functionality, such as charting and database management. Even old-fashioned function libraries (like the Numerical Algorithms Group library, the International Mathematics and Statistics Library and the Numerical Recipes library) are available to satisfy a variety of needs. Easy Language, too, is highly extensible and modular: Modules called User Functions can be created in Easy Language, and jimcrions written in other languages (including C+ +) can be called (if they are placed in a dynamic link library, or DLL). Macrolike languages, on the other hand, are not as flexible, greatly limiting their usefulness to the advanced system developer. In our view, the ability to access modules written in other languages is absolutely crucial: Different languages have different expressive foci, and even with a powerful language like C+ +, it sometimes makes sense to write one or more modules in another lan guage such as Prolog (a language designed for writing expert systems).

One additional design issue, unrelated to the language employed, is relevant when discussing simulator power: whether a simulator can work with whole portfolios as well as with individual tradables. Many products are not designed to perform simulations and optimizations on whole portfolios at once, although sometimes addons are available that make it possible to generate portfolio performance analyses after the fact. On the other hand, an appropriately designed simulator can make multiple- account or portfolio simulations and system optimizations straightforward

Trade-by-Trade Reports

Illustrative trade-by-trade reports were prepared using the simulators contained in TradeStation (Table 2-3) and in the C-Trader toolkit (Table 2-4). Both reports pertain to the same simple moving-average crossover system used in various ways throughout this discussion. Since hundreds of trades were taken by this system, the original reports are quite lengthy. Consequently, large blocks of trades have been edited out and ellipses inserted where the deletions were made. Because these reports are presented merely for illustration, such deletions were considered acceptable.

In contrast to a performance report, which provides an overall evaluation of a trading system’s behavior, a detail or trade-by-trade report contains detailed information on each trade taken in the simulated account. A minimal detail report contains each trade’s entry and exit dates (and times, if the simulation involves intraday data), the prices at which these entries and exits occurred, the positions held (in numbers of contracts, long or short), and the profit or loss resulting from each trade. A more cmprehensive trade-by-trade report might also provide information on the type of order responsible for each entry or exit (e.g., stop, limit, or market), where in the bar the order was executed (at the between), the number of bars each trade was held, the account equity at the start of each trade, the maximum favorable and adverse excursions within each trade, and the account equity on exit from each trade.

Most trade-by-trade reports contain the date (and time, if applicable) each trade was entered, whether a buy or sell was involved (that is, a long or short position established), the number of contracts in the transaction, the date the trade was exited, the profit or loss on the trade, and the cumulative profit or loss on all trades up to and including the trade under consideration. Reports also provide the name of the order on which the trade was entered and the name of the exit order. A better trade-by-trade report might include the fields for maximum favorable excursion (the greatest unrealized profit to occur during each trade), the maximum adverse excursion (the largest unrealized loss), and the number of bars each trade was held.

As with the performance summaries, there are differences between various trade-by-trade reports with respect to the ways they are formatted and in the assumptions underlying the computations on which they are based.

While the performance summary provides a picture of the whole forest, a good trade-by-trade report focuses on the trees. In a good trade-by-trade report, each trade is scrutinized in detail: What was the worst paper loss sustained in this trade? What would the profit have been with a perfect exit? What was the actual profit (or loss)open, the close, or in on the trade? Has the trading been fairly consistent? Are recent trades worse than those of the past? Or are they better? How might some of the worst trades be characterized in a way to improve the trading system? These are the kinds of questions that cannOt be answered by a distant panoramic view of the forest (a summary report), but they can be answered with a good trade-by-trade or detail report. In addition, a properly formatted detail report can be loaded into a spreadsheet for further analysis. Spreadsheets are convenient for sorting and displaying data. They make it easy, for instance, to draw histograms. Histograms can be very useful in decisions regarding the placement of stops (Sweeney, 1993). Histograms can show how much of the potential profit in the trades is being captured by the system’s exit strategy and is also helpful in designing profit targets. Finally, a detailed examination of the worst and best trades may generate ideas for improving the system under study.

Performance Summary Reports

As an illustration of the appearance of performance summary reports, two have been prepared using the same moving-average crossover system employed to illustrate simulator programming. Both the TradeStation (Table 2-l) and C-Trader (Table 2-2) implementations of this system were run using their respective target software applications. In each instance, the length parameter (controls the period of the moving average) was set to 4. Such style factors as the total number of trades, the number of winning trades, the number of losing trades, the percentage of profitable trades, the maximum numbers of consecutive winners and losers, and the average numbers of bars in winners and losers also appear in performance summary reports. Reward, risk, and style are critical aspects of system performance that these reports address.

Although all address the issues of reward, risk and trading style, there are a number of differences between various performance summary reports. Least significant are differences in formatting. Some reports, in an effort to cram as much information as possible into a limited amount of space, round dollar values to the nearest whole integer, scale up certain values by some factor of 10 to avoid the need for decimals, and arrange their output in a tabular, spreadsheet-like format. Other reports use less cryptic descriptors, do not round dollar values or rescale numbers, and format their output to resemble more traditional reports,

Somewhat more significant than differences in formatting are the variations between performance summary reports that result from the definitions and assumptions made in various calculations. For instance, the number of winning trades may differ slightly between reports because of how winners are defined. Some simulators count as a winner any trade in which the P/L (proWloss) figure is greater than or equal to zero, whereas others count as winners only trades for which the P/L is strictly greater than zero. This difference in calculation also affects figures for the average winning trade and for the ratio of the average winnerto the  average loser. Likewise, the average number of bars in a trade may be greater or fewer, depending on how they are counted. Some simulators include the entry bar in all bar counts; others do not. Return-on-account figures may also differ, depending, for instance, on whether or not they are annualized.

Differences in content between performance summary reports may even be more significant. Some only break down their performance analyses into long positions, short positions, and all trades combined. Others break them down into in-sample and out-of-sample trades, as well. The additional breakdown makes it easy to see whether a system optimized on one sample of data (the in-sample set) shows similar behavior on another sample (the out-of-sample data) used for verification; out-of-sample tests are imperative for optimized systems. Other important information, such as the total bar counts, maximum run-up (the converse of drawdown), adverse and favorable excursion numbers, peak equity, lowest equity, annualized return in dollars, trade variability (expressed as a standard deviation), and the annualized risk-to-reward ratio (a variant of the Sharpe Ratio), are present in some reports. The calculation of inferential statistics, such as the t-statistic and its associated probability, either for a single test or corrected for multiple tests or optimizations, is also a desirable feature. Statistical items, such as t-tests and probabilities, are important since they help reveal whether a system’s performance reflects the capture of a valid market inefficiency or is merely due to chanc  or excessive curve-fitting. Many additional, possibly useful statistics can also be cnl culated, some of them on the basis of the information present in performance summaries. Among these statistics (Stendahl, 1999) are net positive outliers, net negative outliers, select net profit (calculated after the removal of outlier trades), loss ratio (greatest loss divided by net profit), run-up-t&rawdown ratio, longest flat period, and buy-and-hold return (useful as a baseline). Finally, some reports also contain a text-based plot of account equity as a function of time.

To the degree that history repeats itself, a clear image of the past seems like an excellent foundation from which to envision a likely future. A good performance summary provides a panoramic view of a trading method’s historical behavior. Figures on return and risk show how well the system traded on test data from the historical period under study. The Sharpe Ratio, or annualized risk to Reward, measures return on a risk- or stability-adjusted scale. T-tests and related statistics may be used to determine whether a system’s performance derives from some real market inefficiency or is an artifact of chance, multiple tests, or inappropriate optimization. Performance due to real market inefficiency may persist for a time, while that due to artifact is unlikely to recur in the future. In short, a good performance summary aids in capturing profitable market phenomena likely to persist; the capture of persistent market inefficiency is, of course, the basis for any sustained success as a trader.

This wraps up the discussion of one kind of report obtainable within most trading simulation environments. Next we consider the other type of output that most simulators provide: the trade-by-trade report.

SIMULATOR OUTPUT

All good trading simulators generate output containing a wealth of information about the performance of the user’s simulated account. Expect to obtain data on gross and net profit, number of winning and losing trades, worst-case drawdown, and related system characteristics, from even the most basic simulators.

Better simulators provide figures for maximum run-up, average favorable and adverse excursion, inferential statistics, and more, not to mention highly detailed analyses of individual trades. An extraordinary simulator might also includein its output some measure of risk relative to reward, such as the annualized riskto- reward ratio (ARRR) or the Sharp Rario, an important and well-known measure used to compare the performances of different portfolios, systems, or funds (Sharpe, 1994).

The output from a trading simulator is typically presented to the user in the form of one or more reports. Two basic kinds of reports are available from most trading simulators: the performance summary and the trade-by-trade, or “detail,” report. The information contained in these reports can help the trader evaluate a system’s “trading style” and determine whether the system is worthy of real-money trading. Other kinds of reports may also be generated, and the information from the simulator may be formatted in a way that can easily be run into a spreadsheet for further analysis. Almost all the tables and charts that appear in this book were produced in this manner: The output from the simulator was written to a file that would be read by Excel, where the information was further processed and formatted for presentation 

PROGRAMMING THE SIMULATOR

Regardless of whether an integrated or component-based simulator is employed, the trading logic of the user’s system must be programmed into it using some computer language. The language used may be either a generic programming language, such as C+ + or FORTRAN, or a proprietary scripting language. Without the aid of a formal language, it would be impossible to express a system’s trading rules with the precision required for an accurate simulation. The need for programming of some kind should not be looked upon as a necessary evil. Programming can actually benefit the trader by encouraging au explicit and disciplined expression of trading ideas.

For an example of how trading logic is programmed into a simulator, consider TradeStation, a popular integrated product from Omega Research that contains an interpreter for a basic system writing language (called Easy Language) with bistorical simulation capabilities. Omega’s Easy Language is a proprietary, tradingspecific language based on Pascal (a generic programming language). What does a simple trading system look like when programmed in Easy Language? The following 
code implements a simple moving-average crossover system:

( Simple moving average crossover system in Easy Language)
Inputs: k”(4) ; (length parameter )

rf (close > Average~close, Led 1 And
(Close [II c= Average ~ClOm?, Len) [II ) Then
myc ‘A”, 1 contract At Market; (buys at open of next bar)
If (Close <= Average (Close, Len), And
(Close L1, > *"verage (Close. ILen) [II, Then
Sellc"B"J 1 ContraCt At Market; (sells at open Of next bar

This system goes long one contract at tomorrow’s open when the close crosses above its moving average, and goes short one contract when the close crosses below the moving average. Each order is given a name or identifier: A for the buy: B for the sell. The length of the moving average (Len) may be set by the user or
optimized by the software.

Below is the same system programmed in Cf + using Scientific Consultant Services’ component-based C-Trader toolkit, which includes the C+ + Trading Simulator:

Except for syntax and naming conventions, the differences between the Cf + and Easy Language implementations are small. Most significant are the explicit references to the current bar (cb) and to a particular simulated trading account or simulator class instance (ts) in the C+ + implementation. In C+ +, it is possible to explicitly declare and reference any number of simulated accounts: this becomes important when working with portfolios and merasystems (systems that trade the accounts of other systems), and when developing models that incorporate an implicit walk-forward adaptation.

TYPES OF SIMULATORS

There are two major forms of trading simulators. One form is the integrated, easyto- use software application that provides some basic historical analysis and simulation along with data collection and charting. The other form is the specialized software component or class library that can be incorporated into user-written software to provide system testing and evaluation functionality. Software components and class libraries offer open architecture, advanced features, and high levels of performance, but require programming expertise and such additional elements as graphics, report generation, and data management to be useful. Integrated applications packages, although generally offering less powerful simulation and testing capabilities, are much more accessible to the novice.

DATA SOURCES AND VENDORS

Today there are a great many sowces from which data may be acquired. Data may be purchased from value-added vendors, downloaded from any of several exchanges, and extracted from a wide variety of databases accessible over the Internet and on compact discs.

Value-added vendors, such as Tick Data and Pinnacle, whose data have been used extensively in this work, can supply the trader with relatively clean data in easy-to-use form. They also provide convenient update services and, at least in the case of Pinnacle, error corrections that are handled automatically by the downloading software, which makes the task of maintaining a reliable, up-to-date database very straightforward. Popular suppliers of end-of-day commodities data include Pinnacle Data Corporation (800-724-4903), Prophet Financial Systems (650-322-4183). Commodities Systems Incorporated (CSI, 800.274.4727), and Technical Tools (800-231-8005). Intraday historical data, which are needed for testing short time frame systems, may be purchased from Tick Data (SOO-822- 8425) and Genesis Financial Data Services (800-62 l-2628). Day traders should also look into Data Transmission Network (DTN, SOO-485-4000), Data Broadcasting Corporation (DBC, 800.367.4670), Bonneville Market Information (BMI, 800-532-3400), and FutureSource-Bridge (X00-621 -2628); these data distributors can provide the fast, real-time data feeds necessary for successful day trading. For additional information on data sources, consult Marder (1999). For a comparative review of end-of-day data, see Knight (1999).

Data need not always be acquired from a commercial vendor. Sometimes it can be obtained directly from the originator. For instance, various exchanges occasionally furnish data directly to the public. Options data can currently be downloaded over the Internet from the Chicago Board of Trade (CBOT). When a new contract is introduced and the exchange wants to encourage traders, it will often release a kit containing data and other information of interest. Sometimes this is the only way to acquire certain kinds of data cheaply and easily.

FiiaIly, a vast, mind-boggling array of databases may be accessed using an Internet web browser or ftp client. These days almost everything is on-line. For example, the Federal Reserve maintains files containing all kinds of economic time series and business cycle indicators. NASA is a great source for solar and astronomical data. Climate and geophysical data may be downloaded from the National Climatic Data Center (NCDC) and the National Geophysical Data Center (NGDC), respectively For the ardent net-surfer, there is an overwh&lming abundance of data in a staggering variety of formats. Therein, however, lies another problem: A certain level of skill is required in the art of the search, as is perhaps some basic programming or scripting experience, as well as the time and effort to find, tidy up, and reformat the data. Since “time is money,” it is generally best to rely on a reputable, value-added data vendor for basic pricing data, and to employ the Internet and other sources for data that is more specialized or difficult to acquire.

Additional sources of data also include databases available through libraries and on compact discs. ProQuest and other periodical databases offer full text retrieval capabilities and can frequently be found at the public library. Bring a floppy disk along and copy any data of interest. Finally, do not forget newspapers such as Investor’s Business Daily, Barron’s, and the Wall Street Journal; these can be excellent sources for certain kinds of information and are available on microfilm from many libraries.

In general, it is best to maintain data in a standard text-based (ASCII) format. Such a format has the virtue of being simple, portable across most operating systems and hardware platforms, and easily read by all types of software, from text editors to charting packages.

DATA QUALITY

Data quality varies from excellent to awful. Since bad data can wreak havoc with all forms of analysis, lead to misleading results, and waste precious time, only use the best data that can be found when running tests and trading simulations. Some forecasting models, including those based on neural networks, can be exceedingly sensitive to a few errant data points; in such cases, the need for clean, error-free data is extremely important. Time spent finding good data, and then giving it a final scrubbing, is time well spent.

Data errors take many forms, some more innocuous than others. In real-time trading, for example, ticks are occasionally received that have extremely deviant, if not obviously impossible, prices. The S&P 500 may appear to be trading at 952.00 one moment and at 250.50 the next! Is this the ultimate market crash? No-a few seconds later, another tick will come along, indicating the S&P 500 is again trading at 952.00 or thereabouts. What happened? A bad tick, a “noise spike,” occurred in the data. This kind of data error, if not detected and eliminated, can skew the results produced by almost any mechanical trading model. Although anything but innocuous, such errors are obvious, are easy to detect (even automatically), and are readily corrected or otherwise handled. More innocuous, albeit less obvious and harder to find, are the common, small errors in the settling price, and other numbers reported by the exchanges, that are frequently passed on to the consumer by the data vendor. Better data vendors repeatedly check their data and post corrections as such errors are detected. For example, on an almost daily basis, Pinnacle Data posts error corrections that are handled automatically by its software. Many of these common, small errors are not seriously damaging to software-based trading simulations, but one never knows for sure.

Depending on the sensitivity of the trading or forecasting model being analyzed, and on such other factors as the availability of data-checking software, it may be worthwhile to run miscellaneous statistical scans to highlight suspicious data points. There are many ways to flag these data points, or ourlieru, as they are sometimes referred to by statisticians. Missing, extra, and logically inconsitent data points are also occasionally seen; they should be noted and corrected. As an example of data checking, two data sets were run through a utility program that scans for missing data points, outliers, and logical inconsistencies. The results appear in Tables I-1 and 1-2, respectively.

Table I 1 shows the output produced by the data-checking program when it was used on Pinnacle Data Corporation’s (800-724-4903) end-of-day, continuous-contract data for the S&P 500 futures. The utility found no illogical prices or volumes in this data set; there were no observed instances of a high that wan less than the close, a low that was greater than the open, a volume that was less than zero, or of any cognate data faux pas. Rvo data points (bars) with suspiciously high ranges, however, were noted by the software: One bar with unusual range occurred on 1 O/l 9/87 (or 871019 in the report). The other was dated 10/13/89. The abnormal range observed on 10/19/87 does not reflect an error, just tbe normal volatility associated with a major crash like that of Black Monday; nor is a data error responsible for the aberrant range seen on 10/13/89, which appeared due to the so-called anniversary effect. Since these statistically aberrant data points were not errors, corrections were unnecessary Nonetheless, the presence of such data points should emphasize the fact that market events involving exceptional ranges do occur and must be managed adequately by a trading system. All ranges shown in Table l-l are standardized ranges, computed by dividing a bar’s range by the average range over the last 20 bars. As is common with market data, the distribution of the standardized range had a longer tail than would be expected given a normally distributed underlying process. Nevertheless, the events of 10/19/87 and 10/13/89 appear to be statistically exceptional: The distribution of all other range data declined, in an orderly fashion, to zero at a standardized value of 7,well below the range of 10 seen for the critical bars.

The data-checking utility also flagged 5 bars as having exceptionally deviant closing prices. As with range, deviance has been defined in terms of a distribution, using a standardized close-to-close price measure. In this instance, the standardized measure was computed by dividing the absolute value of the difference between each closing price and its predecessor by the average of the preceding 20 such absolute values. When the 5 flagged (and most deviant) bars were omitted, the same distributional behavior that characterized the range was observed: a longtailed distribution of close-to-close price change that fell off, in an orderly fasbion, to zero at 7 standardized units. Standardized close-to-close deviance scores (DEV) of 8 were noted for 3 of the aberrant bars, and scores of 10 were observed for the remaining 2 bars. Examination of the flagged data points again suggests that unusual market activity, rather than data error, was responsible for their statistical deviance. It is not surprising that the 2 most deviant data points were the same ones noted earlier for their abnormally high range. Finally, the data-checking software did not find any missing bars, bars falling on weekends, or bars with duplicate or out-of-order dates. The only outliers detected appear to be the result of bizarre market conditions, not cormpted data. Overall, the S&P 500 data series appears to be squeaky-clean. This was expected: In our experience, Pinnacle Data Corporation (the source of the data) supplies data of very high quality.

As an example of how bad data quality can get, and the kinds of errors that can be expected when dealing with low-quality data, another data set was analyzed with the same data-checking utility. This data, obtained from an acquaintance, was for Apple Computer (AAPL). The data-checking results appear in Table l-2.

In this data set, unlike in the previous one, 2 bars were flagged for having outright logical inconsistencies. One logically invalid data point had an opening price of zero, which was also lower than the low, while the other bar had a high price that was lower than the closing price. Another data point was etected as having an excessive range, which may or may not be a data error, In addition, several bars evidenced extreme closing price deviance, perhaps reflecting uncorrected stock splits. There were no duplicate or out-of-order dates, but quite a few data points were missing. In this instance, the missing data points were holidays and, therefore, only reflect differences in data handling: for a variety of reasons, we usually fill holidays with data from previous bars. Considering that the data series extended only from l/2/97 through 1 l/6/98 (in contrast to the S&P 500, which ran from l/3/83 to 5/21/98), it is distressing that several serious errors, including logical violations, were detected by a rather simple scan.

The implication of this exercise is that data should be purchased only from a reputable vendor who takes data quality seriously; this will save time and ensure reliable, error-free data for system development, testing, and trading, In addition, all data should be scanned for errors to avoid disturbing surprises. For an in-depth discussion of data quality, which includes coverage of how data is produced, transmitted, received, and stored, see Jurik (1999).