Harnessing the (mis)Behavior of Markets: Brownian Motion and Stock Prices
by Rick Martinelli and Neil Rhoads
Haiku Laboratories, March 2006
Copyright © 2006


Note: this document has been published in a slightly abreviated form in the June 2006 issue of Technical analysis of Stocks and Commodities, under the title Harnessing the (mis)Behavior of Markets


Introduction
Brownian Motion and Stock Prices
Trading Schemes and the Fortune Indicator
More Examples
Conclusions and Open Questions
References
Table I
Figures

1. INTRODUCTION

In 1900 Louis Bachelier received a doctorate from the University of Paris with a dissertation entitled “Theorie de la Speculation”, an event that marked the first time a serious academic paper addressed the behavior of markets [1].  In his dissertation, Bachelier proposed that market prices could be modeled as something called Brownian motion.  Slowly, his ideas where adopted by the financial community and are now the foundation of modern financial engineering.  The idea of Brownian motion arose when a botanist named Robert Brown described the chaotic behavior of pollen grains suspended in a fluid and viewed under a microscope.  He reasoned (correctly) that their motion was due to large numbers of random molecular forces impinging on the grains.  Using similar reasoning, Bachelier assumed that market prices vary due to large numbers of random effects, such as the whims of individual traders, and hence may be modeled as Brownian motion. 

Three critical assumptions underlie the Brownian model, namely:

                                  1. price changes are statistically independent,

                                  2. price changes are normally distributed, and

                                  3. price-change statistics do not vary over time. 

The first assumption means that price changes behave like coin tosses, where the current change was not influenced by past changes and has no influence on future changes.  The second assumption says the changes follow a bell-shaped curve.  This assumption is relevant whenever random behavior is due to many small influences.  It provides a distribution function characterized by only two parameters, the mean and standard deviation, and implies a certain ‘contained’ behavior of the changes.  The third assumption says the mean and standard deviation do not change with time.   

Knowledgeable investors might take exception to one or all of these assumptions.  In fact, there is ample evidence that these assumptions are often violated in the real markets.  The recent book The (mis)Behavior of Markets [2] documents many of these violations.  It discusses for example the 1987 stock market crash where there was a price change in the Dow Jones Industrial Average equal to about 18 standard deviations, an event with a probability of about one in 1017, if the second assumption is true.  A glance at price changes for many of the more volatile stocks over a long enough time span suggests that the third assumption is often violated as well.  The violation of either assumption 2 or 3 may produce similar effects, namely, larger than usual excursions in price.  A question is: are these excursions accompanied by precursors, smaller changes but in the same direction, like earthquakes.  If so, then the changes are locally correlated, in violation of assumptions 1 and 3, and it may be possible to harness those correlations by means of a simple linear predictor and some statistical data.  This article describes some results from an attempt to do just that.  We focus on the statistical behavior of the stock charts and ignore real-world issues such as commissions on trades.

2. BROWNIAN MOTION AND STOCK PRICES

The three assumptions listed in the Introduction constitute the technical definition of a white noise.  Consequently, a Brownian motion is characterized by the fact that its changes are a white noise.  That is, if we suppose a sequence of stock closing prices P0, P1,…,PN is a sample from a Brownian motion process, then its price changes W1, W2,…,WN defined by

Wn = Pn – Pn-1,

form a white noise sequence characterized by

       1. each Wn is normally distributed with mean zero and (fixed) standard deviation s

       2. the Wn are statistically independent (uncorrelated) .

A larger standard deviation of W means a broader bell curve and greater volatility in the prices.  Figure 1 shows a graph of 253 days of price changes for General Motors, GM, starting 11/01/04 and ending 10/28/05 (data source: Yahoo Finance, adjusted closes).  The numbers have been normalized so that vertical units are standard deviations (sigma’s): their mean (-0.039) has been subtracted and the result divided by their standard deviation (s=0.727).  For reference, Figure 2 shows a graph of 253 points of computer generated white noise having the same mean and standard deviation.  This sequence is typical in that its values rarely exceed two sigma’s, in contrast to GM whose largest change is more than seven sigma's.  Our goal is to capitalize on these large sigmas.

Prices may be recovered from their changes according to the simple formula

,

where P0 is the starting price.  In the case of GM, P0 is known to be 37.16 on 11/01/04 and the resulting prices are shown in Figure 3.  Using the same starting price and the computer generated values in Figure 2, gives the simulated prices shown in Figure 4; the obvious differences between Figures 1 and 2 are not so obvious in Figures 3 and 4.  As pointed out in [2], “fake charts” like Figure 4 are virtually indistinguishable from real stock charts.

The summation that appears in the last equation represents the difference in the stock’s price between day 0 and day k.  Consequently

represents the fractional change in price between day 0 and day k; that is, an investment of D dollars on day 0 results in a profit/loss of D·Fk on day k.  For this reason, Fk is called the Fortune indicator.  But suppose we modify the equation slightly to get

.

where the An’s have the special (magic?) property that

An-1 = +1 if  Wn > 0

An-1 = -1  if  Wn < 0

making An-1Wn always positive (the case Wn = 0 is ignored).  In this case, a graph of Fk starts at zero and increases thereafter, sort-of-a Holy Grail for traders!  Of course the problem is: An-1 must be calculated on the day before the measurement of Wn, that is, predicted on the previous day.  A trading scheme that predicts the An’s in an attempt to capture this behavior is described next. 

3. TRADING SCHEMES AND THE FORTUNE INDICATOR

A trading scheme can be thought of here as a device for generating buy/sell signals when presented with a set of market data.  The device employed in this scheme is a one-day-ahead predictor, which is based on the previous behavior of the market item.  A buy signal occurs when the predicted price change exceeds the average of the most recent changes.  A wager of one unit is placed and the resulting profit/loss taken at the close on the following day (there is no explicit sell signal).  Each days profit/loss is added to the previous day’s total to get the Fortune to date. 

It is called a wager because this is not a buy and hold scheme; rather the scheme is designed primarily to exploit any mis-behavior in the data.  It is unrealistic in the sense that no one can trade at exactly the closing price each time.  However, if the scheme is to be implemented in the real world, one can monitor the stock’s intraday price and buy just before the close.  Similarly, the stock may be sold any time before the close on the following day.

A long or short position is determined by predicted direction of the price movement.  The question of when to place the wager is determined as follows.  If ΔP denotes the predicted closing price change, and σ the standard deviation of the most recent price changes, then the Alpha Indicator is defined as their ratio:

α = ΔP / σ .

(Notice that alpha is dimensionless.)  If alpha is greater than one or less than negative one, this indicates the predicted price change exceeds the recent average.  Relatively large, positive values of alpha indicate a long position, and relatively large, negative values of alpha indicate a short position.  On a buy signal, a position of one investing unit is taken, and after the next close, the position is canceled and a profit/loss taken.  Then, tomorrow’s position is calculated and the procedure repeats.  Profits and losses are accumulated in the trader’s Fortune. 

This then is the trading scheme, or algorithm.  It has been assumed that correlation within the price changes will be captured in the forecast to produce a larger ΔP than usual, which is in turn captured in α, whose calculation is now considered. 

Let Pn denote the closing price of a stock and ΔPn = Pn - Pn-1 its price change on day n.  The algorithm for finding α on day n, αn, is as follows; it is assumed that all computations are carried out in Excel.

·         use the built-in FORECAST function to calculate Pn+1, the estimate of Pn+1, based on the previous 3 prices: Pn, Pn-1, Pn-2

·         calculate ΔPn = Pn+1 - Pn, the estimated price change for day n+1

·         use the built-in function STDEVP to find σn, the standard deviation of the last seven price changes: ΔPn, ΔPn-1, ΔPn-2, ΔPn-3, ΔPn-4, ΔPn-5, ΔPn-6

·         calculate αn = ΔPn / σn

A forecast lag of three was chosen because it is the smallest value that can capture a trend, and yields more potential wagers than larger values.  In calculating σn it was found that a lag of seven  produced enough averaging, while not involving data from the remote past.  If the lag was too short, small values of σn determine large values of αn which may lead to losses.  Figure 5 shows αn values computed from the GM data in Figure 3.  Values range roughly between 2 and –2 and a question that arises is exactly what constitutes a “relatively large” value mentioned above.  

A crucial parameter in this algorithm is the Alpha cutoff, denoted by C, where  αn > C signals a long position, and  αn < -C a short position.  For a long position, An-1 is set to +1, and for a short position it is set to –1; the 1 represents one investing unit.  Otherwise it is set to 0, and there is no investment on day n.  Thus the algorithm for finding An is

                                                      An = +1 if  αn > C

                                                      An = -1  if  αn < -C

                                                      An =  0  if -C ≤ αnC

However, these An’s will not always yield a positive factor An-1Wn, like the magic numbers described in Section 2 above.   Figure 6 shows a graph of the Fortune indicator for GM using C = 1.04.  Notice that the graph is generally increasing with time, while GM is generally decreasing over the same time period.  The Excel spreadsheet was programmed to test values of C between 1 and 4 in increments of 0.01.  The value that maximized the Fortune on the last day (the LDF) was chosen as optimal C (1.04 in GM’s case).  The LDF for GM was 0.294, with the maximum fortune of 0.305 occurring at day 239, just 14 days before data’s end.  The number of wagers was 49, with 26 winners and 23 losers for a win-ratio of 0.531 (see Figure 7).  The two factors contributing to the rise in the fortune are the win-ratio and the sizes of the wins and losses.  In the case of GM, the winners were generally larger than the losers. 

For comparison the so-called “buy-and-hold” position was also calculated as

(PN - P0) / P0.

This represents the fraction of the wager on day zero that is won or lost by simply waiting for day N, and must be compared with the LDF.  In the case of GM, which decreased in value, the B&H position was –0.266, hence the difference between the LDF and B&H was 0.560.  This means a trader employing the current scheme realizes 56% more return than an investor holding the stock.

The Fortune indicator for the Brownian motion in Figure 4 is shown in Figure 8.  Here C = 1.42, the LDF is 0.117, the maximum Fortune is 0.160 occurring at day 144, the number of wagers is 30, with 18 winners and 12 losers for a win-ratio of 0.60.  The B&H value for this simulation was 0.45, with a difference of  0.333.   The two graphs in Figures 6 and 8 are similar in that they are both increasing on average, but the LDF in the Brownian case is somewhat smaller.  Figure 9 shows fewer wagers than GM, as would be expected from a comparison of Figures 1 and 2.  However, this behavior is not “typical”.  After numerous experiments, it appears that most any other LDF, from zero to greater than one, can be obtained with a different computer generated white noise sequence, depending on how the prices are patterned. 

Price patterns can be understood by looking more closely at price data in the vicinity of a win or loss.  Figure 10 shows a segment of GM data near day 128 (May 4, 2005). On this date GM jumped by 4.88 points, while price changes for the previous 7 days had a standard deviation of only 0.319.  The 3 previous closes were nicely aligned yielding a predicted close of 27.45, with a change of 0.51, and an Alpha equal to 1.59.  Since this exceeded the cutoff of 1.04, the wager was set to +1 and a win of 0.181 was realized, a clear example of a large change being preceded by several smaller ones, all in the same direction.  (Note that this win was immediately followed by a loss of 0.059 when the stock had a loss of 1.89 on a predicted gain of 2.03.)  This win can be traced directly back to the +7 sigma outlier on day 128, an apparent violation of the Brownian assumption.  The other large outlier in Figure 1 is about –7 sigma’s at day 94 (March 16, 2005) and yielded a win of 0.140.  The obvious question now is: how many other stocks enjoy this type of mis-behavior?

4. MORE EXAMPLES

Thirty-two more experiments were conducted on twenty-eight stocks/indexes, and four simulated Brownian motions.  The stock charts were selected at random from Yahoo’s most active list in some cases, and well-known items were chosen from the Dow and Nasdaq in others.  The results are summarized in Table I; items BM-1 through BM-4 are the simulated Brownian motions.   Column headings Sdate and Edate represent start and end dates, respectively, for the item in the first column.  Approximately one year of data was used in each case.  The optimal C value is in the column labeled C; MaxF, B&H and LDF are, respectively, the maximum Fortune achieved in the date range, the buy-and-hold position and the last-day Fortune value.  The difference between the LDF and B&H position are shown in column DIFF; the table is ordered by decreasing values of DIFF.  Column #W is the total number of wagers and WR is the win-ratio. 

Several observations can be made. 

5.CONCLUSIONS AND OPEN QUESTIONS 

The intent of this analysis was to test the feasibility of harnessing any mis-behavior in stock prices, which amounted to calculating the An’s that appear in the equation for the Fortune indicator.  In this regard, the majority of items in Table I did indeed beat their buy-and-hold position, the reason for which was traced to a particular price pattern as exemplified in Figure 10. Essentially, the price must move in the same direction for four consecutive days and, if the last change is large enough, a profit is realized.   

The related questions of whether most stocks are Brownian motions and if this has an effect on the Fortune indicator is not as clear.   It appears as though most items are approximately a Brownian motion and Bachelier’s model works well most of the time.  But when it fails, e.g., large price excursions, the scheme often captures this behavior with an accompanying increase in the fortune indicator.  For GM, the pattern had a seven-sigma price change, reducing the likelihood that it is a Brownian motion.  However, the data in Table I suggest there is no difference between true Brownian motions (the BM-items) and real-world market items, as far as the trading scheme is concerned. 

The results suggest there are many items which would yield a positive DIFF under this scheme.  One outstanding question is how to choose them.  A closely related issue is: will an item continue to behave as it did historically or will assumption 3 in Section 1 be violated with disastrous consequences.  In general, how do we choose optimal C in the real world?  More questions:

This author will attempt to address these issues in a future article.

6.REFERENCES 

  1. "Théorie de la Spéculation", Annales de l'Ecole normale superiure, Louis Bachelier, 1900.
  2. The (mis)Behavior of Markets, Benoit Mandelbrot & Richard Hudson, Basic Books, New York, 2004.

Table I.  Summary of results from experiments on 28 stocks/indexes and 4 simulated Brownian motions.  Definitions of column headings are given in Section 4.

STOCK

SDATE

EDATE

C

MAXF

B&H

LDF

DIFF

#W

WLR

BM-1

   

1.27

2.924

1.614

2.606

0.992

29

0.55

RMBS

01/03/05

12/30/05

1.00

0.536

-0.292

0.488

0.780

64

0.75

F

01/03/05

01/06/06

1.10

0.243

-0.410

0.213

0.623

52

0.62

GM

11/01/04

10/28/05

1.04

0.305

-0.266

0.294

0.560

49

0.53

BM-3

   

1.00

0.216

-0.365

0.128

0.493

59

0.51

SYMC

01/10/05

01/09/06

1.31

0.172

-0.180

0.172

0.352

41

0.54

IBM

10/22/04

10/21/05

1.19

0.242

-0.038

0.231

0.269

47

0.65

SUNW

11/08/04

11/07/05

1.19

0.169

-0.191

0.077

0.268

45

0.64

DELL

11/09/04

11/07/05

1.91

0.026

-0.209

0.018

0.227

10

0.50

GE

01/07/05

01/05/06

2.22

0.000

-0.207

0.000

0.207

0

 

CSCO

11/09/04

11/07/05

1.04

0.077

-0.096

0.072

0.168

49

0.56

MSFT

01/10/05

01/09/06

1.00

0.203

0.014

0.177

0.163

76

0.66

ZMH

01/07/05

01/06/06

2.04

0.000

-0.144

0.000

0.144

0

 

ORCL

01/07/05

01/06/05

1.04

0.174

-0.016

0.106

0.122

54

0.54

BM-2

   

1.41

0.108

-0.001

0.102

0.103

23

0.52

AMZN

01/07/05

01/06/06

1.00

0.263

0.131

0.233

0.102

68

0.58

DJ

01/07/05

01/06/06

1.50

0.028

-0.054

0.024

0.078

13

0.38

INTC

11/09/04

11/08/05

1.34

0.145

0.074

0.120

0.046

32

0.72

SEBL

01/07/05

01/06/06

1.28

0.178

0.130

0.172

0.042

18

0.44

SIRI

01/10/05

01/09/06

1.30

0.076

-0.008

0.025

0.033

30

0.57

TWX

08/06/04

08/04/05

1.01

0.097

0.075

0.081

0.006

59

0.54

BM-4

   

1.17

0.112

0.101

0.102

0.001

40

0.65

QQQQ

11/23/04

11/21/05

1.10

0.088

0.077

0.076

-0.001

50

0.61

DJI

01/07/05

01/06/06

1.40

0.024

0.026

0.010

-0.016

26

0.58

KCN

01/07/05

01/06/06

1.65

0.006

0.035

0.006

-0.029

3

0.67

DIA

01/06/05

01/06/06

1.35

0.021

0.055

0.016

-0.039

31

0.58

DUK

01/07/05

01/06/06

1.00

0.046

0.166

0.017

-0.149

60

0.43

AZN

01/07/05

01/06/06

1.02

0.256

0.435

0.197

-0.238

65

0.63

LSI

01/07/05

01/06/06

1.12

0.308

0.554

0.291

-0.263

46

0.62

WFMI

01/03/05

12/30/05

2.42

0.000

0.666

0.000

-0.666

0

 

GOOG

01/03/05

12/30/05

1.41

0.116

1.047

0.075

-0.972

29

0.59

AAPL

01/10/05

01/09/06

1.38

0.188

1.206

0.163

-1.043

29

0.52

7. FIGURES 

Figure 1. Normalized price changes for GM: 253 days, starting 11/01/04 and ending 10/28/05, mean=-0.039, standard deviation=0.727.

Figure 2. Computer generated white noise: 253 points, mean=-0.039, standard deviation=0.727.

Figure 3. Closing prices for GM – 253 days, generated from Figure 1.

Figure 4. A Brownian Motion - simulated closing prices from the white noise sequence in Figure 2 – 253 points

Figure 5.  αk values for GM data in Figure 3.

Figure 6.  Fortune indicator for GM data in Figure 3, C = 1.04

Figure 7.  Wins and Losses for GM data in Figure 3, C = 1.04

Figure 8.  Fortune indicator for Brownian motion data in Figure 4, C = 1.42

Figure 9.  Wins and Losses for Brownian motion data in Figure 4

Figure 10.  GM prices in the vicinity of a win, 12 points

Figure 11.  Last-day Fortunes versus Buy & Hold positions for the 32 items in Table I.  Red points are simulated Brownian motions.