In it's simplest form, an option is a financial derivative that gives the holder the right, but not the obligation, to either purchase (known as a call option) or to sell (known as a put option) an underlying security for a predetermined price, known as the strike price on a predetermined date, known as the exercise date. This kind of option, with a single exercise date at the end of the options life, is known as a European option. While this version of an option is the most simple, both conceptually and in terms of ease of modelling its value/risks, nearly all options traded on futures exchanges and nearly all stock options (options written on equity) are American. An American option differs from it's European counterpart in one important way. American options give the holder exercise rights at any moment in time during the contracts lifetime.
The widespread use of American options clearly gives rise to the need for accurate and reliable pricing and risk tools. In the following, we will derive the Black-Scholes equation for a European option on a simple stock option with continuous dividend yield, before generalising the result to American options. With the equation in place, we will propose and implement a general purpose solver for American options, and test the results on a few simple test cases.
In the Black-Scholes model, it is assumed that we operate within in a frictionless market without transaction costs, where prices can be observed continuously, and the market itself is efficient, in the sense that asset prices incorporate all available information and no arbitrage opportunities exist.
Consider an underlying asset emitting dividends continuously at a yield of over each infinitesimal time interval . While this assumption may seem unreasonable for a single stock, it becomes closer to reality in the case where our asset tracks an index, like the S&P 500. Likewise, consider a bank-account process , where investors can place their money to earn a risk-free return over any infinitesimal time increment . We assume that follows a geometric Brownian motion, with constant mean return and volatility . Specifically, we have the system
where is a Brownian motion. Clearly, the term arises from the fact that as dividends are paid out, the price of the underlying adjusts downwards to preclude any arbitrage opportunities.
The key insight of Black, Scholes and Merton that led to the famous Black Scholes PDE is that, in a complete market it is possible to dynamically hedge a position in an option using a varying (but known) quantity in the underlying asset. By buying and selling the asset in the correct quantity at each moment in time, it is theoretically possible to completely counteract any unexpected moves in the overall portfolio. Furthermore, if a portfolio grows at a predictable rate, it must grow at the risk-free rate - if not, one could finance their position using the bank-account and earn a guaranteed profit, an arbitrage opportunity we have assumed cannot exist.
Let our option have expiry date and let it's value at time be denoted by . Clearly, is a function of both and of . Consider a portfolio consisting of units of the stock at each time along with a short position in the option. We will denote the portfolio's value at time by , so that
By using a result of stochastic calculus known as Ito's Lemma on , we find that satisfies the stochastic differential equation
Next, by noting that in each time interval we receive units of dividend, the right-hand side of equation (1) satisfies
Hence, we find that
As noted earlier, by prudent choice of our quantity of underlying asset, we can remove the uncertainty associated with the above portfolio in any infinitesimal interval . By observing the first bracketed term in the above, we see that the choice
achieves this goal. Substituting in this value for yields
As this change contains no randomness, we find that or after substituting in equation (1),
Substituting this expression into equation (2) and rearranging yields
In turn, this gives us the celebrated Black-Scholes PDE
Along with a terminal payoff , the above gives us a well-defined PDE to solve for at all points and . Note that technically we need boundary conditions for and . A condition when drops directly out of the PDE itself - we find that at ,
which implies that
Likewise, when we find that the PDE is dominated by the second-order term and hence that
Empirically, particularly for vanilla options, it suffices to consider an upper pound anywhere larger than 3 standard deviations above the asset price, supposing that the initial value of the asset was such that the option was initially at-the-money, i.e. . Solving the SDE for and setting , where is a standard normal that we have set equal to 5 standard deviations, we find that satisfying
is a sufficient upper bound for which to set .
Without diving too deep into the body of theory, finite difference methods treat partial derivatives in PDEs such as equation (3) above as difference equations in their respective variables. For example, for a function , the finite difference approximation to the derivative could be given by
Assuming is sufficiently regular, a Taylor series expansion on the above shows that , i.e. this approximation is second-order in as .
The second key insight of finite difference methods is to discretise our dimensions on a discrete grid of points. By introducing natural numbers , representing the number of grid points in space and time respectively, we can divide our spatial dimension and time dimensions up uniformly via
We first perform this discretisation in , setting and using the finite-difference approximations
Isolating the terms in the Black-Scholes PDE that do not contain time derivatives, performing the spatial discretisations and substituting the above approximations gives
By discretising the boundary condition and considering the ghost point we see that
Substituting this condition into equation (4) shows that
With the above, we have a set of equations, one for each spatial grid point , . Denoting by , and the terms relating the point to , and respectively in equation (4), we may write equation (4) more succinctly as
where and . Specifically, we find that
Denoting the vector of points by
we find that the above yields matrix-vector system
where
With this system in place, we proceed to discretise in the dimension. To that end, we set and . Finally, we use a theta rule with to write
This setup encompasses both the implicit and explicit time-stepping schemes. Rewriting yields
With and the above can be rewritten as the system of equations
With we arrive at the explicit scheme, which is the easiest to implement but exhibits instability when too few timesteps are used relative to spatial steps . In general, we require a relationship of the form for some , where depends on the specific parameters of the problem, to ensure stability of the system. The explicit scheme generally exhibits first-order convergence in time and second-order convergence in space.
Setting brings us to the implicit scheme, where the solution must be found as the implicit solution of an equation of the form at each step . Fortunately, the matrix is sparse and efficient algorithms exist to solve problems of this form in linear time. Unlike the explicit scheme, the implicit scheme is unconditionally stable. Similarly to the explicit scheme, the implicit scheme also exhibits first-order convergence in time and second-order convergence in space.
Another scheme, known as the Crank-Nicolson scheme, where is a popular choice for the finite-difference scheme (5). This scheme offers unconditional stability as in the implicit case, but in this case also offers second-order convergence in both space and time. Like the implicit scheme, an implicit system must be solved at each step in time. Generally speaking, when using the Crank-Nicolson scheme, users will select so that the solution converges to the true solution at second-order in . Heuristically, Denoting the error by , we find that
Similarly, the computational cost of the calculation is linear in the spatial steps at each time step, and linear in time, so that
Roughly speaking, this says that decreasing the error by a factor of will require a corresponding increase in computational effort of , which can be thought of roughly as the amount of real-world time the calculation will take on a computer.
In the framework we have developed above, time flows in discrete steps of size . Hence, for our purposes it will be more convenient to consider options known as Bermudan options, which are similar to American options, except that they offer exercise rights only on certain dates instead of on all dates. It should be clear that a Bermudan option with exercise rights on the dates will converge in price to the equivalent American option when and .
To preclude arbitrage, at each timestep , the holder of the Bermudan option considered above will make a judgement as to whether there is more value in holding their option for a further timestep, or whether they are better off exercising at that moment in time. Specifically, if we let the solution to system (5) at time be denoted by , the Bermudan option has price at time given by satisfying
where denotes the element-wise maximum of the two vectors and . Hence, the updated algorithm
converges to the solution of the American option problem as .
With the finite difference scheme developed above, we are in a position to test our method in terms of its rate of convergence in its discretisation parameters and as well as its hyperparameter which determines the type of finite difference scheme we are using. In the following, we will consider the price of a call option on underlying with continuous dividends as presented above. We will use the parameters
In the case of European options, there is an analytic formula available for the price of call/put options under the Black-Scholes model. The formula, with the notation above, is given by
where is the standard cumulative normal distribution function. A straddle option consists of a combination of a put and a call option with the same strike and maturity. Below is a plot of the above at time for various values of along with the payoff function .
As noted previously, the explicit finite difference scheme, with , is unstable when we use too few time steps per space step, so we will not consider the explicit scheme. By comparing the error in the results of the finite difference method developed above for the implicit scheme and the Crank-Nicolson scheme as we increase , we can deduce the rate of convergence for different finite difference schemes. These results can be seen below.
We observe linear plots in log-log space for both schemes, with a slope of -1 for the implicit scheme and a slope of 02 for the Crank-Nicolson scheme, corresponding to first and second order convergence respectively. Due to its favourable convergence characteristics, we will only consider the Crank-Nicolson scheme when moving on to the American option problem.
Using the same option parameters as above, we now turn to the American option problem. Below is a plot of the American straddle price versus the underlying, with the European option price included for reference.
Unlike the European option, the value of the American option never drops below the payoff value. This is to be expected in a no-arbitrage environment. Likewise, since the American option gives the holder the same benefits as the European option holder plus more, it is also to be expected that the American option price should be just as great as its European equivalent - this is also reflected in the graph. To see this difference in more detail, we consider the spread in the American option versus its European equivalent and plot this against the underlying.
The above plot confirms that the American option is always worth at least as much as its European equivalent. It also shows that near the money, their prices are broadly similar, but looking at the wings we see that their prices start to diverge at an increasing rate. This observation does not always hold. For example, in the case of a European call with no dividends, the in-the-money European and American option both have exactly the same value - the payoff value. The above observation on the straddle on a dividend paying underlying shows that a sensible investor holding the in-the-money straddle on the call side would exercise early, lock in their profits and earn dividends on the underlying that they now hold.
With an American option pricing scheme in place, we can run a similar error analysis to that of the European option. However, in this case there is no analytic value to compare our approximations to. To sidestep this issue, there are a few proxies we could consider. Here, we consider the error in successive approximations on a finer grid. Observing the slope of a log-log plot of these differences versus the number of steps used gives us a proxy for the order of convergence of the error of our approximation compared to the true value. To see this, set and assume that our error satisfies . Roughly, we find that for some constant . Furthermore, by the triangle inequality, the difference in successive error terms satisfies
The left-hand-side of this inequality can be simplified to . If we assume that for some base then we find that this simplifies further to for some constant . Similarly, we find that the right-hand-side of the inequality simplifies to for some . If we observe that the middle term, the difference that we calculate, satisfies for some , then we are forced to conclude that . Below is a log-log plot of the norm of successive differences in our approximation.
In contrast to the European case, it appears that the Crank-Nicolson scheme has dropped to first-order convergence. This is a downside of the naive American scheme developed above. More complex iterative schemes for solving the problem have been proposed and explored, such as the use of successive over-relaxation schemes, or schemes based on Jacobi's method. While these schemes can offer improved performance, for 1D problems such as the option pricing problem above, our scheme performs sufficiently fast for most purposes and has the added benefit of easy interpretability.
While finite-difference methods perform excellently for low-dimensional problems and are perfectly suited for backward moving American option problems, if we introduce extra-dimensionality or path-dependence, finite-difference methods quickly run into issues. The other major paradigm for option pricing are Monte Carlo methods. Monte Carlo methods run stochastic simulations forwards through time, then average over all these simulations to calculate the option's price as an expectation. In contrast to finite-difference methods, Monte Carlo methods are inherently forward moving, so handle path-dependence with ease. They also scale excellently to high-dimensional problems. While the forward moving nature of Monte Carlo methods is a plus for path-dependence, it is a major downside when it comes to pricing options with embedded optionality like American or Bermudan options. In these cases, a naive application of iterated expectations leads to an exponential explosion in the computational complexity of the problem. A breakthrough in the pricing of American options by Monte Carlo methods came in 2001, in the form of the Longstaff-Schwartz method. In a following post, we will explore the pricing of American Options in the Monte Carlo framework by utilising the work of Longstaff and Schwartz.