Pricing American Options I – Finite Difference Methods

Numerical Methods




In it's simplest form, an option is a financial derivative that gives the holder the right, but not the obligation, to either purchase (known as a call option) or to sell (known as a put option) an underlying security for a predetermined price, known as the strike price on a predetermined date, known as the exercise date. This kind of option, with a single exercise date at the end of the options life, is known as a European option. While this version of an option is the most simple, both conceptually and in terms of ease of modelling its value/risks, nearly all options traded on futures exchanges and nearly all stock options (options written on equity) are American. An American option differs from it's European counterpart in one important way. American options give the holder exercise rights at any moment in time during the contracts lifetime.

The widespread use of American options clearly gives rise to the need for accurate and reliable pricing and risk tools. In the following, we will derive the Black-Scholes equation for a European option on a simple stock option with continuous dividend yield, before generalising the result to American options. With the equation in place, we will propose and implement a general purpose solver for American options, and test the results on a few simple test cases.

The Black-Scholes Equation

In the Black-Scholes model, it is assumed that we operate within in a frictionless market without transaction costs, where prices can be observed continuously, and the market itself is efficient, in the sense that asset prices incorporate all available information and no arbitrage opportunities exist.

Consider an underlying asset (S(t))t0(S(t))_{t\ge 0} emitting dividends continuously at a yield of yy over each infinitesimal time interval dt\text{d}t. While this assumption may seem unreasonable for a single stock, it becomes closer to reality in the case where our asset tracks an index, like the S&P 500. Likewise, consider a bank-account process (B(t))t0(B(t))_{t\ge 0}, where investors can place their money to earn a risk-free return rr over any infinitesimal time increment dt\text{d}t. We assume that SS follows a geometric Brownian motion, with constant mean return μ\mu and volatility σ\sigma. Specifically, we have the system

dS(t)=S(t)((μy)dt+σdW(t)),dB(t)=B(t)rdt, \begin{aligned} \text{d}S(t) & = S(t)\left( (\mu - y) \text{d}t + \sigma \text{d}W(t) \right), \\ \text{d}B(t) & = B(t) r \text{d} t, \end{aligned}

where (W(t))t0(W(t))_{t\ge 0} is a Brownian motion. Clearly, the yS(t)dt-yS(t)\text{d}t term arises from the fact that as dividends are paid out, the price of the underlying adjusts downwards to preclude any arbitrage opportunities.

The key insight of Black, Scholes and Merton that led to the famous Black Scholes PDE is that, in a complete market it is possible to dynamically hedge a position in an option using a varying (but known) quantity in the underlying asset. By buying and selling the asset in the correct quantity at each moment in time, it is theoretically possible to completely counteract any unexpected moves in the overall portfolio. Furthermore, if a portfolio grows at a predictable rate, it must grow at the risk-free rate - if not, one could finance their position using the bank-account and earn a guaranteed profit, an arbitrage opportunity we have assumed cannot exist.

Let our option have expiry date T>0T > 0 and let it's value at time t[0,T]t \in [0,T] be denoted by VV(t,S(t))V \equiv V(t, S(t)). Clearly, VV is a function of both tt and of S(t)S(t). Consider a portfolio consisting of Δ(t)\Delta(t) units of the stock at each time tt along with a short position in the option. We will denote the portfolio's value at time tt by P(t)P(t), so that

P(t)=Δ(t)S(t)V.(1) P(t) = \Delta(t) S(t) - V. \tag{1}

By using a result of stochastic calculus known as Ito's Lemma on VV, we find that VV satisfies the stochastic differential equation

dV=VSdS(t)+(12σ2S(t)22VS2+Vt)dt. \text{d}V = \frac{\partial V}{\partial S} \text{d} S(t) + \left(\frac{1}{2}\sigma^2 S(t)^2 \frac{\partial^2 V}{\partial S^2} + \frac{\partial V}{\partial t} \right) \text{d}t.

Next, by noting that in each time interval dt\text{d} t we receive Δ(t)S(t)ydt\Delta(t)S(t)y \text{d}t units of dividend, the right-hand side of equation (1) satisfies

d(Δ(t)S(t)V)=Δ(t)dS(t)+yΔ(t)S(t)dtdV. \text{d} \left( \Delta(t) S(t) - V \right) = \Delta(t) \text{d} S(t) + y\Delta(t)S(t) \text{d}t - \text{d} V.

Hence, we find that

dP(t)=(Δ(t)VS)dS(t)+(yΔ(t)S(t)12σ2S(t)22VS2Vt)dt. \text{d}P(t) = \left( \Delta(t) - \frac{\partial V}{\partial S} \right) \text{d} S(t) + \left( y\Delta(t)S(t) - \frac{1}{2}\sigma^2 S(t)^2 \frac{\partial^2 V}{\partial S^2} - \frac{\partial V}{\partial t} \right) \text{d}t.

As noted earlier, by prudent choice of our quantity of underlying asset, we can remove the uncertainty associated with the above portfolio in any infinitesimal interval dt\text{d}t. By observing the first bracketed term in the above, we see that the choice

Δ(t)=VS \Delta(t) = \frac{\partial V}{\partial S}

achieves this goal. Substituting in this value for Δ(t)\Delta(t) yields

dP(t)=(yS(t)VS12σ2S(t)22VS2Vt)dt.(2) \text{d}P(t) = \left( y S(t) \frac{\partial V}{\partial S} - \frac{1}{2}\sigma^2 S(t)^2 \frac{\partial^2 V}{\partial S^2} - \frac{\partial V}{\partial t} \right) \text{d}t. \tag{2}

As this change contains no randomness, we find that dP(t)=rP(t)dt\text{d}P(t) = r P(t)\text{d}t or after substituting in equation (1),

dP(t)=r(S(t)VSV)dt. \text{d}P(t) = r \left(S(t) \frac{\partial V}{\partial S} - V\right)\text{d}t.

Substituting this expression into equation (2) and rearranging yields

(Vt+12σ2S(t)22VS2+(ry)S(t)VSrV)dt=0. \left(\frac{\partial V}{\partial t} + \frac{1}{2}\sigma^2 S(t)^2 \frac{\partial^2 V}{\partial S^2} + (r-y) S(t)\frac{\partial V}{\partial S} - rV \right) \text{d}t = 0.

In turn, this gives us the celebrated Black-Scholes PDE

Vt+12σ2S(t)22VS2+(ry)S(t)VSrV=0.(3) \frac{\partial V}{\partial t} + \frac{1}{2}\sigma^2 S(t)^2 \frac{\partial^2 V}{\partial S^2} + (r-y) S(t)\frac{\partial V}{\partial S} - r V = 0. \tag{3}

Along with a terminal payoff V(T,S(T)):=P(S)V(T, S(T)) := P(S), the above gives us a well-defined PDE to solve for V(t,S)V(t,S) at all points t[0,T]t \in [0,T] and S[0,)S \in [0,\infty). Note that technically we need boundary conditions for S=0S=0 and SS \rightarrow \infty. A condition when S=0S=0 drops directly out of the PDE itself - we find that at S=0S=0,

V(t,0)trV(t,0)=0, \frac{\partial V(t,0)}{\partial t} - rV(t,0) = 0,

which implies that

V(t,0)=P(S)exp(r(Tt)). V(t,0) = P(S) \exp(-r (T-t)).

Likewise, when SS \rightarrow \infty we find that the PDE is dominated by the second-order term and hence that

2V(t,S)S20. \frac{\partial^2 V(t, S)}{\partial S^2} \rightarrow 0.

Empirically, particularly for vanilla options, it suffices to consider an upper pound anywhere larger than 3 standard deviations above the asset price, supposing that the initial value of the asset was such that the option was initially at-the-money, i.e. S(0)=KS(0) = K. Solving the SDE for S(t)S(t) and setting W(T)=TZ=5TW(T) = \sqrt{T}\cdot Z = 5\sqrt{T}, where ZN(0,1)Z \sim \mathcal{N}(0,1) is a standard normal that we have set equal to 5 standard deviations, we find that S(T)S(T) satisfying

Sˉ=Kexp((r12σ2)T+5σT), \bar{S} = K \exp\left( \left(r - \tfrac{1}{2}\sigma^2 \right)T + 5\sigma\sqrt{T} \right),

is a sufficient upper bound for which to set 2V(t,Sˉ)S2=0\frac{\partial^2 V(t, \bar{S})}{\partial S^2} = 0.

The Finite-Difference Method

Without diving too deep into the body of theory, finite difference methods treat partial derivatives in PDEs such as equation (3) above as difference equations in their respective variables. For example, for a function f(x)f(x), the finite difference approximation to the derivative f(x)f'(x) could be given by

Dhf(x)=f(x+h)f(xh)2h. D_h f(x) = \frac{f(x+h) - f(x-h)}{2h}.

Assuming ff is sufficiently regular, a Taylor series expansion on the above shows that Dhf(x)f(x)O(h2)D_h f(x) - f'(x) \sim \mathcal{O}(h^2), i.e. this approximation is second-order in hh as h0h \rightarrow 0.

The second key insight of finite difference methods is to discretise our dimensions on a discrete grid of points. By introducing natural numbers N,MN,M, representing the number of grid points in space and time respectively, we can divide our spatial dimension and time dimensions up uniformly via

Sn=nΔS,  ΔS=SˉN1,  n=0,,N1,tm=mΔt,  Δt=TM1,  m=0,,M1. \begin{aligned} S_n & = n \Delta S, \; & \Delta S = \frac{\bar{S}}{N-1}, \; n & = 0,\ldots,N-1, \\ t_m & = m \Delta t, \; & \Delta t = \frac{T}{M-1}, \; m & = 0,\ldots,M-1. \end{aligned}

We first perform this discretisation in SS, setting Vn(t):=V(t,Sn)V_n(t) := V(t, S_n) and using the finite-difference approximations

VnSVn+1Vn12ΔS=:D1Vn(t),2VnS2Vn+12Vn+Vn1ΔS2=:D2Vn(t). \begin{aligned} \frac{\partial V_n}{\partial S} & \approx \frac{V_{n+1} - V_{n-1}}{2 \Delta S} =: D_1 V_n(t), \\ \frac{\partial^2 V_n}{\partial S^2} & \approx \frac{V_{n+1} - 2V_n + V_{n-1}}{\Delta S^2} =: D_2 V_n(t). \\ \end{aligned}

Isolating the terms in the Black-Scholes PDE that do not contain time derivatives, performing the spatial discretisations and substituting the above approximations gives

LVn(t)=12σ2Sn2D2Vn(t)+(ry)SnD1Vn(t)rVn(t)=12σ2n2(Vn+12Vn+Vn1)+12(ry)n(Vn+1Vn1)rVn(t),=12n(σ2n(ry))Vn1(t)+(σ2n2r)Vn(t)+12n(σ2n+(ry))Vn+1(t).(4) \begin{aligned} L V_n(t) & = \frac{1}{2}\sigma^2 S_n^2 D_2 V_n(t) + (r-y) S_n D_1 V_n(t) - r V_n(t) \\ & = \frac{1}{2}\sigma^2 n^2 \left(V_{n+1} - 2V_n + V_{n-1}\right) + \frac{1}{2} (r-y) n \left( V_{n+1} - V_{n-1} \right) - r V_n(t), \\ & = \frac{1}{2} n \left( \sigma^2 n - (r-y) \right)V_{n-1}(t) + \left(-\sigma^2 n^2 - r\right)V_n(t) + \frac{1}{2} n \left(\sigma^2 n + (r-y) \right) V_{n+1}(t). \tag{4} \end{aligned}

By discretising the boundary condition 2V(t,Sˉ)S2=0\frac{\partial^2 V(t, \bar{S})}{\partial S^2} = 0 and considering the ghost point VN(t)V_N(t) we see that

VN(t)=2VN1(t)VN2(t). V_N(t) = 2V_{N-1}(t) - V_{N-2}(t).

Substituting this condition into equation (4) shows that

LVN1(t)=12(N1)(σ2(N1)(ry))VN2(t)+(σ2(N1)2r)VN1(t)+12(N1)(σ2(N1)+(ry))(2VN1(t)VN2(t))=(N1)(ry)VN2(t)+(r+(N1)(ry))VN1(t). \begin{aligned} L V_{N-1}(t) & = \frac{1}{2} (N-1) \left( \sigma^2 (N-1) - (r-y) \right)V_{N-2}(t) + \left(-\sigma^2 (N-1)^2 - r\right)V_{N-1}(t) \\ & \quad\quad\quad + \frac{1}{2} (N-1) \left(\sigma^2 (N-1) + (r-y) \right) (2V_{N-1}(t) - V_{N-2}(t)) \\ & = -(N-1)\left(r-y\right)V_{N-2}(t) + \left(-r + (N-1)(r-y) \right)V_{N-1}(t). \end{aligned}

With the above, we have a set of NN equations, one for each spatial grid point SnS_n, n=0,N1n=0\ldots,N-1. Denoting by ana_n, bnb_n and cnc_n the terms relating the point VnV_n to Vn1V_{n-1}, VnV_n and Vn+1V_{n+1} respectively in equation (4), we may write equation (4) more succinctly as

LVn(t)=anVn1(t)+bnVn(t)+cnVn+1(t),n=0,,N1, L V_n(t) = a_n V_{n-1}(t) + b_n V_n(t) + c_n V_{n+1}(t), \quad n = 0,\ldots,N-1,

where a0=0a_0 = 0 and cN1=0c_{N-1} = 0. Specifically, we find that

an={12n(σ2n(ry)),n<N1,(N1)(ry),n=N1,bn={σ2n2r,n<N1,r+(N1)(ry),n=N1,cn={12n(σ2n+(ry)),n<N1,0,n=N1. \begin{aligned} a_n & = \begin{cases} \frac{1}{2} n \left( \sigma^2 n - (r-y) \right), & n < N-1, \\ -(N-1)\left(r-y\right), & n = N-1, \end{cases} \\ b_n & = \begin{cases} -\sigma^2 n^2 - r, & n < N-1, \\ -r + (N-1)(r-y), & n = N-1, \end{cases} \\ c_n & = \begin{cases} \frac{1}{2} n \left(\sigma^2 n + (r-y) \right), & n < N-1, \\ 0, & n = N-1. \end{cases} \end{aligned}

Denoting the vector of points V0(t),,VN1(t)V_0(t),\ldots,V_{N-1}(t) by

V(t)=(V0(t),,VN1(t)), V(t) = (V_0(t),\ldots, V_{N-1}(t))^\intercal,

we find that the above yields matrix-vector system

V(t)t=AV(t), \frac{\partial V(t)}{\partial t} = -A V(t),

where

A=(b0c000a1b1c10000aN2bN2cN200aN1bN1). A = \left( \begin{array}{cccccc} b_0 & c_0 & 0 & \cdots & \cdots & 0 \\ a_1 & b_1 & c_1 & \ddots & \cdots & 0 \\ 0 & \ddots & \ddots & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & \ddots & \ddots & 0 \\ 0 & \cdots & \ddots & a_{N-2} & b_{N-2} & c_{N-2} \\ 0 & \cdots & \cdots & 0 & a_{N-1} & b_{N-1} \end{array} \right).

With this system in place, we proceed to discretise in the tt dimension. To that end, we set Vm:=V(tm)V^m := V(t_m) and Vnm:=Vn(tm)V^m_n := V_n(t_m). Finally, we use a theta rule with θ[0,1]\theta \in [0,1] to write

Vm1VmΔt=θAVm1+(1θ)AVm,m=1,,M. \frac{V^{m-1} - V^m}{\Delta t} = \theta A V^{m-1} + (1-\theta) A V^m, \quad m = 1,\ldots,M.

This setup encompasses both the implicit and explicit time-stepping schemes. Rewriting yields

(IΔtθA)Vm1=(I+Δt(1θ)A)Vm,m=1,,M. (I - \Delta t \theta A) V^{m-1} = (I + \Delta t (1-\theta) A) V^m, \quad m = 1,\ldots,M.

With K1θ:=IΔtθAK_1^\theta := I - \Delta t \theta A and K2θ:=I+Δt(1θ)AK_2^\theta := I + \Delta t (1-\theta) A the above can be rewritten as the system of equations

VM=(P(S0),,P(SN1)),K1θVm1=K2θVm,m=1,,M.(5) \begin{aligned} V^M & = (P(S_0),\ldots,P(S_{N-1}))^\intercal, \\ K_1^\theta V^{m-1} & = K_2^\theta V^m, \quad m = 1,\ldots,M. \tag{5} \end{aligned}

With θ=0\theta = 0 we arrive at the explicit scheme, which is the easiest to implement but exhibits instability when too few timesteps MM are used relative to spatial steps NN. In general, we require a relationship of the form MλN2M \ge \lambda N^2 for some λ>0\lambda > 0, where λ\lambda depends on the specific parameters of the problem, to ensure stability of the system. The explicit scheme generally exhibits first-order convergence in time and second-order convergence in space.

Setting θ=1\theta = 1 brings us to the implicit scheme, where the solution Vm1V^{m-1} must be found as the implicit solution of an equation of the form KVm1=VmK V^{m-1} = V^m at each step m=1,,N1m = 1,\ldots,N-1. Fortunately, the matrix KK is sparse and efficient algorithms exist to solve problems of this form in linear time. Unlike the explicit scheme, the implicit scheme is unconditionally stable. Similarly to the explicit scheme, the implicit scheme also exhibits first-order convergence in time and second-order convergence in space.

Another scheme, known as the Crank-Nicolson scheme, where θ=12\theta = \tfrac{1}{2} is a popular choice for the finite-difference scheme (5). This scheme offers unconditional stability as in the implicit case, but in this case also offers second-order convergence in both space and time. Like the implicit scheme, an implicit system must be solved at each step in time. Generally speaking, when using the Crank-Nicolson scheme, users will select N=MN = M so that the solution converges to the true solution at second-order in NN. Heuristically, Denoting the error by ϵ\epsilon, we find that

ϵO(N2). \epsilon \sim \mathcal{O}(N^{-2}).

Similarly, the computational cost of the calculation is linear in the spatial steps at each time step, and linear in time, so that

C(ϵ)O(N2)O(ϵ1). C(\epsilon) \sim \mathcal{O}(N^2) \sim \mathcal{O}(\epsilon^{-1}).

Roughly speaking, this says that decreasing the error by a factor of α\alpha will require a corresponding increase in computational effort of α\alpha, which can be thought of roughly as the amount of real-world time the calculation will take on a computer.

Solving the American Option Problem

In the framework we have developed above, time flows in discrete steps of size Δt\Delta t. Hence, for our purposes it will be more convenient to consider options known as Bermudan options, which are similar to American options, except that they offer exercise rights only on certain dates instead of on all dates. It should be clear that a Bermudan option with exercise rights on the dates t0,,tM1t_0,\ldots,t_{M-1} will converge in price to the equivalent American option when MM \rightarrow \infty and Δt0\Delta t \rightarrow 0.

To preclude arbitrage, at each timestep tmt_m, the holder of the Bermudan option considered above will make a judgement as to whether there is more value in holding their option for a further timestep, or whether they are better off exercising at that moment in time. Specifically, if we let the solution to system (5) at time mm be denoted by HmH^m, the Bermudan option has price at time mm given by VmV^m satisfying

Vm=max(Hm,P(S)), V^m = \max(H^m, P(S)),

where max(X,Y)\max(X,Y) denotes the element-wise maximum of the two vectors XX and YY. Hence, the updated algorithm

VM=(P(S0),,P(SN1)),K1θHm1=K2θVm,Vm1=max(Hm1,P(S)),m=1,,M, \begin{aligned} V^M & = (P(S_0),\ldots,P(S_{N-1}))^\intercal, \\ K_1^\theta H^{m-1} & = K_2^\theta V^m, \\ V^{m-1} & = \max(H^{m-1}, P(S)), \quad m = 1,\ldots,M, \end{aligned}

converges to the solution of the American option problem as N,MN,M \rightarrow \infty.

Results

With the finite difference scheme developed above, we are in a position to test our method in terms of its rate of convergence in its discretisation parameters NN and MM as well as its hyperparameter thetatheta which determines the type of finite difference scheme we are using. In the following, we will consider the price of a call option on underlying with continuous dividends as presented above. We will use the parameters

r=0.05,y=0.1,σ=0.2,K=150,T=2. \begin{aligned} r & = 0.05, \\ y & = 0.1, \\ \sigma & = 0.2, \\ K & = 150, \\ T & = 2. \end{aligned}
European straddle option

In the case of European options, there is an analytic formula available for the price of call/put options under the Black-Scholes model. The formula, with the notation above, is given by

C(t,S)=er(Tt)(FN(d1)KN(d2)),P(t,S)=er(Tt)(KN(d2)FN(d1)),F:=Se(ry)(Tt),d1:=1σTt[log(SK)+(ry+12σ2)(Tt)]d2:=d1σTt, \begin{aligned} C(t, S) & = e^{-r(T-t)}\left(F N(d_1) - K N(d_2) \right), \\ P(t, S) & = e^{-r(T-t)}\left(K N(-d_2) - F N(-d_1) \right), \\ F & := S e^{(r-y)(T-t)}, \\ d_1 & := \frac{1}{\sigma\sqrt{T-t}}\left[\log\left(\frac{S}{K}\right) + \left(r-y +\tfrac{1}{2}\sigma^2\right)(T-t) \right] \\ d_2 & := d_1 - \sigma\sqrt{T-t}, \end{aligned}

where N()N(\cdot) is the standard cumulative normal distribution function. A straddle option consists of a combination of a put and a call option with the same strike and maturity. Below is a plot of the above at time t=0t = 0 for various values of SS along with the payoff function P(S)P(S).

As noted previously, the explicit finite difference scheme, with θ=0\theta = 0, is unstable when we use too few time steps per space step, so we will not consider the explicit scheme. By comparing the L2L_2 error in the results of the finite difference method developed above for the implicit scheme and the Crank-Nicolson scheme as we increase N=MN = M, we can deduce the rate of convergence for different finite difference schemes. These results can be seen below.

We observe linear plots in log-log space for both schemes, with a slope of -1 for the implicit scheme and a slope of 02 for the Crank-Nicolson scheme, corresponding to first and second order convergence respectively. Due to its favourable convergence characteristics, we will only consider the Crank-Nicolson scheme when moving on to the American option problem.

American straddle option

Using the same option parameters as above, we now turn to the American option problem. Below is a plot of the American straddle price versus the underlying, with the European option price included for reference.

Unlike the European option, the value of the American option never drops below the payoff value. This is to be expected in a no-arbitrage environment. Likewise, since the American option gives the holder the same benefits as the European option holder plus more, it is also to be expected that the American option price should be just as great as its European equivalent - this is also reflected in the graph. To see this difference in more detail, we consider the spread in the American option versus its European equivalent and plot this against the underlying.

The above plot confirms that the American option is always worth at least as much as its European equivalent. It also shows that near the money, their prices are broadly similar, but looking at the wings we see that their prices start to diverge at an increasing rate. This observation does not always hold. For example, in the case of a European call with no dividends, the in-the-money European and American option both have exactly the same value - the payoff value. The above observation on the straddle on a dividend paying underlying shows that a sensible investor holding the in-the-money straddle on the call side would exercise early, lock in their profits and earn dividends on the underlying that they now hold.

With an American option pricing scheme in place, we can run a similar error analysis to that of the European option. However, in this case there is no analytic value to compare our approximations to. To sidestep this issue, there are a few proxies we could consider. Here, we consider the L2L_2 error in successive approximations on a finer grid. Observing the slope of a log-log plot of these differences versus the number of steps used gives us a proxy for the order of convergence of the error of our approximation compared to the true value. To see this, set N=MN=M and assume that our error ϵ(N)=V(N)V\epsilon(N) = \left\vert\left\vert V(N) - V\right\vert\right\vert satisfies ϵ(N)O(Nα)\epsilon(N) \sim \mathcal{O}(N^{-\alpha}). Roughly, we find that ϵ(N)=CNα\epsilon(N) = C N^{-\alpha} for some constant CC. Furthermore, by the triangle inequality, the difference in successive error terms satisfies

ϵ(N1)ϵ(N2)(V(N1)V)(V(N2)V)=V(N1)V(N2)ϵ(N1)+ϵ(N2). \epsilon(N_1) - \epsilon(N_2) \le \left\vert\left\vert (V(N_1) - V) - (V(N_2) - V)\right\vert\right\vert = \left\vert\left\vert V(N_1) - V(N_2) \right\vert\right\vert \le \epsilon(N_1) + \epsilon(N_2).

The left-hand-side of this inequality can be simplified to C(N1α+N2α)C(N_1^{-\alpha} + N_2^{-\alpha}). If we assume that N2=aN1N_2 = a N^1 for some base a>1a > 1 then we find that this simplifies further to C(1+aα)N1αC0N1αC(1 + a^{-\alpha})N_1^{-\alpha} \equiv C_0 N_1^{-\alpha} for some constant C0>0C_0 > 0. Similarly, we find that the right-hand-side of the inequality simplifies to C1N1αC_1 N_1^{-\alpha} for some C1>0C_1 > 0. If we observe that the middle term, the difference that we calculate, satisfies C2N1βC_2 N_1^{-\beta} for some C2>0C_2 > 0, then we are forced to conclude that α=β\alpha = \beta. Below is a log-log plot of the L2L_2 norm of successive differences in our approximation.

In contrast to the European case, it appears that the Crank-Nicolson scheme has dropped to first-order convergence. This is a downside of the naive American scheme developed above. More complex iterative schemes for solving the problem have been proposed and explored, such as the use of successive over-relaxation schemes, or schemes based on Jacobi's method. While these schemes can offer improved performance, for 1D problems such as the option pricing problem above, our scheme performs sufficiently fast for most purposes and has the added benefit of easy interpretability.

Extensions

While finite-difference methods perform excellently for low-dimensional problems and are perfectly suited for backward moving American option problems, if we introduce extra-dimensionality or path-dependence, finite-difference methods quickly run into issues. The other major paradigm for option pricing are Monte Carlo methods. Monte Carlo methods run stochastic simulations forwards through time, then average over all these simulations to calculate the option's price as an expectation. In contrast to finite-difference methods, Monte Carlo methods are inherently forward moving, so handle path-dependence with ease. They also scale excellently to high-dimensional problems. While the forward moving nature of Monte Carlo methods is a plus for path-dependence, it is a major downside when it comes to pricing options with embedded optionality like American or Bermudan options. In these cases, a naive application of iterated expectations leads to an exponential explosion in the computational complexity of the problem. A breakthrough in the pricing of American options by Monte Carlo methods came in 2001, in the form of the Longstaff-Schwartz method. In a following post, we will explore the pricing of American Options in the Monte Carlo framework by utilising the work of Longstaff and Schwartz.

© 2022 Will Douglas, All Rights Reserved.