Superlinear Convergence

Description

Superlinear convergence is a rate of convergence for iterative algorithms where the error decreases faster than linear but slower than quadratic as the number of iterations increases. It describes a category of convergence behavior that lies between linear and quadratic convergence, making it an important concept in numerical analysis and optimization.

Formally, a sequence {xₖ} converging to a solution x* exhibits superlinear convergence if:

limₖ→∞ (|xₖ₊₁ - x*| / |xₖ - x*|) = 0

This means that, asymptotically, the error at step k+1 becomes negligible compared to the error at step k. It improves upon linear convergence where the error decreases at a fixed proportion, but without achieving the exponential improvement of quadratic convergence.

How It Works

To understand convergence behavior, we observe how the error sequence eₖ = |xₖ - x*| evolves as k increases.

In linear convergence, eₖ₊₁ ≈ C·eₖ for 0 < C < 1.
In quadratic convergence, eₖ₊₁ ≈ C·eₖ², which is much faster.
In superlinear convergence, there’s no fixed rate C, but:

eₖ₊₁ / eₖ → 0

In practice, this means each step brings significantly better accuracy than the previous one, though not as explosively fast as quadratic convergence.

Key Components

1. Error (`eₖ`)

Defined as the distance between the current iterate and the true solution:

eₖ = |xₖ - x*|

2. Convergence Order (`p`)

The order of convergence determines the classification:

Linear: p = 1
Superlinear: p = 1+ε (where ε > 0)
Quadratic: p = 2

3. Asymptotic Ratio

For superlinear convergence:

limₖ→∞ eₖ₊₁ / eₖ = 0

This ratio distinguishes it from linear convergence, where the ratio would stabilize at a constant C < 1.

Mathematical Definition

A sequence {xₖ} converges to x* with order p and rate constant C if:

limₖ→∞ |xₖ₊₁ - x*| / |xₖ - x*|ᵖ = C,   where C > 0 and p > 1

If this limit exists and 1 < p < 2, the convergence is superlinear but sub-quadratic.

Use Cases

🧠 Optimization Algorithms

Quasi-Newton methods (e.g., BFGS, DFP) converge superlinearly under mild conditions.
Faster than gradient descent, but more practical than Newton’s Method (which needs Hessians).

🔁 Root-Finding

Secant Method achieves superlinear convergence with order ≈ 1.618 (the golden ratio).
Useful when derivatives are not available.

🧮 Iterative Linear Solvers

Krylov subspace methods like GMRES may exhibit superlinear convergence when preconditioned effectively.

📊 Nonlinear Equation Systems

Fixed-point iteration schemes with smart acceleration techniques may reach superlinear rates.

Examples

1. Secant Method

A root-finding algorithm that approximates the derivative:

xₖ₊₁ = xₖ - f(xₖ) · (xₖ - xₖ₋₁) / (f(xₖ) - f(xₖ₋₁))

It converges superlinearly with order ≈ 1.618 (φ, the golden ratio), provided the function is smooth and the initial guesses are close enough.

2. BFGS Algorithm

An iterative quasi-Newton optimization method that updates an approximate Hessian matrix using gradient evaluations:

Does not require second derivatives
Converges superlinearly for convex, smooth functions

Benefits and Limitations

✅ Benefits

Faster than Linear: Noticeable acceleration in convergence, especially after a few iterations.
Less Computational Cost Than Newton’s Method: Superlinear methods like BFGS avoid computing second derivatives.
Applicable in Non-Derivative Scenarios: E.g., Secant method doesn’t require explicit gradients.

❌ Limitations

Not as Fast as Quadratic: In problems where extreme precision is needed, quadratic methods may be preferred.
Requires Good Initialization: Like most iterative methods, poor starting points can ruin convergence.
Problem-Dependent Behavior: Superlinear convergence might not always be guaranteed unless certain smoothness and regularity conditions are met.

Comparison with Other Convergence Types

Convergence Type	Condition	Speed	Examples
Linear	`eₖ₊₁ ≈ C · eₖ`	Moderate	Gradient Descent
Superlinear	`lim (eₖ₊₁ / eₖ) = 0`	Fast	Secant, BFGS
Quadratic	`eₖ₊₁ ≈ C · eₖ²`	Very fast	Newton’s Method
Cubic	`eₖ₊₁ ≈ C · eₖ³`	Extremely fast	Halley’s Method

Visualization Insight

Imagine you’re approaching a target on a number line:

With linear convergence, you reduce the gap by 50% each time.
With superlinear, the first few steps might be modest, but then you make sharper reductions.
With quadratic, the error practically collapses after just a few iterations.

Superlinear convergence is like a turbo mode that gradually kicks in, particularly when you’re near the solution.

Real-World Analogy

Think of superlinear convergence like tuning into a radio signal. At first, there’s noise, and progress is slow. But as you get closer to the right frequency, the signal clears up faster and faster—even though you’re still not quite locked in. The improvement accelerates, just not as explosively as if you jumped straight to the correct station (quadratic).

Code Illustration: Secant Method (Superlinear Example)

def secant_method(f, x0, x1, tol=1e-10, max_iter=100):
    for i in range(max_iter):
        fx0, fx1 = f(x0), f(x1)
        if fx1 - fx0 == 0:
            raise ZeroDivisionError("Zero denominator in secant update.")
        x2 = x1 - fx1 * (x1 - x0) / (fx1 - fx0)
        if abs(x2 - x1) < tol:
            return x2
        x0, x1 = x1, x2
    return x1

Try it with:

f = lambda x: x**2 - 2
root = secant_method(f, 1, 2)
print(root)  # ≈ 1.41421356

You’ll see convergence that accelerates—superlinear but not quite as explosive as Newton’s Method.

Key Formulas Summary

Error Definition:
eₖ = |xₖ - x*|
Superlinear Convergence Condition:
lim (eₖ₊₁ / eₖ) = 0
Order of Convergence (General):
If
lim (eₖ₊₁ / eₖᵖ) = C with C > 0 and p > 1,
then the sequence has order p
Secant Method Update:
xₖ₊₁ = xₖ - f(xₖ) * (xₖ - xₖ₋₁) / (f(xₖ) - f(xₖ₋₁))