Computational Performance in Fintech Applications

computation

Many types of applications do not require high-performance computations because they simply perform very few or infrequent calculations. This also applies to many financial applications. However, there are financial applications, such as banking or insurance ones, that occasionally perform a significant number of computations.

In this article, I will show an example of an application that calculates simulations of profits from investment funds, where the simulation time was reduced 100-fold (in words: a hundred times).

The observations made are universal and can be applied in other cases as well.

Calculation Accuracy

As we know, calculation accuracy depends on the data type used. In Java, for example, we have types:

  • float, double, which have limited precision
  • int, long, which allow storing integers losslessly
  • BigDecimal, which allows storing decimal numbers with any specified precision

In most projects I have seen, financial calculations were performed using the BigDecimal type due to the ability to perform lossless calculations.

Is this correct? Let’s examine the properties of individual types.

Floating-point calculations

Since around 1985, the IEEE 754 standard defines 2 ways of representing decimal numbers - single and double precision.

The float type is stored on 32 bits. The double type on 64 bits.
The figure below shows the meaning of all 32 bits of a float number.

source: [https://en.wikipedia.org/wiki/Floating-point_arithmetic](https://en.wikipedia.org/wiki/Floating-point_arithmetic)

where:

  • S - 1 sign bit
  • E - exponent, 8 bits
  • M - mantissa, 23 bits

Each part (S, E, M) can be interpreted as an integer. Then the decimal number encoded this way has the value given by the formula:

value  =  (-1)^S  *  2^(E-127) *  (1 + M/2^23)

For the example from the figure (S=0, E=124, M=2097152) we have:

value = (-1)^0 * 2^(124-127) * (1 + 2097152/8388608)
      = 1 * 2^(-3) * (1 + 0.25)
      = 0.15625

You can check from Java code if this is true:

int i = Float.floatToIntBits(0.15625f);  
String.format("%32s", Integer.toBinaryString(i)).replace(' ', '0');
// 0 01111100 01000000000000000000000

In the case of the double type, 11 bits are allocated for the exponent, and 52 bits for the mantissa.

Floating-point accuracy

The float type (32-bit) can represent numbers from the range of -3.4e+38 to 3.4e+38 with some approximation. Meanwhile, on 32 bits, only a little over 4 billion different values can be stored. This means that many mathematically different numbers have the same bit representation in the float type.

Let’s perform a simple test:

float x = 0;  
while (x + 1 > x) {  
    x += 1;  
}  
  
System.out.printf("x = %.3f %n", x);   // x = 16777216,000

x = x + 3;  
System.out.printf("x = %.3f %n", x);   // x = 16777220,000 

In the case of the float type, we very quickly reach a value after which integers can no longer be represented without error. You can see that 16_777_216 + 3 resulted in 16_777_220. The float type, due to its short mantissa (only 23 bits), has significant inaccuracies that manifest in many calculations.

For the double type, the number after which the type stops accurately representing integers is already 9_007_199_254_740_992. The precision of double is sufficient in many applications, though of course not in all.

Exact Calculations

As an alternative, we have types that allow performing calculations with arbitrarily specified precision or even with full precision if needed. Developers of many applications are tempted by this possibility.

BigDecimal

The Java standard library provides the BigDecimal class, which allows storing any decimal number without any error.

In brief, a BigDecimal object stores among others 3 fields:

final BigInteger intVal;
final int scale;
final long intCompact;

The intVal field stores all the digits of the number. The scale field indicates where the decimal separator is, i.e., how many of the last digits constitute the fractional part. Meanwhile, intCompact contains the full number stored as long if that number fits in the long type.

For example, let’s analyze a few operations performed on a BigDecimal object:


a = new BigDecimal("701.90334405")

// a.toString   = 701.90334405
// a.intVal     = null
// a.scale      = 8
// a.intCompact = 70190334405

b = a.multiply(new BigDecimal("905.11003377"))

// b.toString   = 635299.7594363714285685
// b.intVal     = 6352997594363714285685 (as BigInteger object)
// b.scale      = 16
// b.intCompact = INFLATED


var c = b.multiply(new BigDecimal("29.33001212"));

// c            = 18633349.644101858368735819250220
// c.intVal     = 18633349644101858368735819250220 (as BigInteger object)
// c.scale      = 24
// c.intCompact = INFLATED


var d = c.multiply(new BigDecimal("3.12009999"));

// d            = 58137914.03822871185527404595525322949780
// d.intVal     = 5813791403822871185527404595525322949780 (as BigInteger object)
// d.scale      = 32
// d.intCompact = INFLATED


var e = d.multiply(d);

// e            = 3380017048716471.1258068402704706905982726021476447557679021816679765560402048400
// e.intVal     = 33800170487164711258068402704706905982726021476447557679021816679765560402048400 (BigInteger)
// e.scale      = 64
// e.intCompact = INFLATED

Object a stores the value in the intCompact field because the entire value has a total of 13 digits, which can be stored in a long variable.

Unfortunately, object b has a value consisting of 22 digits, so BigDecimal allocates a new BigInteger object in which it holds all the digits. After a few multiplications, object e stores the mathematically correct result in the form of 99 digits, of which 64 digits constitute the fractional part.

When BigDecimal holds the value in a BigInteger object, all operations become expensive (see for example the BigInteger#multiply method). Moreover, the cost of these operations increases as the number of digits increases.

Unfortunately, if we want to perform lossless calculations (without rounding errors), we must reckon with the fact that the number of digits stored by BigDecimal grows very quickly. Fortunately, we can impose precision, agreeing to some rounding errors, which will speed up operations on BigDecimal objects.

Let’s compare 2 uses, one without limiting precision, the second with setting maximum precision to 12 significant digits:

Without limiting precision:

var a = BigDecimal.ONE;  
for (int i = 1; i <= 7; i++) {  
    a = a.multiply(new BigDecimal("1.12345678"));  
    System.out.printf("a%d = %s %n", i, a);  
}

// output
// a1 = 1.12345678 
// a2 = 1.2621551365279684 
// a3 = 1.417976745544171758605752 
// a4 = 1.59303558866393455169015543139856 
// a5 = 1.7897066328657884135725655786585371142368 
// a6 = 2.010658050904040823273542821338556825870967485504 
// a7 = 2.25888741954972979230344347725313034144001782674902051712 

With limiting precision to 12 significant digits:

final MathContext PRECISION = new MathContext(12);
a = BigDecimal.ONE;
for (int i = 1; i <= 7; i++) {
	a = a.multiply(new BigDecimal("1.12345678"),  PRECISION);
	System.out.printf("a%d = %s %n", i, a);
}

// output
// a1 = 1.12345678 
// a2 = 1.26215513653 
// a3 = 1.41797674555 
// a4 = 1.59303558867 
// a5 = 1.78970663287 
// a6 = 2.01065805091 
// a7 = 2.25888741956 

In the vast majority of cases, the method with limited precision is much more efficient - remember that the performance of calculations in the BigDecimal class worsens with the number of digits.

Decimal4j

There are many libraries that provide alternatives to the BigDecimal type. One of the recognized libraries is Decimal4j (https://github.com/tools4j/decimal4j), which is a Java library for fast fixed-point arithmetic based on longs with support for up to 18 decimal places.

In other words, decimal numbers handled by Decimal4j are stored in long variables, and the object class indicates how many of the last digits are digits after the decimal point.

Decimal4j objects are immutable, so each operation creates a new instance containing the result.

For example, if we want all calculations to be performed with precision to 3 decimal places, we should use objects of the Decimal3f class; if we need precision to 10 places, then the Decimal10f class is necessary. For example:

Decimal3f a = Decimal3f.valueOf(1.123456);  
Decimal3f b = a.multiply(Decimal3f.valueOf(1.10109999));

// a = 1.123
// b = 1.236

Benchmark

Below are the results of a simple benchmark comparing the speed of calculations using:

  • double
  • BigDecimal without limiting precision
  • BigDecimal with limiting precision to 16 digits
  • Decimal4j with 16-digit precision

The calculations are simple and look like this:

public double calc_double(long seed) {
	double a = 0.00021 * seed;
	double b = a * 12.5;
	double c = b * a;
	double d = a + b;
	double e = c / d;
	double f = e * a;
	double g = f + (a * 10);
	double h = g + (b * 10);
	return h;
}

public double calc_bigdecimal(long seed) {
	BigDecimal a = BigDecimal.valueOf(0.00021 * seed);
	BigDecimal b = a.multiply(BigDecimal.valueOf(12.5));
	BigDecimal c = b.multiply(a);
	BigDecimal d = a.add(b);
	BigDecimal e = c.divide(d, PRECISION);
	BigDecimal f = e.multiply(a);
	BigDecimal g = f.add(a.multiply(BigDecimal.TEN));
	BigDecimal h = g.add(b.multiply(BigDecimal.TEN));
	return h.doubleValue();
}

static final MathContext PRECISION = new MathContext(16);
public double calc_bigdecimal_precision(long seed) {
	BigDecimal a = BigDecimal.valueOf(0.00021 * seed).round(PRECISION);
	BigDecimal b = a.multiply(BigDecimal.valueOf(12.5), PRECISION);
	BigDecimal c = b.multiply(a, PRECISION);
	BigDecimal d = a.add(b, PRECISION);
	BigDecimal e = c.divide(d, PRECISION);
	BigDecimal f = e.multiply(a, PRECISION);
	BigDecimal g = f.add(a.multiply(BigDecimal.TEN, PRECISION), PRECISION);
	BigDecimal h = g.add(b.multiply(BigDecimal.TEN, PRECISION), PRECISION);
	return h.doubleValue();
}

public double calc_decimal4j(long seed) {
	Decimal16f a = Decimal16f.valueOf(0.00021).multiply(seed);
	Decimal16f b = a.multiply(Decimal16f.valueOf(12.5));
	Decimal16f c = b.multiply(a);
	Decimal16f d = a.add(b);
	Decimal16f e = c.divide(d, RoundingMode.HALF_UP);
	Decimal16f f = e.multiply(a);
	Decimal16f g = f.add(a.multiply(Decimal16f.TEN));
	Decimal16f h = g.add(b.multiply(Decimal16f.TEN));
	return h.doubleValue();
}

static ScaleMetrics scale15 = Scales.getScaleMetrics(15);
static DecimalArithmetic arith = scale15.getDefaultArithmetic();
public double calc_decimal4j_noalloc(long seed) {
	long a = arith.multiply(arith.fromDouble(0.00021), arith.fromLong(seed));
	long b = arith.multiply(a, arith.fromDouble(12.5));
	long c = arith.multiply(b, a);
	long d = arith.add(a, b);
	long e = arith.divide(c, d);
	long f = arith.multiply(e, a);
	long g = arith.add(f, arith.multiply(a, arith.fromLong(10)));
	long h = arith.add(g, arith.multiply(b, arith.fromLong(10)));
	return arith.toDouble(h);
}

Each method takes a seed variable and performs calculations on it.
Importantly, for each seed value from 1 to 1000, each of these methods returns the same result with precision to 1e-10.

The benchmark itself, performed using the JMH framework, looks like this:

static long seed = 0;   
static void updateSeed() {  
    seed = 1 + (seed + 1) % 1_000;  
}  
  
@Benchmark  
public void calc_bigdecimal(Blackhole hole) {  
    updateSeed();  
    hole.consume( calc_bigdecimal(seed) );  
}

// ...

Benchmark Results

The benchmark results printed by JMH look as follows:

Benchmark                                Mode  Cnt      Score   Units

CalcBenchmark.calc_bigdecimal            avgt   20    838,736   ns/op
CalcBenchmark.calc_bigdecimal_precision  avgt   20    408,332   ns/op
CalcBenchmark.calc_decimal4j             avgt   20     62,067   ns/op
CalcBenchmark.calc_decimal4j_noalloc     avgt   20     70,498   ns/op
CalcBenchmark.calc_double                avgt   20      8,342   ns/op

And on the chart, they are presented even more clearly:

The first conclusion is that calculations using the double type (64-bit IEEE-754) are about 100 times faster than calculations using BigDecimal without limiting precision. Additionally, calculations with BigDecimal are characterized by an intense degree of object allocation on the heap, since each intermediate result means a new object. On the other hand, calculations with the double type are GC-friendly and can even be parallelized at the CPU level (SIMD instructions).

Can We Sacrifice Accuracy?

I have seen many fintech projects where all calculations were performed using BigDecimal without limiting precision. Only at the very end of the calculations, the result was rounded to a specific number of decimal places. Is this necessary?

I have written a dozen fintech-type systems. For each system, I received a calculation specification (reference implementation) and a set of test data along with expected calculation results:

  • in 2 cases, the reference implementation was written in R language,
  • in all other cases, the reference implementation was created in Excel

I looked at this and one thing struck me:

EXCEL:

R language:

Both R and Excel work with double types (64-bit IEEE 754). Therefore, if calculations in these systems are to be consistent with the tests and reference implementation, they must also be performed on double variables.

Real-life Example

Below is a fragment of a function (Groovy language) that performs intensive calculations on multiple time windows on investment fund unit prices:


// history of investment fund unit prices
def p = get(...)

...
def K0 = p.size() - 1
def r = new BigDecimal[K0 + 1]
def sig = new BigDecimal[K0 - w]

// calculating r[i] for each day of selected window (from fund unit prices)
r[i] = ...

// calculating sigma array for each day
for(int ti = 1; ti <= K0 - w; ti++) {
    
    // using moving window from ti to ti+w
    def rsum = 0
    for(int k = ti; k <= ti + w; k++) {
        rsum += r[k]
    }
    def M1 = rsum / (w+1)

    def delta = 0
    for(int k = ti; k <= ti + w; k++) {
        delta = delta + (r[k] - M1)**2
    }
    
    sig[ti - 1] = Math.sqrt(delta / (w+1))
}

def K1 = ...
def K2 = 0, K3 = 0, K4 = 0
for (int i = 1; i <= K0; i++) {
    def x = r[i] - K1
    K2 += x**2
    K3 += x**3
    K4 += x**4
}
...

Notice that the calculations are performed on BigDecimal objects because the r and sig arrays are defined as follows:

def r = new BigDecimal[K0 + 1]
def sig = new BigDecimal[K0 - w]

Believe it or not, it was enough to change these 2 lines to:

def r = new double[K0 + 1]
def sig = new double[K0 - w]

for the entire function to run about 100 times faster!
According to the Groovy language specifics, the calculations were now performed on the double type because the r[i] values were now of type double.

This is not a made-up story, it really happened.

Comments

Popular posts from this blog

JDBC fetch size - commonly forgotten setting