Case Study 14.2: Floating-Point Pitfalls in Financial Calculations

Open Assembly Language Project

Case Study 14.2: Floating-Point Pitfalls in Financial Calculations

Why Finance and Floating-Point Do Not Mix

The IEEE 754 standard uses binary fractions. Most decimal fractions (0.1, 0.01, 0.25, etc.) are not exactly representable in binary. This is not a bug — it is a consequence of the base-2 representation. But for financial calculations where exactness is a legal requirement, this produces errors that cause real problems.

Demonstration: Accumulation Error

; Add 1 cent (0.01) one million times to an account
; Expected result: $10,000.00 = 10000.0
; What we get: approximately 10000.000000218279...

section .data
    one_cent:    dq 0.01           ; 0.01 as double
    iterations:  dq 1000000        ; one million
    result_fp:   dq 0.0            ; accumulated total (floating-point)
    result_int:  dq 0              ; accumulated total (integer, in cents)

section .text
global accumulation_demo

accumulation_demo:
    ; Floating-point accumulation:
    xorpd  xmm0, xmm0             ; xmm0 = 0.0 (accumulated sum)
    movsd  xmm1, [rel one_cent]   ; xmm1 = 0.01
    mov    rcx, [rel iterations]  ; loop counter

.fp_loop:
    addsd  xmm0, xmm1             ; sum += 0.01
    dec    rcx
    jnz    .fp_loop

    movsd  [rel result_fp], xmm0  ; store floating-point result

    ; Integer accumulation (correct):
    mov    rax, [rel iterations]
    mov    [rel result_int], rax  ; 1,000,000 cents = $10,000.00 exactly
    ret

The difference: result_fp ≈ 10000.000000218279 vs. result_int = 1000000 (exactly $10,000.00 in cents).

Why 0.01 is Not Exactly Representable

In binary: 0.01 = 0.000000001010001111010111000010100011... (repeating pattern)

The IEEE 754 double representation rounds this to: 0.01000000000000000020816681711721685228...

Each addition introduces a rounding error of approximately 10^-18. After one million additions: 10^-18 × 10^6 = 10^-12 apparent accumulation — but the actual error is larger due to correlation of errors.

Catastrophic Cancellation

; Catastrophic cancellation: computing (a + b) - a when b is much smaller than a
; Should return b exactly; actually loses all precision in b

section .data
    val_a:  dq 1.0e15        ; large value: a = 10^15
    val_b:  dq 1.0           ; small value: b = 1.0
    ; Expected: (a + b) - a = b = 1.0
    ; Actual: 0.0 (b is completely lost!)

section .text
cancellation_demo:
    movsd xmm0, [rel val_a]    ; xmm0 = 1e15
    movsd xmm1, [rel val_b]    ; xmm1 = 1.0
    addsd xmm0, xmm1           ; xmm0 = 1e15 + 1.0 = 1000000000000001.0
    ; But wait: 1e15 has only 15-16 significant decimal digits.
    ; The last digit of 1e15 is already at the precision limit.
    ; 1e15 + 1.0 rounds back to 1e15 in many cases (depends on exact representation)

    movsd xmm2, [rel val_a]    ; xmm2 = 1e15 (original)
    subsd xmm0, xmm2           ; xmm0 = (1e15 + 1.0) - 1e15
    ; Expected: 1.0
    ; Actual: may be 0.0 or 2.0 depending on rounding
    ret

The IEEE 754 double has 53 bits of mantissa = approximately 15.9 decimal digits. Adding 1.0 to 10^15 (which has 16 significant digits) may round the result back to 10^15, completely losing the 1.0.

The Solution: Fixed-Point Arithmetic

For financial calculations, use integer arithmetic in the smallest currency unit (cents, or for fractions of cents, milli-cents or pico-cents depending on requirements):

; Fixed-point financial arithmetic: all amounts in cents (integer)
; Amount 100.00 is stored as 10000 (cents)
; Amount 0.01 is stored as 1 (cent)
; Maximum safe value: 2^63 / 100 = ~92,233,720,368,547,758 dollars (9.2 quadrillion)

section .text
global interest_calculation

; int64_t interest_calculation(int64_t principal_cents, int64_t rate_millipercent)
; rate_millipercent: e.g., 350 = 3.50% annual rate
; Returns interest in cents for one year (truncated)
; RDI = principal_cents, RSI = rate_millipercent

interest_calculation:
    ; interest = principal * rate / 100000  (divide by 100 for percent, 1000 for milli)
    ; Use 128-bit intermediate to avoid overflow
    mov  rax, rdi           ; rax = principal_cents
    imul rsi                ; rdx:rax = principal * rate_millipercent (128-bit)
    mov  rbx, 100000        ; divisor
    idiv rbx                ; rax = (principal * rate) / 100000 = interest in cents
    ret

The key: every arithmetic operation on integers is exact. No rounding, no accumulation error. The only imprecision is in the business logic (truncation of fractional cents), which is an explicit and controllable decision.

Comparison: Errors After 1000 Transactions

Method	Computation	Error After 1000 Additions of $0.01
Double	`xmm0 += 0.01`	~2.2e-11 relative (22 nano-dollars on $10)
Float	(single precision)	~1.2e-6 relative (1.2 micro-dollars on $10 — visible!)
Fixed-point (cents)	`rax += 1`	0 (exact)
Fixed-point (millicents)	`rax += 10`	0 (exact)

For float (32-bit), the error after 1000 additions becomes visible in financial reporting. For double (64-bit), the error is below the cent level for typical transaction volumes but accumulates in large-scale systems.

The Assembly Implementation: Fixed-Point Interest Accumulation

; Simulate 12 monthly interest applications (compound interest in cents)
; principal: starting amount in cents
; monthly_rate_millipct: monthly rate in millipercents (350 = 3.50%/12 per month)

; int64_t compound_12_months(int64_t principal_cents, int64_t monthly_rate_millipct)
; RDI = principal, RSI = monthly_rate_millipct

compound_12_months:
    push rbp
    mov  rbp, rsp
    push rbx
    push r12

    mov  rbx, rdi           ; current principal in cents
    mov  r12, rsi           ; monthly rate
    mov  rcx, 12            ; 12 months

.month_loop:
    ; interest = principal * rate / 100000
    mov  rax, rbx
    imul r12                ; rdx:rax = principal * rate (128-bit)
    mov  rdi, 100000
    idiv rdi                ; rax = interest in cents (truncated)
    add  rbx, rax           ; principal += interest

    dec  rcx
    jnz  .month_loop

    mov  rax, rbx           ; return final principal in cents

    pop  r12
    pop  rbx
    pop  rbp
    ret

Banker's Rounding vs. Truncation

The IDIV above truncates (rounds toward zero). For financial calculations, several rounding modes exist: - Truncate: always rounds toward zero (biases down for positive numbers) - Round half up: standard "school" rounding - Banker's rounding (round half to even): reduces systematic bias over many transactions

Implementing banker's rounding in assembly:

; Divide RDX:RAX by RBX, apply banker's rounding to quotient in RAX
; (Assumes 128-bit dividend already set up)
bankers_round_div:
    ; First: get quotient and remainder
    idiv  rbx               ; rax = quotient, rdx = remainder

    ; remainder * 2 vs divisor:
    ; if |remainder * 2| < |divisor|: truncate (already done)
    ; if |remainder * 2| > |divisor|: round up (away from zero)
    ; if |remainder * 2| == |divisor|: round to even (half-way case)
    mov   rcx, rdx
    add   rcx, rcx          ; rcx = remainder * 2 (check overflow?)
    jo    .check_overflow    ; handle if remainder*2 overflows
    ; Check if |2*remainder| >= |divisor|:
    ; ... (simplified; full implementation handles signs)
    ret

The Lesson for Assembly Programmers

When implementing financial logic in assembly:

Never use floating-point for exact monetary amounts. Use int64_t in the smallest unit you need.
For percentage calculations, scale up. 3.5% becomes 35 millipercent or 3500 micro-percent as an integer, then divide at the end.
Use 128-bit intermediates (RDX:RAX + IMUL/IDIV) for the intermediate products before the final division. This prevents overflow when multiplying large amounts by rate values.
Be explicit about truncation/rounding policy. IDIV truncates toward zero. Verify this matches your requirements; implement banker's rounding if needed.
Test at the boundaries. Test with $0.01, $0.00, $999,999,999.99 and confirm the integer arithmetic handles them without overflow.

The C standard library's printf("%.2f") has floating-point rounding for display. For input parsing, use integer parsing directly rather than strtod() if you need exact cents.