Case Study 14.2: Floating-Point Pitfalls in Financial Calculations
Why Finance and Floating-Point Do Not Mix
The IEEE 754 standard uses binary fractions. Most decimal fractions (0.1, 0.01, 0.25, etc.) are not exactly representable in binary. This is not a bug — it is a consequence of the base-2 representation. But for financial calculations where exactness is a legal requirement, this produces errors that cause real problems.
Demonstration: Accumulation Error
; Add 1 cent (0.01) one million times to an account
; Expected result: $10,000.00 = 10000.0
; What we get: approximately 10000.000000218279...
section .data
one_cent: dq 0.01 ; 0.01 as double
iterations: dq 1000000 ; one million
result_fp: dq 0.0 ; accumulated total (floating-point)
result_int: dq 0 ; accumulated total (integer, in cents)
section .text
global accumulation_demo
accumulation_demo:
; Floating-point accumulation:
xorpd xmm0, xmm0 ; xmm0 = 0.0 (accumulated sum)
movsd xmm1, [rel one_cent] ; xmm1 = 0.01
mov rcx, [rel iterations] ; loop counter
.fp_loop:
addsd xmm0, xmm1 ; sum += 0.01
dec rcx
jnz .fp_loop
movsd [rel result_fp], xmm0 ; store floating-point result
; Integer accumulation (correct):
mov rax, [rel iterations]
mov [rel result_int], rax ; 1,000,000 cents = $10,000.00 exactly
ret
The difference: result_fp ≈ 10000.000000218279 vs. result_int = 1000000 (exactly $10,000.00 in cents).
Why 0.01 is Not Exactly Representable
In binary: 0.01 = 0.000000001010001111010111000010100011... (repeating pattern)
The IEEE 754 double representation rounds this to: 0.01000000000000000020816681711721685228...
Each addition introduces a rounding error of approximately 10^-18. After one million additions: 10^-18 × 10^6 = 10^-12 apparent accumulation — but the actual error is larger due to correlation of errors.
Catastrophic Cancellation
; Catastrophic cancellation: computing (a + b) - a when b is much smaller than a
; Should return b exactly; actually loses all precision in b
section .data
val_a: dq 1.0e15 ; large value: a = 10^15
val_b: dq 1.0 ; small value: b = 1.0
; Expected: (a + b) - a = b = 1.0
; Actual: 0.0 (b is completely lost!)
section .text
cancellation_demo:
movsd xmm0, [rel val_a] ; xmm0 = 1e15
movsd xmm1, [rel val_b] ; xmm1 = 1.0
addsd xmm0, xmm1 ; xmm0 = 1e15 + 1.0 = 1000000000000001.0
; But wait: 1e15 has only 15-16 significant decimal digits.
; The last digit of 1e15 is already at the precision limit.
; 1e15 + 1.0 rounds back to 1e15 in many cases (depends on exact representation)
movsd xmm2, [rel val_a] ; xmm2 = 1e15 (original)
subsd xmm0, xmm2 ; xmm0 = (1e15 + 1.0) - 1e15
; Expected: 1.0
; Actual: may be 0.0 or 2.0 depending on rounding
ret
The IEEE 754 double has 53 bits of mantissa = approximately 15.9 decimal digits. Adding 1.0 to 10^15 (which has 16 significant digits) may round the result back to 10^15, completely losing the 1.0.
The Solution: Fixed-Point Arithmetic
For financial calculations, use integer arithmetic in the smallest currency unit (cents, or for fractions of cents, milli-cents or pico-cents depending on requirements):
; Fixed-point financial arithmetic: all amounts in cents (integer)
; Amount 100.00 is stored as 10000 (cents)
; Amount 0.01 is stored as 1 (cent)
; Maximum safe value: 2^63 / 100 = ~92,233,720,368,547,758 dollars (9.2 quadrillion)
section .text
global interest_calculation
; int64_t interest_calculation(int64_t principal_cents, int64_t rate_millipercent)
; rate_millipercent: e.g., 350 = 3.50% annual rate
; Returns interest in cents for one year (truncated)
; RDI = principal_cents, RSI = rate_millipercent
interest_calculation:
; interest = principal * rate / 100000 (divide by 100 for percent, 1000 for milli)
; Use 128-bit intermediate to avoid overflow
mov rax, rdi ; rax = principal_cents
imul rsi ; rdx:rax = principal * rate_millipercent (128-bit)
mov rbx, 100000 ; divisor
idiv rbx ; rax = (principal * rate) / 100000 = interest in cents
ret
The key: every arithmetic operation on integers is exact. No rounding, no accumulation error. The only imprecision is in the business logic (truncation of fractional cents), which is an explicit and controllable decision.
Comparison: Errors After 1000 Transactions
| Method | Computation | Error After 1000 Additions of $0.01 |
|---|---|---|
| Double | xmm0 += 0.01 |
~2.2e-11 relative (22 nano-dollars on $10) |
| Float | (single precision) | ~1.2e-6 relative (1.2 micro-dollars on $10 — visible!) |
| Fixed-point (cents) | rax += 1 |
0 (exact) |
| Fixed-point (millicents) | rax += 10 |
0 (exact) |
For float (32-bit), the error after 1000 additions becomes visible in financial reporting. For double (64-bit), the error is below the cent level for typical transaction volumes but accumulates in large-scale systems.
The Assembly Implementation: Fixed-Point Interest Accumulation
; Simulate 12 monthly interest applications (compound interest in cents)
; principal: starting amount in cents
; monthly_rate_millipct: monthly rate in millipercents (350 = 3.50%/12 per month)
; int64_t compound_12_months(int64_t principal_cents, int64_t monthly_rate_millipct)
; RDI = principal, RSI = monthly_rate_millipct
compound_12_months:
push rbp
mov rbp, rsp
push rbx
push r12
mov rbx, rdi ; current principal in cents
mov r12, rsi ; monthly rate
mov rcx, 12 ; 12 months
.month_loop:
; interest = principal * rate / 100000
mov rax, rbx
imul r12 ; rdx:rax = principal * rate (128-bit)
mov rdi, 100000
idiv rdi ; rax = interest in cents (truncated)
add rbx, rax ; principal += interest
dec rcx
jnz .month_loop
mov rax, rbx ; return final principal in cents
pop r12
pop rbx
pop rbp
ret
Banker's Rounding vs. Truncation
The IDIV above truncates (rounds toward zero). For financial calculations, several rounding modes exist: - Truncate: always rounds toward zero (biases down for positive numbers) - Round half up: standard "school" rounding - Banker's rounding (round half to even): reduces systematic bias over many transactions
Implementing banker's rounding in assembly:
; Divide RDX:RAX by RBX, apply banker's rounding to quotient in RAX
; (Assumes 128-bit dividend already set up)
bankers_round_div:
; First: get quotient and remainder
idiv rbx ; rax = quotient, rdx = remainder
; remainder * 2 vs divisor:
; if |remainder * 2| < |divisor|: truncate (already done)
; if |remainder * 2| > |divisor|: round up (away from zero)
; if |remainder * 2| == |divisor|: round to even (half-way case)
mov rcx, rdx
add rcx, rcx ; rcx = remainder * 2 (check overflow?)
jo .check_overflow ; handle if remainder*2 overflows
; Check if |2*remainder| >= |divisor|:
; ... (simplified; full implementation handles signs)
ret
The Lesson for Assembly Programmers
When implementing financial logic in assembly:
- Never use floating-point for exact monetary amounts. Use
int64_tin the smallest unit you need. - For percentage calculations, scale up. 3.5% becomes 35 millipercent or 3500 micro-percent as an integer, then divide at the end.
- Use 128-bit intermediates (RDX:RAX + IMUL/IDIV) for the intermediate products before the final division. This prevents overflow when multiplying large amounts by rate values.
- Be explicit about truncation/rounding policy. IDIV truncates toward zero. Verify this matches your requirements; implement banker's rounding if needed.
- Test at the boundaries. Test with $0.01, $0.00, $999,999,999.99 and confirm the integer arithmetic handles them without overflow.
The C standard library's printf("%.2f") has floating-point rounding for display. For input parsing, use integer parsing directly rather than strtod() if you need exact cents.