Case Study 29-2: The PIT — Programming the PC Timer

Setting Up 100Hz and Measuring Time in Bare Metal

The Programmable Interval Timer is one of the oldest surviving pieces of the original IBM PC architecture. Despite being designed in 1981, it is still present in every x86 PC, still wired to IRQ0, and still essential for every OS that needs a hardware timer. This case study programs the PIT completely, calibrates it, and builds a millisecond-accurate delay function for MinOS.

PIT Channel Architecture

The 8253/8254 PIT has three independent counter channels, each with its own 16-bit counter register:

                 ┌──────────────────────────────────┐
                 │     8253/8254 PIT                 │
   1.193182 MHz  │                                   │
   ──────────────┤   Channel 0: ───────────────────────→ IRQ0 (timer interrupt)
                 │   (Mode 3, divisor N → N/1.193MHz Hz)  │
                 │                                   │
                 │   Channel 1: ─── (legacy DRAM refresh, unused)
                 │                                   │
                 │   Channel 2: ───────────────────────→ PC Speaker
                 │   (controls speaker frequency)    │
                 └──────────────────────────────────┘

Mode/Command Port (0x43):
  Write only. Selects channel, access mode, and operating mode.

Channel 0 (0x40), Channel 1 (0x41), Channel 2 (0x42):
  Both read and write the counter value.

PIT Operating Modes

Mode 0 — Interrupt on Terminal Count: The counter decrements from the loaded value to 0, fires IRQ0 once, and stops. Output stays high until reloaded.

; Mode 0: single shot at N Hz
; After firing, write a new count to restart

Mode 2 — Rate Generator: The counter decrements from N to 1. When it reaches 1, IRQ0 fires and the counter reloads to N. This gives exactly N/1,193,182 second intervals. One clock pulse is a low pulse. Strictly: frequency = 1,193,182 / N Hz.

Mode 3 — Square Wave Generator (recommended for periodic timer): The counter decrements by 2 on odd-numbered clocks, 1 on even — effectively halving the effective divisor for the frequency. The output is high for N/2 counts and low for N/2 counts. For an even divisor, this is a perfect 50% duty cycle square wave. IRQ0 fires at the "falling edge" (start of low phase): at frequency = 1,193,182 / N Hz. This is the standard mode for the system timer.

Full PIT Initialization Code

; minOS/drivers/pit.asm — Complete PIT driver

section .data
    pit_ticks:   dq 0        ; total ticks since boot
    pit_hz:      dq 100      ; configured frequency
    pit_ms_per_tick: dq 10   ; 1000ms / 100Hz = 10ms per tick

section .text

;=============================================================================
; pit_init: configure PIT channel 0 for given frequency
; RDI = desired frequency in Hz (1 to 1193182)
;=============================================================================
global pit_init
pit_init:
    ; Bounds check
    test rdi, rdi
    jz .invalid
    cmp rdi, 1193182
    ja .invalid

    ; Calculate divisor
    ; divisor = 1,193,182 / hz
    ; Store hz and ms_per_tick
    mov [pit_hz], rdi

    ; ms_per_tick = 1000 / hz (may be 0 for hz > 1000)
    mov rax, 1000
    xor rdx, rdx
    div rdi
    mov [pit_ms_per_tick], rax

    ; Calculate divisor = 1,193,182 / hz
    mov rax, 1193182
    xor rdx, rdx
    div rdi                 ; rax = divisor, rdx = remainder
    ; If rdi = 0 we guard above; divisor in rax (16-bit if hz >= 19)

    ; If divisor = 0, use maximum (0x10000 = 65536 = ~18.2Hz)
    test rax, rax
    jnz .divisor_ok
    mov rax, 0x10000

.divisor_ok:
    ; Write Mode/Command: Channel 0, lo+hi access, Mode 3 (square wave), binary
    ; 0x36 = 00_11_011_0 = ch0 | lo+hi | mode3 | binary
    push rax
    mov al, 0x36
    out 0x43, al

    ; Write divisor: low byte, then high byte
    pop rax
    out 0x40, al            ; low byte
    shr rax, 8
    out 0x40, al            ; high byte

    ; Reset tick counter
    mov qword [pit_ticks], 0
    ret

.invalid:
    ret                     ; invalid frequency, do nothing

;=============================================================================
; pit_irq_handler: IRQ0 handler — called at pit_hz frequency
; No arguments. All registers preserved.
;=============================================================================
global pit_irq_handler
pit_irq_handler:
    push rax
    inc qword [pit_ticks]

    ; Optional: invoke scheduler tick here
    ; call scheduler_tick

    ; Send EOI to PIC1
    mov al, 0x20
    out 0x20, al

    pop rax
    iretq

;=============================================================================
; pit_get_ticks: return current tick count
; Returns: RAX = ticks since pit_init was called
;=============================================================================
global pit_get_ticks
pit_get_ticks:
    mov rax, [pit_ticks]
    ret

;=============================================================================
; pit_sleep_ms: busy-wait for at least N milliseconds
; RDI = milliseconds
; Note: granularity is 1/hz seconds (10ms at 100Hz)
; For sub-10ms delays, use pit_sleep_ticks directly
;=============================================================================
global pit_sleep_ms
pit_sleep_ms:
    ; Convert ms to ticks: ticks = ms / ms_per_tick = ms * hz / 1000
    mov rax, rdi            ; rax = ms
    imul rax, [pit_hz]
    xor rdx, rdx
    mov rcx, 1000
    div rcx                 ; rax = ticks needed (round down)
    inc rax                 ; add 1 tick for safety (avoid 0-tick wait)

    ; Target tick = current + rax
    add rax, [pit_ticks]
    ; Busy-wait
.wait:
    cmp [pit_ticks], rax
    jl  .wait               ; spin until tick_count >= target
    ret

;=============================================================================
; pit_sleep_ticks: busy-wait for exactly N ticks
; RDI = tick count
;=============================================================================
global pit_sleep_ticks
pit_sleep_ticks:
    add rdi, [pit_ticks]    ; target = current + n
.wait:
    cmp [pit_ticks], rdi
    jl  .wait
    ret

;=============================================================================
; pit_read_counter: read the current value of PIT channel 0's counter
; Returns: RAX = current counter value (0 to divisor-1, counting down)
; This gives sub-tick time resolution.
;=============================================================================
global pit_read_counter
pit_read_counter:
    ; Latch the current count (Mode/Command = 00_00_0000 = channel 0 latch)
    mov al, 0x00
    out 0x43, al

    ; Read low byte, then high byte
    in  al, 0x40
    movzx rax, al
    in  al, 0x40
    movzx rcx, al
    shl rcx, 8
    or  rax, rcx
    ret

High-Resolution Time Measurement

The PIT counter provides sub-tick resolution. By reading the latch count between ticks, you can measure time with microsecond granularity:

; Get current time in microseconds since boot
; (tick_count × tick_duration_us) + (ticks_remaining_in_current_tick × us_per_counter_tick)

pit_get_us:
    ; Each tick at 100Hz = 10,000 microseconds
    ; PIT runs at 1.193182 MHz → 0.838 microseconds per count
    ; Remaining counter counts → us = count × 10000 / 11931

    ; 1. Get ticks
    mov rax, [pit_ticks]
    imul rax, 10000         ; rax = total us from complete ticks

    ; 2. Get remaining counter (counts DOWN to 0)
    call pit_read_counter   ; rax = remaining counts

    ; 3. Convert remaining counts to us
    ; remaining_us = rax * 10000 / 11931
    imul rax, 10000
    xor rdx, rdx
    mov rcx, 11931
    div rcx                 ; rax = microseconds within current tick

    ; 4. Combine
    add rax, [pit_ticks]
    ; Wait, need to avoid overwriting the ticks result...
    ; Better: save ticks first, then add remaining
    ; (left as exercise to get right — or use RDTSC for precision)
    ret

Implementing a Delay Calibration

For accurate delays, calibrate the PIT divisor empirically using RDTSC:

; Measure actual PIT frequency using RDTSC
pit_calibrate:
    ; 1. Read RDTSC before one tick
    ; 2. Wait for tick_count to increment
    ; 3. Read RDTSC after
    ; 4. cpu_hz = rdtsc_delta × pit_configured_hz

    mov rax, [pit_ticks]
.wait_tick:
    cmp [pit_ticks], rax
    je .wait_tick           ; wait for tick to increment

    rdtsc                   ; read after tick
    shl rdx, 32
    or  rax, rdx
    mov rbx, rax            ; start TSC

    inc qword [rax + ... ]  ; use pit_ticks
    mov rax, [pit_ticks]
    inc rax
.wait_tick2:
    cmp [pit_ticks], rax
    jl .wait_tick2

    rdtsc
    shl rdx, 32
    or  rax, rdx
    sub rax, rbx            ; TSC delta for one tick

    ; cpu_cycles_per_tick = rax
    ; cpu_hz ≈ rax × pit_hz (approximate)
    imul rax, [pit_hz]      ; cpu Hz estimate
    ; Store for RDTSC-based timing (Chapter 33 will use this)
    ret

Testing in MinOS

# MinOS with 100Hz PIT, serial console output:
qemu-system-x86_64 -drive format=raw,file=minOS.img -serial stdio

# Expected output:
PIT initialized at 100Hz (divisor=11931)
tick_count after 1 second: 100
Sleeping 500ms...
Sleep complete. tick_count: 150
PC speaker test: 440Hz tone for 500ms

The PIT's simplicity (three I/O port writes to configure, one I/O port write for EOI) makes it one of the most-programmed hardware devices in history. Every PC timer since 1981 has been the same chip, the same registers, the same frequency.

⚡ Performance Note: At 100Hz, the timer fires 8,640,000 times per day. Each IRQ has a fixed overhead: ~300 cycles for interrupt entry, EOI write (~50 ns), register save/restore. At 3GHz, 100Hz interrupt overhead is about 0.01% of CPU time — essentially free. At 1000Hz (Linux's default), it is 0.1% — still negligible. This is why high-resolution timers typically use the LAPIC timer (per-CPU, much higher resolution) rather than cranking up the PIT frequency.