how lossless compression preserves audio quality
nicholas chen · march 14, 2026 · 6 min read

in an era of streaming and convenience, audio quality is often overlooked. however, for those who care about the nuances of sound, lossless audio is the gold standard. in this post, i will explore what it is and why it matters.
what is lossless audio?
lossless audio compression reduces the file size of an audio track without losing any data. unlike lossy formats like MP3 or AAC, which discard information to save space, lossless formats like FLAC or ALAC preserve every single bit of the original recording.
lossless vs lossy
lossy formats like MP3 use psychoacoustics — they throw away information humans can't easily hear: sounds masked by louder nearby frequencies, very high frequencies (above ~16 kHz for most adults), and quiet sounds during loud moments (temporal masking).
lossless formats (FLAC, WAV) keep every sample exactly. they compress like a zip file: perfectly reconstructable, nothing discarded.
Lossiness: 55% · levels: 21 · kernel: 17 · Frequency: 0 – 3.9 kHz

common lossless formats
there are several lossless formats available today, each with its own advantages. WAV and AIFF are uncompressed formats, while FLAC and ALAC are compressed but still lossless.
what is FLAC?
FLAC = Free Lossless Audio Codec. open source by Xiph.org (same as Ogg Vorbis). completely free, no patents. common alternatives: ALAC (Apple's version, same idea), WAV/AIFF (lossless but uncompressed, no LPC), WavPack (slightly better compression). FLAC is the standard for lossless archival: open, well-supported, ~50–60% compression.
ALAC (apple lossless audio codec)
ALAC is apple's proprietary lossless format. it is similar to FLAC but designed for use within the apple ecosystem, including itunes and apple music.
WAV and AIFF
these are uncompressed formats that store raw PCM audio. they are the highest quality but have the largest file sizes and limited metadata support compared to FLAC or ALAC.
MP3 under the hood
MP3 splits audio into short frames, uses MDCT and a psychoacoustic model to compute a masking threshold per band, then allocates bits only where the signal is audible. you're compressing perceptual error — the removed data is designed to be inaudible. at 128 kbps it mostly works; at 64 kbps artifacts appear.
why does it matter?
the primary benefit is sound quality. lossless audio provides more detail, better dynamic range, and a wider soundstage. it's also essential for archiving and professional audio work. if you ever need to convert your music to another format, starting from a lossless source ensures the best possible results.
honest answer: for casual listening on spotify through airpods, you generally cannot hear the difference. studies show ABX tests at 320 kbps are basically coin flips for most people. the meaningful difference is archival and editing, not perceptual quality on a good encode.
Note: It’s really hard to tell the difference for most people.
how lossless audio is compressed
FLAC and similar codecs use linear prediction plus entropy coding to shrink the file without losing a single sample. below is how it works.
linear prediction in depth
a "sample" is one number in the sequence. at CD quality, 44,100 numbers per second — each is air pressure at that moment. a 3-minute song is ~8 million integers: .
why is audio predictable? sound waves are smooth. if the last four samples were 100, 105, 110, 115, the next is probably ~120. the predictor finds coefficients so that the next sample is best predicted by:
concrete example. say and FLAC found these coefficients: and the last three samples were , , . the prediction is: if the actual sample , the residual is: so instead of storing 105 (needs ~8 bits), you store 2 (needs ~2 bits). that's the compression.
// Same example: p=3, coefficients a1=1.5, a2=-0.7, a3=0.2 const a = [1.5, -0.7, 0.2]; const prev = [100, 90, 80]; // x[n-1], x[n-2], x[n-3] let pred = 0; for (let k = 0; k < a.length; k++) pred += a[k] * prev[k]; // pred = 1.5*100 + (-0.7)*90 + 0.2*80 = 103 const xActual = 105; const residual = xActual - pred; // e[n] = 105 - 103 = 2 // Store residual (small) instead of 105 (large) → compression.
how does FLAC find the coefficients?
uses Levinson–Durbin algorithm — solves a system of equations called the Yule–Walker equations. basically finds the \( a \) values that minimize the average squared residual:
this is just least squares regression but for time series. FLAC stores the winning coefficients in the subframe header so the decoder can reconstruct.
order — how many previous samples?
higher order = better prediction = smaller residuals = better compression, but you have to store more coefficients. FLAC tries multiple orders and picks the best tradeoff. typically to for music. order 1 (just previous sample) works okay for slowly varying signals. order 8+ captures more complex wave patterns like harmonics.
the key point:
the decoder has the same coefficients and the residuals. it just runs since nothing was ever approximated or thrown away — residuals stored exactly — you get back the original perfectly every time.
what is n?
n is just the index — the position of the current sample in the sequence. so if you have and you're predicting sample number 50, then : , , . the formula works for any — you slide it across the whole audio sequence, predicting each sample from the ones before it.
why store the error?
the residual is the predictor's mistake. if it guessed 103 and the real sample was 105, . we store the error because it's almost always a small number — small numbers need fewer bits. raw sample 105 could be anything in (16 bits); residual 2 needs ~2–3 bits. audio is smooth so the predictor is usually close; the error is the small unpredictable part. we never round — we store it exactly. that's what makes it lossless.
so instead of writing 16 bits for every sample, you write: the predictor coefficients once per frame (small, fixed cost), and 2–3 bits per residual instead of 16 bits per sample. across millions of samples that difference is massive.
why are residuals almost always small? because audio is smooth — the predictor is pretty good, so the error is rarely large. the distribution looks like: → very common; → common; → rare. Rice coding exploits this: small numbers get short codes, large numbers get long codes. since large residuals are rare, the average bits per sample stays low.
the lossless guarantee: if you stored instead of , you'd get a different back. so FLAC stores the exact integer residual every time — no rounding.
how does the error give back the audio?
it doesn't — on its own the error is meaningless. the decoder needs both: . it has the coefficients, runs the same predictor to get , then adds the stored residual. prediction + error = original sample exactly. example: predictor 103, residual 2 → 103 + 2 = 105 ✓. think of it like directions: "start at the coffee shop (prediction) and walk 2 steps east (residual)." together they get you exactly there.
// Decoder: prediction + residual → original sample const pred = 103; // from same coefficients + previous samples const residual = 2; // stored in the bitstream const xReconstructed = pred + residual; // 103 + 2 = 105 ✓
rice coding
goal: small residuals are common, large ones rare. give small numbers short codes, large numbers long codes — like Morse code (E is one dot).
the parameter : with you split at the boundary. quotient (how many 4s fit), remainder (the leftover). store in unary ( ones then a zero), then in bits.
concrete example: , . then , . store in unary: 1 one followed by a zero → 10. store in binary with bits → 10. full code: 10 10 = 4 bits. versus storing 6 in 16-bit audio = 8 bits. already saving bits.
why unary for ? unary means ones then a zero: → 0 (1 bit); → 10 (2 bits); → 110 (3 bits). small (small residual) = short code.
e.g. is the most common residual (predictor nailed it) → gets the shortest code, just 1 bit.
| e | normal binary | Rice code (k=2) |
|---|---|---|
| 0 | 0000 | 0 (1 bit) |
| 1 | 0001 | 010 (3 bits) |
| 2 | 0010 | 011 (3 bits) |
| 4 | 0100 | 10100 (5 bits) |
FLAC tries multiple values of and picks whichever gives the smallest total size for that block of residuals. it stores the chosen in the subframe so the decoder knows how to decode.