Difference between revisions of "Universal code (data compression)"
Karl Jones (Talk | contribs) (Created page with "In data compression, a universal code for integers is a prefix code that maps the positive integers onto binary codewords, with the additional property that whatever t...") |
Karl Jones (Talk | contribs) |
||
Line 1: | Line 1: | ||
− | In [[data compression]], a universal code for integers is a [[prefix code]] that maps the positive integers onto binary codewords | + | In [[data compression]], a '''universal code''' for integers is a [[prefix code]] that maps the positive integers onto binary codewords. |
+ | |||
+ | == Description == | ||
+ | |||
+ | The mapping has the additional property that whatever the true probability distribution on integers, as long as the distribution is monotonic (i.e., p(i) ≥ p(i + 1) for all positive i), the expected lengths of the codewords are within a constant factor of the expected lengths that the optimal code for that probability distribution would have assigned. | ||
A universal code is asymptotically optimal if the ratio between actual and optimal expected lengths is bounded by a function of the information entropy of the code that, in addition to being bounded, approaches 1 as entropy approaches infinity. | A universal code is asymptotically optimal if the ratio between actual and optimal expected lengths is bounded by a function of the information entropy of the code that, in addition to being bounded, approaches 1 as entropy approaches infinity. | ||
Line 18: | Line 22: | ||
[[Category:Computing]] | [[Category:Computing]] | ||
[[Category:Data]] | [[Category:Data]] | ||
+ | [[Category:Data compression]] | ||
+ | [[Category:Information theory]] | ||
[[Category:Telecommunications]] | [[Category:Telecommunications]] |
Latest revision as of 16:23, 22 April 2016
In data compression, a universal code for integers is a prefix code that maps the positive integers onto binary codewords.
Description
The mapping has the additional property that whatever the true probability distribution on integers, as long as the distribution is monotonic (i.e., p(i) ≥ p(i + 1) for all positive i), the expected lengths of the codewords are within a constant factor of the expected lengths that the optimal code for that probability distribution would have assigned.
A universal code is asymptotically optimal if the ratio between actual and optimal expected lengths is bounded by a function of the information entropy of the code that, in addition to being bounded, approaches 1 as entropy approaches infinity.
In general, most prefix codes for integers assign longer codewords to larger integers.
Such a code can be used to efficiently communicate a message drawn from a set of possible messages, by simply ordering the set of messages by decreasing probability and then sending the index of the intended message.
Universal codes are generally not used for precisely known probability distributions, and no universal code is known to be optimal for any distribution used in practice.