Character encoding

From Wiki @ Karl Jones dot com
Jump to: navigation, search

In computing, a character encoding is used to represent a repertoire of characters by some kind of an encoding system.

Description

Depending on the abstraction level and context, corresponding code points and the resulting code space may be regarded as bit patterns, octets, natural numbers, electrical pulses, etc.

A character encoding is used in computation, data storage, and transmission of textual data.

Character set, character map, codeset, and code page are related, but not identical, terms.

Early character codes associated with the optical or electrical telegraph could only represent a subset of the characters used in written languages, sometimes restricted to upper case letters, numerals and some punctuation only.

The low cost of digital representation of data in modern computer systems allows more elaborate character codes (such as Unicode) which represent more of the characters used in many written languages.

Character encoding using internationally accepted standards permits worldwide interchange of text in electronic form.

See also

  • Alt code
  • Charset sniffing– used in some applications when character encoding metadata is not available
  • Hexadecimal
  • Mojibake – character set mismap.
  • Mojikyo – a system ("glyph set") that includes over 100,000 Chinese character drawings, modern and ancient, popular and obscure.
  • Symbol
  • TRON, part of the TRON project, is an encoding system that does not use Han Unification; instead, it uses "control codes" to switch between 16-bit "planes" of characters.
  • Universal Character Set character

External links