About: Universal Character Set characters

Property	Value
dbo:abstract	Unikoda alfabeto (eo) The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set (abbr. UCS, official designation: ISO/IEC 10646), is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen. UCS has a potential capacity of over 1 million characters. Each UCS character is abstractly represented by a code point, an integer between 0 and 1,114,111 (1,114,112 = 220 + 216 or 17 × 216 = 0x110000 code points), used to represent each character within the internal logic of text processing software. As of Unicode 15.0, released in September 2022, 293,168 (26%) of these code points are allocated, 149,251 (13%) have been assigned characters, 137,468 (12.3%) are reserved for , 2,048 are used to enable the mechanism of , and 66 are designated as , leaving the remaining 820,944 (74%) unallocated. The number of encoded characters is made up as follows: * 149,014 graphical characters (some of which do not have a visible glyph, but are still counted as graphical) * 237 for control and formatting. ISO maintains the basic mapping of characters from character name to code point. Often, the terms character and code point will be used interchangeably. However, when a distinction is made, a code point refers to the integer of the character: what one might think of as its address. Meanwhile, a character in ISO/IEC 10646 includes the combination of the code point and its name, Unicode adds many other useful properties to the character set, such as block, category, script, and directionality. In addition to the UCS, the supplementary Unicode Standard, (not a joint project with ISO, but rather a publication of the Unicode Consortium,) provides other implementation details such as: 1. * mappings between UCS and other character sets 2. * different collations of characters and character strings for different languages 3. * an algorithm for laying out bidirectional text ("the BiDi algorithm"), where text on the same line may shift between left-to-right ("LTR") and right-to-left ("RTL") 4. * a case-folding algorithm Computer software end users enter these characters into programs through various input methods, for example, physical keyboards or virtual character palettes. The UCS can be divided in various ways, such as by plane, block, character category, or character property. (en) UnicodeやISO/IEC 10646には、単純計算で U+0000 ～ U+10FFFF の 1,114,112 = 220 + 216 個の符号位置がある。 Unicode 5.0.0の時点で、これらの符号位置のうち 102,012 (9.2%) が割り当て済みであり、ほかに 137,468 (12.3%) がに、2,048 がに予約されており、そして 66 がに指定され、872,582 (78.3%) が未割り当てのまま残されている。割り当てられた符号位置の数は以下のような構成である。 * 2,684 は特定のブロック内への割り当てのために予約されている。 * 98,893 は図形文字である。 * 435 は制御、整形、グリフ/文字の異体字選択用のである。 (さらに詳細な内訳はを参照) Unicodeの文字はさまざまな方法で分類できる。すべての文字は用字 (script) が割り当てられている。なお、たくさんの文字に「Common」(用字をまたがって共通に使う)、もしくは「Inherited」(隣接した文字から用字を受け継ぐ) という用字が割り当てられている。Unicodeにおける用字とは、字のみならず、その用字特有の句読点、ダイアクリティカルマーク、および他のマークや数字や記号をも含む一貫した書記体系である。一つの用字が一つかそれ以上の言語をサポートする。文字は文字のブロックに割り当てられる。これらのブロックは通常8の倍数個の符号位置群である。多くは、たとえば128個か256個の符号位置のブロックにグループ分けされる。すべての文字は一般カテゴリ (general category) と下位カテゴリも割り当てられている。一般カテゴリは次の通り——字 (letter), マーク (mark)、数字 (number)、句読点 (punctuation)、記号 (symbol)、もしくは制御文字 (control; 言い換えると書式文字または非図形文字)。文字のブロックは各種の面 (plane) に割り当てられている。現在ほとんどの文字は、最初の面である基本多言語面 (Basic Multilingual Plane) に割り当てられている。基本多言語面は2オクテットのみで指定可能であるため、これは従来のソフトウェアからの移行を容易にすることを助ける。最初の面に含まれない文字は通常きわめて特殊なものであるか、めったに使われない。最初の256個の符号位置は、西洋でもっとも広く使われている8ビットの文字コードであるISO/IEC 8859-1のものに対応する。結果として、最初の128文字はASCIIとも等価である。Unicodeはこれらをラテン文字のブロックとして参照しているが、これらの2ブロックはラテン文字以外でも広く有用な文字を多数含む。 (ja) Dit is een lijst van de groepen karakters in de Unicode-standaard 15.0. De karakters zijn verdeeld over zeventien planes, elk met ruimte voor 65536 tekens, genummerd 0 t/m 16 (soms ook hexadecimaal 0 t/m 10): (nl)
dbo:thumbnail	wiki-commons:Special:FilePath/New_Unicode_logo.svg?width=300
dbo:wikiPageExternalLink	https://unicode.org/cldr/utility/properties.jsp https://www.unicode.org https://web.archive.org/web/20140312143430/http:/www.decodeunicode.org/ https://www.unicode.org/versions/corrigendum9.html
dbo:wikiPageID	10633237 (xsd:integer)
dbo:wikiPageLength	56142 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID	1110676012 (xsd:integer)
dbo:wikiPageWikiLink	dbr:Private_Use_Planes dbr:Mojibake dbr:Big-endian dbr:Brahmic_scripts dbr:Decimal dbr:Apple_Advanced_Typography dbc:Unicode dbr:Little-endian dbr:Right-to-left_script dbr:Character_(computing) dbr:Character_encoding dbr:Cursive dbr:UTF-16 dbr:UTF-8 dbr:Unicode dbr:Unicode_Consortium dbr:Unicode_Standard dbr:Unicode_character_property dbr:Unicode_compatibility_characters dbr:Gaiji dbr:Zero-width_joiner dbr:Virtual_keyboard dbr:Python_programming_language dbr:Hanzi dbr:Computer_keyboard dbr:Orthography dbr:Collations dbr:Endianness dbr:Glyph dbr:Musical_notation dbr:ConScript_Unicode_Registry dbr:Control_character dbr:Virama dbr:Tai_Xuan_Jing dbr:Apple_Inc. dbr:Arabic dbr:Arabic_alphabet dbr:Logograph dbr:Zero-width_non-joiner dbr:Ä dbr:Ö dbr:Document_Type_Definition dbr:Machine-readable_data dbr:String_(computer_science) dbr:Mathematical_notation dbr:Byte-order_mark dbr:Typeface dbr:Typography dbr:Western_world dbr:Code_points dbr:Featural_alphabet dbr:Logogram dbr:ASCII dbr:Abugida dbr:Alphabet dbr:Cyrillic dbr:Font dbr:Georgian_(Unicode_block) dbr:Grapheme dbr:SGML_entity dbr:Han_characters dbr:HTML dbr:Hangul dbr:Hanja dbr:Hebrew_alphabet dbr:Hebrew_language dbr:Hexadecimal dbr:International_Electrotechnical_Commission dbr:International_Organization_for_Standardization dbr:International_Phonetic_Alphabet dbr:Thaana dbr:Armenian_(Unicode_block) dbc:Character_sets dbc:IEC_standards dbr:Abjad dbr:Character_encodings dbr:Kana dbr:Kanji dbr:Latin_script dbr:Bidirectional_text dbr:Code_point dbr:Diacritic dbr:Diaeresis_(diacritic) dbr:Byte_order_mark dbr:Plane_(Unicode) dbr:Software_company dbr:Greek_and_Coptic dbr:End_users dbr:Han_script dbr:Information_interchange dbr:Integer dbr:Klingon_scripts dbr:Octet_(computing) dbr:OpenType dbr:Chancery_hand dbr:CJK_ideograph dbr:Working_group dbr:XML dbr:Unicode_block dbr:Unified_Canadian_Aboriginal_Syllabics_(Unicode_block) dbr:Syriac_Abbreviation_Mark dbr:Script_(Unicode) dbr:UTF-32 dbr:Universal_Character_Set dbr:Text_processing dbr:ISO/IEC_JTC_1/SC_2 dbr:Left_to_right dbr:Page_layout dbr:Syllabary dbr:Private_use_(unicode) dbr:Interoperate dbr:Case_fold dbr:Character_name dbr:ISO_8859-1 dbr:Fonts_on_the_Mac dbr:Format_character dbr:Hexadecimal_digit dbr:Ethiopic dbr:Input_methods dbr:Combining_diacritical_mark dbr:South_Indic dbr:Wikt:interchange dbr:File:Writing_systems_worldwide.png dbr:File:New_Unicode_logo.svg dbr:File:AppleChancery1¼FractionExample.png dbr:File:AppleChancery4and221-225thsExample.svg dbr:File:LetterA.svg
dbp:above	Index of predominant national and selected regional or minority scripts (en)
dbp:abovestyle	background:transparent;font-size:110%;padding:0;font-weight:bold; (en)
dbp:below	(en) a Featural-alphabetic.b Limited. (en)
dbp:border	none (en)
dbp:col1header	Alphabetic (en)
dbp:col3header	dbr:Abjad
dbp:col4header	dbr:Abugida
dbp:colheaderstyle	padding:0.15em 0.15em 0.25em;font-weight:normal; (en)
dbp:colstyle	white-space:nowrap; (en)
dbp:style	90.0
dbp:wikiPageUsesTemplate	dbt:Anchor dbt:Commons_category dbt:Further dbt:Legend dbt:Main dbt:Mono dbt:Navbox_with_columns dbt:Nbsp dbt:Proper_name dbt:Redirect dbt:Reflist dbt:See_also dbt:Short_description dbt:Smallcaps dbt:Smaller dbt:Unichar dbt:Longitem dbt:\ dbt:Abbr. dbt:Unicode_navigation dbt:SpecialChars
dcterms:subject	dbc:Unicode dbc:IEC_standards
rdf:type	owl:Thing yago:Abstraction100002137 yago:Measure100033615 yago:WikicatIECStandards yago:Standard107260623 yago:SystemOfMeasurement113577171
rdfs:comment	Unikoda alfabeto (eo) Dit is een lijst van de groepen karakters in de Unicode-standaard 15.0. De karakters zijn verdeeld over zeventien planes, elk met ruimte voor 65536 tekens, genummerd 0 t/m 16 (soms ook hexadecimaal 0 t/m 10): (nl) The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set (abbr. UCS, official designation: ISO/IEC 10646), is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same (en) UnicodeやISO/IEC 10646には、単純計算で U+0000 ～ U+10FFFF の 1,114,112 = 220 + 216 個の符号位置がある。 Unicode 5.0.0の時点で、これらの符号位置のうち 102,012 (9.2%) が割り当て済みであり、ほかに 137,468 (12.3%) がに、2,048 がに予約されており、そして 66 がに指定され、872,582 (78.3%) が未割り当てのまま残されている。割り当てられた符号位置の数は以下のような構成である。 * 2,684 は特定のブロック内への割り当てのために予約されている。 * 98,893 は図形文字である。 * 435 は制御、整形、グリフ/文字の異体字選択用のである。 (さらに詳細な内訳はを参照) Unicodeの文字はさまざまな方法で分類できる。すべての文字は用字 (script) が割り当てられている。なお、たくさんの文字に「Common」(用字をまたがって共通に使う)、もしくは「Inherited」(隣接した文字から用字を受け継ぐ) という用字が割り当てられている。Unicodeにおける用字とは、字のみならず、その用字特有の句読点、ダイアクリティカルマーク、および他のマークや数字や記号をも含む一貫した書記体系である。一つの用字が一つかそれ以上の言語をサポートする。 (ja)
rdfs:label	Unikoda alfabeto (eo) Unicode文字のマッピング (ja) Lijst van Unicode-subbereiken (nl) Universal Character Set characters (en)
rdfs:seeAlso	dbr:Unicode_control_characters dbr:HTML_character_entity_references dbr:List_of_XML
owl:sameAs	freebase:Universal Character Set characters yago-res:Universal Character Set characters wikidata:Universal Character Set characters dbpedia-eo:Universal Character Set characters dbpedia-ja:Universal Character Set characters dbpedia-nl:Universal Character Set characters https://global.dbpedia.org/id/53Xp4
prov:wasDerivedFrom	wikipedia-en:Universal_Character_Set_characters?oldid=1110676012&ns=0
foaf:depiction	wiki-commons:Special:FilePath/AppleChancery1¼FractionExample.png wiki-commons:Special:FilePath/LetterA.svg wiki-commons:Special:FilePath/Writing_systems_worldwide.png wiki-commons:Special:FilePath/AppleChancery4and221-225thsExample.svg wiki-commons:Special:FilePath/New_Unicode_logo.svg
foaf:isPrimaryTopicOf	wikipedia-en:Universal_Character_Set_characters
is dbo:wikiPageRedirects of	dbr:Unicode_characters dbr:Low_Surrogates dbr:Low_surrogate dbr:Non-character dbr:Noncharacter dbr:High_PU_Surrogates dbr:High_Surrogates dbr:High_surrogate dbr:Surrogate_code_points dbr:Surrogate_code_point dbr:Surrogate_mechanism dbr:Surrogate_pair dbr:Surrogates_in_Unicode dbr:High_Private_Use_Surrogates dbr:Mapping_of_Unicode_characters dbr:Mapping_of_Unicode_graphic_characters dbr:Universal_Character_Set_character dbr:Universal_Character_Set_Characters dbr:Low_Surrogates_(Unicode_block) dbr:Mapping_of_Unicode_blocks dbr:High_Private_Use_Surrogates_(Unicode_block) dbr:High_Surrogates_(Unicode_block) dbr:List_of_Unicode_ranges dbr:Unicode_character dbr:Unicode_codepage dbr:Unicode_range
is dbo:wikiPageWikiLink of	dbr:List_of_Unicode_characters dbr:Per_mille dbr:Character_(computing) dbr:Character_encoding dbr:UTF-16 dbr:UTF-8 dbr:Unicode_Consortium dbr:Unicode_characters dbr:Universal_Disk_Format dbr:Runes dbr:Rust_(programming_language) dbr:Orders_of_magnitude_(data) dbr:Low_Surrogates dbr:Low_surrogate dbr:Specials_(Unicode_block) dbr:Latin_script_in_Unicode dbr:Non-character dbr:Noncharacter dbr:List_of_Latin_letters_by_shape dbr:List_of_Special_Characters_for_Passwords dbr:High_PU_Surrogates dbr:High_Surrogates dbr:High_surrogate dbr:Astrological_sign dbr:Unicode_input dbr:Phonetic_symbols_in_Unicode dbr:Surrogate_code_points dbr:Surrogate_code_point dbr:Surrogate_mechanism dbr:Surrogate_pair dbr:Surrogates_in_Unicode dbr:High_Private_Use_Surrogates dbr:Zodiac dbr:Byte_order_mark dbr:Plane_(Unicode) dbr:Zalgo_text dbr:Universal_Coded_Character_Set dbr:Western_astrology dbr:Mapping_of_Unicode_characters dbr:Mapping_of_Unicode_graphic_characters dbr:Valid_characters_in_XML dbr:Unicode_symbols dbr:Universal_Character_Set_character dbr:Universal_Character_Set_Characters dbr:Low_Surrogates_(Unicode_block) dbr:Mapping_of_Unicode_blocks dbr:High_Private_Use_Surrogates_(Unicode_block) dbr:High_Surrogates_(Unicode_block) dbr:List_of_Unicode_ranges dbr:Unicode_character dbr:Unicode_codepage dbr:Unicode_range
is rdfs:seeAlso of	dbr:Unicode
is foaf:primaryTopic of	wikipedia-en:Universal_Character_Set_characters