Extension:AntiSpoof/Equivalence sets

Uit MAT'54 Hondekop in 3D
Ga naar: navigatie, zoeken

This is an editable copy of AntiSpoof/equivset.in

This page has been split up, to make it easier to edit. The sections are:

# This is the input file for generateEquivset.php
# The format is:
#    <hexadecimal codepoint> <character> => [<hexadecimal codepoint>] <character>
# If the codepoint is given, it must match the character, or else a warning 
# will be issued and the line will be ignored.
# The effect of such a line is to conflate the two identified character, i.e.
# to put them in the same set. If two sets share a member, then they will be 
# merged into a single larger set.
# We have attempted to include the following types of equivalence:
#    * Case folding. Although letters of different cases are often visually 
#      distinct, they can easily be confused by people who are familiar with
#      the alphabet. Two words with a different case may be read as the same 
#      word. This is a popular technique for impersonation.
#    * Visually similar characters. Cross-script pairs are included, but these
#      tend to produce false conflations within scripts, and so should be 
#      avoided. The software implements a blanket restriction against cross-
#      script strings, which makes cross-script pairs mostly redundant. 
#    * Chinese Simplified/Traditional pairs. 
# The list is based on one by Neil Harris, which was derived by unknown methods.
# That list also contained transliteration pairs, which we considered excessive
# and have attempted to remove. For example, the Latin E and H were considered 
# equivalent, because the Latin transliteration of the Greek "Η" (which 
# looks like Latin H) is "E". 

Sjabloon:Extension:AntiSpoof/Equivalence sets/equivset 1 Sjabloon:Extension:AntiSpoof/Equivalence sets/equivset 2 Sjabloon:Extension:AntiSpoof/Equivalence sets/equivset 3