Abstract
CL-UNICODE is a library which provides Common Lisp implementations with knowledge about Unicode characters including their name, their general category, the scripts and blocks they belong to, their numerical value, and several other properties. It also provides the ability to replace the standard syntax for reading Lisp characters with one that is Unicode-aware and is used to enhance CL-PPCRE with Unicode properties.CL-UNICODE is based on Unicode 5.1.
The code comes with a BSD-style license so you can basically do with it whatever you want.
Download shortcut: http://weitz.de/files/cl-unicode.tar.gz.
The library comes with a system definition for ASDF and you compile and load it in the usual way. It depends on CL-PPCRE.
CL-UNICODE builds parts of its source code automatically the first time it is compiled. This is done by parsing several Unicode data files which are included with the distribution and might take some time. This happens only once. FLEXI-STREAMS is needed for this process, but it is not used anymore once CL-UNICODE has been built.
You can run a test suite which tests most aspects of the library with
(asdf:oos 'asdf:test-op :cl-unicode)(Some of these tests are expected to fail if your Lisp has a very low
CHAR-CODE-LIMIT
like for example
CMUCL.)
If you want to send patches, please read this first.
[Generic function]
general-category c => name, symbol
Returns the general category of a character as a string.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point. The second return value is the property symbol of the category.CL-USER 1 > (general-category #\A) "Lu" CL-UNICODE-NAMES::LU CL-USER 2 > (general-category #\-) "Pd" CL-UNICODE-NAMES::PD CL-USER 3 > (general-category #\8) "Nd" CL-UNICODE-NAMES::NDSee also
GENERAL-CATEGORIES
.
[Generic function]
script c => name, symbol
Returns the script of a character as a string orNIL
if there is no script for that particular character.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point. The second return value (if there is one) is the property symbol of the script.CL-USER 1 > (script #\B) "Latin" CL-UNICODE-NAMES::LATIN CL-USER 2 > (script (code-char #x5d0)) "Hebrew" CL-UNICODE-NAMES::HEBREWSee alsoSCRIPTS
.
[Generic function]
code-block c => name, symbol
Returns the block of a character as a string orNIL
if there is no block for that particular character.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point. The second return value (if there is one) is the property symbol of the block.CL-USER 1 > (code-block #\a) "Basic Latin" CL-UNICODE-NAMES::BASICLATIN CL-USER 2 > (code-block #\ä) "Latin-1 Supplement" CL-UNICODE-NAMES::LATIN1SUPPLEMENTSee alsoCODE-BLOCKS
.
[Generic function]
has-binary-property c property => generalized-boolean
Checks whether a character has the binary propertyproperty
.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point.property
can be a string naming the property or the corresponding property symbol. If a true value is returned, it is the property symbol.CL-USER 1 > (has-binary-property #\Space "White_Space") CL-UNICODE-NAMES::WHITESPACE CL-USER 2 > (has-binary-property #\F "ASCII_Hex_Digit") CL-UNICODE-NAMES::ASCIIHEXDIGIT CL-USER 3 > (has-binary-property #\- "Dash") CL-UNICODE-NAMES::DASH CL-USER 4 > (has-binary-property #\= "Dash") NILSee alsoBINARY-PROPERTIES
.
[Generic function]
numeric-type c => name, symbol
Returns the numeric type of a character (one of"Decimal"
,"Digit"
, or"Numeric"
) as a string orNIL
if that particular character has no numeric type.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point. The second return value (if there is one) is the property symbol of the numeric type.CL-USER 1 > (numeric-type #\3) "Decimal" CL-UNICODE-NAMES::DECIMAL CL-USER 2 > (numeric-type (character-named "VULGAR FRACTION THREE QUARTERS")) "Numeric" CL-UNICODE-NAMES::NUMERIC CL-USER 3 > (numeric-type #\z) NIL NIL
[Generic function]
numeric-value c => number-or-nil
Returns the numeric value of a character as a Lisp rational orNIL
(for NaN).c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point.CL-USER 1 > (numeric-value #\3) 3 CL-USER 2 > (numeric-value (character-named "VULGAR FRACTION THREE QUARTERS")) 3/4 CL-USER 3 > (numeric-value #\z) NIL
[Generic function]
bidi-class c => name, symbol
Returns the bidirectional (Bidi) class of a character as a string orNIL
if there is no bidirectional class for that particular character.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point. The second return value (if there is one) is the property symbol of the class.CL-USER 1 > (bidi-class #\Space) "WS" CL-UNICODE-NAMES::WS CL-USER 2 > (bidi-class #\A) "L" CL-UNICODE-NAMES::L CL-USER 3 > (bidi-class (character-named "HEBREW LETTER ALEF")) "R" CL-UNICODE-NAMES::RSee alsoBIDI-CLASSES
.
[Function]
bidi-mirroring-glyph c &key want-code-point-p => char-or-code-point
Returns the Bidi mirroring glyph for a character if the character has the BidiMirrored property and an appropriate mirroring glyph is defined.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point.Returns the code point instead of the character if
want-code-point-p
is true. This can be especially useful for Lisp implementations whereCHAR-CODE-LIMIT
is smaller than+CODE-POINT-LIMIT+
.CL-USER 1 > (bidi-mirroring-glyph #\[) #\] CL-USER 2 > (bidi-mirroring-glyph #\]) #\[ CL-USER 3 > (bidi-mirroring-glyph #\|) NIL
[Function]
lowercase-mapping c &key want-code-point-p => char-or-code-point
Returns the simple lowercase mapping of a character.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point. Returns the character itself if no such mapping is explicitly defined. Note that case mapping only makes sense for characters with the LC property.Returns the code point instead of the character if
want-code-point-p
is true. This can be especially useful for Lisp implementations whereCHAR-CODE-LIMIT
is smaller than+CODE-POINT-LIMIT+
.CL-USER 1 > (lowercase-mapping #\Ä) #\ä CL-USER 2 > (unicode-name (lowercase-mapping (character-named "GEORGIAN CAPITAL LETTER AN"))) "GEORGIAN SMALL LETTER AN" CL-USER 3 > (lowercase-mapping (character-named "LATIN CAPITAL LETTER SHARP S")) #\ß
[Function]
uppercase-mapping c &key want-code-point-p => char-or-code-point
Returns the simple uppercase mapping of a character.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point. Returns the character itself if no such mapping is explicitly defined. Note that case mapping only makes sense for characters with the LC property.Returns the code point instead of the character if
want-code-point-p
is true. This can be especially useful for Lisp implementations whereCHAR-CODE-LIMIT
is smaller than+CODE-POINT-LIMIT+
.CL-USER 1 > (uppercase-mapping #\s) #\S CL-USER 2 > (unicode-name (uppercase-mapping (character-named "GLAGOLITIC SMALL LETTER AZU"))) "GLAGOLITIC CAPITAL LETTER AZU"
[Function]
titlecase-mapping c &key want-code-point-p => char-or-code-point
Returns the simple titlecase mapping of a character.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point. Returns the character itself if no such mapping is explicitly defined. Note that case mapping only makes sense for characters with the LC property.Returns the code point instead of the character if
want-code-point-p
is true. This can be especially useful for Lisp implementations whereCHAR-CODE-LIMIT
is smaller than+CODE-POINT-LIMIT+
.CL-USER 1 > (unicode-name (titlecase-mapping (char-code (character-named "LATIN SMALL LETTER DZ WITH CARON")))) "LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON" CL-USER 2 > (unicode-name (uppercase-mapping (char-code (character-named "LATIN SMALL LETTER DZ WITH CARON")))) "LATIN CAPITAL LETTER DZ WITH CARON"
[Generic function]
combining-class c => class
Returns the combining class of a character as a non-negative integer.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point.CL-USER 1 > (combining-class #\~) 0 CL-USER 2 > (combining-class (character-named "COMBINING TILDE OVERLAY")) 1 CL-USER 3 > (combining-class (character-named "NON-SPACING DOUBLE OVERSCORE")) 230
[Generic function]
age c => age
Returns the age of a character orNIL
if there is no age entry for that particular character. The age of a character is a list of two integers denoting the major and minor number of the Unicode version where the character first appeared.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point.CL-USER 1 > (age #\K) (1 1) CL-USER 2 > (age (character-named "HANGUL SYLLABLE PWILH")) (2 0) CL-USER 3 > (age (character-named "LATIN CAPITAL LETTER SHARP S")) (5 1)
[Function]
general-categories => list
Returns a sorted list of all general categories known to CL-UNICODE. These are the possible return values ofGENERAL-CATEGORY
.CL-USER 1 > (general-categories) ("Cc" "Cf" "Cn" "Co" "CS" "Ll" "Lm" "Lo" "Lt" "Lu" "Mc" "Me" "Mn" "Nd" "Nl" "No" "Pc" "Pd" "Pe" "Pf" "Pi" "Po" "Ps" "Sc" "Sk" "Sm" "So" "Zl" "Zp" "Zs")
[Function]
scripts => list
Returns a sorted list of all scripts known to CL-UNICODE. These are the possible return values ofSCRIPT
.
[Function]
code-blocks => list
Returns a sorted list of all blocks known to CL-UNICODE. These are the possible return values ofCODE-BLOCK
.
[Function]
binary-properties => list
Returns a sorted list of all binary properties known to CL-UNICODE. These are the allowed second arguments (modulo canonicalization) toHAS-BINARY-PROPERTY
.CL-USER 1 > (binary-properties) ("ASCII_Hex_Digit" "BidiMirrored" "Bidi_Control" "Dash" "Deprecated" "Diacritic" "Extender" "Hex_Digit" "Hyphen" "Ideographic" "IDS_Binary_Operator" "IDS_Trinary_Operator" "Join_Control" "Logical_Order_Exception" "Other_Alphabetic" "Other_Default_Ignorable_Code_Point" "Other_Grapheme_Extend" "Other_ID_Continue" "Other_ID_Start" "Other_Lowercase" "Other_Math" "Other_Uppercase" "Pattern_Syntax" "Pattern_White_Space" "Quotation_Mark" "Radical" "Soft_Dotted" "STerm" "Terminal_Punctuation" "Unified_Ideograph" "Variation_Selector" "White_Space")
[Function]
bidi-classes => list
Returns a sorted list of all Bidi classes known to CL-UNICODE. These are the possible return values ofBIDI-CLASS
.CL-USER 1 > (bidi-classes) ("AL" "AN" "B" "BN" "CS" "EN" "ES" "ET" "L" "LRE" "LRO" "NSM" "ON" "PDF" "R" "RLE" "RLO" "S" "WS")
[Function]
has-property c property => generalized-boolean
Checks whether a character has the named propertyproperty
.property
can be a string naming a property (which will be used for look-up after canonicalization) or it can be a property symbol (seePROPERTY-SYMBOL
).c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point.Properties in the sense of CL-UNICODE can be names of general categories, scripts, blocks, binary properties, or Bidi classes, amongst other things. If there are a block and a script with the same name (like, say,
"Cyrillic"
), the bare name denotes the script. Prepend"Block:"
to the name to refer to the block. (You can also prepend"Script:"
to refer to the script unambiguously.) Names of Bidi classes must be prepended with"BidiClass:"
if there's a potential for ambiguity.This function also recognizes several aliases for properties (like
"Symbol"
for"S"
) and you can, as in Perl, prepend block names with"In"
instead of"Block:"
and most other properties with"Is"
. SeeRECOGNIZED-PROPERTIES
.Signals an error if no property named
property
was found.CL-USER 1 > (has-property #\A "L") T CL-USER 2 > (has-property #\A "Letter") T CL-USER 3 > (has-property #\A "LC") T CL-USER 4 > (has-property #\A "CasedLetter") T CL-USER 5 > (has-property #\A "Lu") T CL-USER 6 > (has-property #\A "UppercaseLetter") T CL-USER 7 > (has-property #\A "IsUppercaseLetter") T CL-USER 8 > (has-property #\A "LowercaseLetter") NIL CL-USER 9 > (has-property #\A "Latin") T CL-USER 10 > (has-property #\A "Script:Latin") T CL-USER 11 > (has-property #\A "Script:Hebrew") NIL CL-USER 12 > (has-property #\A "Basic Latin") T CL-USER 13 > (has-property #\A "Block:BasicLatin") T CL-USER 14 > (has-property #\A "InBasicLatin") T CL-USER 15 > (has-property #\A "Block:Arabic") NIL CL-USER 16 > (has-property #\A "WhiteSpace") NIL CL-USER 17 > (has-property #\A "HexDigit") CL-UNICODE-NAMES::HEXDIGIT CL-USER 18 > (has-property #\A "BidiClass:L") T CL-USER 19 > (has-property #\A "BidiClass:Left-to-Right") T CL-USER 20 > (has-property #\A "LeftToRight") T CL-USER 21 > (has-property #\A "Any") T CL-USER 22 > (has-property #\A "Assigned") T CL-USER 23 > (has-property #\A "Unassigned") NIL CL-USER 24 > (has-property #\A "ASCII") TSee alsoPROPERTY-TEST
.
[Generic function]
property-test property &key errorp => function
Returns a unary function which can test code points or Lisp characters for the named propertyproperty
.property
is interpreted as inHAS-PROPERTY
andPROPERTY-TEST
is actually used internally byHAS-PROPERTY
but might come in handy if you need a faster way to test forproperty
(as you're saving the time to look up the property).Returns
NIL
if no property namedproperty
was found or signals an error iferrorp
is true.CL-USER 1 > (let ((ascii-tester (property-test "ASCII_Hex_Digit"))) (count-if 'identity (map 'list ascii-tester "ALEF"))) 3See also CL-PPCRE'sCREATE-OPTIMIZED-TEST-FUNCTION
.
[Function]
list-all-characters property &key want-code-point-p => list
Lists all character (ordered by code point) which have the propertyproperty
whereproperty
is interpreted as inHAS-PROPERTY
. Ifwant-code-point-p
is true, a list of code points instead of a list of characters is returned. (IfCHAR-CODE-LIMIT
is smaller than+CODE-POINT-LIMIT+
in your Lisp implementation, the list of code points can actually be longer than the list of characters.).CL-USER 1 > (mapcar 'unicode-name (list-all-characters "Grapheme_Link" :want-code-point-p t)) ("DEVANAGARI SIGN VIRAMA" "BENGALI SIGN VIRAMA" "GURMUKHI SIGN VIRAMA" "GUJARATI SIGN VIRAMA" "ORIYA SIGN VIRAMA" "TAMIL SIGN VIRAMA" "TELUGU SIGN VIRAMA" "KANNADA SIGN VIRAMA" "MALAYALAM SIGN VIRAMA" "SINHALA SIGN AL-LAKUNA" "THAI CHARACTER PHINTHU" "TIBETAN MARK HALANTA" "MYANMAR SIGN VIRAMA" "MYANMAR SIGN ASAT" "TAGALOG SIGN VIRAMA" "HANUNOO SIGN PAMUDPOD" "KHMER SIGN COENG" "BALINESE ADEG ADEG" "SUNDANESE SIGN PAMAAEH" "SYLOTI NAGRI SIGN HASANTA" "SAURASHTRA SIGN VIRAMA" "REJANG VIRAMA" "KHAROSHTHI VIRAMA")
[Function]
recognized-properties &optional all => list
Returns a list of all property names known to CL-UNICODE. These are the allowed second arguments (modulo canonicalization) toHAS-PROPERTY
. Ifall
is true, known aliases (like Letter for L) are also included.CL-USER 1 > (length (recognized-properties t)) 996
[Function]
property-symbol name => symbol, name
Returns a symbol in theCL-UNICODE-NAMES
packages (which is only used for this purpose) which can stand in for the stringname
in look-ups. The symbol's name is the result of canonicalizing and then upcasingname
.A symbol returned by this function is only really useful and only actually a property symbol if the second return value is true.
All exported functions of CL-UNICODE which return strings which are property names return the corresponding property symbol as their second return value. All exported functions of CL-UNICODE which accept property names as arguments will also accept property symbols.
CL-USER 1 > (property-symbol "XID_Start") CL-UNICODE-NAMES::XIDSTART "XIDStart" CL-USER 2 > (property-symbol "Foo") CL-UNICODE-NAMES::FOO NILSee alsoPROPERTY-NAME
.
[Function]
property-name symbol => name-or-nil
Returns a name (not the name) for a property symbolsymbol
if it is known to CL-UNICODE. Note that(STRING= (PROPERTY-NAME (PROPERTY-SYMBOL <string>)) <string>)is not necessarily true even if the property name is notNIL
while(EQ (PROPERTY-SYMBOL (PROPERTY-NAME <symbol>)) <symbol>)always holds if there is a property name for<symbol>
.CL-USER 1 > (property-name 'cl-unicode-names::asciihexdigit) "ASCII_Hex_Digit"See alsoPROPERTY-SYMBOL
.
[Function]
canonicalize-name name => name'
Converts the stringname
into a canonicalized name which can be used for unambiguous look-ups by removing all whitespace, hyphens, and underline characters.Tries not to remove hyphens preceded by spaces or underlines if this could lead to ambiguities as described in http://unicode.org/unicode/reports/tr18/#Name_Properties.
All CL-UNICODE functions which accept string names for characters or properties will canonicalize the name first using this function and will then look up the name case-insensitively.
CL-USER 1 > (canonicalize-name "Left-to-Right") "LefttoRight" CL-USER 2 > (canonicalize-name "Left_To_Right") "LeftToRight" CL-USER 3 > (string-equal * **) T CL-USER 4 > (canonicalize-name "TIBETAN LETTER A") "TIBETANLETTERA" CL-USER 5 > (canonicalize-name "TIBETAN LETTER -A") "TIBETANLETTER -A" CL-USER 6 > (canonicalize-name (canonicalize-name "TIBETAN LETTER A")) "TIBETANLETTERA" CL-USER 7 > (canonicalize-name (canonicalize-name "TIBETAN LETTER -A")) "TIBETANLETTER -A" CL-USER 8 > (canonicalize-name "Tibetan_Letter_-A") "TibetanLetter -A"Note that the preceding chracter is relevant in the ambiguous cases (but there are only three of them):CL-USER 8 > (char= (character-named "TibetanLetter A") (character-named "TibetanLetter -A")) NIL CL-USER 9 > (char= (character-named "TibetanLetterA") (character-named "TibetanLetter-A")) T
[Generic function]
unicode-name c => name-or-nil
Returns the Unicode name of a character as a string orNIL
if there is no name for that particular character.c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point.CL-USER 1 > (unicode-name #\ß) "LATIN SMALL LETTER SHARP S" CL-USER 2 > (unicode-name #\ü) "LATIN SMALL LETTER U WITH DIAERESIS" CL-USER 3 > (unicode-name #xd4db) "HANGUL SYLLABLE PWILH"
[Generic function]
unicode1-name c => name-or-nil
Returns the Unicode 1.0 name of a character as a string orNIL
if there is no name for that particular character. This name is only non-NIL
if it is significantly different from the Unicode name (seeUNICODE-NAME
). For control characters, sometimes the ISO 6429 name is returned instead.
c
can be the character's code point (a positive integer) or a (Lisp) character assuming its character code is also its Unicode code point.CL-USER 1 > (unicode-name (code-char 1)) NIL CL-USER 2 > (unicode1-name (code-char 1)) "START OF HEADING" CL-USER 3 > (unicode-name (code-char #x67e)) "ARABIC LETTER PEH" CL-USER 4 > (unicode1-name (code-char #x67e)) "ARABIC LETTER TAA WITH THREE DOTS BELOW"
Returns the character which has the namename
(a string) by looking up the Unicode name (seeUNICODE-NAME
).If
try-unicode1-names
is true, the Unicode 1.0 name (seeUNICODE1-NAME
) will be used as a fallback.If
try-abbreviations-p
is true,name
is treated as an abbreviation as follows: Ifname
contains a colon, it is interpreted as"<script>:<short-name>"
and the function tries to look up, in turn, the characters named"<script> <size> LETTER <short-name>"
,"<script> LETTER <short-name>"
, and"<script> <short-name>"
where<size>
is"SMALL"
if none of the characters in<short-name>
is uppercase,"CAPITAL"
otherwise. Ifname
does not contain a colon, the same algorithm as above is tried withname
instead of<short-name>
and each element of the list of stringsscripts-to-try
as<string>
. (scripts-to-try
can also be a single string which is interpreted as a one-element list.)If
try-hex-notation-p
is true,name
can be of the form"U+<x>"
where<x>
is a hexadecimal number with four to six digits with the obvious meaning.If
try-lisp-names-p
is true, the function returns the character with the character namename
(if there is one) or, ifname
is exactly one character, it returns this character.All the keyword-governed alternatives are tried in the order they're described above.
See also
*TRY-UNICODE1-NAMES-P*
,*TRY-ABBREVIATIONS-P*
,*SCRIPTS-TO-TRY*
,*TRY-HEX-NOTATION-P*
, and*TRY-LISP-NAMES-P*
.Returns the code point instead of the character if
want-code-point-p
is true. This can be especially useful for Lisp implementations whereCHAR-CODE-LIMIT
is smaller than+CODE-POINT-LIMIT+
.CL-USER 1 > (character-named "LATIN SMALL LETTER SHARP S") #\ß CL-USER 2 > (character-named "latin small letter sharp s") #\ß CL-USER 3 > (character-named "LatinSmallLetterSharpS") #\ß CL-USER 4 > (character-named "Latin:sharps" :try-abbreviations-p t) #\ß CL-USER 5 > (character-named "sharps" :try-abbreviations-p t :scripts-to-try "Latin") #\ß CL-USER 6 > (character-named "Backspace") #\Backspace CL-USER 7 > (character-named "Backspace" :try-unicode1-names-p nil) NIL CL-USER 8 > (character-named "Newline") NIL CL-USER 9 > (character-named "Newline" :try-lisp-names-p t) #\Newline CL-USER 10 > (character-named "U+0020" :try-hex-notation-p t) #\Space
[Special variable]
*try-unicode1-names-p*
This is the default value for thetry-unicode1-names-p
keyword argument toCHARACTER-NAMED
. Its initial value isT
.
[Special variable]
*try-abbreviations-p*
This is the default value for thetry-abbreviations-p
keyword argument toCHARACTER-NAMED
. Its initial value isNIL
.
[Special variable]
*scripts-to-try*
This is the default value for thescripts-to-try
keyword argument toCHARACTER-NAMED
. Its initial value isNIL
.
[Special variable]
*try-hex-notation-p*
This is the default value for thetry-hex-notation-p
keyword argument toCHARACTER-NAMED
. Its initial value isNIL
.
[Special variable]
*try-lisp-names-p*
This is the default value for thetry-lisp-names-p
keyword argument toCHARACTER-NAMED
. Its initial value isNIL
.
[Macro]
enable-alternative-character-syntax => |
Enables an alternative Lisp character syntax which replaces the usual syntax: After a sharpsign (#\#
) and a backslash (#\\
) have been read, at least one more character is read. Reading then continues as long as ASCII letters, digits, underlines, hyphens, colons, or plus signs are read. The resulting string is then used as input toCHARACTER-NAMED
to produce a character.This macro expands into an
EVAL-WHEN
so that if you use it as a top-level form in a file to be loaded and/or compiled it'll do what you expect. Technically, this'll push the current readtable on a stack so that matching calls of this macro andDISABLE-ALTERNATIVE-CHARACTER-SYNTAX
can be nested.Note that by default the alternative character syntax is not enabled after loading CL-UNICODE.
CL-USER 1 > (enable-alternative-character-syntax) CL-USER 2 > (setq *try-abbreviations-p* t) T CL-USER 3 > (setq *scripts-to-try* "Hebrew") "Hebrew" CL-USER 4 > (char-code #\Alef) 1488(It is recommended that you set*TRY-LISP-SYNTAX-P*
to a true value when enabling the alternative syntax, so that you can still use the short syntax (like#\a
) for characters.)For an alternative syntax for strings see CL-INTERPOL.
[Macro]
disable-alternative-character-syntax => |
Restores the readtable which was active before the last call toENABLE-ALTERNATIVE-CHARACTER-SYNTAX
. If there was no such call, the standard readtable is used.This macro expands into an
EVAL-WHEN
so that if you use it as a top-level form in a file to be loaded and/or compiled it'll do what you expect. Technically, this'll pop a readtable from the stack described inENABLE-ALTERNATIVE-CHARACTER-SYNTAX
so that matching calls of these macros can be nested.
[Constant]
+code-point-limit+
#x110000
, the smallest integer which is not a code point in the Unicode codespace.
[Condition type]
unicode-error
All errors signalled by CL-UNICODE are of this type.
*scripts-to-try*
*try-abbreviations-p*
*try-hex-notation-p*
*try-lisp-names-p*
*try-unicode1-names-p*
+code-point-limit+
age
bidi-class
bidi-classes
bidi-mirroring-glyph
binary-properties
canonicalize-name
character-named
code-block
code-blocks
combining-class
disable-alternative-character-syntax
enable-alternative-character-syntax
general-categories
general-category
has-binary-property
has-property
list-all-characters
lowercase-mapping
numeric-type
numeric-value
property-name
property-symbol
property-test
recognized-properties
script
scripts
titlecase-mapping
unicode-error
unicode-name
unicode1-name
uppercase-mapping
This documentation was prepared with DOCUMENTATION-TEMPLATE.
$Header: /usr/local/cvsrep/cl-unicode/doc/index.html,v 1.13 2008/07/24 14:56:33 edi Exp $