* new utf8proc_map_custom for hooking in user-defined custom mappings
* whoops, add test program
* NEWS, version bump for 2.1
* change test functions to static so that gcc doesn't complain about missing prototypes
* Split codepoint sequence normalisation out into separate function.
This creates utf8proc_normalize_utf32() which takes and returns
a UTF-32 string, applying the following options:
- UTF8PROC_NLF2LS
- UTF8PROC_NLF2PS
- UTF8PROC_NLF2LF
- UTF8PROC_STRIPCC
- UTF8PROC_COMPOSE
- UTF8PROC_STABLE
The utf8proc_reencode() function has been updated to call the
new utf8proc_normalize_utf32().
* Update code documentation: utf8proc_reencode handles UTF8PROC_CHARBOUND.
* convert sequences to utf-16 (saves 25kb)
* store sequence length in properties instead using -1 termination (saves 10kb)
* cache index for slightly faster data creation
* store lower/upper/title mapping in sequence array (saves 25kb). Add utf8proc_totitle, as title_mapping cannot be used to get the title codepoint anymore. Rename xxx_mapping to xxx_seqindex, so programs assuming a value with the old meaning fail at compile time
* change combination array data type to uint16 (saves 40kb)
* merge 1st and 2nd comb index (saves 50kb)
* kill empty prefix/suffix in combination array (saves 50kb)
* there was no need to have a separate combination start array, it can be merged in a single array
* some fixes
* mark the table as const again
* and regen
* Updates for Unicode 9.0.0 TR29 Changes
- New rules GB10/(12/13) are used to combine emoji-zwj sequences/
(force grapheme breaks every two RI codepoints). Unfortunately this
breaks statelessness of grapheme-boundary determination. Deal with
this by ignoring the problem in utf8proc_grapheme_break, and by
hacking in a special case in decompose
- ZWJ moved to its own boundclass, update what is now GB9 accordingly.
- Add comments to indicate which rule a given case implements
- The Number of bound classes Now exceeds 4 bits, expand to 8 and
reorganize fields
* Import Unicode 9 data
* Update Grapheme break API to expose state override
* Bump MAJOR version
Use relative symlinks that are independent of installation prefix.
Drop superfluous .so.MAJOR.MINOR symlink, which is and should never
be needed in practice. The purpose of shared library symlinks is to
provide libraries for compile-time linking (.so) and for run-time
linking using the SONAME (.so.MAJOR).