Selene Unicode

LuaFAR 3

Selene Unicode


/*
  Porting to Lua 5.2 and bug fixes: (C) Shmuel Zeigerman, 2010-2014.
  MIT license.
*/

/*
*   Selene Unicode/UTF-8
*   This additions
*   Copyright (c) 2005 Malete Partner, Berlin, [email protected]
*   Available under "Lua 5.0 license", see http://www.lua.org/license.html#5
*   $Id: slnunico.c,v 1.5 2006/07/26 17:20:04 paul Exp $
*
*   contains code from
** lstrlib.c,v 1.109 2004/12/01 15:46:06 roberto Exp
** Standard library for string operations and pattern-matching
** See Copyright Notice in lua.h
*
*   uses the udata table and a couple of expressions from Tcl 8.4.x UTF-8
* which comes with the following license.terms:

This software is copyrighted by the Regents of the University of
California, Sun Microsystems, Inc., Scriptics Corporation, ActiveState
Corporation and other parties.  The following terms apply to all files
associated with the software unless explicitly disclaimed in
individual files.

The authors hereby grant permission to use, copy, modify, distribute,
and license this software and its documentation for any purpose, provided
that existing copyright notices are retained in all copies and that this
notice is included verbatim in any distributions. No written agreement,
license, or royalty fee is required for any of the authorized uses.
Modifications to this software may be copyrighted by their authors
and need not follow the licensing terms described here, provided that
the new terms are clearly indicated on the first page of each file where
they apply.

IN NO EVENT SHALL THE AUTHORS OR DISTRIBUTORS BE LIABLE TO ANY PARTY
FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES
ARISING OUT OF THE USE OF THIS SOFTWARE, ITS DOCUMENTATION, OR ANY
DERIVATIVES THEREOF, EVEN IF THE AUTHORS HAVE BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

THE AUTHORS AND DISTRIBUTORS SPECIFICALLY DISCLAIM ANY WARRANTIES,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGEMENT.  THIS SOFTWARE
IS PROVIDED ON AN "AS IS" BASIS, AND THE AUTHORS AND DISTRIBUTORS HAVE
NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR
MODIFICATIONS.

GOVERNMENT USE: If you are acquiring this software on behalf of the
U.S. government, the Government shall have only "Restricted Rights"
in the software and related documentation as defined in the Federal
Acquisition Regulations (FARs) in Clause 52.227.19 (c) (2).  If you
are acquiring the software on behalf of the Department of Defense, the
software shall be classified as "Commercial Computer Software" and the
Government shall have only "Restricted Rights" as defined in Clause
252.227-7013 (c) (1) of DFARs.  Notwithstanding the foregoing, the
authors grant the U.S. Government and others acting in its behalf
permission to use and distribute the software in accordance with the
terms specified in this license.

(end of Tcl license terms)
*/

/*
According to http://ietf.org/rfc/rfc3629.txt we support up to 4-byte
(21 bit) sequences encoding the UTF-16 reachable 0-0x10FFFF.
Any byte not part of a 2-4 byte sequence in that range decodes to itself.
Ill formed (non-shortest) "C0 80" will be decoded as two code points C0 and 80,
not code point 0; see security considerations in the RFC.
However, UTF-16 surrogates (D800-DFFF) are accepted.

See http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries
for default grapheme clusters.
Lazy westerners we are (and lacking the Hangul_Syllable_Type data),
we care for base char + Grapheme_Extend, but not for Hangul syllable sequences.

For http://unicode.org/Public/UNIDATA/UCD.html#Grapheme_Extend
we use Mn (NON_SPACING_MARK) + Me (ENCLOSING_MARK),
ignoring the 18 mostly south asian Other_Grapheme_Extend (16 Mc, 2 Cf) from
http://www.unicode.org/Public/UNIDATA/PropList.txt
*/

/* Contains code from:
Quylthulg Copyright (C) 2009 Kein-Hong Man <[email protected]>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
*/