Perl Compatible Regular Expressions

Compiling PCRE on non-Unix systems ---------------------------------- This document contains the following sections: General Generic instructions for the PCRE C library The C++ wrapper functions Building for virtual Pascal Comments about Win32 builds Building under Windows with BCC5.5 Building PCRE on OpenVMS GENERAL I (Philip Hazel) have no knowledge of Windows or VMS sytems and how their libraries work. The items in the PCRE distribution and Makefile that relate to anything other than Unix-like systems are untested by me. There are some other comments and files in the Contrib directory on the ftp site that you may find useful. See ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Contrib If you want to compile PCRE for a non-Unix system (especially for a system that does not support "configure" and "make" files), note that the basic PCRE library consists entirely of code written in Standard C, and so should compile successfully on any system that has a Standard C compiler and library. The C++ wrapper functions are a separate issue (see below). The PCRE distribution contains some experimental support for "cmake", but this is incomplete and not documented. However if you are a "cmake" user you might like to try building with "cmake". GENERIC INSTRUCTIONS FOR THE PCRE C LIBRARY The following are generic comments about building the PCRE C library "by hand". (1) Copy or rename the file config.h.generic as config.h, and edit the macro settings that it contains to whatever is appropriate for your environment. In particular, if you want to force a specific value for newline, you can define the NEWLINE macro. An alternative approach is not to edit config.h, but to use -D on the compiler command line to make any changes that you need. NOTE: There have been occasions when the way in which certain parameters in config.h are used has changed between releases. (In the configure/make world, this is handled automatically.) When upgrading to a new release, you are strongly advised to review config.h.generic before re-using what you had previously. (2) Copy or rename the file pcre.h.generic as pcre.h. (3) EITHER: Copy or rename file pcre_chartables.c.dist as pcre_chartables.c. OR: Compile dftables.c as a stand-alone program, and then run it with the single argument "pcre_chartables.c". This generates a set of standard character tables and writes them to that file. The tables are generated using the default C locale for your system. If you want to use a locale that is specified by LC_xxx environment variables, add the -L option to the dftables command. You must use this method if you are building on a system that uses EBCDIC code. The tables in pcre_chartables.c are defaults. The caller of PCRE can specify alternative tables at run time. (4) Compile the following source files: pcre_chartables.c pcre_compile.c pcre_config.c pcre_dfa_exec.c pcre_exec.c pcre_fullinfo.c pcre_get.c pcre_globals.c pcre_info.c pcre_maketables.c pcre_newline.c pcre_ord2utf8.c pcre_refcount.c pcre_study.c pcre_tables.c pcre_try_flipped.c pcre_ucp_searchfuncs.c pcre_valid_utf8.c pcre_version.c pcre_xclass.c Now link them all together into an object library in whichever form your system keeps such libraries. This is the basic PCRE C library. If your system has static and shared libraries, you may have to do this once for each type. (5) Similarly, compile pcreposix.c and link it (on its own) as the pcreposix library. (6) Compile the test program pcretest.c. This needs the functions in the pcre and pcreposix libraries when linking. (7) Run pcretest on the testinput files in the testdata directory, and check that the output matches the corresponding testoutput files. Note that the supplied files are in Unix format, with just LF characters as line terminators. You may need to edit them to change this if your system uses a different convention. (8) If you want to use the pcregrep command, compile and link pcregrep.c; it uses only the basic PCRE library (it does not need the pcreposix library). THE C++ WRAPPER FUNCTIONS The PCRE distribution also contains some C++ wrapper functions and tests, contributed by Google Inc. On a system that can use "configure" and "make", the functions are automatically built into a library called pcrecpp. It should be straightforward to compile the .cc files manually on other systems. The files called xxx_unittest.cc are test programs for each of the corresponding xxx.cc files. BUILDING FOR VIRTUAL PASCAL A script for building PCRE using Borland's C++ compiler for use with VPASCAL was contributed by Alexander Tokarev. Stefan Weber updated the script and added additional files. The following files in the distribution are for building PCRE for use with VP/Borland: makevp_c.txt, makevp_l.txt, makevp.bat, pcregexp.pas. COMMENTS ABOUT WIN32 BUILDS There are two ways of building PCRE using the "configure, make, make install" paradigm on Windows systems: using MinGW or using Cygwin. These are not at all the same thing; they are completely different from each other. There is also some experimental, undocumented support for building using "cmake", which you might like to try if you are familiar with "cmake". However, at the present time, the "cmake" process builds only a static library (not a dll), and the tests are not automatically run. The MinGW home page (http://www.mingw.org/) says this: MinGW: A collection of freely available and freely distributable Windows specific header files and import libraries combined with GNU toolsets that allow one to produce native Windows programs that do not rely on any 3rd-party C runtime DLLs. The Cygwin home page (http://www.cygwin.com/) says this: Cygwin is a Linux-like environment for Windows. It consists of two parts: . A DLL (cygwin1.dll) which acts as a Linux API emulation layer providing substantial Linux API functionality . A collection of tools which provide Linux look and feel. The Cygwin DLL currently works with all recent, commercially released x86 32 bit and 64 bit versions of Windows, with the exception of Windows CE. On both MinGW and Cygwin, PCRE should build correctly using: ./configure && make && make install This should create two libraries called libpcre and libpcreposix, and, if you have enabled building the C++ wrapper, a third one called libpcrecpp. These are independent libraries: when you like with libpcreposix or libpcrecpp you must also link with libpcre, which contains the basic functions. (Some earlier releases of PCRE included the basic libpcre functions in libpcreposix. This no longer happens.) If you want to statically link your program against a non-dll .a file, you must define PCRE_STATIC before including pcre.h, otherwise the pcre_malloc() and pcre_free() exported functions will be declared __declspec(dllimport), with unwanted results. Using Cygwin's compiler generates libraries and executables that depend on cygwin1.dll. If a library that is generated this way is distributed, cygwin1.dll has to be distributed as well. Since cygwin1.dll is under the GPL licence, this forces not only PCRE to be under the GPL, but also the entire application. A distributor who wants to keep their own code proprietary must purchase an appropriate Cygwin licence. MinGW has no such restrictions. The MinGW compiler generates a library or executable that can run standalone on Windows without any third party dll or licensing issues. But there is more complication: If a Cygwin user uses the -mno-cygwin Cygwin gcc flag, what that really does is to tell Cygwin's gcc to use the MinGW gcc. Cygwin's gcc is only acting as a front end to MinGW's gcc (if you install Cygwin's gcc, you get both Cygwin's gcc and MinGW's gcc). So, a user can: . Build native binaries by using MinGW or by getting Cygwin and using -mno-cygwin. . Build binaries that depend on cygwin1.dll by using Cygwin with the normal compiler flags. The test files that are supplied with PCRE are in Unix format, with LF characters as line terminators. It may be necessary to change the line terminators in order to get some of the tests to work. We hope to improve things in this area in future. BUILDING UNDER WINDOWS WITH BCC5.5 Michael Roy sent these comments about building PCRE under Windows with BCC5.5: Some of the core BCC libraries have a version of PCRE from 1998 built in, which can lead to pcre_exec() giving an erroneous PCRE_ERROR_NULL from a version mismatch. I'm including an easy workaround below, if you'd like to include it in the non-unix instructions: When linking a project with BCC5.5, pcre.lib must be included before any of the libraries cw32.lib, cw32i.lib, cw32mt.lib, and cw32mti.lib on the command line. BUILDING PCRE ON OPENVMS Dan Mooney sent the following comments about building PCRE on OpenVMS. They relate to an older version of PCRE that used fewer source files, so the exact commands will need changing. See the current list of source files above. "It was quite easy to compile and link the library. I don't have a formal make file but the attached file [reproduced below] contains the OpenVMS DCL commands I used to build the library. I had to add #define POSIX_MALLOC_THRESHOLD 10 to pcre.h since it was not defined anywhere. The library was built on: O/S: HP OpenVMS v7.3-1 Compiler: Compaq C v6.5-001-48BCD Linker: vA13-01 The test results did not match 100% due to the issues you mention in your documentation regarding isprint(), iscntrl(), isgraph() and ispunct(). I modified some of the character tables temporarily and was able to get the results to match. Tests using the fr locale did not match since I don't have that locale loaded. The study size was always reported to be 3 less than the value in the standard test output files." ========================= $! This DCL procedure builds PCRE on OpenVMS $! $! I followed the instructions in the non-unix-use file in the distribution. $! $ COMPILE == "CC/LIST/NOMEMBER_ALIGNMENT/PREFIX_LIBRARY_ENTRIES=ALL_ENTRIES $ COMPILE DFTABLES.C $ LINK/EXE=DFTABLES.EXE DFTABLES.OBJ $ RUN DFTABLES.EXE/OUTPUT=CHARTABLES.C $ COMPILE MAKETABLES.C $ COMPILE GET.C $ COMPILE STUDY.C $! I had to set POSIX_MALLOC_THRESHOLD to 10 in PCRE.H since the symbol $! did not seem to be defined anywhere. $! I edited pcre.h and added #DEFINE SUPPORT_UTF8 to enable UTF8 support. $ COMPILE PCRE.C $ LIB/CREATE PCRE MAKETABLES.OBJ, GET.OBJ, STUDY.OBJ, PCRE.OBJ $! I had to set POSIX_MALLOC_THRESHOLD to 10 in PCRE.H since the symbol $! did not seem to be defined anywhere. $ COMPILE PCREPOSIX.C $ LIB/CREATE PCREPOSIX PCREPOSIX.OBJ $ COMPILE PCRETEST.C $ LINK/EXE=PCRETEST.EXE PCRETEST.OBJ, PCRE/LIB, PCREPOSIX/LIB $! C programs that want access to command line arguments must be $! defined as a symbol $ PCRETEST :== "$ SYS$ROADSUSERS:[DMOONEY.REGEXP]PCRETEST.EXE" $! Arguments must be enclosed in quotes. $ PCRETEST "-C" $! Test results: $! $! The test results did not match 100%. The functions isprint(), iscntrl(), $! isgraph() and ispunct() on OpenVMS must not produce the same results $! as the system that built the test output files provided with the $! distribution. $! $! The study size did not match and was always 3 less on OpenVMS. $! $! Locale could not be set to fr $! ========================= Last Updated: 13 June 2007 ****