This duplication of runtime libraries is usually harmless since there are no namespace conflicts and memory overhead is minimal. However, there is serious problem related to the fact that modules do not share type-information. This is particularly a problem when working with C++ (as described next).
When compiled into two separate modules, the code does not work properly. In fact, you get a type error such as the following :// File : a.i %module a // Here is a base class class a { public: a(); ~a(); void foo(double); }; // File : b.i %module b // Here is a derived class %extern a.i // Gets definition of base class class b : public a { public: bar(); };
[beazley@guinness shadow]$ python Python 1.4 (Jan 16 1997) [GCC 2.7.2] Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> from a import * >>> from b import * >>> # Create a new "b" >>> b = new_b() >>> # Call a function in the base class ... >>> a_foo(b,3) Traceback (innermost last): File "<stdin>", line 1, in ? TypeError: Type error in argument 1 of a_foo. Expected _a_p. >>>
However, from our class definitions we know that "b" is an "a" by inheritance and there should be no type-error. This problem is directly due to the lack of type-sharing between modules. If we look closely at the module modules created here, they look like this :
The type information listed shows the acceptable values for various C datatypes. In the "a" module, we see that "a" can only accept instances of itself. In the "b" module, we see that "a" can accept both "a" and "b" instances--which is correct given that a "b" is an "a" by inheritance.
Unfortunately, this problem is inherent in the method by which SWIG makes modules. When we made the "a" module, we had no idea what derived classes might be used at a later time. However, it's impossible to produce the proper type information until after we know all of the derived classes. A nice problem to be sure, but one that can be fixed by making all modules share a single copy of the SWIG run-time library.
1. Build the SWIG run-time libraries. The SWIG1.1/Runtime directory contains a makefile for doing this. If successfully built, you will end up with 6 files that are usually installed in /usr/local/lib.
libswigtcl.a # Tcl library (static) libswigtcl.so # Tcl library (shared) libswigpl.a # Perl library (static) libswigpl.so # Perl library (shared) libswigpy.a # Python library (static) libswigpy.so # Python library (shared)
Note that certain libraries may be missing due to missing packages or unsupported features (like dynamic loading) on your machine.
2. Compile all SWIG modules using the -c option. For example :
The -c option tells SWIG to omit runtime support. It's now up to you to provide it separately--which we will do using our libraries.% swig -c -python a.i % swig -c -python b.i
3. Build SWIG modules by linking against the appropriate runtime libraries.
or if building a new executable (static linking)% swig -c -python a.i % swig -c -python b.i % gcc -c a_wrap.c b_wrap.c -I/usr/local/include % ld -shared a_wrap.o b_wrap.o -lswigpy -o a.so
% swig -c -tcl -ltclsh.i a.i % gcc a_wrap.c -I/usr/local/include -L/usr/local/lib -ltcl -lswigtcl -lm -o mytclsh
When completed you should now end up with a collection of modules like this :
In this configuration, the runtime library manages all datatypes and other information between modules. This management process is dynamic in nature--when new modules are loaded, they contribute information to the run-time system. In the C++ world, one could incrementally load classes as needed. As this process occurs, type information is updated and base-classes learn about derived classes as needed.
1. Rebuild the scripting language executable with the SWIG runtime library attached to it. This is actually, fairly easy to do using SWIG. For example :
%module mytclsh %{ static void *__embedfunc(void *a) { return a}; %} void *__embedfunc(void *); %include tclsh.i
Now, run SWIG and compile as follows :
This produces a new executable "tclsh" that contains a copy of the SWIG runtime library. The weird __embedfunc() function is needed to force the functions in the runtime library to be included in the final executable.% swig -c -tcl mytclsh.i % gcc mytclsh_wrap.c -I/usr/local/include -L/usr/local/lib -ltcl -lswigtcl -ldl -lm \ -o tclsh
To make new dynamically loadable SWIG modules, simply compile as follows :
Linking against the swigtcl library is no longer necessary as all of the functions are now included in the tclsh executable and will be resolved when your module is loaded.% swig -c -tcl example.i % gcc -c example_wrap.c -I/usr/local/include % ld -shared example_wrap.o -o example.so
2. Using shared library versions of the runtime library
If supported on your machine, the runtime libraries will be built as shared libraries (indicated by a .so, .sl, or .dll suffix). To compile using the runtime libraries, you link process should look something like this :
In order for the libswigtcl.so library to work, it needs to be placed in a location where the dynamic loader can find it. Typically this is a system library directory (ie. /usr/local/lib or /usr/lib).% ld -shared swigtcl_wrap.o -o libswigtcl.so # Irix % gcc -shared swigtcl_wrap.o -o libswigtcl.so # Linux % ld -G swigtcl_wrap.o -o libswigtcl.so # Solaris
When running with the shared libary version, you may get error messages such as the following
This indicates that the loader was unable to find the shared libary at run-time. To find shared libaries, the loader looks through a collection of predetermined paths. If the libswigtcl.so file is not in any of these directories, it results in an error. On most machines, you can change the loader search path by changing the Unix environment variable LD_LIBRARY_PATH. For example :Unable to locate libswigtcl.so
A somewhat better approach is to link your module with the proper path encoded. This is typically done using the `-rpath' or `-R' option to your linker (see the man page). For example :% setenv LD_LIBRARY_PATH .:/home/beazley/packages/lib
The -rpath option encodes the location of shared libraries into your modules and gets around having to set the LD_LIBRARY_PATH variable.% ld -shared example_wrap.o example.o -rpath /home/beazley/packages/lib \ -L/home/beazley/packages/lib -lswigtcl.so -o example.so
If all else fails, pull up the man pages for your linker and start playing around.
While the process of building C++ modules is, by no means, and exact science, here are a few rules of thumb to follow :
The SWIG distribution contains some additional documentation about C++ modules in the Doc directory as well.
However, the matching process is complicated by the fact that datatypes may use a variety of different names. For example, the following declarations
define two sets of equivalent types :typedef double Real; typedef Real * RealPtr; typedef double Float;
All of the types in each set are freely interchangable and the type-checker knows about the relationships by managing a table of equivalences such as the following :{double, Real, Float} {RealPtr, Real *}
double => { Real, Float } Real => { double, Float } Float => { double, Real } RealPtr => { Real * } Real * => { RealPtr }
When you declare a function such as the following :
SWIG first checks to see if the argument passed is a "Real *". If not, it checks to see if it is any of the other equivalent types (double *, RealPtr, Float *). If so, the value is accepted and no error occurs.void foo(Real *a);
Derived versions of the various datatypes are also legal. For example, if you had a function like this,
The type-checker will accept pointers of type double *** and Real ***. However, the type-checker does not always capture the full-range of possibilities. For example, a datatype of `RealPtr **' is equivalent to a `Float ***' but would be flagged as a type error. If you encounter this kind of problem, you can manually force SWIG to make an equivalence as follows:void bar(Float ***a);
Doing this should hardly ever be necessary (I have never encountered a case where this was necessary), but if all else fails, you can force the run-time type checker into doing what you want.// Tell the type checker that `Float_ppp' and `RealPtr_pp' are equivalent. %init %{ SWIG_RegisterMapping("Float_ppp","RealPtr_pp",0); %}
Type-equivalence of C++ classes is handled in a similar manner, but is encoded in a manner to support inheritance. For example, consider the following classes hierarchy :
The type-checker encodes this into the following sets :class A { }; class B : public A { }; class C : public B { }; class D {}; class E : public C, public D {};
The encoding reflects the class hierarchy. For example, any object of type "A" will also accept objects of type B,C, and E because these are all derived from A. However, it is not legal to go the other way. For example, a function operating on a object from class E will not accept an object from class A.A => { B, C, E } "B isa A, C isa A, E isa A" B => { C, E } "C isa B, E isa B" C => { E } "E isa C" D => { E } "E isa D" E => { }
All pointers are internally represented as void *, but conversion functions are always invoked when pointer values are converted between base and derived classes in a C++ class hierarchy.void *EtoA(void *ptr) { E *in = (E *) ptr; A *out = (A *) in; // Cast using C++ return (void *) out; }
An alternative to a name based scheme would be to generate type-signatures based on the structure of a datatype. Such a scheme would result in perfect type-checking, but I think it would also result in a very confusing scripting language module. For this reason, I see SWIG sticking with the name-based approach--at least for the foreseeable future.
Most well-structured C codes will find an exact match on the first attempt, providing the best possible performance. For C++ codes, it is quite common to be passing various objects of a common base-class around between functions. When base-class functions are invoked, it almost always results in a miscompare (because the type-checker is looking for the base-type). In this case, we drop down to a small cache of recently used datatypes. If we've used a pointer of the same type recently, it will be in the cache and we can match against it. For tight loops, this results in about 10-15% overhead over finding a match on the first try. Finally, as a last resort, we need to search the internal pointer tables for a match. This involves a combination of hash table lookup and linear search. If a match is found, it is placed into the cache and the result returned. If not, we finally report a type-mismatch.
As a rule of thumb, C++ programs require somewhat more processing than C programs, but this seems to be avoidable. Also, keep in mind that performance penalties in the type-checker don't necessarily translate into big penalties in the overall application. Performance is most greatly affected by the efficiency of the target scripting language and the types of operations your C code is performing.