(Note : This is a work in progress.)
First a little background: SWIG was started in 1995 as a one-person project and continued in this mode of operation until about 1998. Most of this development was driven by ideas submitted by early SWIG users as opposed to being motivated by a grand design. As a result, the code ended up being a pretty horrible C++ coding disaster. A mostly working disaster perhaps, but a disaster nonetheless.
With that said, the primary goal of future SWIG development is to reengineer the original system, fix most of its inherent design flaws, and to produce what I hope will become a highly extensible and modular interface compiler framework. To this do this, there are a few critical areas of work. First, I want to restructure SWIG as a collection of loosely coupled modules written in either ANSI C or an scripting language. Second, I want the system to be minimalistic in its use of data structures and interconnections. The primary reason for this is that the fewer data structures there are, the less users will have to remember. This will also make the system more accessible to non-experts. Finally, I want to reevaluate the whole idea of a SWIG module is and expand the definition to include just about anything from parsers, preprocessors, optimizers, interface editors, and code generators.
The rest of this document outlines a few general rules of how code
should be developed within the SWIG project. These rules are
primarily drawn from my own experience developing software and
observing the practices of other successful projects.
2. Programming Languages and Libraries
All SWIG modules must be written in either ANSI C or one of the
scripting languages for which SWIG can generate an interface (e.g.,
Perl, Python, or Tcl). C++ is currently being used to write
SWIG modules, but it is only being utilized to avoid working with
a lot of pointers to functions. Advanced C++ features like namespaces, templates,
and overloading should not be used..
Module writers should make every attempt to use only those functions
described in the POSIX.1 standard. This includes most of the
functions contained the Kernighan and Ritchie C programming book. Use
of operating system dependent functionality such as socket libraries
should always be included inside a conditional compilation block so
that it can be omitted on problematic platforms. If you are unsure
about a library call, check the man page or contact Dave.
3. The Source Directory and Module Names
All SWIG modules are contained within the "Source" directory. Within
this directory, each module is placed into its own subdirectory. The
name of this subdirectory should exactly match the name of the module.
For example, if you are creating a module called "Tcl", all of your
files should be placed in a directory "Tcl".
When choosing a module name, please pick a name that is not
currently in use. As a general convention, the first letter of a
module name is capitalized such as "Perl". Alternatives such as
"perl" or "PERL" should be avoided. In certain instances, the first
two letters may be capitalized as in "CParse." The exact usage of
this is somewhat inconsistent and isn't terribly important--just make
sure the first letter is capitalized. Also, module names should not
start with numbers, include underscores or any other special
non-alphanumeric characters.
5. File Structure
Each file in a module should be given a filename that is all lowercase letters
such as "parser.c", not "Parser.c" or "PARSER.c". Please note that filenames
are case-insensitive on Windows so this convention will prevent you from inadvertently
creating two files that differ in case-only.
Each file should include a short abstract, author information, copyright information, and a SVN revision tag like this:
The SVN revision tag should be placed into a static string as shown above mangled with the name of the file. This adds the revision information to the SWIG executable and makes it possible to extract version information from a raw binary (sometimes useful in debugging)./* ----------------------------------------------------------------------------- * See the LICENSE file for information on copyright, usage and redistribution * of SWIG, and the README file for authors - http://www.swig.org/release.html. * * cwrap.c * * This file defines a variety of wrapping rules for C/C++ handling including * the naming of local variables, calling conventions, and so forth. * ----------------------------------------------------------------------------- */ char cvsroot_cwrap_c[] = "$Id: engineering.html 10008 2007-10-17 22:10:32Z wsfulton $"; #include "swig.h" /* Declarations */ typedef struct { int x, y; } Foo; ... /* Private Declarations (used only in this file) */ static int avariable; ... /* Functions */ ...
As a general rule, files start to get unmanageable once they exceed
about 2000 lines. Files larger than this should be broken up into
multiple files. Similarly, you should avoid the temptation to create
many small files as this increases compilation time and makes the
directory structure too complicated.
6. Bottom-Up Design
Within each source file, the preferred organization is to use what is
known as "bottom-up" design. Under this scheme, lower-level functions
appear first and the highest level function appears last. The easy
way to remember is that the "main" function of your module should
always appear last in the source file. For example:
This choice of design is somewhat arbitrary however it has a number of benefits particular to C. In particular, a bottom-up design generally eliminates the need to include forward references--resulting in cleaner code and fewer compilation errors./* Simple bottom-up program */ #include <stdio.h> int foo(int x, int y) { /* Implement foo */ ... } int bar() { ... foo(i,j); ... } ... int main(int argc, char **argv) { ... bar(); ... }
In the function declaration, the return type and any specifiers (extern or static) should appear on a separate line followed by the function name and arguments as shown above. The left curly brace should appear on the same line as the function name./* ------------------------------------------------------------------------- * Swig_add_directory() * * Adds a directory to the SWIG search path. * ------------------------------------------------------------------------- */ void Swig_add_directory(DOH *dirname) { ... }
Function declarations should NOT use the pre-ANSI function
declaration syntax. The ANSI standard has been around long enough for
this to be a non-issue.
8. Naming Conventions
The following conventions are used to name various objects throughout SWIG.
In general, the module name should match the name of the module subdirectory and the function name should be in all lowercase with words separated by underscores.Preprocessor_define() Swig_add_directory()
In this case, both the name of the module and the type should be capitalized. Also, whenever possible, you should use the "typedef struct Name { ... } Name" form when defining new data structures.typedef struct SwigScanner { ... } SwigScanner; typedef struct LParseType { ... } LParseType;
Separate words in a constant should be separated by underscores as with functions.#define SWIG_TOKEN_LPAREN 1
typedef struct SwigScanner { DOH *text; /* Current token value */ DOH *scanobjs; /* Objects being scanned */ DOH *str; /* Current object being scanned */ char *idstart; /* Optional identifier start characters */ int next_token; /* Next token to be returned */ int start_line; /* Starting line of certain declarations */ int yylen; /* Length of text pushed into text */ DOH *file; /* Current file name */ } SwigScanner;
It is better to hide the implementation of Foo and provide an function-call interface like this:typedef struct Foo { int line; } Foo;
Although this results in worse performance, there are many practical reasons for doing this. The most important reason is that it allows you to change the internal representation of Foo without breaking all of the other modules or having to recompile the entire universe after making your changes.typedef struct Foo Foo; extern int Foo_getline(Foo *f); extern void Foo_setline(Foo *f, int line);
There are two types of tags, internal (aka personal) and external. Internal tags are used by SWIG developers primarily, whereas external tags are used when communicating with people w/ anonymous svn access.