tcc-xref
bellard 2002-01-05 19:50:17 +00:00
parent fd20e74b2f
commit ebe9e87ccf
3 changed files with 199 additions and 64 deletions

36
README
View File

@ -15,6 +15,9 @@ Features:
heading torward full ISOC99 compliance. TCC can of course compile heading torward full ISOC99 compliance. TCC can of course compile
itself. itself.
- SAFE! tcc includes an optional memory and bound checker. Bound
checked code can be mixed freely with standard code.
- Compile and execute C source directly. No linking or assembly - Compile and execute C source directly. No linking or assembly
necessary. Full C preprocessor included. necessary. Full C preprocessor included.
@ -27,7 +30,7 @@ Documentation:
1) Installation 1) Installation
***TCC currently only works on Linux x86***. *** TCC currently only works on Linux x86 with glibc >= 2.1 ***.
Type 'make install' to compile and install tcc in /usr/local/bin and Type 'make install' to compile and install tcc in /usr/local/bin and
/usr/local/lib/tcc. /usr/local/lib/tcc.
@ -49,21 +52,7 @@ launch the C code as a shell or perl script :-) The command line
arguments are put in 'argc' and 'argv' of the main functions, as in arguments are put in 'argc' and 'argv' of the main functions, as in
ANSI C. ANSI C.
3) Invokation 3) Examples
'-Idir' : specify an additionnal include path. The
default ones are: /usr/include, /usr/lib/tcc, /usr/local/lib/tcc.
'-Dsym' : define preprocessor symbol 'sym' to 1.
'-lxxx' : dynamically link your program with library
libxxx.so. Standard library paths are checked, including those
specificed with LD_LIBRARY_PATH.
'-i file' : compile C source 'file' before main C source. With this
command, multiple C files can be compiled and linked together.
4) Examples
ex1.c: simplest example (hello world). Can also be launched directly ex1.c: simplest example (hello world). Can also be launched directly
as a script: './ex1.c'. as a script: './ex1.c'.
@ -84,7 +73,7 @@ generator.
prog.c: auto test for TCC which tests many subtle possible bugs. Used prog.c: auto test for TCC which tests many subtle possible bugs. Used
when doing 'make test'. when doing 'make test'.
5) Full Documentation 4) Full Documentation
Please read tcc-doc.html to have all the features of TCC. Please read tcc-doc.html to have all the features of TCC.
@ -105,7 +94,7 @@ assembly), but it allows to be very fast and surprisingly not so
complicated. complicated.
The TCC code generator is register based. It means that it could even The TCC code generator is register based. It means that it could even
generate good code for RISC processors. On x86, three temporary generate not so bad code for RISC processors. On x86, three temporary
registers are used. When more registers are needed, one register is registers are used. When more registers are needed, one register is
flushed in a new local variable. flushed in a new local variable.
@ -113,13 +102,12 @@ Constant propagation is done for all operations. Multiplications and
divisions are optimized to shifts when appropriate. Comparison divisions are optimized to shifts when appropriate. Comparison
operators are optimized by maintaining a special cache for the operators are optimized by maintaining a special cache for the
processor flags. &&, || and ! are optimized by maintaining a special processor flags. &&, || and ! are optimized by maintaining a special
'jmp target' value. No other jmp optimization is currently performed 'jump target' value. No other jump optimization is currently performed
because it would require to store the code in a more abstract fashion. because it would require to store the code in a more abstract fashion.
The types and values descriptions are stored in a single 'int' The types are stored in a single 'int' variable (see VT_xxx
variable (see VT_xxx constants). It was choosen in the first stages of constants). It was choosen in the first stages of development when tcc
development when tcc was much simpler. Now, it may not be the best was much simpler. Now, it may not be the best solution.
solution.
License: License:
------- -------
@ -130,4 +118,4 @@ file).
I accept only patches where you give your copyright explicitely to me I accept only patches where you give your copyright explicitely to me
to simplify licensing issues. to simplify licensing issues.
Fabrice Bellard - Nov 17, 2001. Fabrice Bellard.

18
TODO
View File

@ -1,25 +1,27 @@
TODO list: TODO list:
Critical: Critical:
- finish float/double support. add function type convertion. - optimize slightly bound checking when doing addition + dereference.
- section generation and GNUC __attributte__ handling. - better section generator (suppress some mmaps).
- D option with '=' handling - To check: bound checking and float/long long/struct copy code
- 0 is pointer - fix type compare
- To check: 'sizeof' may not work if too complex expression is given. - To check: 'sizeof' may not work if too complex expression is given.
- fix 'char' and 'short' casts (only in function parameters and in - fix bound check code with '&' on local variables (currently done
assignment). only for local arrays).
Not critical: Not critical:
- interactive mode - add PowerPC or ARM code generator and improve codegen for RISC (need
to suppress VT_LOCAL and use a base register instead).
- interactive mode / integrated debugger
- fix multiple compound literals inits in blocks (ISOC99 normative - fix multiple compound literals inits in blocks (ISOC99 normative
example - only relevant when using gotos! -> must add boolean example - only relevant when using gotos! -> must add boolean
variable to tell if compound literal was already initialized). variable to tell if compound literal was already initialized).
- add more bounds checked functions (strcpy, ...)
- fix L"\x1234" wide string case (need to store them as utf8 ?) - fix L"\x1234" wide string case (need to store them as utf8 ?)
- fix preprocessor symbol redefinition - fix preprocessor symbol redefinition
- better constant opt (&&, ||, ?:) - better constant opt (&&, ||, ?:)
- add ELF executable and shared library output option (would be needed - add ELF executable and shared library output option (would be needed
for completness!). for completness!).
- add PowerPC code generator. - D option with all #define cases (needs C parser)
- add portable byte code generator and interpreter for other - add portable byte code generator and interpreter for other
unsupported architectures. unsupported architectures.

View File

@ -14,49 +14,51 @@ Tiny C Compiler Reference Documentation
<h2>Introduction</h2> <h2>Introduction</h2>
TinyCC (aka TCC) is a small but very fast C compiler. Unlike other C TinyCC (aka TCC) is a small but hyper fast C compiler. Unlike other C
compilers, it is meant to be self-suffisant: you do not need an compilers, it is meant to be self-suffisant: you do not need an
external assembler or linker because TCC does that for you. external assembler or linker because TCC does that for you.
<P> <P>
TCC compiles so <em>fast</em> that even for big projects <tt>Makefile</tt>s may
TCC compiles so fast that even for big projects <tt>Makefile</tt>s may
not be necessary. not be necessary.
<P> <P>
TCC not only supports ANSI C, but also most of the new ISO C99
standard and many GNUC extensions.
<P>
TCC can also be used to make <I>C scripts</I>, TCC can also be used to make <I>C scripts</I>,
i.e. pieces of C source that you run as a Perl or Python i.e. pieces of C source that you run as a Perl or Python
script. Compilation is so fast that your script will be as fast as if script. Compilation is so fast that your script will be as fast as if
it was an executable. it was an executable.
<P>
TCC can also automatically generate <A HREF="#bounds">memory and bound
checks</A> while allowing all C pointers operations. TCC can do these
checks even if non patched libraries are used.
</P>
<h2>Exact differences with ANSI C</h2> <h2>Full ANSI C support</h2>
TCC implements almost all the ANSI C standard, except floating points TCC implements all the ANSI C standard, including structure bit fields
numbers. and floating point numbers (<tt>long double</tt>, <tt>double</tt>, and
<tt>float</tt> fully supported). The following limitations are known:
<ul> <ul>
<li> The preprocessor tokens are the same as C. It means that in some <li> The preprocessor tokens are the same as C. It means that in some
rare cases, preprocessed numbers are not handled exactly as in ANSI rare cases, preprocessed numbers are not handled exactly as in ANSI
C. This approach has the advantage of being simpler and FAST! C. This approach has the advantage of being simpler and FAST!
<li> Floating point numbers are not fully supported yet (some
implicit casts are missing).
<li> Some typing errors are not signaled.
</ul> </ul>
<h2>ISOC99 extensions</h2> <h2>ISOC99 extensions</h2>
TCC implements many features of the new C standard: ISO C99. Currently TCC implements many features of the new C standard: ISO C99. Currently
missing items are: complex and imaginary numbers (will come with ANSI missing items are: complex and imaginary numbers and variable length
C floating point numbers), <tt>long long</tt>s and variable length
arrays. arrays.
Currently implemented ISOC99 features: Currently implemented ISOC99 features:
<ul> <ul>
<li> <tt>'inline'</tt> keyword is ignored. <li> 64 bit <tt>'long long'</tt> types are fully supported.
<li> <tt>'restrict'</tt> keyword is ignored. <li> The boolean type <tt>'_Bool'</tt> is supported.
<li> <tt>'__func__'</tt> is a string variable containing the current <li> <tt>'__func__'</tt> is a string variable containing the current
function name. function name.
@ -68,7 +70,7 @@ function name.
</PRE> </PRE>
<tt>dprintf</tt> can then be used with a variable number of parameters. <tt>dprintf</tt> can then be used with a variable number of parameters.
<li> Declarations can appear anywhere in a block as in C++. <li> Declarations can appear anywhere in a block (as in C++).
<li> Array and struct/union elements can be initialized in any order by <li> Array and struct/union elements can be initialized in any order by
using designators: using designators:
@ -85,11 +87,6 @@ function name.
to initialize a pointer pointing to an initialized array. The same to initialize a pointer pointing to an initialized array. The same
works for structures and strings. works for structures and strings.
<li> The boolean type <tt>'_Bool'</tt> is supported.
<li> <tt>'long long'</tt> types not supported yet, except in type
definition or <tt>'sizeof'</tt>.
<li> Hexadecimal floating point constants are supported: <li> Hexadecimal floating point constants are supported:
<PRE> <PRE>
double d = 0x1234p10; double d = 0x1234p10;
@ -98,11 +95,15 @@ is the same as writing
<PRE> <PRE>
double d = 4771840.0; double d = 4771840.0;
</PRE> </PRE>
<li> <tt>'inline'</tt> keyword is ignored.
<li> <tt>'restrict'</tt> keyword is ignored.
</ul> </ul>
<h2>GNU C extensions</h2> <h2>GNU C extensions</h2>
TCC implements some GNU C extensions which are found in many C sources: TCC implements some GNU C extensions:
<ul> <ul>
@ -122,6 +123,45 @@ instead of
<li> <tt>'\e'</tt> is ASCII character 27. <li> <tt>'\e'</tt> is ASCII character 27.
<li> case ranges : ranges can be used in <tt>case</tt>s:
<PRE>
switch(a) {
case 1 ... 9:
printf("range 1 to 9\n");
break;
default:
printf("unexpected\n");
break;
}
</PRE>
<li> The keyword <tt>__attribute__</tt> is handled to specify variable or
function attributes. The following attributes are supported:
<ul>
<li> <tt>aligned(n)</tt>: align data to n bytes (must be a power of two).
<li> <tt>section(name)</tt>: generate function or data in assembly
section name (name is a string containing the section name) instead
of the default section.
<li> <tt>unused</tt>: specify that the variable or the function is unused.
</ul>
<BR>
Here are some examples:
<PRE>
int a __attribute__ ((aligned(8), section(".mysection")));
</PRE>
<BR>
align variable <tt>'a'</tt> to 8 bytes and put it in section <tt>.mysection</tt>.
<PRE>
int my_add(int a, int b) __attribute__ ((section(".mycodesection")))
{
return a + b;
}
</PRE>
<BR>
generate function <tt>'my_add'</tt> in section <tt>.mycodesection</tt>.
</ul> </ul>
<h2>TinyCC extensions</h2> <h2>TinyCC extensions</h2>
@ -138,35 +178,140 @@ indicate that you use TCC.
<li> Binary digits can be entered (<tt>'0b101'</tt> instead of <li> Binary digits can be entered (<tt>'0b101'</tt> instead of
<tt>'5'</tt>). <tt>'5'</tt>).
<li> <tt>__BOUNDS_CHECKING_ON</tt> is defined if bound checking is activated.
</ul> </ul>
<h2> Command line invokation </h2> <h2>TinyCC Memory and Bound checks</h2>
<A NAME="bounds"></a>
This feature is activated with the <A HREF="#invoke"><tt>'-b'</tt>
option</A>.
<P>
Note that pointer size is <em>unchanged</em> and that code generated
with bound checks is <em>fully compatible</em> with unchecked
code. When a pointer comes from unchecked code, it is assumed to be
valid. Even very obscure C code with casts should work correctly.
</P>
<P> To have more information about the ideas behind this method, <A
HREF="http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html">check
here</A>.
</P>
<P>
Here are some examples of catched errors:
</P>
<TABLE BORDER=1>
<TR>
<TD>
<PRE>
{
char tab[10];
memset(tab, 0, 11);
}
</PRE>
</TD><TD VALIGN=TOP>Invalid range with standard string function</TD>
<TR>
<TD>
<PRE>
{
int tab[10];
for(i=0;i<11;i++) {
sum += tab[i];
}
}
</PRE>
</TD><TD VALIGN=TOP>Bound error in global or local arrays</TD>
<TR>
<TD>
<PRE>
{
int *tab;
tab = malloc(20 * sizeof(int));
for(i=0;i<21;i++) {
sum += tab4[i];
}
free(tab);
}
</PRE>
</TD><TD VALIGN=TOP>Bound error in allocated data</TD>
<TR>
<TD>
<PRE>
{
int *tab;
tab = malloc(20 * sizeof(int));
free(tab);
for(i=0;i<20;i++) {
sum += tab4[i];
}
}
</PRE>
</TD><TD VALIGN=TOP>Access to a freed region</TD>
<TR>
<TD>
<PRE>
{
int *tab;
tab = malloc(20 * sizeof(int));
free(tab);
free(tab);
}
</PRE>
</TD><TD VALIGN=TOP>Freeing an already freed region</TD>
</TABLE>
<h2> Command line invocation </h2>
<A NAME="invoke"></a>
<PRE> <PRE>
usage: tcc [-Idir] [-Dsym] [-llib] [-i infile] infile [infile_args...] usage: tcc [-Idir] [-Dsym[=val]] [-Usym] [-llib] [-g] [-b]
[-i infile] infile [infile_args...]
</PRE> </PRE>
<table> <table>
<tr><td>'-Idir'</td> <tr><td>'-Idir'</td>
<td>specify an additionnal include path. The default ones are: <td>Specify an additionnal include path. The default ones are:
/usr/include, /usr/lib/tcc, /usr/local/lib/tcc.</td> /usr/include, /usr/lib/tcc, /usr/local/lib/tcc.</td>
<tr><td>'-Dsym'</td> <tr><td>'-Dsym[=val]'</td> <td>Define preprocessor symbol 'sym' to
<td>define preprocessor symbol 'sym' to 1.</td> val. If val is not present, its value is '1'. NOTE: currently, only
integer and strings are supported as values</td>
<tr><td>'-Usym'</td> <td>Undefine preprocessor symbol 'sym'.</td>
<tr><td>'-lxxx'</td> <tr><td>'-lxxx'</td>
<td>dynamically link your program with library <td>Dynamically link your program with library
libxxx.so. Standard library paths are checked, including those libxxx.so. Standard library paths are checked, including those
specificed with LD_LIBRARY_PATH.</td> specified with LD_LIBRARY_PATH.</td>
<tr><td>'-g'</td>
<td>Generate run time debug information so that you get clear run time
error messages: <tt> test.c:68: in function 'test5()': dereferencing
invalid pointer</tt> instead of the laconic <tt>Segmentation
fault</tt>.
</td>
<tr><td>'-b'</td> <td>Generate additionnal support code to check
memory allocations and array/pointer bounds. '-g' is implied. Note
that the generated code is slower and bigger in this case.
</td>
<tr><td>'-i file'</td> <tr><td>'-i file'</td>
<td>compile C source 'file' before main C source. With this <td>Compile C source 'file' before main C source. With this
command, multiple C files can be compiled and linked together.</td> command, multiple C files can be compiled and linked together.</td>
</table> </table>
<br>
Note: the <tt>'-o file'</tt> option to generate an ELF executable is
currently unsupported.
<hr> <hr>
Copyright (c) 2001 Fabrice Bellard <hr> Copyright (c) 2001, 2002 Fabrice Bellard <hr>
Fabrice Bellard - <em> fabrice.bellard at free.fr </em> - <A HREF="http://fabrice.bellard.free.fr/"> http://fabrice.bellard.free.fr/ </A> - <A HREF="http://www.tinycc.org/"> http://www.tinycc.org/ </A> Fabrice Bellard - <em> fabrice.bellard at free.fr </em> - <A HREF="http://bellard.org/"> http://bellard.org/ </A> - <A HREF="http://www.tinycc.org/"> http://www.tinycc.org/ </A>
</body> </body>
</html> </html>