spell fixes + texinfo format fixes by Peter Lund

tcc-xref
bellard 2003-04-10 00:03:26 +00:00
parent d575137648
commit 3605c50b36
1 changed files with 213 additions and 132 deletions

View File

@ -1,16 +1,48 @@
\input texinfo @c -*- texinfo -*-
@c %**start of header
@setfilename tcc-doc.info
@settitle Tiny C Compiler Reference Documentation
@c %**end of header
@include version.texi
@ifinfo
Bla bla bla
@end ifinfo
@iftex
@titlepage
@afourpaper
@sp 7
@center @titlefont{Tiny C Compiler Reference Documentation}
@sp 3
@end titlepage
@headings double
@end iftex
@ifnothtml
@contents
@end ifnothtml
@ifnottex
@node Top, Introduction, (dir), (dir)
@top Tiny C Compiler Reference Documentation
This manual documents version @value{VERSION} of the Tiny C Compiler.
@menu
* Introduction:: Introduction to tcc.
* Invoke:: Invocation of tcc (command line, options).
* Bounds:: Automatic bounds-checking of C code.
* Libtcc:: bla bla bla.
@end menu
@end ifnottex
@node Introduction
@chapter Introduction
TinyCC (aka TCC) is a small but hyper fast C compiler. Unlike other C
compilers, it is meant to be self-suffisant: you do not need an
compilers, it is meant to be self-relying: you do not need an
external assembler or linker because TCC does that for you.
TCC compiles so @emph{fast} that even for big projects @code{Makefile}s may
@ -24,13 +56,13 @@ that you run as a Perl or Python script. Compilation is so fast that
your script will be as fast as if it was an executable.
TCC can also automatically generate memory and bound checks
(@xref{bounds}) while allowing all C pointers operations. TCC can do
(@pxref{Bounds}) while allowing all C pointers operations. TCC can do
these checks even if non patched libraries are used.
With @code{libtcc}, you can use TCC as a backend for dynamic code
generation (@xref{libtcc}).
generation (@pxref{Libtcc}).
@node invoke
@node Invoke
@chapter Command line invocation
@section Quick start
@ -41,45 +73,46 @@ usage: tcc [-c] [-o outfile] [-Bdir] [-bench] [-Idir] [-Dsym[=val]] [-Usym]
[--] infile1 [infile2... --] [infile_args...]
@end example
TCC options are a very much like gcc. The main difference is that TCC
@noindent
TCC options are a very much like gcc options. The main difference is that TCC
can also execute directly the resulting program and give it runtime
arguments.
Here are some examples to understand the logic:
@table @code
@item tcc a.c
Compile a.c and execute it directly
@item @samp{tcc a.c}
Compile @file{a.c} and execute it directly
@item tcc a.c arg1
@item @samp{tcc a.c arg1}
Compile a.c and execute it directly. arg1 is given as first argument to
the @code{main()} of a.c.
@item tcc -- a.c b.c -- arg1
Compile a.c and b.c, link them together and execute them. arg1 is given
@item @samp{tcc -- a.c b.c -- arg1}
Compile @file{a.c} and @file{b.c}, link them together and execute them. arg1 is given
as first argument to the @code{main()} of the resulting program. Because
multiple C files are specified, @code{--} are necessary to clearly separate the
multiple C files are specified, @option{--} are necessary to clearly separate the
program arguments from the TCC options.
@item tcc -o myprog a.c b.c
Compile a.c and b.c, link them and generate the executable myprog.
@item @samp{tcc -o myprog a.c b.c}
Compile @file{a.c} and @file{b.c}, link them and generate the executable @file{myprog}.
@item tcc -o myprog a.o b.o
link a.o and b.o together and generate the executable myprog.
@item @samp{tcc -o myprog a.o b.o}
link @file{a.o} and @file{b.o} together and generate the executable @file{myprog}.
@item tcc -c a.c
Compile a.c and generate object file a.o
@item @samp{tcc -c a.c}
Compile @file{a.c} and generate object file @file{a.o}.
@item tcc -c asmfile.S
Preprocess with C preprocess and assemble asmfile.S and generate
object file asmfile.o.
@item @samp{tcc -c asmfile.S}
Preprocess with C preprocess and assemble @file{asmfile.S} and generate
object file @file{asmfile.o}.
@item tcc -c asmfile.s
Assemble (but not preprocess) asmfile.s and generate object file
asmfile.o.
@item @samp{tcc -c asmfile.s}
Assemble (but not preprocess) @file{asmfile.s} and generate object file
@file{asmfile.o}.
@item tcc -r -o ab.o a.c b.c
Compile a.c and b.c, link them together and generate the object file ab.o.
@item @samp{tcc -r -o ab.o a.c b.c}
Compile @file{a.c} and @file{b.c}, link them together and generate the object file @file{ab.o}.
@end table
@ -93,19 +126,19 @@ need to add @code{#!/usr/local/bin/tcc} at the start of your C source:
#include <stdio.h>
int main()
{
@{
printf("Hello World\n");
return 0;
}
@}
@end example
@section Option summary
General Options:
@table @samp
@table @option
@item -c
Generate an object file (@samp{-o} option must also be given).
Generate an object file (@option{-o} option must also be given).
@item -o outfile
Put object file, executable, or dll into output file @file{outfile}.
@ -120,54 +153,54 @@ Output compilation statistics.
Preprocessor options:
@table @samp
@table @option
@item -Idir
Specify an additionnal include path. Include paths are searched in the
Specify an additional include path. Include paths are searched in the
order they are specified.
System include paths are always searched after. The default system
include paths are: @file{/usr/local/include}, @file{/usr/include}
and @file{PREFIX/lib/tcc/include}. (@code{PREFIX} is usually
and @file{PREFIX/lib/tcc/include}. (@file{PREFIX} is usually
@file{/usr} or @file{/usr/local}).
@item -Dsym[=val]
Define preprocessor symbol 'sym' to
val. If val is not present, its value is '1'. Function-like macros can
also be defined: @code{'-DF(a)=a+1'}
Define preprocessor symbol @samp{sym} to
val. If val is not present, its value is @samp{1}. Function-like macros can
also be defined: @option{-DF(a)=a+1}
@item -Usym
Undefine preprocessor symbol 'sym'.
Undefine preprocessor symbol @samp{sym}.
@end table
Linker options:
@table @samp
@table @option
@item -Ldir
Specify an additionnal static library path for the @samp{-l} option. The
Specify an additional static library path for the @option{-l} option. The
default library paths are @file{/usr/local/lib}, @file{/usr/lib} and @file{/lib}.
@item -lxxx
Link your program with dynamic library libxxx.so or static library
libxxx.a. The library is searched in the paths specified by the
@samp{-L} option.
@option{-L} option.
@item -shared
Generate a shared library instead of an executable (@samp{-o} option
Generate a shared library instead of an executable (@option{-o} option
must also be given).
@item -static
Generate a statically linked executable (default is a shared linked
executable) (@samp{-o} option must also be given).
executable) (@option{-o} option must also be given).
@item -r
Generate an object file combining all input files (@samp{-o} option must
Generate an object file combining all input files (@option{-o} option must
also be given).
@end table
Debugger options:
@table @samp
@table @option
@item -g
Generate run time debug information so that you get clear run time
error messages: @code{ test.c:68: in function 'test5()': dereferencing
@ -175,17 +208,17 @@ invalid pointer} instead of the laconic @code{Segmentation
fault}.
@item -b
Generate additionnal support code to check
memory allocations and array/pointer bounds. @samp{-g} is implied. Note
Generate additional support code to check
memory allocations and array/pointer bounds. @option{-g} is implied. Note
that the generated code is slower and bigger in this case.
@item -bt N
Display N callers in stack traces. This is useful with @samp{-g} or
@samp{-b}.
Display N callers in stack traces. This is useful with @option{-g} or
@option{-b}.
@end table
Note: GCC options @samp{-Ox}, @samp{-Wx}, @samp{-fx} and @samp{-mx} are
Note: GCC options @option{-Ox}, @option{-Wx}, @option{-fx} and @option{-mx} are
ignored.
@chapter C language support
@ -206,11 +239,11 @@ Currently implemented ISOC99 features:
@itemize
@item 64 bit @code{'long long'} types are fully supported.
@item 64 bit @code{long long} types are fully supported.
@item The boolean type @code{'_Bool'} is supported.
@item The boolean type @code{_Bool} is supported.
@item @code{'__func__'} is a string variable containing the current
@item @code{__func__} is a string variable containing the current
function name.
@item Variadic macros: @code{__VA_ARGS__} can be used for
@ -218,6 +251,8 @@ function name.
@example
#define dprintf(level, __VA_ARGS__) printf(__VA_ARGS__)
@end example
@noindent
@code{dprintf} can then be used with a variable number of parameters.
@item Declarations can appear anywhere in a block (as in C++).
@ -225,14 +260,14 @@ function name.
@item Array and struct/union elements can be initialized in any order by
using designators:
@example
struct { int x, y; } st[10] = { [0].x = 1, [0].y = 2 };
struct @{ int x, y; @} st[10] = @{ [0].x = 1, [0].y = 2 @};
int tab[10] = { 1, 2, [5] = 5, [9] = 9};
int tab[10] = @{ 1, 2, [5] = 5, [9] = 9@};
@end example
@item Compound initializers are supported:
@example
int *p = (int []){ 1, 2, 3 };
int *p = (int [])@{ 1, 2, 3 @};
@end example
to initialize a pointer pointing to an initialized array. The same
works for structures and strings.
@ -241,14 +276,16 @@ works for structures and strings.
@example
double d = 0x1234p10;
@end example
@noindent
is the same as writing
@example
double d = 4771840.0;
@end example
@item @code{'inline'} keyword is ignored.
@item @code{inline} keyword is ignored.
@item @code{'restrict'} keyword is ignored.
@item @code{restrict} keyword is ignored.
@end itemize
@section GNU C extensions
@ -259,30 +296,30 @@ TCC implements some GNU C extensions:
@item array designators can be used without '=':
@example
int a[10] = { [0] 1, [5] 2, 3, 4 };
int a[10] = @{ [0] 1, [5] 2, 3, 4 @};
@end example
@item Structure field designators can be a label:
@example
struct { int x, y; } st = { x: 1, y: 1};
struct @{ int x, y; @} st = @{ x: 1, y: 1@};
@end example
instead of
@example
struct { int x, y; } st = { .x = 1, .y = 1};
struct @{ int x, y; @} st = @{ .x = 1, .y = 1@};
@end example
@item @code{'\e'} is ASCII character 27.
@item @code{\e} is ASCII character 27.
@item case ranges : ranges can be used in @code{case}s:
@example
switch(a) {
case 1 ... 9:
switch(a) @{
case 1 @dots{} 9:
printf("range 1 to 9\n");
break;
default:
printf("unexpected\n");
break;
}
@}
@end example
@item The keyword @code{__attribute__} is handled to specify variable or
@ -307,20 +344,22 @@ Here are some examples:
int a __attribute__ ((aligned(8), section(".mysection")));
@end example
align variable @code{'a'} to 8 bytes and put it in section @code{.mysection}.
@noindent
align variable @code{a} to 8 bytes and put it in section @code{.mysection}.
@example
int my_add(int a, int b) __attribute__ ((section(".mycodesection")))
{
@{
return a + b;
}
@}
@end example
generate function @code{'my_add'} in section @code{.mycodesection}.
@noindent
generate function @code{my_add} in section @code{.mycodesection}.
@item GNU style variadic macros:
@example
#define dprintf(fmt, args...) printf(fmt, ## args)
#define dprintf(fmt, args@dots{}) printf(fmt, ## args)
dprintf("no arg\n");
dprintf("one arg %d\n", 1);
@ -341,26 +380,31 @@ to get the alignment of a type or an expression.
used to jump on the pointer resulting from @code{expr}.
@item Inline assembly with asm instruction:
@cindex inline assembly
@cindex assembly, inline
@cindex __asm__
@example
static inline void * my_memcpy(void * to, const void * from, size_t n)
{
@{
int d0, d1, d2;
__asm__ __volatile__(
"rep ; movsl\n\t"
"testb $2,%b4\n\t"
"je 1f\n\t"
"movsw\n"
"1:\ttestb $1,%b4\n\t"
"je 2f\n\t"
"movsb\n"
"2:"
: "=&c" (d0), "=&D" (d1), "=&S" (d2)
:"0" (n/4), "q" (n),"1" ((long) to),"2" ((long) from)
: "memory");
"rep ; movsl\n\t"
"testb $2,%b4\n\t"
"je 1f\n\t"
"movsw\n"
"1:\ttestb $1,%b4\n\t"
"je 2f\n\t"
"movsb\n"
"2:"
: "=&c" (d0), "=&D" (d1), "=&S" (d2)
:"0" (n/4), "q" (n),"1" ((long) to),"2" ((long) from)
: "memory");
return (to);
}
@}
@end example
@noindent
@cindex gas
TCC includes its own x86 inline assembler with a @code{gas}-like (GNU
assembler) syntax. No intermediate files are generated. GCC 3.x named
operands are supported.
@ -371,13 +415,13 @@ operands are supported.
@itemize
@item @code{__TINYC__} is a predefined macro to @code{'1'} to
@item @code{__TINYC__} is a predefined macro to @code{1} to
indicate that you use TCC.
@item @code{'#!'} at the start of a line is ignored to allow scripting.
@item @code{#!} at the start of a line is ignored to allow scripting.
@item Binary digits can be entered (@code{'0b101'} instead of
@code{'5'}).
@item Binary digits can be entered (@code{0b101} instead of
@code{5}).
@item @code{__BOUNDS_CHECKING_ON} is defined if bound checking is activated.
@ -452,6 +496,16 @@ They can be defined several times in the same source. Use 'b'
@end itemize
@section Directives
@cindex assembler directives
@cindex directives, assembler
@cindex .align
@cindex .skip
@cindex .space
@cindex .byte
@cindex .word
@cindex .short
@cindex .int
@cindex .long
All directives are preceeded by a '.'. The following directives are
supported:
@ -468,6 +522,7 @@ supported:
@end itemize
@section X86 Assembler
@cindex assembler
All X86 opcodes are supported. Only ATT syntax is supported (source
then destination operand order). If no size suffix is given, TinyCC
@ -476,20 +531,22 @@ tries to guess it from the operand sizes.
Currently, MMX opcodes are supported but not SSE ones.
@chapter TinyCC Linker
@cindex linker
@section ELF file generation
@cindex ELF
TCC can directly output relocatable ELF files (object files),
executable ELF files and dynamic ELF libraries without relying on an
external linker.
Dynamic ELF libraries can be output but the C compiler does not generate
position independant code (PIC) code. It means that the dynamic librairy
position independent code (PIC). It means that the dynamic librairy
code generated by TCC cannot be factorized among processes yet.
TCC linker cannot currently suppress unused object code. But TCC
TCC linker cannot currently eliminate unused object code. But TCC
will soon integrate a novel feature not found in GNU tools: unused code
will be suppressed at the function or variable level, provided you only
will be eliminated at the function or variable level, provided you only
use TCC to compile your files.
@section ELF file loader
@ -498,10 +555,14 @@ TCC can load ELF object files, archives (.a files) and dynamic
libraries (.so).
@section GNU Linker Scripts
@cindex scripts, linker
@cindex linker scripts
@cindex GROUP, linker command
@cindex FILE, linker command
Because on many Linux systems some dynamic libraries (such as
@file{/usr/lib/libc.so}) are in fact GNU ld link scripts (horrible!),
TCC linker also support a subset of GNU ld scripts.
the TCC linker also supports a subset of GNU ld scripts.
The @code{GROUP} and @code{FILE} commands are supported.
@ -513,78 +574,80 @@ Example from @file{/usr/lib/libc.so}:
GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a )
@end example
@node bounds
@node Bounds
@chapter TinyCC Memory and Bound checks
@cindex bound checks
@cindex memory checks
This feature is activated with the @code{'-b'} (@xref{invoke}).
This feature is activated with the @option{-b} (@pxref{Invoke}).
Note that pointer size is @emph{unchanged} and that code generated
with bound checks is @emph{fully compatible} with unchecked
code. When a pointer comes from unchecked code, it is assumed to be
valid. Even very obscure C code with casts should work correctly.
To have more information about the ideas behind this method, check at
For more information about the ideas behind this method, see
@url{http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html}.
Here are some examples of catched errors:
Here are some examples of caught errors:
@table @asis
@item Invalid range with standard string function:
@example
{
@{
char tab[10];
memset(tab, 0, 11);
}
@}
@end example
@item Bound error in global or local arrays:
@item Out of bounds-error in global or local arrays:
@example
{
@{
int tab[10];
for(i=0;i<11;i++) {
for(i=0;i<11;i++) @{
sum += tab[i];
}
}
@}
@}
@end example
@item Bound error in allocated data:
@item Out of bounds-error in malloc'ed data:
@example
{
@{
int *tab;
tab = malloc(20 * sizeof(int));
for(i=0;i<21;i++) {
for(i=0;i<21;i++) @{
sum += tab4[i];
}
@}
free(tab);
}
@}
@end example
@item Access to a freed region:
@item Access of freed memory:
@example
{
@{
int *tab;
tab = malloc(20 * sizeof(int));
free(tab);
for(i=0;i<20;i++) {
for(i=0;i<20;i++) @{
sum += tab4[i];
}
}
@}
@}
@end example
@item Freeing an already freed region:
@item Double free:
@example
{
@{
int *tab;
tab = malloc(20 * sizeof(int));
free(tab);
free(tab);
}
@}
@end example
@end table
@node libtcc
@node Libtcc
@chapter The @code{libtcc} library
The @code{libtcc} library enables you to use TCC as a backend for
@ -597,7 +660,7 @@ The idea consists in giving a C string containing the program you want
to compile directly to @code{libtcc}. Then you can access to any global
symbol (function or variable) defined.
@chapter Developper's guide
@chapter Developer's guide
This chapter gives some hints to understand how TCC works. You can skip
it if you do not intend to modify the TCC code.
@ -617,7 +680,7 @@ expansion.
@code{tok} contains the current token (see @code{TOK_xxx})
constants. Identifiers and keywords are also keywords. @code{tokc}
contains additionnal infos about the token (for example a constant value
contains additional infos about the token (for example a constant value
if number or string token).
@section Parser
@ -771,22 +834,24 @@ contain the exported symbols (currently only used for debugging).
@end table
@section Code generation
@cindex code generation
@subsection Introduction
The TCC code generator directly generates linked binary code in one
pass. It is rather unusual these days (see gcc for example which
generates text assembly), but it allows to be very fast and surprisingly
not so complicated.
generates text assembly), but it can be very fast and surprisingly
little complicated.
The TCC code generator is register based. Optimization is only done at
the expression level. No intermediate representation of expression is
kept except the current values stored in the @emph{value stack}.
On x86, three temporary registers are used. When more registers are
needed, one register is flushed in a new local variable.
needed, one register is spilled into a new temporary variable on the stack.
@subsection The value stack
@cindex value stack, introduction
When an expression is parsed, its value is pushed on the value stack
(@var{vstack}). The top of the value stack is @var{vtop}. Each value
@ -794,7 +859,7 @@ stack entry is the structure @code{SValue}.
@code{SValue.t} is the type. @code{SValue.r} indicates how the value is
currently stored in the generated code. It is usually a CPU register
index (@code{REG_xxx} constants), but additionnal values and flags are
index (@code{REG_xxx} constants), but additional values and flags are
defined:
@example
@ -835,8 +900,8 @@ put in a normal register.
@item VT_JMP
@itemx VT_JMPI
indicates that the value is the consequence of a jmp. For VT_JMP, it is
1 if the jump is taken, 0 otherwise. For VT_JMPI it is inverted.
indicates that the value is the consequence of a conditional jump. For VT_JMP,
it is 1 if the jump is taken, 0 otherwise. For VT_JMPI it is inverted.
These values are used to compile the @code{||} and @code{&&} logical
operators.
@ -857,10 +922,10 @@ understand how TCC works.
@itemx VT_LVAL_SHORT
@itemx VT_LVAL_UNSIGNED
if the lvalue has an integer type, then these flags give its real
type. The type alone is not suffisant in case of cast optimisations.
type. The type alone is not enough in case of cast optimisations.
@item VT_LLOCAL
is a saved lvalue on the stack. @code{VT_LLOCAL} should be suppressed
is a saved lvalue on the stack. @code{VT_LLOCAL} should be eliminated
ASAP because its semantics are rather complicated.
@item VT_MUSTCAST
@ -877,17 +942,18 @@ are only used for optional bound checking.
@end table
@subsection Manipulating the value stack
@cindex value stack
@code{vsetc()} and @code{vset()} pushes a new value on the value
stack. If the previous @code{vtop} was stored in a very unsafe place(for
stack. If the previous @var{vtop} was stored in a very unsafe place(for
example in the CPU flags), then some code is generated to put the
previous @code{vtop} in a safe storage.
previous @var{vtop} in a safe storage.
@code{vpop()} pops @code{vtop}. In some cases, it also generates cleanup
@code{vpop()} pops @var{vtop}. In some cases, it also generates cleanup
code (for example if stacked floating point registers are used as on
x86).
The @code{gv(rc)} function generates code to evaluate @code{vtop} (the
The @code{gv(rc)} function generates code to evaluate @var{vtop} (the
top value of the stack) into registers. @var{rc} selects in which
register class the value should be put. @code{gv()} is the @emph{most
important function} of the code generator.
@ -896,7 +962,7 @@ important function} of the code generator.
entries.
@subsection CPU dependent code generation
@cindex CPU dependent
See the @file{i386-gen.c} file to have an example.
@table @code
@ -939,12 +1005,18 @@ floating point to floating point of different size conversion.
@item gen_bounded_ptr_add()
@item gen_bounded_ptr_deref()
are only used for bound checking.
are only used for bounds checking.
@end table
@section Optimizations done
@cindex optimizations
@cindex constant propagation
@cindex strength reduction
@cindex comparison operators
@cindex caching processor flags
@cindex flags, caching
@cindex jump optimization
Constant propagation is done for all operations. Multiplications and
divisions are optimized to shifts when appropriate. Comparison
operators are optimized by maintaining a special cache for the
@ -952,3 +1024,12 @@ processor flags. &&, || and ! are optimized by maintaining a special
'jump target' value. No other jump optimization is currently performed
because it would require to store the code in a more abstract fashion.
@unnumbered Concept Index
@printindex cp
@bye
@c Local variables:
@c fill-column: 78
@c texinfo-column-for-description: 32
@c End: