Go to file
Kirill Smelkov efd9d92b7c lib/bcheck: Don't assume heap goes right after bss
At startup __bound_init() wants to mark malloc zone as invalid memory,
so that any access to memory on heap, not allocated through malloc be
invalid. Other pages are initialized as empty regions, access to which
is not treated as invalid by bounds-checking.

The problem is code incorrectly assumed that heap goes right after bss,
and that is not correct for two cases:

    1) if we are running from `tcc -b -run`, program text data and bss
       will be already in malloced memory, possibly in mmaped region
       insead of heap, and marking memory as invalid from _end
       will not cover heap and probably wrongly mark correct regions.

    2) if address space randomization is turned on, again heap does not
       start from _end, and we'll mark as invalid something else instead
       of malloc area.

For example with the following diagnostic patch ...

    diff --git a/tcc.c b/tcc.c
    index 5dd5725..31c46e8 100644
    --- a/tcc.c
    +++ b/tcc.c
    @@ -479,6 +479,8 @@ static int parse_args(TCCState *s, int argc, char **argv)
         return optind;
     }

    +extern int _etext, _edata, _end;
    +
     int main(int argc, char **argv)
     {
         int i;
    @@ -487,6 +489,18 @@ int main(int argc, char **argv)
         int64_t start_time = 0;
         const char *default_file = NULL;

    +    void *brk;
    +
    +    brk = sbrk(0);
    +
    +    fprintf(stderr, "\n>>> TCC\n\n");
    +    fprintf(stderr, "etext:\t%10p\n",  &_etext);
    +    fprintf(stderr, "edata:\t%10p\n",  &_edata);
    +    fprintf(stderr, "end:\t%10p\n",    &_end);
    +    fprintf(stderr, "brk:\t%10p\n",    brk);
    +    fprintf(stderr, "stack:\t%10p\n",  &brk);
    +
    +    fprintf(stderr, "&errno: %p\n", &errno);
         s = tcc_new();

         output_type = TCC_OUTPUT_EXE;

    diff --git a/tccrun.c b/tccrun.c
    index 531f46a..25ed30a 100644
    --- a/tccrun.c
    +++ b/tccrun.c
    @@ -91,6 +91,8 @@ LIBTCCAPI int tcc_run(TCCState *s1, int argc, char **argv)
         int (*prog_main)(int, char **);
         int ret;

    +    fprintf(stderr, "\n\ntcc_run() ...\n\n");
    +
         if (tcc_relocate(s1, TCC_RELOCATE_AUTO) < 0)
             return -1;

    diff --git a/lib/bcheck.c b/lib/bcheck.c
    index ea5b233..8b26a5f 100644
    --- a/lib/bcheck.c
    +++ b/lib/bcheck.c
    @@ -296,6 +326,8 @@ static void mark_invalid(unsigned long addr, unsigned long size)
         start = addr;
         end = addr + size;

    +    fprintf(stderr, "mark_invalid  %10p - %10p\n", (void *)addr, (void *)end);
    +
         t2_start = (start + BOUND_T3_SIZE - 1) >> BOUND_T3_BITS;
         if (end != 0)
             t2_end = end >> BOUND_T3_BITS;

... Look how memory is laid out for `tcc -b -run ...`:

    $ ./tcc -B. -b -DTCC_TARGET_I386 -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\"  -run   \
        -DONE_SOURCE ./tcc.c -B. -c x.c

    >>> TCC

    etext:   0x8065477
    edata:   0x8070220
    end:     0x807a95c
    brk:     0x807b000
    stack:  0xaffff0f0
    &errno: 0xa7e25688

    tcc_run() ...

    mark_invalid  0xfff80000 -      (nil)
    mark_invalid  0xa7c31d98 - 0xafc31d98

    >>> TCC

    etext:  0xa7c22767
    edata:  0xa7c2759c
    end:    0xa7c31d98
    brk:     0x8211000
    stack:  0xafffeff0
    &errno: 0xa7e25688
    Runtime error: dereferencing invalid pointer
    ./tccpp.c:1953: at 0xa7beebdf parse_number() (included from ./libtcc.c, ./tcc.c)
    ./tccpp.c:3003: by 0xa7bf0708 next() (included from ./libtcc.c, ./tcc.c)
    ./tccgen.c:4465: by 0xa7bfe348 block() (included from ./libtcc.c, ./tcc.c)
    ./tccgen.c:4440: by 0xa7bfe212 block() (included from ./libtcc.c, ./tcc.c)
    ./tccgen.c:5529: by 0xa7c01929 gen_function() (included from ./libtcc.c, ./tcc.c)
    ./tccgen.c:5767: by 0xa7c02602 decl0() (included from ./libtcc.c, ./tcc.c)

The second mark_invalid goes right after in-memory-compiled program's
_end, and oops, that's not where malloc zone is (starts from brk), and oops
again, mark_invalid covers e.g. errno. Then compiled tcc is crasshing by
bcheck on errno access:

    1776 static void parse_number(const char *p)
    1777 {
    1778     int b, t, shift, frac_bits, s, exp_val, ch;
         ...
    1951             *q = '\0';
    1952             t = toup(ch);
    1953             errno = 0;

The solution here is to use sbrk(0) as approximation for the program
break start instead of &_end:

    - if we are a separately compiled program, __bound_init() runs early,
      and sbrk(0) should be equal or very near to start_brk (in case other
      constructors malloc something), or

    - if we are running from under `tcc -b -run`, sbrk(0) will return
      start of heap portion which is under this program control, and not
      mark as invalid earlier allocated memory.

With this patch `tcc -b -run tcc.c ...` succeeds compiling above
small-test program (diagnostic patch is still applied too):

    $ ./tcc -B. -b -DTCC_TARGET_I386 -DCONFIG_MULTIARCHDIR=\"i386-linux-gnu\"  -run   \
        -DONE_SOURCE ./tcc.c -B. -c x.c

    >>> TCC

    etext:   0x8065477
    edata:   0x8070220
    end:     0x807a95c
    brk:     0x807b000
    stack:  0xaffff0f0
    &errno: 0xa7e25688

    tcc_run() ...

    mark_invalid  0xfff80000 -      (nil)
    mark_invalid   0x8211000 - 0x10211000

    >>> TCC

    etext:  0xa7c22777
    edata:  0xa7c275ac
    end:    0xa7c31da8
    brk:     0x8211000
    stack:  0xafffeff0
    &errno: 0xa7e25688

    (completes ok)

but running `tcc -b -run tcc.c -run tests/tcctest.c` sigsegv's - that's
the plot for the next patch.
2012-12-09 19:05:36 +04:00
examples Revert "Make ex1.c and ex4.c be executable on any systems" 2012-06-12 15:45:13 +02:00
include Remove semicolon in x86-64 va_arg definition. 2011-08-05 20:32:57 +02:00
lib lib/bcheck: Don't assume heap goes right after bss 2012-12-09 19:05:36 +04:00
tests tests: btest should only run on targets supporting bcheck 2012-11-24 12:54:03 +04:00
tests2 Create a clean target for tests2/Makefile 2012-11-07 14:56:37 +01:00
win32 win32: tcc.exe uses libtcc.dll 2012-04-18 18:38:11 +02:00
.gitignore Update .gitignore 2012-11-22 10:40:02 +04:00
COPYING changed license to LGPL 2003-05-24 14:18:56 +00:00
Changelog Add support for arm hardfloat calling convention 2012-06-05 23:09:55 +02:00
Makefile Define TCC_ARM_EABI if using hardfloat ABI 2012-11-20 11:36:13 +01:00
README Document in README that ex4.c can be executed. 2011-07-07 12:15:43 +02:00
TODO re-apply VLA by Thomas Preud'homme 2011-04-06 09:17:03 -07:00
VERSION update Changelog, bump version: 0.9.25 2009-05-11 19:01:26 +02:00
arm-gen.c arm-gen.c: Invalid operator test always false 2012-11-28 22:26:39 +01:00
c67-gen.c rename error/warning -> tcc_(error/warning) 2011-08-11 17:07:56 +02:00
coff.h C67 COFF executable format support (TK) 2004-10-05 22:33:55 +00:00
configure Add armv6l to ARM supported processors 2012-11-11 20:01:01 +01:00
elf.h Add support for R_ARM_THM_{JUMP24,CALL} relocs 2012-10-28 19:55:12 +01:00
i386-asm.c rename error/warning -> tcc_(error/warning) 2011-08-11 17:07:56 +02:00
i386-asm.h i386-asm: support "pause" opcode 2011-02-24 09:38:13 -08:00
i386-gen.c i386: We can change 'lea 0(%ebp),r' to 'mov %ebp,r' 2012-11-16 10:22:45 +04:00
i386-tok.h integrate x86_64-asm.c into i386-asm.c 2009-12-19 22:16:20 +01:00
il-gen.c rename error/warning -> tcc_(error/warning) 2011-08-11 17:07:56 +02:00
il-opcodes.h added CIL target 2002-02-10 16:14:03 +00:00
libtcc.c fix #include_next infinite loop bug, see http://savannah.nongnu.org/bugs/?31357 2012-09-20 22:12:05 +03:00
libtcc.h tccrun: another incompatible change to the tcc_relocate API 2012-09-01 11:33:34 +02:00
stab.def added 2002-12-08 14:36:36 +00:00
stab.h added 2002-12-08 14:36:36 +00:00
tcc-doc.texi Inform user that -b only exists on i386. 2012-03-13 19:43:43 +01:00
tcc.c tcc.c: fix argv index for parse_args 2012-06-12 15:32:44 +02:00
tcc.h Make tcc work after self-compiling with bounds-check enabled 2012-12-09 18:06:09 +04:00
tccasm.c Compile tccasm.c conditionally (TCC_CONFIG_ASM) 2012-01-06 18:34:21 +01:00
tcccoff.c rename error/warning -> tcc_(error/warning) 2011-08-11 17:07:56 +02:00
tccelf.c Generate PLT thumb stub only when necessary 2012-11-17 10:01:11 +01:00
tccgen.c Make tcc work after self-compiling with bounds-check enabled 2012-12-09 18:06:09 +04:00
tccpe.c pe: fix tcc not linking to user32 and gdi32 2012-11-02 16:59:21 +08:00
tccpp.c Make tcc work after self-compiling with bounds-check enabled 2012-12-09 18:06:09 +04:00
tccrun.c tccrun: another incompatible change to the tcc_relocate API 2012-09-01 11:33:34 +02:00
tcctok.h tcctok.h: fix ifdef target/host confusion 2011-04-12 00:11:47 -07:00
texi2pod.pl automatic man page generation from tcc-doc.texi 2003-05-18 18:11:06 +00:00
x86_64-asm.h x86-64: fix udiv, add cqto instruction 2009-12-19 22:16:19 +01:00
x86_64-gen.c x86-64: Fix call saved register restore 2012-06-10 09:01:26 +02:00

README

Tiny C Compiler - C Scripting Everywhere - The Smallest ANSI C compiler
-----------------------------------------------------------------------

Features:
--------

- SMALL! You can compile and execute C code everywhere, for example on
  rescue disks.

- FAST! tcc generates optimized x86 code. No byte code
  overhead. Compile, assemble and link about 7 times faster than 'gcc
  -O0'.

- UNLIMITED! Any C dynamic library can be used directly. TCC is
  heading torward full ISOC99 compliance. TCC can of course compile
  itself.

- SAFE! tcc includes an optional memory and bound checker. Bound
  checked code can be mixed freely with standard code.

- Compile and execute C source directly. No linking or assembly
  necessary. Full C preprocessor included. 

- C script supported : just add '#!/usr/local/bin/tcc -run' at the first
  line of your C source, and execute it directly from the command
  line.

Documentation:
-------------

1) Installation on a i386 Linux host (for Windows read tcc-win32.txt)

   ./configure
   make
   make test
   make install

By default, tcc is installed in /usr/local/bin.
./configure --help  shows configuration options.


2) Introduction

We assume here that you know ANSI C. Look at the example ex1.c to know
what the programs look like.

The include file <tcclib.h> can be used if you want a small basic libc
include support (especially useful for floppy disks). Of course, you
can also use standard headers, although they are slower to compile.

You can begin your C script with '#!/usr/local/bin/tcc -run' on the first
line and set its execute bits (chmod a+x your_script). Then, you can
launch the C code as a shell or perl script :-) The command line
arguments are put in 'argc' and 'argv' of the main functions, as in
ANSI C.

3) Examples

ex1.c: simplest example (hello world). Can also be launched directly
as a script: './ex1.c'.

ex2.c: more complicated example: find a number with the four
operations given a list of numbers (benchmark).

ex3.c: compute fibonacci numbers (benchmark).

ex4.c: more complicated: X11 program. Very complicated test in fact
because standard headers are being used ! As for ex1.c, can also be launched
directly as a script: './ex4.c'.

ex5.c: 'hello world' with standard glibc headers.

tcc.c: TCC can of course compile itself. Used to check the code
generator.

tcctest.c: auto test for TCC which tests many subtle possible bugs. Used
when doing 'make test'.

4) Full Documentation

Please read tcc-doc.html to have all the features of TCC.

Additional information is available for the Windows port in tcc-win32.txt.

License:
-------

TCC is distributed under the GNU Lesser General Public License (see
COPYING file).

Fabrice Bellard.