www.digitalmars.com         C & C++   DMDScript  

c++ - How DMC handle segments & Optlink oddities

Hi everyone,

As I'm trying to make DMC working for a 16bit x86 (80186) non PC 
hardware, I started to make a crt for that target, but once I 
started to make real program, I got problem with how segments are 
handled by DMC.

DMC seems to create a full bunch of segments which are good:

_TEXT for code
_DATA for inited RW data
CONST for constant data ?
_BSS for non inited RW data ?

I'm using a memory model which is close to the small and tiny 
model, but CS, DS and SS are different, as the code is in ROM, 
data are in a limite memory space (I can't copy the whole 
executable there, and at the end there could be more than 64K of 
data and it would be paginated, but that's another problem) and 
the stack have it's own area.

My first issue is that I don't understand how DMC put data into 
each segments, and I can't find a way to make sure it does what I 
want.

It seems to put everything into _DATA, it honour the __cs, but 
the CONST segment stay empty whatever the code I put.

Here is an example:

int non_init_var;             /* Should be in _BSS    */
int inited_var = 2;           /* Should be in _DATA   */
const int constant_var = 53;  /* Should be in CONST   */

int main(int argc, char *argv[])
{
    const int const_local_var = 20;              /* Should be in 
TEXT or CONST */
    static int static_local_var;                 /* Should be in 
_BSS          */
    static int static_inited_local_var = 5;      /* Should be in 
_DATA         */
    int local_var = 42;                          /* Should be on 
stack         */
    printf("Hello World!");    /* Text should be in CONST or in 
TEXT as it is implicitely const here. */


    /* Just to make sure all the variable are used and are not 
stripped */
    non_init_var = 29;
    static_local_var++;
    local_var = non_init_var + inited_var + constant_var + 
const_local_var + static_local_var + static_inited_local_var;
    return local_var;
}

After building with

dmc -c -a1 -NL test.c

I got this annotated disassembly output from the freeware version 
of IDA Pro 5:
FLAT:0000 ;
FLAT:0000 ; 
+-------------------------------------------------------------------------+
FLAT:0000 ; ¦     This file is generated by The Interactive 
Disassembler (IDA)        ¦
FLAT:0000 ; ¦     Copyright (c) 2010 by Hex-Rays SA, 
<support hex-rays.com>           ¦
FLAT:0000 ; ¦                      Licensed to: Freeware version  
                     ¦
FLAT:0000 ; 
+-------------------------------------------------------------------------+
FLAT:0000 ;
FLAT:0000 ; Input MD5   : F86FF1C59D493A04AE64BE3441AE11FD
FLAT:0000
FLAT:0000 ; File Name   : C:\users\crossover\Desktop\My Mac 
Desktop\AC2016\AC2016\test.obj
FLAT:0000 ; Format      : Object Module Format (OMF/Microsoft)
FLAT:0000 ; Module name      : test.c
FLAT:0000 ; MS parameters    :
FLAT:0000 ; Debug info type  : CodeView
FLAT:0000
FLAT:0000
FLAT:0000                 .386
FLAT:0000                 .model flat
FLAT:0000
FLAT:0000 ; 
---------------------------------------------------------------------------
FLAT:0000
FLAT:0000 ; Segment type: Group
FLAT:0000 FLAT            group
FLAT:0000
extn00:0001 ; Near data, 4 bytes
extn00:0001 ; 
---------------------------------------------------------------------------
extn00:0001
extn00:0001 ; Segment type: Externs
extn00:0001 ; extn00
extn00:0001                 extrn _non_init_var:byte:4 ; DATA 
XREF: _main+1Cw
extn00:0005                 extrn __acrtused_con:far
extn00:0005
extn01:0006 ; 
---------------------------------------------------------------------------
extn01:0006
extn01:0006 ; Segment type: Externs
extn01:0006 ; extn01
extn01:0006 ; int printf(const char *,...)
extn01:0006                 extrn _printf:near      ; CODE XREF: 
_main+12p
extn01:0006
_TEXT:00000007 ; 
---------------------------------------------------------------------------
_TEXT:00000007
_TEXT:00000007 ; Segment type: Pure code
_TEXT:00000007 _TEXT           segment dword public 'CODE' use32
_TEXT:00000007                 assume cs:_TEXT
_TEXT:00000007                 ;org 7
_TEXT:00000007                 assume es:nothing, ss:nothing, 
ds:nothing, fs:nothing, gs:nothing
_TEXT:00000007
_TEXT:00000007 ; ¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦ S U B R O U T I N E 
¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦¦
_TEXT:00000007
_TEXT:00000007 ; Attributes: bp-based frame
_TEXT:00000007
_TEXT:00000007                 public _main
_TEXT:00000007 _main           proc near
_TEXT:00000007
_TEXT:00000007 local_var       = dword ptr -4
_TEXT:00000007
_TEXT:00000007                 enter   4, 0
_TEXT:0000000B                 push    ebx
_TEXT:0000000C                 mov     eax, 20
_TEXT:00000011                 mov     [ebp+local_var], eax
_TEXT:00000014                 push    offset aHelloWorld ; 
"Hello World!"
_TEXT:00000019                 call    _printf
_TEXT:0000001E                 mov     ecx, 29
_TEXT:00000023                 mov     dword ptr 
ds:_non_init_var, ecx
_TEXT:00000029                 inc     ds:static_local_var
_TEXT:0000002F                 mov     edx, ds:_inited_var
_TEXT:00000035                 lea     ebx, [ecx+edx]
_TEXT:00000038                 add     ebx, ds:_constant_var
_TEXT:0000003E                 add     ebx, [ebp+local_var]
_TEXT:00000041                 add     ebx, ds:static_local_var
_TEXT:00000047                 add     ebx, 
ds:static_inited_local_var
_TEXT:0000004D                 mov     eax, ebx
_TEXT:0000004F                 add     esp, 4
_TEXT:00000052                 pop     ebx
_TEXT:00000053                 leave
_TEXT:00000054                 retn
_TEXT:00000054 _main           endp
_TEXT:00000054
_TEXT:00000054 _TEXT           ends
_TEXT:00000054
_DATA:00000005 ; 
---------------------------------------------------------------------------
_DATA:00000005
_DATA:00000005 ; Segment type: Pure data
_DATA:00000005 _DATA           segment dword public 'DATA' use32
_DATA:00000005                 assume cs:_DATA
_DATA:00000005                 ;org 5
_DATA:00000005                 public _inited_var
_DATA:00000005 _inited_var     dd 2                    ; DATA 
XREF: _main+28r
_DATA:00000009                 public _constant_var
_DATA:00000009 _constant_var   dd 53                   ; DATA 
XREF: _main+31r
_DATA:0000000D static_inited_local_var dd 5            ; DATA 
XREF: _main+40r
_DATA:00000011 ; char aHelloWorld[]
_DATA:00000011 aHelloWorld     db 'Hello World!',0     ; DATA 
XREF: _main+Do
_DATA:00000011 _DATA           ends
_DATA:00000011
CONST:0000000E ; 
---------------------------------------------------------------------------
CONST:0000000E
CONST:0000000E ; Segment type: Zero-length
CONST:0000000E CONST           segment dword public 'CONST' use32
CONST:0000000E CONST           ends
CONST:0000000E
_BSS:0000000F ; 
---------------------------------------------------------------------------
_BSS:0000000F
_BSS:0000000F ; Segment type: Uninitialized
_BSS:0000000F _BSS            segment dword public 'BSS' use32
_BSS:0000000F                 assume cs:_BSS
_BSS:0000000F                 ;org 0Fh
_BSS:0000000F                 assume es:nothing, ss:nothing, 
ds:nothing, fs:nothing, gs:nothing
_BSS:0000000F static_local_var dd ?                   ; DATA 
XREF: _main+22w
_BSS:0000000F                                         ; _main+3Ar
_BSS:0000000F _BSS            ends
_BSS:0000000F
_BSS:0000000F
_BSS:0000000F                 end

What I see here is that the CONST segment is absolutely empty, 
especially with all things that should be treated as constant 
(because they are explicitly or implicitly declared as constant, 
the text given in parameter to printf should be constant for 
exemple)

and two variable defined in this C file, but not initialised 
instead of going into the BSS are put as extern? That clearly not 
what should be expected.

I've run this with 32bit x86 output, but using all memory model 
for 16bit DOS does the same.

The fact that on some system and that some linker may merge CONST 
and DATA segments is possible, as on some system everything is in 
RAM anyway, but that should not be the C compiler that do such a 
thing.

I should only copy the non constant variables into the RW DATA 
RAM, and not all the constant one that could be really big on 
some project.
Is this a bug in DMC?
Is this something done on purpose? If yes, how can I force it to 
use the CONST segment for everything which is constant, and make 
sure it does not create extern for variable that are clearly 
defined in the current file.


By the way, I also have a few thing I don't understand with the 
optlink:
I've created a third group to represent a specific area in memory:

section PRAM      class=IRAM location=00000h
             resb  0200h ; Put at byte 200h
_tickL:     resw    1   ; Low word of tick counter
_tickH:     resw    1   ; High word of tick counter
group CGROUP	_TEXT
group	DGROUP	C_COMMON _DATA _BSS _CONST ENDDATA
group IRGROUP  PRAM

and optlink seems to have merged DGROUP with IRGROUP where I 
should not expect this at all as in one C file that use 
_tickL/_tickH, instead of referencing them as a far pointer (ds 
is not set to this section) it just merge that section with the 
DGROUP, which is not the behaviour I expected.

Similar problem, I've tried to use the trick to get the size of 
the RW data by putting a pointer at the beginning of the DGROUP, 
and use the ENDDATA section to put an end pointer and use them to 
copy the rw data into the memory, but doing that as when the 
assembler build the CRT, it does not now where DRGOUP is about to 
be put in the resulting file, and create a relocation marker to 
say "I still don't know where this is to be put" but then, when 
linking optlink does not replace that marker with the real value 
so I end with a file with relocation, which is not something I 
want as I'm currently using exe2com (updated to work on 64bit 
computer) which does not want relocatable EXEs.

Is there is a way to copy the DGROUP into memory without 
generating a relocation?
Thanks

Manoel
Apr 23 2016