digitalmars.com                        
Last update Thu May 31 14:33:42 2018

Using Assembly Language Functions

This chapter describes how to call assembly language functions from both C and C++ and how to create an interface to assembly language modules. It explains conventions for function return values, register usage, and data alignment at the assembly language level.

Conventions for both 16- and 32-bit compilations are covered. When describing register usage and contents, the name of the corresponding 32-bit register appears in parentheses after the name of a 16-bit register.

For information about the advantages of writing assembly language code inline, instead of assembling it separately, see Using the Inline Assembler.

What's in This Chapter

Implications of Type-Safe Linkage

Type-safe linkage affects how you call assembly language functions from a C++ program. You cannot use the standard C-to-assembly language interface for C++ functions for the following reasons: For more information, see "Type-Safe Linkage" in Mixing Languages.

The Easy Way to Call Assembly Code from C++

In many cases in which you want to call an assembly language routine from a C++ function, you can use the following method, which does not require you to worry about function-naming or parameter-passing conventions:
extern "C"
{
  int assembler_routine(int x);
}

This method tells the C++ compiler through its function prototype that your assembly language routine uses C linkage. This is the easiest method of specifying C linkage and does not involve any change to the naming of the assembly language routine.

For complete information, see the section "Creating Routines With C++ Linkage" in this chapter.

Using existing assembly language modules

If you already have some assembly language routines written for use with DMC or Microsoft C, you can almost certainly use them with DMC++. However, you will need an ANSI C standard header file containing the function prototypes for these routines, and you will need to modify it to declare the functions as taking C linkage.

The best method of specifying C linkage is to enclose in braces with an extern "C" {} statement the prototypes of your assembly language functions, as shown in the section "The Easy Way to Call Assembly Code from C++" above. The advantage is that you can use the same routine with both C and C++ modules.

Provided you include the header file containing the function prototypes in all the source files that use the assembly language routines, you will not even have to reassemble their code.

Similarly, when calling a C++ function from an assembly language routine, declare that function as having C linkage in your C++ program. The only exception to this rule is member functions, which cannot be given C linkage.

Organization of Object Files

Digital Mars .com files are not the same as those produced by other compilers. In most other compilers, CS==SS==DS for .com files, and the entire size of the program, plus stack and heap, must be less than 64KB. In Digital Mars .com files, only the size of the code plus DGROUP areas must be less than 64KB. Considerably larger .com programs can thus be created. Also, the only difference between a Digital Mars Tiny model program and a Small model program is how it is linked.

In all but the Tiny model, the STACK segment is set to 128 bytes in length. This is enough to allow the operating system to start up the program. Code in the C++ startup module, c. asm, then allocates a full stack elsewhere. The 128 bytes are subsequently used to store the program command line so that it is addressable using the DS register. In the Tiny model, the STACK segment is zero bytes in length.

All BSS segments are cleared to 0 by the startup module, regardless of the memory model in use.

For the Tiny, Small, and Medium models, there are two schemes for allocation of the near heap. These schemes are selected by the value of the global variable _okbigbuf. For more information on memory allocation, see "Choosing a memory model" in Compiling Code.

Layout of Assembly Language Modules

To work with DMC++, assembly language code must be divided into code and data segments. Executable code and functions callable from C or C++ go into the code segment. Static and global data declarations go into the data segment.

The pseudo-ops for defining the code and data segments for each memory model are different. Therefore, use the macros begcode, endcode, begdata, and enddata defined in macros. asm for each memory model. The general layout for an .asm source file is:

INCLUDE MACROS.ASM ;define memory model macros
                   ;EXTRN statements for C/C++
                   ;functions to call go here

begdata            ;define start of data
                   ;EXTRN statements for
                   ;external data globals go here

enddata            ;define end of data segment

begcode modulename ;define start of code
                   ;executable code goes here

endcode modulename ;define end of code segment

END                ;define end of module

Function Return Values for 16-Bit Models

For the 16-bit memory models (Tiny, Small, Medium, Compact, and Large), near pointers, ints, unsigned ints, and shorts are returned in AX. Chars are returned in AL. Longs and unsigned longs are returned in DX, AX, where DX contains the most significant 16 bits and AX contains the least significant 16 bits. Far pointers are returned in DX, AX, where DX has the segment portion and AX has the offset.

When C linkage is in effect, floats are returned in DX, AX, and doubles are returned in AX, BX, CX, DX, where AX contains the most significant 16 bits, and DX contains the least significant. When C++ linkage is in effect, the compiler creates a temporary copy of the variable on the stack and returns a pointer to it. Both these techniques are reentrant. In AL, 1-byte structs are returned, 2-byte structs in AX, and 4-byte structs in DX, AX. With larger structures, the method used depends on the linkage system in use for the function. For C linkage, when a function returns a structure, it actually returns a pointer to the structure, which is in the static data segment. This means that C functions that return structures are not reentrant. C++ linkage creates a temporary copy on the stack and returns a pointer to it which is reentrant.

Function Return Values for 32-Bit Models

Near pointers, ints, unsigned ints, longs, and unsigned longs are returned in EAX. Chars are returned in AL; shorts are returned in AX. Far pointers are returned in DX, EAX, where DX contains the segment and EAX contains the offset. long longs are returned in EDX, EAX.

When C Linkage is in effect, floats are returned in EAX and doubles in EDX, EAX, where EDX contains the most significant 32 bits and EAX contains the least significant 32 bits. When C++ linkage is in effect, the compiler creates a temporary copy of the variable on the stack and returns a pointer to it. Both these techniques are reentrant.

1-byte structs are returned in AL, 2-byte structs in AX, and 4-byte structs in EAX. With larger structures, the compiler creates a temporary copy of the variable on the stack and returns a (reentrant) pointer to it.

For 32-bit C++ code, where a struct has no constructors or destructors declared for it, 1-byte structs are returned in AL, 2-byte structs in AX, 4-byte structs in EAX, and 8-byte structs in EDX: EAX.

Warning: In previous versions of DMC++, small structs without constructors in 32-bit C++ code were passed through a hidden pointer to the return value. The change described above was made for compatibility with Microsoft. Due to this change, if you build part of an application with the current version of DMC++, you need to rebuild all of the application; otherwise, crash bugs could be introduced.

Register usage and data alignment for 16-bit models

When interfacing to 16-bit memory models, assembly language functions can change the values in AX, BX, CX, DX, or ES. Functions must preserve the values in SI, DI, BP, SP, SS, CS, and DS. The direction flag must always be set to forward.

Data should be aligned along 16-bit boundaries to maximize speed on 16-bit buses.

Register usage and data alignment for 32-bit models

When interfacing to 32-bit memory models, assembly language functions can change the values in EAX, ECX, EDX, or ES. Functions must preserve the values in EBX, ESI, EDI, EBP, ESP, SS, FS, CS, and DS. The direction flag must always be set to forward.

Data should be aligned along 32-bit boundaries to maximize speed on 32-bit buses.

Macros in macros.asm

There are macros defined in macros.asm that aid in the development of memory model-independent assembly language files. The macros are:

Table 5-1 Macros defined in macros.asm
begcode Define start of code segment
endcode Define end of code segment
begdata Define start of initialized data segment
enddata Define end of initialized data segment
SIZEPTR Default pointer size in bytes (2 for Tiny, Small, Medium models, 4 for Compact, Large, Phar Lap, and DOSX models)
P Offset of first parameter from BP (EBP)
SPTR Non-zero if pointers are near by default (Tiny, Small, Medium, Phar Lap, and DOSX memory models)
LPTR Non-zero if pointers are far by default (Compact and Large memory models)
LCODE Non-zero if large code (Medium or Large memory models)
SSeqDS Non-zero if SS == DS
ESeqDS Non-zero if ES == DS
uses Pushes registers that must be saved
unuse Pops saved registers

Creating Routines with C Linkage

Calling an assembly language routine directly from a C function is much easier than calling an assembly language routine from C++. Subroutine linkage The BP register (EBP for 32-bit compilations) is dedicated to pointing to the current stack frame. A subroutine with C linkage is called by pushing the arguments onto the stack from right to left; then the subroutine is called. The called subroutine saves the old BP (EBP) on the stack, sets BP (EBP) to point to it, allocates space on the stack for all local variables, and pushes SI (ESI) and DI (EDI) if they are needed by the function. 32-bit code must also save the EBX register. The body of the subroutine is then executed.

The subroutine returns by popping EBX (32-bit code only), DI (EDI) and SI (ESI), deallocating space on the stack for the local variables, popping off the old value of BP (EBP), and returning. The calling code then removes the parameters from the stack.

Organization of the stack frame

The stack frame of a function is the current state of the stack and variables in it at a given point in the execution of the function. The table below shows the normal organization of the stack frame.

Table 5-2 Normal organization of stack frame
  High memory
  Previous stack frame
  Parameters
  Return address
BP (EBP) Old value of BP (EBP)
  Local variables and temporaries
  SI (ESI)
SP (ESP) DI (EDI)
  Low memory

The stack grows downward (toward lower addresses).

Small model example

The example below shows a short C++ program that calls an assembly language function using C linkage. This function sets the cursor position to the coordinates x, y. All macros are expanded, and the calling function is translated to assembly language to further show how the compiler translates a function with C linkage. The utility to translate the function is obj2asm. exe.

Here is the C++ program:

// essential!
extern "C" void gotoxy(int x, int y);
// normally in a header file

int main()
{
  gotoxy(10, 20); // set cursor position at ROW 10, COL 20
  return 1;
}

After compiling the C++ program to an object file, use the utility OBJ2ASM to produce the assembly language equivalent below:

_TEXT segment
_main:
  mov AX,014h   ; move 20 into AX
  push AX       ; push on stack (2 BYTES)
  mov AX,0Ah    ; move 10 into AX
  push AX       ; push on stack (2 BYTES)
  callm _gotoxy ; call gotoxy() function
  add SP,4      ; adjust stack ptr. (4 BYTES)
  ret
_TEXT ends

or for a 32-bit memory model:

_TEXT segment
_main:
  push 014h     ; push 20 on stack (4 BYTES)
  push 0Ah      ; push 10 on stack (4 BYTES)
  callm _gotoxy ; call gotoxy() function
  add ESP,8     ; adjust stack ptr. (8 BYTES)
  ret
_TEXT ends

Since the function gotoxy has been defined as using C linkage, the variables are pushed on the stack from right to left.

First, the column (20) is pushed on the stack. Next, the row (10) is pushed on the stack. Finally, the call to gotoxy is made, pushing the instruction pointer (IP) on the stack. Note that the 32-bit version pushes the parameters directly onto the stack, whereas the 16-bit version first moves them into AX. The table below shows some of the advantages of generating 32-bit code.

Table 5-3 Stack frame generating 32-bit code
  High memory
BP+4 (EBP+8) 20
BP+2 (EBP+4) 10
BP+0 (EBP+0) IP (EIP) return address
  Low memory

The compiler prepends an underscore to the function _main and _gotoxy. The _TEXT segment is the CODE segment. Table 5-3 shows how the stack looks after the call to _gotoxy(10,20):

The assembly language function below defines a set of utility macros for MASM 5. 0 and above that is supplied with the compiler and normally installed in the INCLUDE directory. All macros are defined in macros. asm. The 32-bit version is controlled by whether the macro DOS386 is defined.

include macros.asm ; pull in defs of macros
begcode gotoxy     ; define start of code seg called gotoxy

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; C++ interface routine, C linkage.
; Puts cursor at row, col.
; Usage:
; void gotoxy(int row, int col);

IFNDEF DOS386
public _gotoxy      ; make gotoxy global
_gotoxy proc near   ; define start of func
    push BP         ; save old stack frame
    mov BP,SP       ; set BP to point to old BP
    mov DH,P[BP]    ; DH = row
    mov DL,P[BP+2]  ; DL = col
    mov AH,2        ; BIOS function set cursor pos.
    xor BX,BX       ; page 0
    int 10h         ; BIOS video interrupt
    pop BP          ; restore old BP
    ret             ; return to caller
_gotoxy endp        ; define end of func
ELSE
public _gotoxy      ; make gotoxy global
_gotoxy proc near   ; define start of func
    push EBP        ; save old stack frame
    mov EBP,ESP     ; set EBP to point to old BP
    uses   ; saves registers that are used
    mov DH,P[EBP]   ; DH = row
    mov DL,P[EBP 4] ; DL = col
    mov AH,2        ; BIOS function set cursor pos.
    xor EBX,EBX     ; page 0
    int 10h         ; BIOS video interrupt
    unuse  ; note reverse order
    pop EBP         ; restore old EBP
    ret             ; return to caller
_gotoxy endp        ; define end of func
ENDIF
endcode gotoxy      ;define end of code seg end

The assembly language function begins by first pushing BP (EBP) onto the stack and moving the stack pointer into BP (EBP). This permits access to the variables pushed onto the stack by the calling function.

This is done by using BP (EBP) to point to offsets within the stack. In the example above, MOV DH, P[BP] (MOV DH, P[EBP]) obtains the row number from the stack and places it in DH. The next diagram shows the variables and their positions on the stack. P expands to 4 for the Tiny, Small, and Compact models; 6 for the Medium, and Large models; and 8 for the Phar Lap and DOSX models. It is the offset from BP (EBP) to the first parameter on the stack.

Table 5-4 Stack for tiny, small, and compact memory models
  High memory
BP+6 (EBP+12) 20
BP+4 (EBP+8) 10
BP+2 (EBP+4) IP (EIP) return address
BP+0 (EBP+0) Previous BP (EBP)
  Low memory

After completing the function, restore BP (EBP) and return to the calling function. The ret will pop IP off the stack and begin execution at the instruction following the calln _gotoxy. The next instruction in the calling function is ADD SP, 4. (ADD ESP, 8) This instruction resets the stack pointer to the position it occupied before the parameters were pushed.

The above example pertains to the Tiny, Small, Compact, and the 32-bit models. If the Large or Medium memory models are used, the far call also pushes CS onto the stack. This changes the position of the variables on the stack to those shown below:

Table 5-5 Stack for large and medium memory models
  High memory
BP+8 20
BP+6 10
BP+4 CS return segment
BP+2 IP return address
BP+0 Previous BP
  Low memory

Using the P macro (defined in macros.asm) compensates for these differences.

Model-independent example

This example illustrates an assembly language routine to implement the following C function. The routine is written to make it assemble correctly for any memory model:
// C++ MODULE
extern var1;
int var2;
extern "C" int func1(int *p, int a);// essential!

int func2(int *pa, int a)
{
  int b;
  *pa = b;
  var2 = b + var1 + func1(&b, a);
  return a - var2;
}

Here is the corresponding assembly language module:

; Assembler MODULE
include MACROS.ASM

IFNDEF DOS386
begdata                 ; define start of data seg
extrn   _var1:word
_var2   dw      0       ; allocate var2
enddata                 ; end of data segment

IF LCODE                ; if large code model
extrn   _func1:far      ; then far function
ELSE
extrn   _func1:near     ; else near function
ENDIF

begcode func2
public  _func2          ; make func2 global
IF LCODE
_func2  proc far        ; define function func2
ELSE
 _func2 proc near       ; define function func2
ENDIF
push    BP              ; save old frame pointer
mov     BP,SP           ; set new frame pointer
sub     SP,2            ; create room for b
mov     AX,-2[BP]       ; AX = b
IF SPTR                 ; if small memory model
mov     BX,P[BP]        ; BX = pa
mov     [BX],AX         ; *pa = b
ELSE                    ; else large memory model
les     BX,P[BP]        ; ES:BX = pa
mov     ES:[BX],AX      ; *pa = b
ENDIF
push    P+SIZEPTR[BP]   ; push a onto stack
IF LPTR                 ; if far pointers
push    SS              ; push segment of b
ENDIF
lea     AX,-2[BP]       ; AX = offset of b
push    AX
call    func1           ; call func1(& b, a)
add     SP,SIZEPTR+ 2   ; restore the stack
add     AX,_var1        ; func1 returned result in AX
add     AX,-2[BP]       ; AX = b+ var1+ func1(a)
mov     _var2,AX
mov     AX,p+SIZEPTR[BP]; AX = a
sub     AX,_var2        ; AX = a -var2
mov     SP,BP           ; dump local variables
pop     BP              ; restore old frame pointer
ret                     ; AX has return value
_func2  endp            ; end of function func2
endcode func2           ; end of code segment

ELSE
begdata                 ; start of data seg
extrn _var1:dword
_var2 dd 0              ; allocate var2
enddata                 ; end of data segment

extrn   _func1:near     ; near function
begcode func2
public  _func2          ; make func2 global
proc    _func2 near     ; define function func2
push    EBP             ; save old frame pointer
mov     EBP,ESP         ; set new frame pointer
sub     ESP,4           ; create room for b
uses               ; preserve EBX
mov     EAX,-4[EBP]     ; EAX = b
mov     EBX,P[EBP]      ; EBX = pa
mov     [EBX],EAX       ; *pa = b
push    P+SIZEPTR[EBP]  ; push a onto stack
lea     EAX,-4[EBP]     ; EAX = offset of b
push    EAX
call    near ptr func1  ; call func1(& b, a)
add     ESP,SIZEPTR+4   ; restore the stack
add     EAX,_var1       ; func1 returned result in EAX
add     EAX,-4[EBP]     ; EAX = b + var1 + func1(a)
mov     _var2,EAX
mov     EAX,p+SIZEPTR[EBP]      ; EAX = a
sub     EAX,_var2       ; EAX = a - var2
unuse              ; restore EBX
mov     ESP,EBP         ; dump local variables
pop     EBP             ; restore old frame ptr.
ret                     ; EAX has return value
_func2 endp             ; end of function func2
endcode func2           ; end of code segment
endif
END                     ; end of module

EXTERN statements for code should be outside the begcode/ endcode pairs; otherwise, a message about fix up errors from the linker can be generated when using the Medium or Large models.

Creating Routines With C++ Linkage

In almost all cases, it is better to use C linkage for assembly language functions that will be called from C++ code. This ensures compatibility with future versions of DMC++ and other compilers and avoids the problems associated with subtle differences in C++ calling conventions in different situations.

Digital Mars recommends that writing assembly language functions inline or use C linkage (that is, declare them as extern "C"), rather than use C++ linkage. If you must use C++ linkage, see the book Microsoft Object Mapping Specification for implementation details.

Running MASM

When you call MASM, the include file macros.asm sets up macros, depending on which memory model you desire. You indicate the memory model by defining a symbol on the command line:
MASM /MX /DI8086? module;

where ? is one of S, M, C, L, or V, corresponding to the appropriate memory model. The Small model is the default. You can see this by looking at the file macros. asm. Do not define I8086T for Tiny model programs; use I8086S instead. (Remember that the only difference between Tiny and Small programs is how they are linked, not how they are compiled or assembled.) For 32-bit programs, define the symbol as /DDOS386.

The /MX switch is necessary so that all global names are case sensitive. Do not use the /ML switch; it causes some versions of MASM to assemble 8087 opcodes incorrectly.

The /R switch enables the assembling of 8087 opcodes.

DMC++ offers built-in support for MASM (Versions 5.0 and higher; Version 5.1 is recommended). If a file argument to the compiler ends in .asm, the compiler tries to assemble it with MASM. If you specify a memory model, the compiler passes the appropriate define to MASM. The compiler passes -g, -D, -v, and -I options to MASM as the corresponding MASM switches.

Support for 386ASM

DMC also supports the Phar Lap assembler, 386ASM. To assemble test.asm using 386ASM, use:

dmc -mp test

Using Register Variables

DMC++ defines the following register variables:

Table 5-6 Register variables
_EAX _AX _AH _AL
_EBX _BX _BH _BL
_ECX _CX _CH _CL
_EDX _DX _DH _DL
_ESI _SI    
_EDI _DI    
_EBP _BP    
_ESP _SP    

The extended registers are not available in 16-bit compilations.

The register variables have the following types:

Register Variable Types
Register Type
Byte registers unsigned char
Word registers unsigned short
Extended registers unsigned long

Keep the following limitations in mind when you use register variables:

Using the __emit__ Function

The __emit__ function lets you insert inline machine instructions into your program in byte pairs. Although of limited usefulness in writing large routines (use the inline assembler instead), the __emit__ function is comparable to the inline assembler for implementing simple functions.

Note: The __emit__ function replaces the asm() function supported in Zortech 3.1.

Calls to __emit__ have the form:

__emit__(arg1, arg2, . . .);

The type of each argument determines the number of bytes stored, with this exception: If the argument is of type int and has a value in the range 0 to 255, only one byte is stored. Therefore, to store sizeof(int) bytes, cast the argument to unsigned:

__emit__(1,(unsigned) 23,6);

or use the u postfix:

__emit__(1,23u, 6);

Home | Runtime Library | IDDE Reference | STL | Search | Download | Forums