Everything You Never Wanted To Know About DLLs

I've recently had cause to investigate how dynamic linking is implemented on Windows. This post is basically a brain dump of everything I've learnt on the issue. This is mostly for my future reference, but I hope it will be useful to others too as I'm going to bring together lots of information you would otherwise have to hunt around for.

Without further ado, here we go:

Export and import directories

The Windows executable loader is responsible for doing all dynamic loading and symbol resolution before running the code. The linker works out what functions are exported or imported by each image (an image is a DLL or EXE file) by inspecting the .edata and .idata sections of those images, respectively.

The contents of these sections is covered in detail by the PE/COFF specification.

The .edata section

This section records the exports of the image (yes, EXEs can export things). This takes the form of:

  • The export address table: an array of length N holding the addresses of the exported functions/data (the addresses are stored relative to the image base). Indexes into this table are called ordinals.
  • The export name pointer table: an array of length M holding pointers to strings that represent the name of an export. This array is lexically ordered by name, to allow binary searches for a given export.
  • The export ordinal table: a parallel array of length M holding the ordinal of the corresponding name in the export name pointer table.

(As an alternative to importing an image's export by its name, it is possible to import by specifying an ordinal. Importing by ordinal is slightly faster at runtime because the dynamic linker doesn't have to do a lookup. Furthermore, if the import is not given a name by the exporting DLL, importing by ordinal is the only way to do the import.)

How does the .edata section get created in the first place? There are two main methods:

  1. Most commonly, they start life in the object files created by compiling some source code that defines a function/some data that was declared with the __declspec(dllimport) modifier. The compiler just emits an appropriate .edata section naming these exports.

  2. Less commonly, the programmer might write a .def file specifying which functions they would like to export. By supplying this to dlltool --output-exp, an export file can be generated. An export file is just an object file which only contains a .edata section, exporting (via some unresolved references that will be filled in by the linker in the usual way) the symbols named in the .def file. This export library must be named by the programmer when he comes to link together his object files into a DLL.

In both these cases, the linker collects the .edata sections from all objects named on the link line to build the .edata for the overall image file. One last possible way that the .edata can be created is by the linker itself, without having to put .edata into any object files:

  1. The linker could choose to export all symbols defined by object files named on the link line. For example, this is the default behaviour of GNU ld (the behaviour can also be explicitly asked for using –-export-all-symbols). In this case, the linker generates the .edata section itself. (GNU ld also supports specifying a .def file on the command line, in which case the generated section will export just those things named by the .def).

The .idata section

The .idata section records those things that the image imports. It consists of:

  • For every image from which symbols are imported:

    • The filename of the image. Used by the dynamic linker to locate it on disk.

    • The import lookup table: an array of length N, which each entry is either an ordinal or a pointer to a string representing the name to import.

    • The import address table: an array of N pointers. The dynamic linker is responsible for filling out this array with the address of the function/data named by the corresponding symbol in the import lookup table.

The ways in which .idata entries are created are as follows:

  1. Most commonly, they originate in a library of object files called an import library'. This import library can be created by usingdlltool` on the DLL you wish to export or a .def file of the type we discussed earlier. Just like the export library, the import library must be named by the user on the link line.

  2. Alternatively, some linkers (like GNU ld) let you specify a DLL directly on the link line. The linker will automatically generate .idata entries for any symbols that you must import from the DLL.

Notice that unlike the case when we were exporting symbols, __declspec(dllimport) does not cause .idata sections to be generated.

Import libraries are a bit more complicated than they first appear. The Windows dynamic loader fills the import address table with the addresses of the imported symbols (say, the address of a function Func). However, when the assembly code in other object files says call Func they expect that Func to name the address of that code. But we don't know that address until runtime: the only thing we know statically is the address where that address will be placed by the dynamic linker. We will call this address __imp__Func.

To deal with this extra level of indirection, the import library exports a function Func that just dereferences __imp__Func (to get the actual function pointer) and then jmps to it. All of the other object files in the project can now say call Func just as they would if Func had been defined in some other object file, rather than a DLL. For this reason, saying __declspec(dllimport) in the declaration of a dynamically linked function is optional (though in fact you will get slightly more efficient code if you add them, as we will see later).

Unfortunately, there is no equivalent trick if you want to import data from another DLL. If we have some imported data myData, there is no way the import library can be defined so that a mov $eax, myData in an object file linked against it writes to the storage for myData in that DLL. Instead, the import library defines a symbol __imp__myData that resolves to the address at which the linked-in address of the storage can be found. The compiler then ensures that when you read or write from a variable defined with __declspec(dllimport) those reads and writes go through the __imp_myData indirection. Because different code needs to be generated at the use site, __declspec declarations on data imports are not optional.

Practical example

Theory is all very well but it can be helpful to see all the pieces in play.

Building a DLL

First, lets build a simple DLL exporting both functions and data. For maximum clarity, we'll use an explicit export library rather instead of decorating our functions with declspec(dllexport) or supply a .def file to the linker.

First lets write the .def file, library.def:

LIBRARY library
EXPORTS
   function_export
   data_export      DATA

(The DATA keyword and LIBRARY line only affects how the import library is generated, as explained later on. Ignore them for now.)

Build an export file from that:

$ dlltool --output-exp library_exports.o -d library.def

The resulting object basically just contains an .edata section that exports the symbols _data_export and _function_export under the names data_export and function_export respectively:

$ objdump -xs library_exports.o

...

There is an export table in .edata at 0x0

The Export Tables (interpreted .edata section contents)

Export Flags                    0
Time/Date stamp                 4e10e5c1
Major/Minor                     0/0
Name                            00000028 library_exports.o.dll
Ordinal Base                    1
Number in:
        Export Address Table            00000002
        [Name Pointer/Ordinal] Table    00000002
Table Addresses
        Export Address Table            00000040
        Name Pointer Table              00000048
        Ordinal Table                   00000050

Export Address Table -- Ordinal Base 1

[Ordinal/Name Pointer] Table
        [   0] data_export
        [   1] function_export

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000000  00000000  00000000  00000000  2**2
                  ALLOC, LOAD, READONLY, CODE
  1 .data         00000000  00000000  00000000  00000000  2**2
                  ALLOC, LOAD, DATA
  2 .bss          00000000  00000000  00000000  00000000  2**2
                  ALLOC
  3 .edata        00000070  00000000  00000000  000000b4  2**2
                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
SYMBOL TABLE:
[  0](sec -2)(fl 0x00)(ty   0)(scl 103) (nx 1) 0x00000000 fake
File
[  2](sec  4)(fl 0x00)(ty   0)(scl   3) (nx 0) 0x00000028 name
[  3](sec  4)(fl 0x00)(ty   0)(scl   3) (nx 0) 0x00000040 afuncs
[  4](sec  4)(fl 0x00)(ty   0)(scl   3) (nx 0) 0x00000048 anames
[  5](sec  4)(fl 0x00)(ty   0)(scl   3) (nx 0) 0x00000050 anords
[  6](sec  4)(fl 0x00)(ty   0)(scl   3) (nx 0) 0x00000054 n1
[  7](sec  4)(fl 0x00)(ty   0)(scl   3) (nx 0) 0x00000060 n2
[  8](sec  1)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x00000000 .text
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 10](sec  2)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x00000000 .data
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 12](sec  3)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x00000000 .bss
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 14](sec  4)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x00000000 .edata
AUX scnlen 0x70 nreloc 8 nlnno 0
[ 16](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x00000000 _data_export
[ 17](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x00000000 _function_export


RELOCATION RECORDS FOR [.edata]:
OFFSET   TYPE              VALUE
0000000c rva32             .edata
0000001c rva32             .edata
00000020 rva32             .edata
00000024 rva32             .edata
00000040 rva32             _data_export
00000044 rva32             _function_export
00000048 rva32             .edata
0000004c rva32             .edata


Contents of section .edata:
 0000 00000000 c1e5104e 00000000 28000000  .......N....(...
 0010 01000000 02000000 02000000 40000000  ............@...
 0020 48000000 50000000 6c696272 6172795f  H...P...library_
 0030 6578706f 7274732e 6f2e646c 6c000000  exports.o.dll...
 0040 00000000 00000000 54000000 60000000  ........T...`...
 0050 00000100 64617461 5f657870 6f727400  ....data_export.
 0060 66756e63 74696f6e 5f657870 6f727400  function_export.

We'll fulfil these symbol with a trivial implementation of the DLL, library.c:

int data_export = 42;

int function_export() {
    return 1337 + data_export;
}

We can put it together into a DLL:

$ gcc -shared -o library.dll library.c library_exports.o

The export table for the DLL is as follows, showing that we have exported what we wanted:

The Export Tables (interpreted .edata section contents)

Export Flags                    0
Time/Date stamp                 4e10e5c1
Major/Minor                     0/0
Name                            00005028 library_exports.o.dll
Ordinal Base                    1
Number in:
        Export Address Table            00000002
        [Name Pointer/Ordinal] Table    00000002
Table Addresses
        Export Address Table            00005040
        Name Pointer Table              00005048
        Ordinal Table                   00005050

Export Address Table -- Ordinal Base 1
        [   0] +base[   1] 200c Export RVA
        [   1] +base[   2] 10f0 Export RVA

[Ordinal/Name Pointer] Table
        [   0] data_export
        [   1] function_export

Using the DLL

When we come to look at using the DLL, things become a lot more interesting. First, we need an import library:

$ dlltool --output-lib library.dll.a -d library.def

(The reason that we have an import library but an export object is because using a library for the imports allows the linker to discard .idata for any imports that are not used. Contrariwise ,he linker can never discard any .edata entry because any export may potentially be used by a user of the DLL).

This import library is rather complex. It contains one object for each export (disds00000.o and disds00001.o) but also two other object files (distdt.o and disdh.o) that set up the header and footer of the import list. (The header of the import list contains, among other things, the name of the DLL to link in at runtime, as derived from the LIBRARY line of the .def file.)


$ objdump -xs library.dll.a In archive library.dll.a: disdt.o: file format pe-i386 ... Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000000 00000000 00000000 00000000 2**2 ALLOC, LOAD, READONLY, CODE 1 .data 00000000 00000000 00000000 00000000 2**2 ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 00000000 2**2 ALLOC 3 .idata$4 00000004 00000000 00000000 00000104 2**2 CONTENTS, ALLOC, LOAD, DATA 4 .idata$5 00000004 00000000 00000000 00000108 2**2 CONTENTS, ALLOC, LOAD, DATA 5 .idata$7 0000000c 00000000 00000000 0000010c 2**2 CONTENTS, ALLOC, LOAD, DATA SYMBOL TABLE: [ 0](sec -2)(fl 0x00)(ty 0)(scl 103) (nx 1) 0x00000000 fake File [ 2](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .text AUX scnlen 0x0 nreloc 0 nlnno 0 [ 4](sec 2)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .data AUX scnlen 0x0 nreloc 0 nlnno 0 [ 6](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .bss AUX scnlen 0x0 nreloc 0 nlnno 0 [ 8](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .idata$4 AUX scnlen 0x4 nreloc 0 nlnno 0 [ 10](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .idata$5 AUX scnlen 0x4 nreloc 0 nlnno 0 [ 12](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .idata$7 AUX scnlen 0x7 nreloc 0 nlnno 0 [ 14](sec 6)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __library_dll_a_iname Contents of section .idata$4: 0000 00000000 .... Contents of section .idata$5: 0000 00000000 .... Contents of section .idata$7: 0000 6c696272 6172792e 646c6c00 library.dll. disdh.o: file format pe-i386 ... Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000000 00000000 00000000 00000000 2**2 ALLOC, LOAD, READONLY, CODE 1 .data 00000000 00000000 00000000 00000000 2**2 ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 00000000 2**2 ALLOC 3 .idata$2 00000014 00000000 00000000 00000104 2**2 CONTENTS, ALLOC, LOAD, RELOC, DATA 4 .idata$5 00000000 00000000 00000000 00000000 2**2 ALLOC, LOAD, DATA 5 .idata$4 00000000 00000000 00000000 00000000 2**2 ALLOC, LOAD, DATA SYMBOL TABLE: [ 0](sec -2)(fl 0x00)(ty 0)(scl 103) (nx 1) 0x00000000 fake File [ 2](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 hname [ 3](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 fthunk [ 4](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .text AUX scnlen 0x0 nreloc 0 nlnno 0 [ 6](sec 2)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .data AUX scnlen 0x0 nreloc 0 nlnno 0 [ 8](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .bss AUX scnlen 0x0 nreloc 0 nlnno 0 [ 10](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .idata$2 AUX scnlen 0x14 nreloc 3 nlnno 0 [ 12](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$4 [ 13](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$5 [ 14](sec 4)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __head_library_dll_a [ 15](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __library_dll_a_iname RELOCATION RECORDS FOR [.idata$2]: OFFSET TYPE VALUE 00000000 rva32 .idata$4 0000000c rva32 __library_dll_a_iname 00000010 rva32 .idata$5 Contents of section .idata$2: 0000 00000000 00000000 00000000 00000000 ................ 0010 00000000 .... disds00001.o: file format pe-i386 ... Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000008 00000000 00000000 0000012c 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 1 .data 00000000 00000000 00000000 00000000 2**2 ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 00000000 2**2 ALLOC 3 .idata$7 00000004 00000000 00000000 00000134 2**2 CONTENTS, RELOC 4 .idata$5 00000004 00000000 00000000 00000138 2**2 CONTENTS, RELOC 5 .idata$4 00000004 00000000 00000000 0000013c 2**2 CONTENTS, RELOC 6 .idata$6 00000012 00000000 00000000 00000140 2**1 CONTENTS SYMBOL TABLE: [ 0](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .text [ 1](sec 2)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .data [ 2](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .bss [ 3](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$7 [ 4](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$5 [ 5](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$4 [ 6](sec 7)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$6 [ 7](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 _function_export [ 8](sec 5)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __imp__function_export [ 9](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __head_library_dll_a RELOCATION RECORDS FOR [.text]: OFFSET TYPE VALUE 00000002 dir32 .idata$5 RELOCATION RECORDS FOR [.idata$7]: OFFSET TYPE VALUE 00000000 rva32 __head_library_dll_a RELOCATION RECORDS FOR [.idata$5]: OFFSET TYPE VALUE 00000000 rva32 .idata$6 RELOCATION RECORDS FOR [.idata$4]: OFFSET TYPE VALUE 00000000 rva32 .idata$6 Contents of section .text: 0000 ff250000 00009090 .%...... Contents of section .idata$7: 0000 00000000 .... Contents of section .idata$5: 0000 00000000 .... Contents of section .idata$4: 0000 00000000 .... Contents of section .idata$6: 0000 01006675 6e637469 6f6e5f65 78706f72 ..function_expor 0010 7400 t. disds00000.o: file format pe-i386 ... Sections: Idx Name Size VMA LMA File off Algn 0 .text 00000000 00000000 00000000 00000000 2**2 ALLOC, LOAD, READONLY, CODE 1 .data 00000000 00000000 00000000 00000000 2**2 ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 00000000 2**2 ALLOC 3 .idata$7 00000004 00000000 00000000 0000012c 2**2 CONTENTS, RELOC 4 .idata$5 00000004 00000000 00000000 00000130 2**2 CONTENTS, RELOC 5 .idata$4 00000004 00000000 00000000 00000134 2**2 CONTENTS, RELOC 6 .idata$6 0000000e 00000000 00000000 00000138 2**1 CONTENTS SYMBOL TABLE: [ 0](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .text [ 1](sec 2)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .data [ 2](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .bss [ 3](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$7 [ 4](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$5 [ 5](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$4 [ 6](sec 7)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$6 [ 7](sec 5)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __imp__data_export [ 8](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __head_library_dll_a RELOCATION RECORDS FOR [.idata$7]: OFFSET TYPE VALUE 00000000 rva32 __head_library_dll_a RELOCATION RECORDS FOR [.idata$5]: OFFSET TYPE VALUE 00000000 rva32 .idata$6 RELOCATION RECORDS FOR [.idata$4]: OFFSET TYPE VALUE 00000000 rva32 .idata$6 Contents of section .idata$7: 0000 00000000 .... Contents of section .idata$5: 0000 00000000 .... Contents of section .idata$4: 0000 00000000 .... Contents of section .idata$6: 0000 00006461 74615f65 78706f72 7400 ..data_export.

Note that the object corresponding to data_export has an empty .text section, whereas function_export does define some code. If we disassemble it we get this:

00000000 <_function_export>:
   0:   ff 25 00 00 00 00       jmp    *0x0
                        2: dir32        .idata$5
   6:   90                      nop
   7:   90                      nop

The relocation of type dir32 tells the linker how to fill in the address being dereferenced by the jmp. We can see that _function_export, when entered, will jump directly to the function at the address loaded from the memory named .idata$5. Inspection of the complete .idata section satisfies us that .idata$5 corresponds to the address of the fragment of the import address table corresponding to the function_export import name, and hence the address where the absolute address of the loaded function_export import can be found.

Although only function_export gets a corresponding _function_export function, both of the exports have lead to a symbol with the __imp__ prefix (__imp__data_export and __imp__function_export) being defined in the import library. As discussed before, this symbol stands for the address at which the pointer to the data/function will be inserted by the dynamic linker. As such, the __imp__ symbols always point directly into the import address table.

With an import library in hand, we are capable of writing some client code that uses our exports, main1.c:

#include <stdio.h>

__declspec(dllimport) extern int function_export(void);
__declspec(dllimport) extern int data_export;

int main(int argc, char **argv) {
    printf("%d\n", function_export());
    printf("%d\n", data_export);

    data_export++;

    printf("%d\n", function_export());
    printf("%d\n", data_export);

    return 0;
}

Build and link it against the import library and we will get the results we expect:

$ gcc main1.c library.dll.a -o main1 && ./main1
1379
42
1380
43

The reason that this works even though there is no data_export symbol defined by library.dll.a is because the __declspec(dllimport) qualifier on our data_export declaration in main.c has caused the compiled to generate code that uses the __imp_data_export symbol directly, as we can see if we disassemble the generated code:

$ gcc -c main1.c -o main1.o && objdump --disassemble -r main1.o

main1.o:     file format pe-i386


Disassembly of section .text:

00000000 <_main>:
   0:   8d 4c 24 04             lea    0x4(%esp),%ecx
   4:   83 e4 f0                and    $0xfffffff0,%esp
   7:   ff 71 fc                pushl  -0x4(%ecx)
   a:   55                      push   %ebp
   b:   89 e5                   mov    %esp,%ebp
   d:   51                      push   %ecx
   e:   83 ec 14                sub    $0x14,%esp
  11:   e8 00 00 00 00          call   16 <_main+0x16>
                        12: DISP32      ___main
  16:   a1 00 00 00 00          mov    0x0,%eax
                        17: dir32       __imp__function_export
  1b:   ff d0                   call   *%eax
  1d:   89 44 24 04             mov    %eax,0x4(%esp)
  21:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
                        24: dir32       .rdata
  28:   e8 00 00 00 00          call   2d <_main+0x2d>
                        29: DISP32      _printf
  2d:   a1 00 00 00 00          mov    0x0,%eax
                        2e: dir32       __imp__data_export
  32:   8b 00                   mov    (%eax),%eax
  34:   89 44 24 04             mov    %eax,0x4(%esp)
  38:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
                        3b: dir32       .rdata
  3f:   e8 00 00 00 00          call   44 <_main+0x44>
                        40: DISP32      _printf
  44:   a1 00 00 00 00          mov    0x0,%eax
                        45: dir32       __imp__data_export
  49:   8b 00                   mov    (%eax),%eax
  4b:   8d 50 01                lea    0x1(%eax),%edx
  4e:   a1 00 00 00 00          mov    0x0,%eax
                        4f: dir32       __imp__data_export
  53:   89 10                   mov    %edx,(%eax)
  55:   a1 00 00 00 00          mov    0x0,%eax
                        56: dir32       __imp__function_export
  5a:   ff d0                   call   *%eax
  5c:   89 44 24 04             mov    %eax,0x4(%esp)
  60:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
                        63: dir32       .rdata
  67:   e8 00 00 00 00          call   6c <_main+0x6c>
                        68: DISP32      _printf
  6c:   a1 00 00 00 00          mov    0x0,%eax
                        6d: dir32       __imp__data_export
  71:   8b 00                   mov    (%eax),%eax
  73:   89 44 24 04             mov    %eax,0x4(%esp)
  77:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
                        7a: dir32       .rdata
  7e:   e8 00 00 00 00          call   83 <_main+0x83>
                        7f: DISP32      _printf
  83:   b8 00 00 00 00          mov    $0x0,%eax
  88:   83 c4 14                add    $0x14,%esp
  8b:   59                      pop    %ecx
  8c:   5d                      pop    %ebp
  8d:   8d 61 fc                lea    -0x4(%ecx),%esp
  90:   c3                      ret
  91:   90                      nop
  92:   90                      nop
  93:   90                      nop

In fact, we can see that the generated code doesn't even use the _function_export symbol, preferring __imp__function_export. Essentially, the code of the _function_export symbol in the import library has been inlined at every use site. This is why using __declspec(dllimport) can improve performance of cross-DLL calls, even though it is entirely optional on function declarations.

We might wonder what happens if we drop the __declspec(dllimport) qualifier on our declarations. Because of our discussion about the difference between data and function imports earlier, you might expect linking to fail. Our test file, main2.c is:

#include <stdio.h>

extern int function_export(void);
extern int data_export;

int main(int argc, char **argv) {
    printf("%d\n", function_export());
    printf("%d\n", data_export);

    data_export++;

    printf("%d\n", function_export());
    printf("%d\n", data_export);

    return 0;
}

Let's try it out:

$ gcc main2.c library.dll.a -o main2 && ./main2
1379
42
1380
43

What the hell -- it worked? This is a bit uprising. The reason that it works despite the fact that the import library library.dll.a not defining the _data_export symbol is because of a nifty feature of GNU ld called auto-import. Without auto-import the link fails as we would expect:

$ gcc main2.c library.dll.a -o main2 -Wl,--disable-auto-import && ./main2
/tmp/ccGd8Urx.o:main2.c:(.text+0x2c): undefined reference to `_data_export'
/tmp/ccGd8Urx.o:main2.c:(.text+0x41): undefined reference to `_data_export'
/tmp/ccGd8Urx.o:main2.c:(.text+0x49): undefined reference to `_data_export'
/tmp/ccGd8Urx.o:main2.c:(.text+0x63): undefined reference to `_data_export'
collect2: ld returned 1 exit status

The Microsoft linker does not implement auto-import, so this is the error you would get if you were using the Microsoft toolchain.

However, there is a way to write client code that does not depend on auto-import or use the __declspec(dllimport) keyword. Our new client, main3.c is as follows:

#include <stdio.h>

extern int (*_imp__function_export)(void);
extern int *_imp__data_export;

#define function_export (*_imp__function_export)
#define data_export (*_imp__data_export)

int main(int argc, char **argv) {
    printf("%d\n", function_export());
    printf("%d\n", data_export);

    data_export++;

    printf("%d\n", function_export());
    printf("%d\n", data_export);

    return 0;
}

In this code, we directly use the __imp__-prefixed symbols from the import library. These name an address at which the real address of the import can be found, which is reflected by our C-preprocessor definitions of data_export and function_export.

This code compiles perfectly even without auto-import:

$ gcc main3.c library.dll.a -o main3 -Wl,--disable-auto-import && ./main3
1379
42
1380
43

If you have followed along until this point you should have a solid understanding of how DLL import and export are implemented on Windows.

How auto-import works

As a bonus, I'm going to explain how auto-import is implemented by the GNU linker. It is a rather cute hack you may get a kick out of.

As a reminder, auto-import is a feature of the linker that allows the programmer to declare an item of DLL-imported data with a simple extern keyword, without having to explicitly use __declspec(dllimport). This is extremely convenient because this is exactly how most nix source code declares symbols it expects to import from a shared library, so by supporting this use case thatnix code becomes more portable to Windows.

Auto-import kicks in whenever the linker finds an object file making use of a symbol foo which is not defined by any other object in the link, but where a symbol __imp_foo is defined by some object. In this case, it assumes that the use of foo is an attempt to access some DLL-imported data item called foo.

Now, the problem is that the linker needs to replace the use of foo with the address of foo itself. However, all we seem to know statically is an address where that address will be placed at runtime (__imp_foo). To square the circle, the linker plays a clever trick.

The trick is to extend the .idata of the image being created with an entry for a "new" DLL. The new entry is set up as follows:

  • The filename of the image being imported is set to the same filename as the .idata entry covering __imp_foo. So if __imp_foo was being filled out by an address in Bar.dll, our new .idata entry will use Bar.dll here.

  • The import lookup table is of length 1, whose sole entry is a pointer to the name of the imported symbol corresponding to __imp_foo. So if __imp_foo is filled out by the address of the foo export from Bar.dll, the name of the symbol we put in here will be foo.

  • The import address table is of length 1 -- and here is the clever bit -- is located precisely at the location in the object file that was referring to the (undefined) symbol foo.

This solution neatly defers the task of filling out the address that the object file wants to the dynamic linker. The reason that the linker can play this trick is that it can see all of the object code that goes into the final image, and can thus fix all of the sites that need to refer to the imported data.

Note that in general the final image's .idata will contain several entries for the same DLL: one from the import library, and one for every place in any object file in the link which referred to some data exported by the DLL. Although this is somewhat unusual behaviour, the Windows linker has no problem with there being several imports of the same DLL.

A wrinkle

Unfortunately, the scheme described above only works if the object code has an undefined reference to foo itself. What if instead it has a reference to foo+N, an address N bytes after the address of foo itself? There is no way to set up the .idata so that the dynamic linker adds a constant to the address it fills in, so we seem to be stuck.

Alas, such relocations are reasonably common, and originate from code that accesses a field of a DLL-imported structure type. Cygwin actually contains another hack to make auto-import work in such cases, known as "pseudo-relocations". If you want to know the details of how these works, there is more information in the original thread on the topic.

Conclusion

Dynamic linking on Windows is hairier than it at first appears. I hope this article has gone some way to clearing up the meaning of the mysterious dllimport and dllexport keywords, and at clarifying the role of the import and export libraries.

Linux and friends implement dynamic linking in a totally different manner to Windows. The scheme they use is more flexible and allows more in-memory sharing of code, but incurs a significant runtime penalty (especially on i386). For more details see here and the Dynamic Linking section of the the ELF spec.


3 Responses to “Everything You Never Wanted To Know About DLLs”