Everything You Never Wanted To Know About DLLs
I’ve recently had cause to investigate how dynamic linking is implemented on Windows. This post is basically a brain dump of everything I’ve learnt on the issue. This is mostly for my future reference, but I hope it will be useful to others too as I’m going to bring together lots of information you would otherwise have to hunt around for.
Without further ado, here we go:
Export and import directories
The Windows executable loader is responsible for doing all dynamic loading and symbol resolution before running the code. The linker works out what functions are exported or imported by each image (an image is a DLL or EXE file) by inspecting the .edata
and .idata
sections of those images, respectively.
The contents of these sections is covered in detail by the PE/COFF specification.
The .edata
section
This section records the exports of the image (yes, EXEs can export things). This takes the form of:
-
The export address table: an array of length N holding the addresses of the exported functions/data (the addresses are stored relative to the image base). Indexes into this table are called ordinals.
-
The export name pointer table: an array of length M holding pointers to strings that represent the name of an export. This array is lexically ordered by name, to allow binary searches for a given export.
-
The export ordinal table: a parallel array of length M holding the ordinal of the corresponding name in the export name pointer table.
(As an alternative to importing an image’s export by its name, it is possible to import by specifying an ordinal. Importing by ordinal is slightly faster at runtime because the dynamic linker doesn’t have to do a lookup. Furthermore, if the import is not given a name by the exporting DLL, importing by ordinal is the only way to do the import.)
How does the .edata
section get created in the first place? There are two main methods:
-
Most commonly, they start life in the object files created by compiling some source code that defines a function/some data that was declared with the
__declspec(dllimport)
modifier. The compiler just emits an appropriate.edata
section naming these exports. -
Less commonly, the programmer might write a .def file specifying which functions they would like to export. By supplying this to
dlltool --output-exp
, an export file can be generated. An export file is just an object file which only contains a.edata
section, exporting (via some unresolved references that will be filled in by the linker in the usual way) the symbols named in the .def file. This export library must be named by the programmer when he comes to link together his object files into a DLL.
In both these cases, the linker collects the.edata
sections from all objects named on the link line to build the.edata
for the overall image file. One last possible way that the.edata
can be created is by the linker itself, without having to put.edata
into any object files: -
The linker could choose to export all symbols defined by object files named on the link line. For example, this is the default behaviour of GNU ld (the behaviour can also be explicitly asked for using
–-export-all-symbols
). In this case, the linker generates the.edata
section itself. (GNU ld also supports specifying a .def file on the command line, in which case the generated section will export just those things named by the .def).
The .idata
section
The .idata
section records those things that the image imports. It consists of:
-
For every image from which symbols are imported:
-
The filename of the image. Used by the dynamic linker to locate it on disk.
-
The import lookup table: an array of length N, which each entry is either an ordinal or a pointer to a string representing the name to import.
-
The import address table: an array of N pointers. The dynamic linker is responsible for filling out this array with the address of the function/data named by the corresponding symbol in the import lookup table.
-
The ways in which .idata
entries are created are as follows:
-
Most commonly, they originate in a library of object files called an
import library'. This import library can be created by using
dlltool` on the DLL you wish to export or a .def file of the type we discussed earlier. Just like the export library, the import library must be named by the user on the link line. -
Alternatively, some linkers (like GNU ld) let you specify a DLL directly on the link line. The linker will automatically generate
.idata
entries for any symbols that you must import from the DLL.
Notice that unlike the case when we were exporting symbols, __declspec(dllimport)
does not cause .idata
sections to be generated.
Import libraries are a bit more complicated than they first appear. The Windows dynamic loader fills the import address table with the addresses of the imported symbols (say, the address of a function Func
). However, when the assembly code in other object files says call Func
they expect that Func
to name the address of that code. But we don’t know that address until runtime: the only thing we know statically is the address where that address will be placed by the dynamic linker. We will call this address __imp__Func
.
To deal with this extra level of indirection, the import library exports a function Func
that just dereferences __imp__Func
(to get the actual function pointer) and then jmp
s to it. All of the other object files in the project can now say call Func
just as they would if Func
had been defined in some other object file, rather than a DLL. For this reason, saying __declspec(dllimport)
in the declaration of a dynamically linked function is optional (though in fact you will get slightly more efficient code if you add them, as we will see later).
Unfortunately, there is no equivalent trick if you want to import data from another DLL. If we have some imported data myData
, there is no way the import library can be defined so that a mov $eax, myData
in an object file linked against it writes to the storage for myData
in that DLL. Instead, the import library defines a symbol __imp__myData
that resolves to the address at which the linked-in address of the storage can be found. The compiler then ensures that when you read or write from a variable defined with __declspec(dllimport)
those reads and writes go through the __imp_myData
indirection. Because different code needs to be generated at the use site, __declspec
declarations on data imports are not optional.
Practical example
Theory is all very well but it can be helpful to see all the pieces in play.
Building a DLL
First, lets build a simple DLL exporting both functions and data. For maximum clarity, we’ll use an explicit export library rather instead of decorating our functions with declspec(dllexport)
or supply a .def file to the linker.
First lets write the .def file, library.def
:
LIBRARY library
EXPORTS
function_export
data_export DATA
(The DATA
keyword and LIBRARY
line only affects how the import library is generated, as explained later on. Ignore them for now.)
Build an export file from that:
$ dlltool --output-exp library_exports.o -d library.def
The resulting object basically just contains an .edata
section that exports the symbols _data_export
and _function_export
under the names data_export
and function_export
respectively:
$ objdump -xs library_exports.o
...
There is an export table in .edata at 0x0
The Export Tables (interpreted .edata section contents)
Export Flags 0
Time/Date stamp 4e10e5c1
Major/Minor 0/0
Name 00000028 library_exports.o.dll
Ordinal Base 1
Number in:
Export Address Table 00000002
[Name Pointer/Ordinal] Table 00000002
Table Addresses
Export Address Table 00000040
Name Pointer Table 00000048
Ordinal Table 00000050
Export Address Table -- Ordinal Base 1
[Ordinal/Name Pointer] Table
[ 0] data_export
[ 1] function_export
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, READONLY, CODE
1 .data 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 00000000 2**2
ALLOC
3 .edata 00000070 00000000 00000000 000000b4 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
SYMBOL TABLE:
[ 0](sec -2)(fl 0x00)(ty 0)(scl 103) (nx 1) 0x00000000 fake
File
[ 2](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000028 name
[ 3](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000040 afuncs
[ 4](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000048 anames
[ 5](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000050 anords
[ 6](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000054 n1
[ 7](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000060 n2
[ 8](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .text
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 10](sec 2)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .data
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 12](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .bss
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 14](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .edata
AUX scnlen 0x70 nreloc 8 nlnno 0
[ 16](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 _data_export
[ 17](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 _function_export
RELOCATION RECORDS FOR [.edata]:
OFFSET TYPE VALUE
0000000c rva32 .edata
0000001c rva32 .edata
00000020 rva32 .edata
00000024 rva32 .edata
00000040 rva32 _data_export
00000044 rva32 _function_export
00000048 rva32 .edata
0000004c rva32 .edata
Contents of section .edata:
0000 00000000 c1e5104e 00000000 28000000 .......N....(...
0010 01000000 02000000 02000000 40000000 ............@...
0020 48000000 50000000 6c696272 6172795f H...P...library_
0030 6578706f 7274732e 6f2e646c 6c000000 exports.o.dll...
0040 00000000 00000000 54000000 60000000 ........T...`...
0050 00000100 64617461 5f657870 6f727400 ....data_export.
0060 66756e63 74696f6e 5f657870 6f727400 function_export.
We’ll fulfil these symbol with a trivial implementation of the DLL, library.c
:
We can put it together into a DLL:
$ gcc -shared -o library.dll library.c library_exports.o
The export table for the DLL is as follows, showing that we have exported what we wanted:
The Export Tables (interpreted .edata section contents)
Export Flags 0
Time/Date stamp 4e10e5c1
Major/Minor 0/0
Name 00005028 library_exports.o.dll
Ordinal Base 1
Number in:
Export Address Table 00000002
[Name Pointer/Ordinal] Table 00000002
Table Addresses
Export Address Table 00005040
Name Pointer Table 00005048
Ordinal Table 00005050
Export Address Table -- Ordinal Base 1
[ 0] +base[ 1] 200c Export RVA
[ 1] +base[ 2] 10f0 Export RVA
[Ordinal/Name Pointer] Table
[ 0] data_export
[ 1] function_export
Using the DLL
When we come to look at using the DLL, things become a lot more interesting. First, we need an import library:
$ dlltool --output-lib library.dll.a -d library.def
(The reason that we have an import library but an export object is because using a library for the imports allows the linker to discard .idata
for any imports that are not used. Contrariwise ,he linker can never discard any .edata
entry because any export may potentially be used by a user of the DLL).
This import library is rather complex. It contains one object for each export (disds00000.o
and disds00001.o
) but also two other object files (distdt.o
and disdh.o
) that set up the header and footer of the import list. (The header of the import list contains, among other things, the name of the DLL to link in at runtime, as derived from the LIBRARY
line of the .def file.)
$ objdump -xs library.dll.a
In archive library.dll.a:
disdt.o: file format pe-i386
...
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, READONLY, CODE
1 .data 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 00000000 2**2
ALLOC
3 .idata$4 00000004 00000000 00000000 00000104 2**2
CONTENTS, ALLOC, LOAD, DATA
4 .idata$5 00000004 00000000 00000000 00000108 2**2
CONTENTS, ALLOC, LOAD, DATA
5 .idata$7 0000000c 00000000 00000000 0000010c 2**2
CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
[ 0](sec -2)(fl 0x00)(ty 0)(scl 103) (nx 1) 0x00000000 fake
File
[ 2](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .text
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 4](sec 2)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .data
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 6](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .bss
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 8](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .idata$4
AUX scnlen 0x4 nreloc 0 nlnno 0
[ 10](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .idata$5
AUX scnlen 0x4 nreloc 0 nlnno 0
[ 12](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .idata$7
AUX scnlen 0x7 nreloc 0 nlnno 0
[ 14](sec 6)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __library_dll_a_iname
Contents of section .idata$4:
0000 00000000 ....
Contents of section .idata$5:
0000 00000000 ....
Contents of section .idata$7:
0000 6c696272 6172792e 646c6c00 library.dll.
disdh.o: file format pe-i386
...
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, READONLY, CODE
1 .data 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 00000000 2**2
ALLOC
3 .idata$2 00000014 00000000 00000000 00000104 2**2
CONTENTS, ALLOC, LOAD, RELOC, DATA
4 .idata$5 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, DATA
5 .idata$4 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, DATA
SYMBOL TABLE:
[ 0](sec -2)(fl 0x00)(ty 0)(scl 103) (nx 1) 0x00000000 fake
File
[ 2](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 hname
[ 3](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 fthunk
[ 4](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .text
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 6](sec 2)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .data
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 8](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .bss
AUX scnlen 0x0 nreloc 0 nlnno 0
[ 10](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 1) 0x00000000 .idata$2
AUX scnlen 0x14 nreloc 3 nlnno 0
[ 12](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$4
[ 13](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$5
[ 14](sec 4)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __head_library_dll_a
[ 15](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __library_dll_a_iname
RELOCATION RECORDS FOR [.idata$2]:
OFFSET TYPE VALUE
00000000 rva32 .idata$4
0000000c rva32 __library_dll_a_iname
00000010 rva32 .idata$5
Contents of section .idata$2:
0000 00000000 00000000 00000000 00000000 ................
0010 00000000 ....
disds00001.o: file format pe-i386
...
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000008 00000000 00000000 0000012c 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 00000000 2**2
ALLOC
3 .idata$7 00000004 00000000 00000000 00000134 2**2
CONTENTS, RELOC
4 .idata$5 00000004 00000000 00000000 00000138 2**2
CONTENTS, RELOC
5 .idata$4 00000004 00000000 00000000 0000013c 2**2
CONTENTS, RELOC
6 .idata$6 00000012 00000000 00000000 00000140 2**1
CONTENTS
SYMBOL TABLE:
[ 0](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .text
[ 1](sec 2)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .data
[ 2](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .bss
[ 3](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$7
[ 4](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$5
[ 5](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$4
[ 6](sec 7)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$6
[ 7](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 _function_export
[ 8](sec 5)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __imp__function_export
[ 9](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __head_library_dll_a
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
00000002 dir32 .idata$5
RELOCATION RECORDS FOR [.idata$7]:
OFFSET TYPE VALUE
00000000 rva32 __head_library_dll_a
RELOCATION RECORDS FOR [.idata$5]:
OFFSET TYPE VALUE
00000000 rva32 .idata$6
RELOCATION RECORDS FOR [.idata$4]:
OFFSET TYPE VALUE
00000000 rva32 .idata$6
Contents of section .text:
0000 ff250000 00009090 .%......
Contents of section .idata$7:
0000 00000000 ....
Contents of section .idata$5:
0000 00000000 ....
Contents of section .idata$4:
0000 00000000 ....
Contents of section .idata$6:
0000 01006675 6e637469 6f6e5f65 78706f72 ..function_expor
0010 7400 t.
disds00000.o: file format pe-i386
...
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, READONLY, CODE
1 .data 00000000 00000000 00000000 00000000 2**2
ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 00000000 2**2
ALLOC
3 .idata$7 00000004 00000000 00000000 0000012c 2**2
CONTENTS, RELOC
4 .idata$5 00000004 00000000 00000000 00000130 2**2
CONTENTS, RELOC
5 .idata$4 00000004 00000000 00000000 00000134 2**2
CONTENTS, RELOC
6 .idata$6 0000000e 00000000 00000000 00000138 2**1
CONTENTS
SYMBOL TABLE:
[ 0](sec 1)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .text
[ 1](sec 2)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .data
[ 2](sec 3)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .bss
[ 3](sec 4)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$7
[ 4](sec 5)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$5
[ 5](sec 6)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$4
[ 6](sec 7)(fl 0x00)(ty 0)(scl 3) (nx 0) 0x00000000 .idata$6
[ 7](sec 5)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __imp__data_export
[ 8](sec 0)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 __head_library_dll_a
RELOCATION RECORDS FOR [.idata$7]:
OFFSET TYPE VALUE
00000000 rva32 __head_library_dll_a
RELOCATION RECORDS FOR [.idata$5]:
OFFSET TYPE VALUE
00000000 rva32 .idata$6
RELOCATION RECORDS FOR [.idata$4]:
OFFSET TYPE VALUE
00000000 rva32 .idata$6
Contents of section .idata$7:
0000 00000000 ....
Contents of section .idata$5:
0000 00000000 ....
Contents of section .idata$4:
0000 00000000 ....
Contents of section .idata$6:
0000 00006461 74615f65 78706f72 7400 ..data_export.
Note that the object corresponding to data_export
has an empty .text
section, whereas function_export
does define some code. If we disassemble it we get this:
00000000 <_function_export>:
0: ff 25 00 00 00 00 jmp *0x0
2: dir32 .idata$5
6: 90 nop
7: 90 nop
The relocation of type dir32
tells the linker how to fill in the address being dereferenced by the jmp
. We can see that _function_export
, when entered, will jump directly to the function at the address loaded from the memory named .idata$5
. Inspection of the complete .idata
section satisfies us that .idata$5
corresponds to the address of the fragment of the import address table corresponding to the function_export
import name, and hence the address where the absolute address of the loaded function_export
import can be found.
Although only function_export
gets a corresponding _function_export
function, both of the exports have lead to a symbol with the __imp__
prefix (__imp__data_export
and __imp__function_export
) being defined in the import library. As discussed before, this symbol stands for the address at which the pointer to the data/function will be inserted by the dynamic linker. As such, the __imp__
symbols always point directly into the import address table.
With an import library in hand, we are capable of writing some client code that uses our exports, main1.c
:
Build and link it against the import library and we will get the results we expect:
$ gcc main1.c library.dll.a -o main1 && ./main1
1379
42
1380
43
The reason that this works even though there is no data_export
symbol defined by library.dll.a
is because the __declspec(dllimport)
qualifier on our data_export
declaration in main.c
has caused the compiled to generate code that uses the __imp_data_export
symbol directly, as we can see if we disassemble the generated code:
$ gcc -c main1.c -o main1.o && objdump --disassemble -r main1.o
main1.o: file format pe-i386
Disassembly of section .text:
00000000 <_main>:
0: 8d 4c 24 04 lea 0x4(%esp),%ecx
4: 83 e4 f0 and $0xfffffff0,%esp
7: ff 71 fc pushl -0x4(%ecx)
a: 55 push %ebp
b: 89 e5 mov %esp,%ebp
d: 51 push %ecx
e: 83 ec 14 sub $0x14,%esp
11: e8 00 00 00 00 call 16 <_main+0x16>
12: DISP32 ___main
16: a1 00 00 00 00 mov 0x0,%eax
17: dir32 __imp__function_export
1b: ff d0 call *%eax
1d: 89 44 24 04 mov %eax,0x4(%esp)
21: c7 04 24 00 00 00 00 movl $0x0,(%esp)
24: dir32 .rdata
28: e8 00 00 00 00 call 2d <_main+0x2d>
29: DISP32 _printf
2d: a1 00 00 00 00 mov 0x0,%eax
2e: dir32 __imp__data_export
32: 8b 00 mov (%eax),%eax
34: 89 44 24 04 mov %eax,0x4(%esp)
38: c7 04 24 00 00 00 00 movl $0x0,(%esp)
3b: dir32 .rdata
3f: e8 00 00 00 00 call 44 <_main+0x44>
40: DISP32 _printf
44: a1 00 00 00 00 mov 0x0,%eax
45: dir32 __imp__data_export
49: 8b 00 mov (%eax),%eax
4b: 8d 50 01 lea 0x1(%eax),%edx
4e: a1 00 00 00 00 mov 0x0,%eax
4f: dir32 __imp__data_export
53: 89 10 mov %edx,(%eax)
55: a1 00 00 00 00 mov 0x0,%eax
56: dir32 __imp__function_export
5a: ff d0 call *%eax
5c: 89 44 24 04 mov %eax,0x4(%esp)
60: c7 04 24 00 00 00 00 movl $0x0,(%esp)
63: dir32 .rdata
67: e8 00 00 00 00 call 6c <_main+0x6c>
68: DISP32 _printf
6c: a1 00 00 00 00 mov 0x0,%eax
6d: dir32 __imp__data_export
71: 8b 00 mov (%eax),%eax
73: 89 44 24 04 mov %eax,0x4(%esp)
77: c7 04 24 00 00 00 00 movl $0x0,(%esp)
7a: dir32 .rdata
7e: e8 00 00 00 00 call 83 <_main+0x83>
7f: DISP32 _printf
83: b8 00 00 00 00 mov $0x0,%eax
88: 83 c4 14 add $0x14,%esp
8b: 59 pop %ecx
8c: 5d pop %ebp
8d: 8d 61 fc lea -0x4(%ecx),%esp
90: c3 ret
91: 90 nop
92: 90 nop
93: 90 nop
In fact, we can see that the generated code doesn’t even use the _function_export
symbol, preferring __imp__function_export
. Essentially, the code of the _function_export
symbol in the import library has been inlined at every use site. This is why using __declspec(dllimport)
can improve performance of cross-DLL calls, even though it is entirely optional on function declarations.
We might wonder what happens if we drop the __declspec(dllimport)
qualifier on our declarations. Because of our discussion about the difference between data and function imports earlier, you might expect linking to fail. Our test file, main2.c
is:
Let’s try it out:
$ gcc main2.c library.dll.a -o main2 && ./main2
1379
42
1380
43
What the hell – it worked? This is a bit uprising. The reason that it works despite the fact that the import library library.dll.a
not defining the _data_export
symbol is because of a nifty feature of GNU ld called auto-import. Without auto-import the link fails as we would expect:
$ gcc main2.c library.dll.a -o main2 -Wl,--disable-auto-import && ./main2
/tmp/ccGd8Urx.o:main2.c:(.text+0x2c): undefined reference to `_data_export'
/tmp/ccGd8Urx.o:main2.c:(.text+0x41): undefined reference to `_data_export'
/tmp/ccGd8Urx.o:main2.c:(.text+0x49): undefined reference to `_data_export'
/tmp/ccGd8Urx.o:main2.c:(.text+0x63): undefined reference to `_data_export'
collect2: ld returned 1 exit status
The Microsoft linker does not implement auto-import, so this is the error you would get if you were using the Microsoft toolchain.
However, there is a way to write client code that does not depend on auto-import or use the __declspec(dllimport)
keyword. Our new client, main3.c
is as follows:
In this code, we directly use the __imp__
-prefixed symbols from the import library. These name an address at which the real address of the import can be found, which is reflected by our C-preprocessor definitions of data_export
and function_export
.
This code compiles perfectly even without auto-import:
$ gcc main3.c library.dll.a -o main3 -Wl,--disable-auto-import && ./main3
1379
42
1380
43
If you have followed along until this point you should have a solid understanding of how DLL import and export are implemented on Windows.
How auto-import works
As a bonus, I’m going to explain how auto-import is implemented by the GNU linker. It is a rather cute hack you may get a kick out of.
As a reminder, auto-import is a feature of the linker that allows the programmer to declare an item of DLL-imported data with a simple extern
keyword, without having to explicitly use __declspec(dllimport)
. This is extremely convenient because this is exactly how most _nix source code declares symbols it expects to import from a shared library, so by supporting this use case that_nix code becomes more portable to Windows.
Auto-import kicks in whenever the linker finds an object file making use of a symbol foo
which is not defined by any other object in the link, but where a symbol __imp_foo
is defined by some object. In this case, it assumes that the use of foo
is an attempt to access some DLL-imported data item called foo
.
Now, the problem is that the linker needs to replace the use of foo
with the address of foo
itself. However, all we seem to know statically is an address where that address will be placed at runtime (__imp_foo
). To square the circle, the linker plays a clever trick.
The trick is to extend the .idata
of the image being created with an entry for a “new” DLL. The new entry is set up as follows:
-
The filename of the image being imported is set to the same filename as the
.idata
entry covering__imp_foo
. So if__imp_foo
was being filled out by an address inBar.dll
, our new.idata
entry will useBar.dll
here. -
The import lookup table is of length 1, whose sole entry is a pointer to the name of the imported symbol corresponding to
__imp_foo
. So if__imp_foo
is filled out by the address of thefoo
export fromBar.dll
, the name of the symbol we put in here will befoo
. -
The import address table is of length 1 – and here is the clever bit – is located precisely at the location in the object file that was referring to the (undefined) symbol
foo
.
This solution neatly defers the task of filling out the address that the object file wants to the dynamic linker. The reason that the linker can play this trick is that it can see all of the object code that goes into the final image, and can thus fix all of the sites that need to refer to the imported data.
Note that in general the final image’s .idata
will contain several entries for the same DLL: one from the import library, and one for every place in any object file in the link which referred to some data exported by the DLL. Although this is somewhat unusual behaviour, the Windows linker has no problem with there being several imports of the same DLL.
A wrinkle
Unfortunately, the scheme described above only works if the object code has an undefined reference to foo
itself. What if instead it has a reference to foo+N
, an address N bytes after the address of foo
itself? There is no way to set up the .idata
so that the dynamic linker adds a constant to the address it fills in, so we seem to be stuck.
Alas, such relocations are reasonably common, and originate from code that accesses a field of a DLL-imported structure type. Cygwin actually contains another hack to make auto-import work in such cases, known as “pseudo-relocations”. If you want to know the details of how these works, there is more information in the original thread on the topic.
Conclusion
Dynamic linking on Windows is hairier than it at first appears. I hope this article has gone some way to clearing up the meaning of the mysterious dllimport
and dllexport
keywords, and at clarifying the role of the import and export libraries.
Linux and friends implement dynamic linking in a totally different manner to Windows. The scheme they use is more flexible and allows more in-memory sharing of code, but incurs a significant runtime penalty (especially on i386). For more details see here and the Dynamic Linking section of the the ELF spec.
Just curious, which compiler were you using?
Not sure: whatever Cygwin 1.7 installs. According to http://cygwin.com/cgi-bin2/package-grep.cgi?grep=gcc-core this may be gcc-4.5.2 or gcc-3.4.4