(linenum→info "unix/slp.c:2238")

binutils/2.18/ld/ldint.texinfo

    1: \input texinfo
    2: @setfilename ldint.info
    3: @c Copyright 1992, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
    4: @c 2003, 2007
    5: @c Free Software Foundation, Inc.
    6: 
    7: @ifinfo
    8: @format
    9: START-INFO-DIR-ENTRY
   10: * Ld-Internals: (ldint).        The GNU linker internals.
   11: END-INFO-DIR-ENTRY
   12: @end format
   13: @end ifinfo
   14: 
   15: @copying
   16: This file documents the internals of the GNU linker ld.
   17: 
   18: Copyright @copyright{} 1992, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2007
   19: Free Software Foundation, Inc.
   20: Contributed by Cygnus Support.
   21: 
   22: Permission is granted to copy, distribute and/or modify this document
   23: under the terms of the GNU Free Documentation License, Version 1.1 or
   24: any later version published by the Free Software Foundation; with the
   25: Invariant Sections being ``GNU General Public License'' and ``Funding
   26: Free Software'', the Front-Cover texts being (a) (see below), and with
   27: the Back-Cover Texts being (b) (see below).  A copy of the license is
   28: included in the section entitled ``GNU Free Documentation License''.
   29: 
   30: (a) The FSF's Front-Cover Text is:
   31: 
   32:      A GNU Manual
   33: 
   34: (b) The FSF's Back-Cover Text is:
   35: 
   36:      You have freedom to copy and modify this GNU Manual, like GNU
   37:      software.  Copies published by the Free Software Foundation raise
   38:      funds for GNU development.
   39: @end copying
   40: 
   41: @iftex
   42: @finalout
   43: @setchapternewpage off
   44: @settitle GNU Linker Internals
   45: @titlepage
   46: @title{A guide to the internals of the GNU linker}
   47: @author Per Bothner, Steve Chamberlain, Ian Lance Taylor, DJ Delorie
   48: @author Cygnus Support
   49: @page
   50: 
   51: @tex
   52: \def\$#1${{#1}}  % Kluge: collect RCS revision info without $...$
   53: \xdef\manvers{2.10.91}  % For use in headers, footers too
   54: {\parskip=0pt
   55: \hfill Cygnus Support\par
   56: \hfill \manvers\par
   57: \hfill \TeX{}info \texinfoversion\par
   58: }
   59: @end tex
   60: 
   61: @vskip 0pt plus 1filll
   62: Copyright @copyright{} 1992, 93, 94, 95, 96, 97, 1998, 2000
   63: Free Software Foundation, Inc.
   64: 
   65:       Permission is granted to copy, distribute and/or modify this document
   66:       under the terms of the GNU Free Documentation License, Version 1.1
   67:       or any later version published by the Free Software Foundation;
   68:       with no Invariant Sections, with no Front-Cover Texts, and with no
   69:       Back-Cover Texts.  A copy of the license is included in the
   70:       section entitled "GNU Free Documentation License".
   71: 
   72: @end titlepage
   73: @end iftex
   74: 
   75: @node Top
   76: @top
   77: 
   78: This file documents the internals of the GNU linker @code{ld}.  It is a
   79: collection of miscellaneous information with little form at this point.
   80: Mostly, it is a repository into which you can put information about
   81: GNU @code{ld} as you discover it (or as you design changes to @code{ld}).
   82: 
   83: This document is distributed under the terms of the GNU Free
   84: Documentation License.  A copy of the license is included in the
   85: section entitled "GNU Free Documentation License".
   86: 
   87: @menu
   88: * README::                      The README File
   89: * Emulations::                  How linker emulations are generated
   90: * Emulation Walkthrough::       A Walkthrough of a Typical Emulation
   91: * Architecture Specific::       Some Architecture Specific Notes
   92: * GNU Free Documentation License::  GNU Free Documentation License
   93: @end menu
   94: 
   95: @node README
   96: @chapter The @file{README} File
   97: 
   98: Check the @file{README} file; it often has useful information that does not
   99: appear anywhere else in the directory.
  100: 
  101: @node Emulations
  102: @chapter How linker emulations are generated
  103: 
  104: Each linker target has an @dfn{emulation}.  The emulation includes the
  105: default linker script, and certain emulations also modify certain types
  106: of linker behaviour.
  107: 
  108: Emulations are created during the build process by the shell script
  109: @file{genscripts.sh}.
  110: 
  111: The @file{genscripts.sh} script starts by reading a file in the
  112: @file{emulparams} directory.  This is a shell script which sets various
  113: shell variables used by @file{genscripts.sh} and the other shell scripts
  114: it invokes.
  115: 
  116: The @file{genscripts.sh} script will invoke a shell script in the
  117: @file{scripttempl} directory in order to create default linker scripts
  118: written in the linker command language.  The @file{scripttempl} script
  119: will be invoked 5 (or, in some cases, 6) times, with different
  120: assignments to shell variables, to create different default scripts.
  121: The choice of script is made based on the command line options.
  122: 
  123: After creating the scripts, @file{genscripts.sh} will invoke yet another
  124: shell script, this time in the @file{emultempl} directory.  That shell
  125: script will create the emulation source file, which contains C code.
  126: This C code permits the linker emulation to override various linker
  127: behaviours.  Most targets use the generic emulation code, which is in
  128: @file{emultempl/generic.em}.
  129: 
  130: To summarize, @file{genscripts.sh} reads three shell scripts: an
  131: emulation parameters script in the @file{emulparams} directory, a linker
  132: script generation script in the @file{scripttempl} directory, and an
  133: emulation source file generation script in the @file{emultempl}
  134: directory.
  135: 
  136: For example, the Sun 4 linker sets up variables in
  137: @file{emulparams/sun4.sh}, creates linker scripts using
  138: @file{scripttempl/aout.sc}, and creates the emulation code using
  139: @file{emultempl/sunos.em}.
  140: 
  141: Note that the linker can support several emulations simultaneously,
  142: depending upon how it is configured.  An emulation can be selected with
  143: the @code{-m} option.  The @code{-V} option will list all supported
  144: emulations.
  145: 
  146: @menu
  147: * emulation parameters::        @file{emulparams} scripts
  148: * linker scripts::              @file{scripttempl} scripts
  149: * linker emulations::           @file{emultempl} scripts
  150: @end menu
  151: 
  152: @node emulation parameters
  153: @section @file{emulparams} scripts
  154: 
  155: Each target selects a particular file in the @file{emulparams} directory
  156: by setting the shell variable @code{targ_emul} in @file{configure.tgt}.
  157: This shell variable is used by the @file{configure} script to control
  158: building an emulation source file.
  159: 
  160: Certain conventions are enforced.  Suppose the @code{targ_emul} variable
  161: is set to @var{emul} in @file{configure.tgt}.  The name of the emulation
  162: shell script will be @file{emulparams/@var{emul}.sh}.  The
  163: @file{Makefile} must have a target named @file{e@var{emul}.c}; this
  164: target must depend upon @file{emulparams/@var{emul}.sh}, as well as the
  165: appropriate scripts in the @file{scripttempl} and @file{emultempl}
  166: directories.  The @file{Makefile} target must invoke @code{GENSCRIPTS}
  167: with two arguments: @var{emul}, and the value of the make variable
  168: @code{tdir_@var{emul}}.  The value of the latter variable will be set by
  169: the @file{configure} script, and is used to set the default target
  170: directory to search.
  171: 
  172: By convention, the @file{emulparams/@var{emul}.sh} shell script should
  173: only set shell variables.  It may set shell variables which are to be
  174: interpreted by the @file{scripttempl} and the @file{emultempl} scripts.
  175: Certain shell variables are interpreted directly by the
  176: @file{genscripts.sh} script.
  177: 
  178: Here is a list of shell variables interpreted by @file{genscripts.sh},
  179: as well as some conventional shell variables interpreted by the
  180: @file{scripttempl} and @file{emultempl} scripts.
  181: 
  182: @table @code
  183: @item SCRIPT_NAME
  184: This is the name of the @file{scripttempl} script to use.  If
  185: @code{SCRIPT_NAME} is set to @var{script}, @file{genscripts.sh} will use
  186: the script @file{scripttempl/@var{script}.sc}.
  187: 
  188: @item TEMPLATE_NAME
  189: This is the name of the @file{emultempl} script to use.  If
  190: @code{TEMPLATE_NAME} is set to @var{template}, @file{genscripts.sh} will
  191: use the script @file{emultempl/@var{template}.em}.  If this variable is
  192: not set, the default value is @samp{generic}.
  193: 
  194: @item GENERATE_SHLIB_SCRIPT
  195: If this is set to a nonempty string, @file{genscripts.sh} will invoke
  196: the @file{scripttempl} script an extra time to create a shared library
  197: script.  @ref{linker scripts}.
  198: 
  199: @item OUTPUT_FORMAT
  200: This is normally set to indicate the BFD output format use (e.g.,
  201: @samp{"a.out-sunos-big"}.  The @file{scripttempl} script will normally
  202: use it in an @code{OUTPUT_FORMAT} expression in the linker script.
  203: 
  204: @item ARCH
  205: This is normally set to indicate the architecture to use (e.g.,
  206: @samp{sparc}).  The @file{scripttempl} script will normally use it in an
  207: @code{OUTPUT_ARCH} expression in the linker script.
  208: 
  209: @item ENTRY
  210: Some @file{scripttempl} scripts use this to set the entry address, in an
  211: @code{ENTRY} expression in the linker script.
  212: 
  213: @item TEXT_START_ADDR
  214: Some @file{scripttempl} scripts use this to set the start address of the
  215: @samp{.text} section.
  216: 
  217: @item SEGMENT_SIZE
  218: The @file{genscripts.sh} script uses this to set the default value of
  219: @code{DATA_ALIGNMENT} when running the @file{scripttempl} script.
  220: 
  221: @item TARGET_PAGE_SIZE
  222: If @code{SEGMENT_SIZE} is not defined, the @file{genscripts.sh} script
  223: uses this to define it.
  224: 
  225: @item ALIGNMENT
  226: Some @file{scripttempl} scripts set this to a number to pass to
  227: @code{ALIGN} to set the required alignment for the @code{end} symbol.
  228: @end table
  229: 
  230: @node linker scripts
  231: @section @file{scripttempl} scripts
  232: 
  233: Each linker target uses a @file{scripttempl} script to generate the
  234: default linker scripts.  The name of the @file{scripttempl} script is
  235: set by the @code{SCRIPT_NAME} variable in the @file{emulparams} script.
  236: If @code{SCRIPT_NAME} is set to @var{script}, @code{genscripts.sh} will
  237: invoke @file{scripttempl/@var{script}.sc}.
  238: 
  239: The @file{genscripts.sh} script will invoke the @file{scripttempl}
  240: script 5 to 8 times.  Each time it will set the shell variable
  241: @code{LD_FLAG} to a different value.  When the linker is run, the
  242: options used will direct it to select a particular script.  (Script
  243: selection is controlled by the @code{get_script} emulation entry point;
  244: this describes the conventional behaviour).
  245: 
  246: The @file{scripttempl} script should just write a linker script, written
  247: in the linker command language, to standard output.  If the emulation
  248: name--the name of the @file{emulparams} file without the @file{.sc}
  249: extension--is @var{emul}, then the output will be directed to
  250: @file{ldscripts/@var{emul}.@var{extension}} in the build directory,
  251: where @var{extension} changes each time the @file{scripttempl} script is
  252: invoked.
  253: 
  254: Here is the list of values assigned to @code{LD_FLAG}.
  255: 
  256: @table @code
  257: @item (empty)
  258: The script generated is used by default (when none of the following
  259: cases apply).  The output has an extension of @file{.x}.
  260: @item n
  261: The script generated is used when the linker is invoked with the
  262: @code{-n} option.  The output has an extension of @file{.xn}.
  263: @item N
  264: The script generated is used when the linker is invoked with the
  265: @code{-N} option.  The output has an extension of @file{.xbn}.
  266: @item r
  267: The script generated is used when the linker is invoked with the
  268: @code{-r} option.  The output has an extension of @file{.xr}.
  269: @item u
  270: The script generated is used when the linker is invoked with the
  271: @code{-Ur} option.  The output has an extension of @file{.xu}.
  272: @item shared
  273: The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
  274: this value if @code{GENERATE_SHLIB_SCRIPT} is defined in the
  275: @file{emulparams} file.  The @file{emultempl} script must arrange to use
  276: this script at the appropriate time, normally when the linker is invoked
  277: with the @code{-shared} option.  The output has an extension of
  278: @file{.xs}.
  279: @item c
  280: The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
  281: this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
  282: @file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf}. The
  283: @file{emultempl} script must arrange to use this script at the appropriate
  284: time, normally when the linker is invoked with the @code{-z combreloc}
  285: option.  The output has an extension of
  286: @file{.xc}.
  287: @item cshared
  288: The @file{scripttempl} script is only invoked with @code{LD_FLAG} set to
  289: this value if @code{GENERATE_COMBRELOC_SCRIPT} is defined in the
  290: @file{emulparams} file or if @code{SCRIPT_NAME} is @code{elf} and
  291: @code{GENERATE_SHLIB_SCRIPT} is defined in the @file{emulparams} file.
  292: The @file{emultempl} script must arrange to use this script at the
  293: appropriate time, normally when the linker is invoked with the @code{-shared
  294: -z combreloc} option.  The output has an extension of @file{.xsc}.
  295: @end table
  296: 
  297: Besides the shell variables set by the @file{emulparams} script, and the
  298: @code{LD_FLAG} variable, the @file{genscripts.sh} script will set
  299: certain variables for each run of the @file{scripttempl} script.
  300: 
  301: @table @code
  302: @item RELOCATING
  303: This will be set to a non-empty string when the linker is doing a final
  304: relocation (e.g., all scripts other than @code{-r} and @code{-Ur}).
  305: 
  306: @item CONSTRUCTING
  307: This will be set to a non-empty string when the linker is building
  308: global constructor and destructor tables (e.g., all scripts other than
  309: @code{-r}).
  310: 
  311: @item DATA_ALIGNMENT
  312: This will be set to an @code{ALIGN} expression when the output should be
  313: page aligned, or to @samp{.} when generating the @code{-N} script.
  314: 
  315: @item CREATE_SHLIB
  316: This will be set to a non-empty string when generating a @code{-shared}
  317: script.
  318: 
  319: @item COMBRELOC
  320: This will be set to a non-empty string when generating @code{-z combreloc}
  321: scripts to a temporary file name which can be used during script generation.
  322: @end table
  323: 
  324: The conventional way to write a @file{scripttempl} script is to first
  325: set a few shell variables, and then write out a linker script using
  326: @code{cat} with a here document.  The linker script will use variable
  327: substitutions, based on the above variables and those set in the
  328: @file{emulparams} script, to control its behaviour.
  329: 
  330: When there are parts of the @file{scripttempl} script which should only
  331: be run when doing a final relocation, they should be enclosed within a
  332: variable substitution based on @code{RELOCATING}.  For example, on many
  333: targets special symbols such as @code{_end} should be defined when doing
  334: a final link.  Naturally, those symbols should not be defined when doing
  335: a relocatable link using @code{-r}.  The @file{scripttempl} script
  336: could use a construct like this to define those symbols:
  337: @smallexample
  338:   $@{RELOCATING+ _end = .;@}
  339: @end smallexample
  340: This will do the symbol assignment only if the @code{RELOCATING}
  341: variable is defined.
  342: 
  343: The basic job of the linker script is to put the sections in the correct
  344: order, and at the correct memory addresses.  For some targets, the
  345: linker script may have to do some other operations.
  346: 
  347: For example, on most MIPS platforms, the linker is responsible for
  348: defining the special symbol @code{_gp}, used to initialize the
  349: @code{$gp} register.  It must be set to the start of the small data
  350: section plus @code{0x8000}.  Naturally, it should only be defined when
  351: doing a final relocation.  This will typically be done like this:
  352: @smallexample
  353:   $@{RELOCATING+ _gp = ALIGN(16) + 0x8000;@}
  354: @end smallexample
  355: This line would appear just before the sections which compose the small
  356: data section (@samp{.sdata}, @samp{.sbss}).  All those sections would be
  357: contiguous in memory.
  358: 
  359: Many COFF systems build constructor tables in the linker script.  The
  360: compiler will arrange to output the address of each global constructor
  361: in a @samp{.ctor} section, and the address of each global destructor in
  362: a @samp{.dtor} section (this is done by defining
  363: @code{ASM_OUTPUT_CONSTRUCTOR} and @code{ASM_OUTPUT_DESTRUCTOR} in the
  364: @code{gcc} configuration files).  The @code{gcc} runtime support
  365: routines expect the constructor table to be named @code{__CTOR_LIST__}.
  366: They expect it to be a list of words, with the first word being the
  367: count of the number of entries.  There should be a trailing zero word.
  368: (Actually, the count may be -1 if the trailing word is present, and the
  369: trailing word may be omitted if the count is correct, but, as the
  370: @code{gcc} behaviour has changed slightly over the years, it is safest
  371: to provide both).  Here is a typical way that might be handled in a
  372: @file{scripttempl} file.
  373: @smallexample
  374:     $@{CONSTRUCTING+ __CTOR_LIST__ = .;@}
  375:     $@{CONSTRUCTING+ LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2)@}
  376:     $@{CONSTRUCTING+ *(.ctors)@}
  377:     $@{CONSTRUCTING+ LONG(0)@}
  378:     $@{CONSTRUCTING+ __CTOR_END__ = .;@}
  379:     $@{CONSTRUCTING+ __DTOR_LIST__ = .;@}
  380:     $@{CONSTRUCTING+ LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2)@}
  381:     $@{CONSTRUCTING+ *(.dtors)@}
  382:     $@{CONSTRUCTING+ LONG(0)@}
  383:     $@{CONSTRUCTING+ __DTOR_END__ = .;@}
  384: @end smallexample
  385: The use of @code{CONSTRUCTING} ensures that these linker script commands
  386: will only appear when the linker is supposed to be building the
  387: constructor and destructor tables.  This example is written for a target
  388: which uses 4 byte pointers.
  389: 
  390: Embedded systems often need to set a stack address.  This is normally
  391: best done by using the @code{PROVIDE} construct with a default stack
  392: address.  This permits the user to easily override the stack address
  393: using the @code{--defsym} option.  Here is an example:
  394: @smallexample
  395:   $@{RELOCATING+ PROVIDE (__stack = 0x80000000);@}
  396: @end smallexample
  397: The value of the symbol @code{__stack} would then be used in the startup
  398: code to initialize the stack pointer.
  399: 
  400: @node linker emulations
  401: @section @file{emultempl} scripts
  402: 
  403: Each linker target uses an @file{emultempl} script to generate the
  404: emulation code.  The name of the @file{emultempl} script is set by the
  405: @code{TEMPLATE_NAME} variable in the @file{emulparams} script.  If the
  406: @code{TEMPLATE_NAME} variable is not set, the default is
  407: @samp{generic}.  If the value of @code{TEMPLATE_NAME} is @var{template},
  408: @file{genscripts.sh} will use @file{emultempl/@var{template}.em}.
  409: 
  410: Most targets use the generic @file{emultempl} script,
  411: @file{emultempl/generic.em}.  A different @file{emultempl} script is
  412: only needed if the linker must support unusual actions, such as linking
  413: against shared libraries.
  414: 
  415: The @file{emultempl} script is normally written as a simple invocation
  416: of @code{cat} with a here document.  The document will use a few
  417: variable substitutions.  Typically each function names uses a
  418: substitution involving @code{EMULATION_NAME}, for ease of debugging when
  419: the linker supports multiple emulations.
  420: 
  421: Every function and variable in the emitted file should be static.  The
  422: only globally visible object must be named
  423: @code{ld_@var{EMULATION_NAME}_emulation}, where @var{EMULATION_NAME} is
  424: the name of the emulation set in @file{configure.tgt} (this is also the
  425: name of the @file{emulparams} file without the @file{.sh} extension).
  426: The @file{genscripts.sh} script will set the shell variable
  427: @code{EMULATION_NAME} before invoking the @file{emultempl} script.
  428: 
  429: The @code{ld_@var{EMULATION_NAME}_emulation} variable must be a
  430: @code{struct ld_emulation_xfer_struct}, as defined in @file{ldemul.h}.
  431: It defines a set of function pointers which are invoked by the linker,
  432: as well as strings for the emulation name (normally set from the shell
  433: variable @code{EMULATION_NAME} and the default BFD target name (normally
  434: set from the shell variable @code{OUTPUT_FORMAT} which is normally set
  435: by the @file{emulparams} file).
  436: 
  437: The @file{genscripts.sh} script will set the shell variable
  438: @code{COMPILE_IN} when it invokes the @file{emultempl} script for the
  439: default emulation.  In this case, the @file{emultempl} script should
  440: include the linker scripts directly, and return them from the
  441: @code{get_scripts} entry point.  When the emulation is not the default,
  442: the @code{get_scripts} entry point should just return a file name.  See
  443: @file{emultempl/generic.em} for an example of how this is done.
  444: 
  445: At some point, the linker emulation entry points should be documented.
  446: 
  447: @node Emulation Walkthrough
  448: @chapter A Walkthrough of a Typical Emulation
  449: 
  450: This chapter is to help people who are new to the way emulations
  451: interact with the linker, or who are suddenly thrust into the position
  452: of having to work with existing emulations.  It will discuss the files
  453: you need to be aware of.  It will tell you when the given "hooks" in
  454: the emulation will be called.  It will, hopefully, give you enough
  455: information about when and how things happen that you'll be able to
  456: get by.  As always, the source is the definitive reference to this.
  457: 
  458: The starting point for the linker is in @file{ldmain.c} where
  459: @code{main} is defined.  The bulk of the code that's emulation
  460: specific will initially be in @code{emultempl/@var{emulation}.em} but
  461: will end up in @code{e@var{emulation}.c} when the build is done.
  462: Most of the work to select and interface with emulations is in
  463: @code{ldemul.h} and @code{ldemul.c}.  Specifically, @code{ldemul.h}
  464: defines the @code{ld_emulation_xfer_struct} structure your emulation
  465: exports.
  466: 
  467: Your emulation file exports a symbol
  468: @code{ld_@var{EMULATION_NAME}_emulation}.  If your emulation is
  469: selected (it usually is, since usually there's only one),
  470: @code{ldemul.c} sets the variable @var{ld_emulation} to point to it.
  471: @code{ldemul.c} also defines a number of API functions that interface
  472: to your emulation, like @code{ldemul_after_parse} which simply calls
  473: your @code{ld_@var{EMULATION}_emulation.after_parse} function.  For
  474: the rest of this section, the functions will be mentioned, but you
  475: should assume the indirect reference to your emulation also.
  476: 
  477: We will also skip or gloss over parts of the link process that don't
  478: relate to emulations, like setting up internationalization.
  479: 
  480: After initialization, @code{main} selects an emulation by pre-scanning
  481: the command line arguments.  It calls @code{ldemul_choose_target} to
  482: choose a target.  If you set @code{choose_target} to
  483: @code{ldemul_default_target}, it picks your @code{target_name} by
  484: default.
  485: 
  486: @code{main} calls @code{ldemul_before_parse}, then @code{parse_args}.
  487: @code{parse_args} calls @code{ldemul_parse_args} for each arg, which
  488: must update the @code{getopt} globals if it recognizes the argument.
  489: If the emulation doesn't recognize it, then parse_args checks to see
  490: if it recognizes it.
  491: 
  492: Now that the emulation has had access to all its command-line options,
  493: @code{main} calls @code{ldemul_set_symbols}.  This can be used for any
  494: initialization that may be affected by options.  It is also supposed
  495: to set up any variables needed by the emulation script.
  496: 
  497: @code{main} now calls @code{ldemul_get_script} to get the emulation
  498: script to use (based on arguments, no doubt, @pxref{Emulations}) and
  499: runs it.  While parsing, @code{ldgram.y} may call @code{ldemul_hll} or
  500: @code{ldemul_syslib} to handle the @code{HLL} or @code{SYSLIB}
  501: commands.  It may call @code{ldemul_unrecognized_file} if you asked
  502: the linker to link a file it doesn't recognize.  It will call
  503: @code{ldemul_recognized_file} for each file it does recognize, in case
  504: the emulation wants to handle some files specially.  All the while,
  505: it's loading the files (possibly calling
  506: @code{ldemul_open_dynamic_archive}) and symbols and stuff.  After it's
  507: done reading the script, @code{main} calls @code{ldemul_after_parse}.
  508: Use the after-parse hook to set up anything that depends on stuff the
  509: script might have set up, like the entry point.
  510: 
  511: @code{main} next calls @code{lang_process} in @code{ldlang.c}.  This
  512: appears to be the main core of the linking itself, as far as emulation
  513: hooks are concerned(*).  It first opens the output file's BFD, calling
  514: @code{ldemul_set_output_arch}, and calls
  515: @code{ldemul_create_output_section_statements} in case you need to use
  516: other means to find or create object files (i.e. shared libraries
  517: found on a path, or fake stub objects).  Despite the name, nobody
  518: creates output sections here.
  519: 
  520: (*) In most cases, the BFD library does the bulk of the actual
  521: linking, handling symbol tables, symbol resolution, relocations, and
  522: building the final output file.  See the BFD reference for all the
  523: details.  Your emulation is usually concerned more with managing
  524: things at the file and section level, like "put this here, add this
  525: section", etc.
  526: 
  527: Next, the objects to be linked are opened and BFDs created for them,
  528: and @code{ldemul_after_open} is called.  At this point, you have all
  529: the objects and symbols loaded, but none of the data has been placed
  530: yet.
  531: 
  532: Next comes the Big Linking Thingy (except for the parts BFD does).
  533: All input sections are mapped to output sections according to the
  534: script.  If a section doesn't get mapped by default,
  535: @code{ldemul_place_orphan} will get called to figure out where it goes.
  536: Next it figures out the offsets for each section, calling
  537: @code{ldemul_before_allocation} before and
  538: @code{ldemul_after_allocation} after deciding where each input section
  539: ends up in the output sections.
  540: 
  541: The last part of @code{lang_process} is to figure out all the symbols'
  542: values.  After assigning final values to the symbols,
  543: @code{ldemul_finish} is called, and after that, any undefined symbols
  544: are turned into fatal errors.
  545: 
  546: OK, back to @code{main}, which calls @code{ldwrite} in
  547: @file{ldwrite.c}.  @code{ldwrite} calls BFD's final_link, which does
  548: all the relocation fixups and writes the output bfd to disk, and we're
  549: done.
  550: 
  551: In summary,
  552: 
  553: @itemize @bullet
  554: 
  555: @item @code{main()} in @file{ldmain.c}
  556: @item @file{emultempl/@var{EMULATION}.em} has your code
  557: @item @code{ldemul_choose_target} (defaults to your @code{target_name})
  558: @item @code{ldemul_before_parse}
  559: @item Parse argv, calls @code{ldemul_parse_args} for each
  560: @item @code{ldemul_set_symbols}
  561: @item @code{ldemul_get_script}
  562: @item parse script
  563: 
  564: @itemize @bullet
  565: @item may call @code{ldemul_hll} or @code{ldemul_syslib}
  566: @item may call @code{ldemul_open_dynamic_archive}
  567: @end itemize
  568: 
  569: @item @code{ldemul_after_parse}
  570: @item @code{lang_process()} in @file{ldlang.c}
  571: 
  572: @itemize @bullet
  573: @item create @code{output_bfd}
  574: @item @code{ldemul_set_output_arch}
  575: @item @code{ldemul_create_output_section_statements}
  576: @item read objects, create input bfds - all symbols exist, but have no values
  577: @item may call @code{ldemul_unrecognized_file}
  578: @item will call @code{ldemul_recognized_file}
  579: @item @code{ldemul_after_open}
  580: @item map input sections to output sections
  581: @item may call @code{ldemul_place_orphan} for remaining sections
  582: @item @code{ldemul_before_allocation}
  583: @item gives input sections offsets into output sections, places output sections
  584: @item @code{ldemul_after_allocation} - section addresses valid
  585: @item assigns values to symbols
  586: @item @code{ldemul_finish} - symbol values valid
  587: @end itemize
  588: 
  589: @item output bfd is written to disk
  590: 
  591: @end itemize
  592: 
  593: @node Architecture Specific
  594: @chapter Some Architecture Specific Notes
  595: 
  596: This is the place for notes on the behavior of @code{ld} on
  597: specific platforms.  Currently, only Intel x86 is documented (and 
  598: of that, only the auto-import behavior for DLLs).
  599: 
  600: @menu
  601: * ix86::                        Intel x86
  602: @end menu
  603: 
  604: @node ix86
  605: @section Intel x86
  606: 
  607: @table @emph
  608: @code{ld} can create DLLs that operate with various runtimes available
  609: on a common x86 operating system.  These runtimes include native (using 
  610: the mingw "platform"), cygwin, and pw.
  611: 
  612: @item auto-import from DLLs 
  613: @enumerate
  614: @item
  615: With this feature on, DLL clients can import variables from DLL 
  616: without any concern from their side (for example, without any source
  617: code modifications).  Auto-import can be enabled using the 
  618: @code{--enable-auto-import} flag, or disabled via the 
  619: @code{--disable-auto-import} flag.  Auto-import is disabled by default.
  620: 
  621: @item
  622: This is done completely in bounds of the PE specification (to be fair,
  623: there's a minor violation of the spec at one point, but in practice 
  624: auto-import works on all known variants of that common x86 operating
  625: system)  So, the resulting DLL can be used with any other PE 
  626: compiler/linker.
  627: 
  628: @item
  629: Auto-import is fully compatible with standard import method, in which
  630: variables are decorated using attribute modifiers. Libraries of either
  631: type may be mixed together.
  632: 
  633: @item
  634: Overhead (space): 8 bytes per imported symbol, plus 20 for each
  635: reference to it; Overhead (load time): negligible; Overhead 
  636: (virtual/physical memory): should be less than effect of DLL 
  637: relocation.
  638: @end enumerate
  639: 
  640: Motivation
  641: 
  642: The obvious and only way to get rid of dllimport insanity is 
  643: to make client access variable directly in the DLL, bypassing 
  644: the extra dereference imposed by ordinary DLL runtime linking.
  645: I.e., whenever client contains something like
  646: 
  647: @code{mov dll_var,%eax,}
  648: 
  649: address of dll_var in the command should be relocated to point 
  650: into loaded DLL. The aim is to make OS loader do so, and than 
  651: make ld help with that.  Import section of PE made following 
  652: way: there's a vector of structures each describing imports 
  653: from particular DLL. Each such structure points to two other 
  654: parallel vectors: one holding imported names, and one which 
  655: will hold address of corresponding imported name. So, the 
  656: solution is de-vectorize these structures, making import 
  657: locations be sparse and pointing directly into code.
  658: 
  659: Implementation
  660: 
  661: For each reference of data symbol to be imported from DLL (to 
  662: set of which belong symbols with name <sym>, if __imp_<sym> is 
  663: found in implib), the import fixup entry is generated. That 
  664: entry is of type IMAGE_IMPORT_DESCRIPTOR and stored in .idata$3 
  665: subsection. Each fixup entry contains pointer to symbol's address 
  666: within .text section (marked with __fuN_<sym> symbol, where N is 
  667: integer), pointer to DLL name (so, DLL name is referenced by 
  668: multiple entries), and pointer to symbol name thunk. Symbol name 
  669: thunk is singleton vector (__nm_th_<symbol>) pointing to 
  670: IMAGE_IMPORT_BY_NAME structure (__nm_<symbol>) directly containing 
  671: imported name. Here comes that "om the edge" problem mentioned above: 
  672: PE specification rambles that name vector (OriginalFirstThunk) should 
  673: run in parallel with addresses vector (FirstThunk), i.e. that they 
  674: should have same number of elements and terminated with zero. We violate
  675: this, since FirstThunk points directly into machine code. But in 
  676: practice, OS loader implemented the sane way: it goes thru 
  677: OriginalFirstThunk and puts addresses to FirstThunk, not something 
  678: else. It once again should be noted that dll and symbol name 
  679: structures are reused across fixup entries and should be there 
  680: anyway to support standard import stuff, so sustained overhead is 
  681: 20 bytes per reference. Other question is whether having several 
  682: IMAGE_IMPORT_DESCRIPTORS for the same DLL is possible. Answer is yes, 
  683: it is done even by native compiler/linker (libth32's functions are in 
  684: fact resident in windows9x kernel32.dll, so if you use it, you have 
  685: two IMAGE_IMPORT_DESCRIPTORS for kernel32.dll). Yet other question is 
  686: whether referencing the same PE structures several times is valid. 
  687: The answer is why not, prohibiting that (detecting violation) would 
  688: require more work on behalf of loader than not doing it.
  689: 
  690: @end table
  691: 
  692: @node GNU Free Documentation License
  693: @chapter GNU Free Documentation License
  694: 
  695:                 GNU Free Documentation License
  696:                 
  697:                    Version 1.1, March 2000
  698: 
  699:  Copyright (C) 2000  Free Software Foundation, Inc.
  700:   51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
  701:      
  702:  Everyone is permitted to copy and distribute verbatim copies
  703:  of this license document, but changing it is not allowed.
  704: 
  705: 
  706: 0. PREAMBLE
  707: 
  708: The purpose of this License is to make a manual, textbook, or other
  709: written document "free" in the sense of freedom: to assure everyone
  710: the effective freedom to copy and redistribute it, with or without
  711: modifying it, either commercially or noncommercially.  Secondarily,
  712: this License preserves for the author and publisher a way to get
  713: credit for their work, while not being considered responsible for
  714: modifications made by others.
  715: 
  716: This License is a kind of "copyleft", which means that derivative
  717: works of the document must themselves be free in the same sense.  It
  718: complements the GNU General Public License, which is a copyleft
  719: license designed for free software.
  720: 
  721: We have designed this License in order to use it for manuals for free
  722: software, because free software needs free documentation: a free
  723: program should come with manuals providing the same freedoms that the
  724: software does.  But this License is not limited to software manuals;
  725: it can be used for any textual work, regardless of subject matter or
  726: whether it is publi