(linenum→info "unix/slp.c:2238")

glibc/2.7/manual/pattern.texi

    1: @node Pattern Matching, I/O Overview, Searching and Sorting, Top
    2: @c %MENU% Matching shell ``globs'' and regular expressions
    3: @chapter Pattern Matching
    4: 
    5: The GNU C Library provides pattern matching facilities for two kinds of
    6: patterns: regular expressions and file-name wildcards.  The library also
    7: provides a facility for expanding variable and command references and
    8: parsing text into words in the way the shell does.
    9: 
   10: @menu
   11: * Wildcard Matching::    Matching a wildcard pattern against a single string.
   12: * Globbing::             Finding the files that match a wildcard pattern.
   13: * Regular Expressions::  Matching regular expressions against strings.
   14: * Word Expansion::       Expanding shell variables, nested commands,
   15:                             arithmetic, and wildcards.
   16:                             This is what the shell does with shell commands.
   17: @end menu
   18: 
   19: @node Wildcard Matching
   20: @section Wildcard Matching
   21: 
   22: @pindex fnmatch.h
   23: This section describes how to match a wildcard pattern against a
   24: particular string.  The result is a yes or no answer: does the
   25: string fit the pattern or not.  The symbols described here are all
   26: declared in @file{fnmatch.h}.
   27: 
   28: @comment fnmatch.h
   29: @comment POSIX.2
   30: @deftypefun int fnmatch (const char *@var{pattern}, const char *@var{string}, int @var{flags})
   31: This function tests whether the string @var{string} matches the pattern
   32: @var{pattern}.  It returns @code{0} if they do match; otherwise, it
   33: returns the nonzero value @code{FNM_NOMATCH}.  The arguments
   34: @var{pattern} and @var{string} are both strings.
   35: 
   36: The argument @var{flags} is a combination of flag bits that alter the
   37: details of matching.  See below for a list of the defined flags.
   38: 
   39: In the GNU C Library, @code{fnmatch} cannot experience an ``error''---it
   40: always returns an answer for whether the match succeeds.  However, other
   41: implementations of @code{fnmatch} might sometimes report ``errors''.
   42: They would do so by returning nonzero values that are not equal to
   43: @code{FNM_NOMATCH}.
   44: @end deftypefun
   45: 
   46: These are the available flags for the @var{flags} argument:
   47: 
   48: @table @code
   49: @comment fnmatch.h
   50: @comment GNU
   51: @item FNM_FILE_NAME
   52: Treat the @samp{/} character specially, for matching file names.  If
   53: this flag is set, wildcard constructs in @var{pattern} cannot match
   54: @samp{/} in @var{string}.  Thus, the only way to match @samp{/} is with
   55: an explicit @samp{/} in @var{pattern}.
   56: 
   57: @comment fnmatch.h
   58: @comment POSIX.2
   59: @item FNM_PATHNAME
   60: This is an alias for @code{FNM_FILE_NAME}; it comes from POSIX.2.  We
   61: don't recommend this name because we don't use the term ``pathname'' for
   62: file names.
   63: 
   64: @comment fnmatch.h
   65: @comment POSIX.2
   66: @item FNM_PERIOD
   67: Treat the @samp{.} character specially if it appears at the beginning of
   68: @var{string}.  If this flag is set, wildcard constructs in @var{pattern}
   69: cannot match @samp{.} as the first character of @var{string}.
   70: 
   71: If you set both @code{FNM_PERIOD} and @code{FNM_FILE_NAME}, then the
   72: special treatment applies to @samp{.} following @samp{/} as well as to
   73: @samp{.} at the beginning of @var{string}.  (The shell uses the
   74: @code{FNM_PERIOD} and @code{FNM_FILE_NAME} flags together for matching
   75: file names.)
   76: 
   77: @comment fnmatch.h
   78: @comment POSIX.2
   79: @item FNM_NOESCAPE
   80: Don't treat the @samp{\} character specially in patterns.  Normally,
   81: @samp{\} quotes the following character, turning off its special meaning
   82: (if any) so that it matches only itself.  When quoting is enabled, the
   83: pattern @samp{\?} matches only the string @samp{?}, because the question
   84: mark in the pattern acts like an ordinary character.
   85: 
   86: If you use @code{FNM_NOESCAPE}, then @samp{\} is an ordinary character.
   87: 
   88: @comment fnmatch.h
   89: @comment GNU
   90: @item FNM_LEADING_DIR
   91: Ignore a trailing sequence of characters starting with a @samp{/} in
   92: @var{string}; that is to say, test whether @var{string} starts with a
   93: directory name that @var{pattern} matches.
   94: 
   95: If this flag is set, either @samp{foo*} or @samp{foobar} as a pattern
   96: would match the string @samp{foobar/frobozz}.
   97: 
   98: @comment fnmatch.h
   99: @comment GNU
  100: @item FNM_CASEFOLD
  101: Ignore case in comparing @var{string} to @var{pattern}.
  102: 
  103: @comment fnmatch.h
  104: @comment GNU
  105: @item FNM_EXTMATCH
  106: @cindex Korn Shell
  107: @pindex ksh
  108: Recognize beside the normal patterns also the extended patterns
  109: introduced in @file{ksh}.  The patterns are written in the form
  110: explained in the following table where @var{pattern-list} is a @code{|}
  111: separated list of patterns.
  112: 
  113: @table @code
  114: @item ?(@var{pattern-list})
  115: The pattern matches if zero or one occurrences of any of the patterns
  116: in the @var{pattern-list} allow matching the input string.
  117: 
  118: @item *(@var{pattern-list})
  119: The pattern matches if zero or more occurrences of any of the patterns
  120: in the @var{pattern-list} allow matching the input string.
  121: 
  122: @item +(@var{pattern-list})
  123: The pattern matches if one or more occurrences of any of the patterns
  124: in the @var{pattern-list} allow matching the input string.
  125: 
  126: @item @@(@var{pattern-list})
  127: The pattern matches if exactly one occurrence of any of the patterns in
  128: the @var{pattern-list} allows matching the input string.
  129: 
  130: @item !(@var{pattern-list})
  131: The pattern matches if the input string cannot be matched with any of
  132: the patterns in the @var{pattern-list}.
  133: @end table
  134: @end table
  135: 
  136: @node Globbing
  137: @section Globbing
  138: 
  139: @cindex globbing
  140: The archetypal use of wildcards is for matching against the files in a
  141: directory, and making a list of all the matches.  This is called
  142: @dfn{globbing}.
  143: 
  144: You could do this using @code{fnmatch}, by reading the directory entries
  145: one by one and testing each one with @code{fnmatch}.  But that would be
  146: slow (and complex, since you would have to handle subdirectories by
  147: hand).
  148: 
  149: The library provides a function @code{glob} to make this particular use
  150: of wildcards convenient.  @code{glob} and the other symbols in this
  151: section are declared in @file{glob.h}.
  152: 
  153: @menu
  154: * Calling Glob::             Basic use of @code{glob}.
  155: * Flags for Globbing::       Flags that enable various options in @code{glob}.
  156: * More Flags for Globbing::  GNU specific extensions to @code{glob}.
  157: @end menu
  158: 
  159: @node Calling Glob
  160: @subsection Calling @code{glob}
  161: 
  162: The result of globbing is a vector of file names (strings).  To return
  163: this vector, @code{glob} uses a special data type, @code{glob_t}, which
  164: is a structure.  You pass @code{glob} the address of the structure, and
  165: it fills in the structure's fields to tell you about the results.
  166: 
  167: @comment glob.h
  168: @comment POSIX.2
  169: @deftp {Data Type} glob_t
  170: This data type holds a pointer to a word vector.  More precisely, it
  171: records both the address of the word vector and its size.  The GNU
  172: implementation contains some more fields which are non-standard
  173: extensions.
  174: 
  175: @table @code
  176: @item gl_pathc
  177: The number of elements in the vector, excluding the initial null entries
  178: if the GLOB_DOOFFS flag is used (see gl_offs below).
  179: 
  180: @item gl_pathv
  181: The address of the vector.  This field has type @w{@code{char **}}.
  182: 
  183: @item gl_offs
  184: The offset of the first real element of the vector, from its nominal
  185: address in the @code{gl_pathv} field.  Unlike the other fields, this
  186: is always an input to @code{glob}, rather than an output from it.
  187: 
  188: If you use a nonzero offset, then that many elements at the beginning of
  189: the vector are left empty.  (The @code{glob} function fills them with
  190: null pointers.)
  191: 
  192: The @code{gl_offs} field is meaningful only if you use the
  193: @code{GLOB_DOOFFS} flag.  Otherwise, the offset is always zero
  194: regardless of what is in this field, and the first real element comes at
  195: the beginning of the vector.
  196: 
  197: @item gl_closedir
  198: The address of an alternative implementation of the @code{closedir}
  199: function.  It is used if the @code{GLOB_ALTDIRFUNC} bit is set in
  200: the flag parameter.  The type of this field is
  201: @w{@code{void (*) (void *)}}.
  202: 
  203: This is a GNU extension.
  204: 
  205: @item gl_readdir
  206: The address of an alternative implementation of the @code{readdir}
  207: function used to read the contents of a directory.  It is used if the
  208: @code{GLOB_ALTDIRFUNC} bit is set in the flag parameter.  The type of
  209: this field is @w{@code{struct dirent *(*) (void *)}}.
  210: 
  211: This is a GNU extension.
  212: 
  213: @item gl_opendir
  214: The address of an alternative implementation of the @code{opendir}
  215: function.  It is used if the @code{GLOB_ALTDIRFUNC} bit is set in
  216: the flag parameter.  The type of this field is
  217: @w{@code{void *(*) (const char *)}}.
  218: 
  219: This is a GNU extension.
  220: 
  221: @item gl_stat
  222: The address of an alternative implementation of the @code{stat} function
  223: to get information about an object in the filesystem.  It is used if the
  224: @code{GLOB_ALTDIRFUNC} bit is set in the flag parameter.  The type of
  225: this field is @w{@code{int (*) (const char *, struct stat *)}}.
  226: 
  227: This is a GNU extension.
  228: 
  229: @item gl_lstat
  230: The address of an alternative implementation of the @code{lstat}
  231: function to get information about an object in the filesystems, not
  232: following symbolic links.  It is used if the @code{GLOB_ALTDIRFUNC} bit
  233: is set in the flag parameter.  The type of this field is @code{@w{int
  234: (*) (const char *,} @w{struct stat *)}}.
  235: 
  236: This is a GNU extension.
  237: @end table
  238: @end deftp
  239: 
  240: For use in the @code{glob64} function @file{glob.h} contains another
  241: definition for a very similar type.  @code{glob64_t} differs from
  242: @code{glob_t} only in the types of the members @code{gl_readdir},
  243: @code{gl_stat}, and @code{gl_lstat}.
  244: 
  245: @comment glob.h
  246: @comment GNU
  247: @deftp {Data Type} glob64_t
  248: This data type holds a pointer to a word vector.  More precisely, it
  249: records both the address of the word vector and its size.  The GNU
  250: implementation contains some more fields which are non-standard
  251: extensions.
  252: 
  253: @table @code
  254: @item gl_pathc
  255: The number of elements in the vector, excluding the initial null entries
  256: if the GLOB_DOOFFS flag is used (see gl_offs below).
  257: 
  258: @item gl_pathv
  259: The address of the vector.  This field has type @w{@code{char **}}.
  260: 
  261: @item gl_offs
  262: The offset of the first real element of the vector, from its nominal
  263: address in the @code{gl_pathv} field.  Unlike the other fields, this
  264: is always an input to @code{glob}, rather than an output from it.
  265: 
  266: If you use a nonzero offset, then that many elements at the beginning of
  267: the vector are left empty.  (The @code{glob} function fills them with
  268: null pointers.)
  269: 
  270: The @code{gl_offs} field is meaningful only if you use the
  271: @code{GLOB_DOOFFS} flag.  Otherwise, the offset is always zero
  272: regardless of what is in this field, and the first real element comes at
  273: the beginning of the vector.
  274: 
  275: @item gl_closedir
  276: The address of an alternative implementation of the @code{closedir}
  277: function.  It is used if the @code{GLOB_ALTDIRFUNC} bit is set in
  278: the flag parameter.  The type of this field is
  279: @w{@code{void (*) (void *)}}.
  280: 
  281: This is a GNU extension.
  282: 
  283: @item gl_readdir
  284: The address of an alternative implementation of the @code{readdir64}
  285: function used to read the contents of a directory.  It is used if the
  286: @code{GLOB_ALTDIRFUNC} bit is set in the flag parameter.  The type of
  287: this field is @w{@code{struct dirent64 *(*) (void *)}}.
  288: 
  289: This is a GNU extension.
  290: 
  291: @item gl_opendir
  292: The address of an alternative implementation of the @code{opendir}
  293: function.  It is used if the @code{GLOB_ALTDIRFUNC} bit is set in
  294: the flag parameter.  The type of this field is
  295: @w{@code{void *(*) (const char *)}}.
  296: 
  297: This is a GNU extension.
  298: 
  299: @item gl_stat
  300: The address of an alternative implementation of the @code{stat64} function
  301: to get information about an object in the filesystem.  It is used if the
  302: @code{GLOB_ALTDIRFUNC} bit is set in the flag parameter.  The type of
  303: this field is @w{@code{int (*) (const char *, struct stat64 *)}}.
  304: 
  305: This is a GNU extension.
  306: 
  307: @item gl_lstat
  308: The address of an alternative implementation of the @code{lstat64}
  309: function to get information about an object in the filesystems, not
  310: following symbolic links.  It is used if the @code{GLOB_ALTDIRFUNC} bit
  311: is set in the flag parameter.  The type of this field is @code{@w{int
  312: (*) (const char *,} @w{struct stat64 *)}}.
  313: 
  314: This is a GNU extension.
  315: @end table
  316: @end deftp
  317: 
  318: @comment glob.h
  319: @comment POSIX.2
  320: @deftypefun int glob (const char *@var{pattern}, int @var{flags}, int (*@var{errfunc}) (const char *@var{filename}, int @var{error-code}), glob_t *@var{vector-ptr})
  321: The function @code{glob} does globbing using the pattern @var{pattern}
  322: in the current directory.  It puts the result in a newly allocated
  323: vector, and stores the size and address of this vector into
  324: @code{*@var{vector-ptr}}.  The argument @var{flags} is a combination of
  325: bit flags; see @ref{Flags for Globbing}, for details of the flags.
  326: 
  327: The result of globbing is a sequence of file names.  The function
  328: @code{glob} allocates a string for each resulting word, then
  329: allocates a vector of type @code{char **} to store the addresses of
  330: these strings.  The last element of the vector is a null pointer.
  331: This vector is called the @dfn{word vector}.
  332: 
  333: To return this vector, @code{glob} stores both its address and its
  334: length (number of elements, not counting the terminating null pointer)
  335: into @code{*@var{vector-ptr}}.
  336: 
  337: Normally, @code{glob} sorts the file names alphabetically before
  338: returning them.  You can turn this off with the flag @code{GLOB_NOSORT}
  339: if you want to get the information as fast as possible.  Usually it's
  340: a good idea to let @code{glob} sort them---if you process the files in
  341: alphabetical order, the users will have a feel for the rate of progress
  342: that your application is making.
  343: 
  344: If @code{glob} succeeds, it returns 0.  Otherwise, it returns one
  345: of these error codes:
  346: 
  347: @vtable @code
  348: @comment glob.h
  349: @comment POSIX.2
  350: @item GLOB_ABORTED
  351: There was an error opening a directory, and you used the flag
  352: @code{GLOB_ERR} or your specified @var{errfunc} returned a nonzero
  353: value.
  354: @iftex
  355: See below
  356: @end iftex
  357: @ifinfo
  358: @xref{Flags for Globbing},
  359: @end ifinfo
  360: for an explanation of the @code{GLOB_ERR} flag and @var{errfunc}.
  361: 
  362: @comment glob.h
  363: @comment POSIX.2
  364: @item GLOB_NOMATCH
  365: The pattern didn't match any existing files.  If you use the
  366: @code{GLOB_NOCHECK} flag, then you never get this error code, because
  367: that flag tells @code{glob} to @emph{pretend} that the pattern matched
  368: at least one file.
  369: 
  370: @comment glob.h
  371: @comment POSIX.2
  372: @item GLOB_NOSPACE
  373: It was impossible to allocate memory to hold the result.
  374: @end vtable
  375: 
  376: In the event of an error, @code{glob} stores information in
  377: @code{*@var{vector-ptr}} about all the matches it has found so far.
  378: 
  379: It is important to notice that the @code{glob} function will not fail if
  380: it encounters directories or files which cannot be handled without the
  381: LFS interfaces.  The implementation of @code{glob} is supposed to use
  382: these functions internally.  This at least is the assumptions made by
  383: the Unix standard.  The GNU extension of allowing the user to provide
  384: own directory handling and @code{stat} functions complicates things a
  385: bit.  If these callback functions are used and a large file or directory
  386: is encountered @code{glob} @emph{can} fail.
  387: @end deftypefun
  388: 
  389: @comment glob.h
  390: @comment GNU
  391: @deftypefun int glob64 (const char *@var{pattern}, int @var{flags}, int (*@var{errfunc}) (const char *@var{filename}, int @var{error-code}), glob64_t *@var{vector-ptr})
  392: The @code{glob64} function was added as part of the Large File Summit
  393: extensions but is not part of the original LFS proposal.  The reason for
  394: this is simple: it is not necessary.  The necessity for a @code{glob64}
  395: function is added by the extensions of the GNU @code{glob}
  396: implementation which allows the user to provide own directory handling
  397: and @code{stat} functions.  The @code{readdir} and @code{stat} functions
  398: do depend on the choice of @code{_FILE_OFFSET_BITS} since the definition
  399: of the types @code{struct dirent} and @code{struct stat} will change
  400: depending on the choice.
  401: 
  402: Beside this difference the @code{glob64} works just like @code{glob} in
  403: all aspects.
  404: 
  405: This function is a GNU extension.
  406: @end deftypefun
  407: 
  408: @node Flags for Globbing
  409: @subsection Flags for Globbing
  410: 
  411: This section describes the flags that you can specify in the
  412: @var{flags} argument to @code{glob}.  Choose the flags you want,
  413: and combine them with the C bitwise OR operator @code{|}.
  414: 
  415: @vtable @code
  416: @comment glob.h
  417: @comment POSIX.2
  418: @item GLOB_APPEND
  419: Append the words from this expansion to the vector of words produced by
  420: previous calls to @code{glob}.  This way you can effectively expand
  421: several words as if they were concatenated with spaces between them.
  422: 
  423: In order for appending to work, you must not modify the contents of the
  424: word vector structure between calls to @code{glob}.  And, if you set
  425: @code{GLOB_DOOFFS} in the first call to @code{glob}, you must also
  426: set it when you append to the results.
  427: 
  428: Note that the pointer stored in @code{gl_pathv} may no longer be valid
  429: after you call @code{glob} the second time, because @code{glob} might
  430: have relocated the vector.  So always fetch @code{gl_pathv} from the
  431: @code{glob_t} structure after each @code{glob} call; @strong{never} save
  432: the pointer across calls.
  433: 
  434: @comment glob.h
  435: @comment POSIX.2
  436: @item GLOB_DOOFFS
  437: Leave blank slots at the beginning of the vector of words.
  438: The @code{gl_offs} field says how many slots to leave.
  439: The blank slots contain null pointers.
  440: 
  441: @comment glob.h
  442: @comment POSIX.2
  443: @item GLOB_ERR
  444: Give up right away and report an error if there is any difficulty
  445: reading the directories that must be read in order to expand @var{pattern}
  446: fully.  Such difficulties might include a directory in which you don't
  447: have the requisite access.  Normally, @code{glob} tries its best to keep
  448: on going despite any errors, reading whatever directories it can.
  449: 
  450: You can exercise even more control than this by specifying an
  451: error-handler function @var{errfunc} when you call @code{glob}.  If
  452: @var{errfunc} is not a null pointer, then @code{glob} doesn't give up
  453: right away when it can't read a directory; instead, it calls
  454: @var{errfunc} with two arguments, like this:
  455: 
  456: @smallexample
  457: (*@var{errfunc}) (@var{filename}, @var{error-code})
  458: @end smallexample
  459: 
  460: @noindent
  461: The argument @var{filename} is the name of the directory that
  462: @code{glob} couldn't open or couldn't read, and @var{error-code} is the
  463: @code{errno} value that was reported to @code{glob}.
  464: 
  465: If the error handler function returns nonzero, then @code{glob} gives up
  466: right away.  Otherwise, it continues.
  467: 
  468: @comment glob.h
  469: @comment POSIX.2
  470: @item GLOB_MARK
  471: If the pattern matches the name of a directory, append @samp{/} to the
  472: directory's name when returning it.
  473: 
  474: @comment glob.h
  475: @comment POSIX.2
  476: @item GLOB_NOCHECK
  477: If the pattern doesn't match any file names, return the pattern itself
  478: as if it were a file name that had been matched.  (Normally, when the
  479: pattern doesn't match anything, @code{glob} returns that there were no
  480: matches.)
  481: 
  482: @comment glob.h
  483: @comment POSIX.2
  484: @item GLOB_NOSORT
  485: Don't sort the file names; return them in no particular order.
  486: (In practice, the order will depend on the order of the entries in
  487: the directory.)  The only reason @emph{not} to sort is to save time.
  488: 
  489: @comment glob.h
  490: @comment POSIX.2
  491: @item GLOB_NOESCAPE
  492: Don't treat the @samp{\} character specially in patterns.  Normally,
  493: @samp{\} quotes the following character, turning off its special meaning
  494: (if any) so that it matches only itself.  When quoting is enabled, the
  495: pattern @samp{\?} matches only the string @samp{?}, because the question
  496: mark in the pattern acts like an ordinary character.
  497: 
  498: If you use @code{GLOB_NOESCAPE}, then @samp{\} is an ordinary character.
  499: 
  500: @code{glob} does its work by calling the function @code{fnmatch}
  501: repeatedly.  It handles the flag @code{GLOB_NOESCAPE} by turning on the
  502: @code{FNM_NOESCAPE} flag in calls to @code{fnmatch}.
  503: @end vtable
  504: 
  505: @node More Flags for Globbing
  506: @subsection More Flags for Globbing
  507: 
  508: Beside the flags described in the last section, the GNU implementation of
  509: @code{glob} allows a few more flags which are also defined in the
  510: @file{glob.h} file.  Some of the extensions implement functionality
  511: which is available in modern shell implementations.
  512: 
  513: @vtable @code
  514: @comment glob.h
  515: @comment GNU
  516: @item GLOB_PERIOD
  517: The @code{.} character (period) is treated special.  It cannot be
  518: matched by wildcards.  @xref{Wildcard Matching}, @code{FNM_PERIOD}.
  519: 
  520: @comment glob.h
  521: @comment GNU
  522: @item GLOB_MAGCHAR
  523: The @code{GLOB_MAGCHAR} value is not to be given to @code{glob} in the
  524: @var{flags} parameter.  Instead, @code{glob} sets this bit in the
  525: @var{gl_flags} element of the @var{glob_t} structure provided as the
  526: result if the pattern used for matching contains any wildcard character.
  527: 
  528: @comment glob.h
  529: @comment GNU
  530: @item GLOB_ALTDIRFUNC
  531: Instead of the using the using the normal functions for accessing the
  532: filesystem the @code{glob} implementation uses the user-supplied
  533: functions specified in the structure pointed to by @var{pglob}
  534: parameter.  For more information about the functions refer to the
  535: sections about directory handling see @ref{Accessing Directories}, and
  536: @ref{Reading Attributes}.
  537: 
  538: @comment glob.h
  539: @comment GNU
  540: @item GLOB_BRACE
  541: If this flag is given the handling of braces in the pattern is changed.
  542: It is now required that braces appear correctly grouped.  I.e., for each
  543: opening brace there must be a closing one.  Braces can be used
  544: recursively.  So it is possible to define one brace expression in
  545: another one.  It is important to note that the range of each brace
  546: expression is completely contained in the outer brace expression (if
  547: there is one).
  548: 
  549: The string between the matching braces is separated into single
  550: expressions by splitting at @code{,} (comma) characters.  The commas
  551: themselves are discarded.  Please note what we said above about recursive
  552: brace expressions.  The commas used to separate the subexpressions must
  553: be at the same level.  Commas in brace subexpressions are not matched.
  554: They are used during expansion of the brace expression of the deeper
  555: level.  The example below shows this
  556: 
  557: @smallexample
  558: glob ("@{foo/@{,bar,biz@},baz@}", GLOB_BRACE, NULL, &result)
  559: @end smallexample
  560: 
  561: @noindent
  562: is equivalent to the sequence
  563: 
  564: @smallexample
  565: glob ("foo/", GLOB_BRACE, NULL, &result)
  566: glob ("foo/bar", GLOB_BRACE|GLOB_APPEND, NULL, &result)
  567: glob ("foo/biz", GLOB_BRACE|GLOB_APPEND, NULL, &result)
  568: glob ("baz", GLOB_BRACE|GLOB_APPEND, NULL, &result)
  569: @end smallexample
  570: 
  571: @noindent
  572: if we leave aside error handling.
  573: 
  574: @comment glob.h
  575: @comment GNU
  576: @item GLOB_NOMAGIC
  577: If the pattern contains no wildcard constructs (it is a literal file name),
  578: return it as the sole ``matching'' word, even if no file exists by that name.
  579: 
  580: @comment glob.h
  581: @comment GNU
  582: @item GLOB_TILDE
  583: If this flag is used the character @code{~} (tilde) is handled special
  584: if it appears at the beginning of the pattern.  Instead of being taken
  585: verbatim it is used to represent the home directory of a known user.
  586: 
  587: If @code{~} is the only character in pattern or it is followed by a
  588: @code{/} (slash), the home directory of the process owner is
  589: substituted.  Using @code{getlogin} and @code{getpwnam} the information
  590: is read from the system databases.  As an example take user @code{bart}
  591: with his home directory at @file{/home/bart}.  For him a call like
  592: 
  593: @smallexample
  594: glob ("~/bin/*", GLOB_TILDE, NULL, &result)
  595: @end smallexample
  596: 
  597: @noindent
  598: would return the contents of the directory @file{/home/bart/bin}.
  599: Instead of referring to the own home directory it is also possible to
  600: name the home directory of other users.  To do so one has to append the
  601: user name after the tilde character.  So the contents of user
  602: @code{homer}'s @file{bin} directory can be retrieved by
  603: 
  604: @smallexample
  605: glob ("~homer/bin/*", GLOB_TILDE, NULL, &result)
  606: @end smallexample
  607: 
  608: If the user name is not valid or the home directory cannot be determined
  609: for some reason the pattern is left untouched and itself used as the
  610: result.  I.e., if in the last example @code{home} is not available the
  611: tilde expansion yields to @code{"~homer/bin/*"} and @code{glob} is not
  612: looking for a directory named @code{~homer}.
  613: 
  614: This functionality is equivalent to what is available in C-shells if the
  615: @code{nonomatch} flag is set.
  616: 
  617: @comment glob.h
  618: @comment GNU
  619: @item GLOB_TILDE_CHECK
  620: If this flag is used @code{glob} behaves like as if @code{GLOB_TILDE} is
  621: given.  The only difference is that if the user name is not available or
  622: the home directory cannot be determined for other reasons this leads to
  623: an error.  @code{glob} will return @code{GLOB_NOMATCH} instead of using
  624: the pattern itself as the name.
  625: 
  626: This functionality is equivalent to what is available in C-shells if
  627: @code{nonomatch} flag is not set.
  628: 
  629: @comment glob.h
  630: @comment GNU
  631: @item GLOB_ONLYDIR
  632: If this flag is used the globbing function takes this as a
  633: @strong{hint} that the caller is only interested in directories
  634: matching the pattern.  If the information about the type of the file
  635: is easily available non-directories will be rejected but no extra
  636: work will be done to determine the information for each file.  I.e.,
  637: the caller must still be able to filter directories out.
  638: 
  639: This functionality is only available with the GNU @code{glob}
  640: implementation.  It is mainly used internally to increase the
  641: performance but might be useful for a user as well and therefore is
  642: documented here.
  643: @end vtable
  644: 
  645: Calling @code{glob} will in most cases allocate resources which are used
  646: to represent the result of the function call.  If the same object of
  647: type @code{glob_t} is used in multiple call to @code{glob} the resources
  648: are freed or reused so that no leaks appear.  But this does not include
  649: the time when all @code{glob} calls are done.
  650: 
  651: @comment glob.h
  652: @comment POSIX.2
  653: @deftypefun void globfree (glob_t *@var{pglob})
  654: The @code{globfree} function frees all resources allocated by previous
  655: calls to @code{glob} associated with the object pointed to by
  656: @var{pglob}.  This function should be called whenever the currently used
  657: @code{glob_t} typed object isn't used anymore.
  658: @end deftypefun
  659: 
  660: @comment glob.h
  661: @comment GNU
  662: @deftypefun void globfree64 (glob64_t *@var{pglob})
  663: This function is equivalent to @code{globfree} but it frees records of
  664: type @code{glob64_t} which were allocated by @code{glob64}.
  665: @end deftypefun
  666: 
  667: 
  668: @node Regular Expressions
  669: @section Regular Expression Matching
  670: 
  671: The GNU C library supports two interfaces for matching regular
  672: expressions.  One is the standard POSIX.2 interface, and the other is
  673: what the GNU system has had for many years.
  674: 
  675: Both interfaces are declared in the header file @file{regex.h}.
  676: If you define @w{@code{_POSIX_C_SOURCE}}, then only the POSIX.2
  677: functions, structures, and constants are declared.
  678: @c !!! we only document the POSIX.2 interface here!!
  679: 
  680: @menu
  681: * POSIX Regexp Compilation::    Using @code{regcomp} to prepare to match.
  682: * Flags for POSIX Regexps::     Syntax variations for @code{regcomp}.
  683: * Matching POSIX Regexps::      Using @code{regexec} to match the compiled
  684:                                    pattern that you get from @code{regcomp}.
  685: * Regexp Subexpressions::       Finding which parts of the string were matched.
  686: * Subexpression Complications:: Find points of which parts were matched.
  687: * Regexp Cleanup::              Freeing storage; reporting errors.
  688: @end menu
  689: 
  690: @node POSIX Regexp Compilation
  691: @subsection POSIX Regular Expression Compilation
  692: 
  693: Before you can actually match a regular expression, you must
  694: @dfn{compile} it.  This is not true compilation---it produces a special
  695: data structure, not machine instructions.  But it is like ordinary
  696: compilation in that its purpose is to enable you to ``execute'' the
  697: pattern fast.  (@xref{Matching POSIX Regexps}, for how to use the
  698: compiled regular expression for matching.)
  699: 
  700: There is a special data type for compiled regular expressions:
  701: 
  702: @comment regex.h
  703: @comment POSIX.2
  704: @deftp {Data Type} regex_t
  705: This type of object holds a compiled regular expression.
  706: It is actually a structure.  It has just one field that your programs
  707: should look at:
  708: 
  709: @table @code
  710: @item re_nsub
  711: This field holds the number of parenthetical subexpressions in the
  712: regular expression that was compiled.
  713: @end table
  714: 
  715: There are several other fields, but we don't describe them here, because
  716: only the functions in the library should use them.
  717: @end deftp
  718: 
  719: After you create a @code{regex_t} object, you can compile a regular
  720: expression into it by calling @code{regcomp}.
  721: 
  722: @comment regex.h
  723: @comment POSIX.2
  724: @deftypefun int regcomp (regex_t *restrict @var{compiled}, const char *restrict @var{pattern}, int @var{cflags})
  725: The function @code{regcomp} ``compiles'' a regular expression into a
  726: data structure that you can use with @code{regexec} to match against a
  727: string.  The compiled regular expression format is designed for
  728: efficient matching.  @code{regcomp} stores it into @code{*@var{compiled}}.
  729: 
  730: It's up to you to allocate an object of type @code{regex_t} and pass its
  731: address to @code{regcomp}.
  732: 
  733: The argument @var{cflags} lets you specify various options that control
  734: the syntax and semantics of regular expressions.  @xref{Flags for POSIX
  735: Regexps}.
  736: 
  737: If you use the flag @code{REG_NOSUB}, then @code{regcomp} omits from
  738: the compiled regular expression the information necessary to record
  739: how subexpressions actually match.  In this case, you might as well
  740: pass @code{0} for the @var{matchptr} and @var{nmatch} arguments when
  741: you call @code{regexec}.
  742: 
  743: If you don't use @code{REG_NOSUB}, then the compiled regular expression
  744: does have the capacity to record how subexpressions match.  Also,
  745: @code{regcomp} tells you how many subexpressions @var{pattern} has, by
  746: storing the number in @code{@var{compiled}->re_nsub}.  You can use that
  747: value to decide how long an array to allocate to hold information about
  748: subexpression matches.
  749: 
  750: @code{regcomp} returns @co