(linenum→info "unix/slp.c:2238")

glibc/2.7/manual/io.texi

    1: @node I/O Overview, I/O on Streams, Pattern Matching, Top
    2: @c %MENU% Introduction to the I/O facilities
    3: @chapter Input/Output Overview
    4: 
    5: Most programs need to do either input (reading data) or output (writing
    6: data), or most frequently both, in order to do anything useful.  The GNU
    7: C library provides such a large selection of input and output functions
    8: that the hardest part is often deciding which function is most
    9: appropriate!
   10: 
   11: This chapter introduces concepts and terminology relating to input
   12: and output.  Other chapters relating to the GNU I/O facilities are:
   13: 
   14: @itemize @bullet
   15: @item
   16: @ref{I/O on Streams}, which covers the high-level functions
   17: that operate on streams, including formatted input and output.
   18: 
   19: @item
   20: @ref{Low-Level I/O}, which covers the basic I/O and control
   21: functions on file descriptors.
   22: 
   23: @item
   24: @ref{File System Interface}, which covers functions for operating on
   25: directories and for manipulating file attributes such as access modes
   26: and ownership.
   27: 
   28: @item
   29: @ref{Pipes and FIFOs}, which includes information on the basic interprocess
   30: communication facilities.
   31: 
   32: @item
   33: @ref{Sockets}, which covers a more complicated interprocess communication
   34: facility with support for networking.
   35: 
   36: @item
   37: @ref{Low-Level Terminal Interface}, which covers functions for changing
   38: how input and output to terminals or other serial devices are processed.
   39: @end itemize
   40: 
   41: 
   42: @menu
   43: * I/O Concepts::       Some basic information and terminology.
   44: * File Names::         How to refer to a file.
   45: @end menu
   46: 
   47: @node I/O Concepts, File Names,  , I/O Overview
   48: @section Input/Output Concepts
   49: 
   50: Before you can read or write the contents of a file, you must establish
   51: a connection or communications channel to the file.  This process is
   52: called @dfn{opening} the file.  You can open a file for reading, writing,
   53: or both.
   54: @cindex opening a file
   55: 
   56: The connection to an open file is represented either as a stream or as a
   57: file descriptor.  You pass this as an argument to the functions that do
   58: the actual read or write operations, to tell them which file to operate
   59: on.  Certain functions expect streams, and others are designed to
   60: operate on file descriptors.
   61: 
   62: When you have finished reading to or writing from the file, you can
   63: terminate the connection by @dfn{closing} the file.  Once you have
   64: closed a stream or file descriptor, you cannot do any more input or
   65: output operations on it.
   66: 
   67: @menu
   68: * Streams and File Descriptors::    The GNU Library provides two ways
   69:                                      to access the contents of files.
   70: * File Position::                   The number of bytes from the
   71:                                      beginning of the file.
   72: @end menu
   73: 
   74: @node Streams and File Descriptors, File Position,  , I/O Concepts
   75: @subsection Streams and File Descriptors
   76: 
   77: When you want to do input or output to a file, you have a choice of two
   78: basic mechanisms for representing the connection between your program
   79: and the file: file descriptors and streams.  File descriptors are
   80: represented as objects of type @code{int}, while streams are represented
   81: as @code{FILE *} objects.
   82: 
   83: File descriptors provide a primitive, low-level interface to input and
   84: output operations.  Both file descriptors and streams can represent a
   85: connection to a device (such as a terminal), or a pipe or socket for
   86: communicating with another process, as well as a normal file.  But, if
   87: you want to do control operations that are specific to a particular kind
   88: of device, you must use a file descriptor; there are no facilities to
   89: use streams in this way.  You must also use file descriptors if your
   90: program needs to do input or output in special modes, such as
   91: nonblocking (or polled) input (@pxref{File Status Flags}).
   92: 
   93: Streams provide a higher-level interface, layered on top of the
   94: primitive file descriptor facilities.  The stream interface treats all
   95: kinds of files pretty much alike---the sole exception being the three
   96: styles of buffering that you can choose (@pxref{Stream Buffering}).
   97: 
   98: The main advantage of using the stream interface is that the set of
   99: functions for performing actual input and output operations (as opposed
  100: to control operations) on streams is much richer and more powerful than
  101: the corresponding facilities for file descriptors.  The file descriptor
  102: interface provides only simple functions for transferring blocks of
  103: characters, but the stream interface also provides powerful formatted
  104: input and output functions (@code{printf} and @code{scanf}) as well as
  105: functions for character- and line-oriented input and output.
  106: @c !!! glibc has dprintf, which lets you do printf on an fd.
  107: 
  108: Since streams are implemented in terms of file descriptors, you can
  109: extract the file descriptor from a stream and perform low-level
  110: operations directly on the file descriptor.  You can also initially open
  111: a connection as a file descriptor and then make a stream associated with
  112: that file descriptor.
  113: 
  114: In general, you should stick with using streams rather than file
  115: descriptors, unless there is some specific operation you want to do that
  116: can only be done on a file descriptor.  If you are a beginning
  117: programmer and aren't sure what functions to use, we suggest that you
  118: concentrate on the formatted input functions (@pxref{Formatted Input})
  119: and formatted output functions (@pxref{Formatted Output}).
  120: 
  121: If you are concerned about portability of your programs to systems other
  122: than GNU, you should also be aware that file descriptors are not as
  123: portable as streams.  You can expect any system running @w{ISO C} to
  124: support streams, but non-GNU systems may not support file descriptors at
  125: all, or may only implement a subset of the GNU functions that operate on
  126: file descriptors.  Most of the file descriptor functions in the GNU
  127: library are included in the POSIX.1 standard, however.
  128: 
  129: @node File Position,  , Streams and File Descriptors, I/O Concepts
  130: @subsection File Position
  131: 
  132: One of the attributes of an open file is its @dfn{file position} that
  133: keeps track of where in the file the next character is to be read or
  134: written.  In the GNU system, and all POSIX.1 systems, the file position
  135: is simply an integer representing the number of bytes from the beginning
  136: of the file.
  137: 
  138: The file position is normally set to the beginning of the file when it
  139: is opened, and each time a character is read or written, the file
  140: position is incremented.  In other words, access to the file is normally
  141: @dfn{sequential}.
  142: @cindex file position
  143: @cindex sequential-access files
  144: 
  145: Ordinary files permit read or write operations at any position within
  146: the file.  Some other kinds of files may also permit this.  Files which
  147: do permit this are sometimes referred to as @dfn{random-access} files.
  148: You can change the file position using the @code{fseek} function on a
  149: stream (@pxref{File Positioning}) or the @code{lseek} function on a file
  150: descriptor (@pxref{I/O Primitives}).  If you try to change the file
  151: position on a file that doesn't support random access, you get the
  152: @code{ESPIPE} error.
  153: @cindex random-access files
  154: 
  155: Streams and descriptors that are opened for @dfn{append access} are
  156: treated specially for output: output to such files is @emph{always}
  157: appended sequentially to the @emph{end} of the file, regardless of the
  158: file position.  However, the file position is still used to control where in
  159: the file reading is done.
  160: @cindex append-access files
  161: 
  162: If you think about it, you'll realize that several programs can read a
  163: given file at the same time.  In order for each program to be able to
  164: read the file at its own pace, each program must have its own file
  165: pointer, which is not affected by anything the other programs do.
  166: 
  167: In fact, each opening of a file creates a separate file position.
  168: Thus, if you open a file twice even in the same program, you get two
  169: streams or descriptors with independent file positions.
  170: 
  171: By contrast, if you open a descriptor and then duplicate it to get
  172: another descriptor, these two descriptors share the same file position:
  173: changing the file position of one descriptor will affect the other.
  174: 
  175: @node File Names,  , I/O Concepts, I/O Overview
  176: @section File Names
  177: 
  178: In order to open a connection to a file, or to perform other operations
  179: such as deleting a file, you need some way to refer to the file.  Nearly
  180: all files have names that are strings---even files which are actually
  181: devices such as tape drives or terminals.  These strings are called
  182: @dfn{file names}.  You specify the file name to say which file you want
  183: to open or operate on.
  184: 
  185: This section describes the conventions for file names and how the
  186: operating system works with them.
  187: @cindex file name
  188: 
  189: @menu
  190: * Directories::                 Directories contain entries for files.
  191: * File Name Resolution::        A file name specifies how to look up a file.
  192: * File Name Errors::            Error conditions relating to file names.
  193: * File Name Portability::       File name portability and syntax issues.
  194: @end menu
  195: 
  196: 
  197: @node Directories, File Name Resolution,  , File Names
  198: @subsection Directories
  199: 
  200: In order to understand the syntax of file names, you need to understand
  201: how the file system is organized into a hierarchy of directories.
  202: 
  203: @cindex directory
  204: @cindex link
  205: @cindex directory entry
  206: A @dfn{directory} is a file that contains information to associate other
  207: files with names; these associations are called @dfn{links} or
  208: @dfn{directory entries}.  Sometimes, people speak of ``files in a
  209: directory'', but in reality, a directory only contains pointers to
  210: files, not the files themselves.
  211: 
  212: @cindex file name component
  213: The name of a file contained in a directory entry is called a @dfn{file
  214: name component}.  In general, a file name consists of a sequence of one
  215: or more such components, separated by the slash character (@samp{/}).  A
  216: file name which is just one component names a file with respect to its
  217: directory.  A file name with multiple components names a directory, and
  218: then a file in that directory, and so on.
  219: 
  220: Some other documents, such as the POSIX standard, use the term
  221: @dfn{pathname} for what we call a file name, and either @dfn{filename}
  222: or @dfn{pathname component} for what this manual calls a file name
  223: component.  We don't use this terminology because a ``path'' is
  224: something completely different (a list of directories to search), and we
  225: think that ``pathname'' used for something else will confuse users.  We
  226: always use ``file name'' and ``file name component'' (or sometimes just
  227: ``component'', where the context is obvious) in GNU documentation.  Some
  228: macros use the POSIX terminology in their names, such as
  229: @code{PATH_MAX}.  These macros are defined by the POSIX standard, so we
  230: cannot change their names.
  231: 
  232: You can find more detailed information about operations on directories
  233: in @ref{File System Interface}.
  234: 
  235: @node File Name Resolution, File Name Errors, Directories, File Names
  236: @subsection File Name Resolution
  237: 
  238: A file name consists of file name components separated by slash
  239: (@samp{/}) characters.  On the systems that the GNU C library supports,
  240: multiple successive @samp{/} characters are equivalent to a single
  241: @samp{/} character.
  242: 
  243: @cindex file name resolution
  244: The process of determining what file a file name refers to is called
  245: @dfn{file name resolution}.  This is performed by examining the
  246: components that make up a file name in left-to-right order, and locating
  247: each successive component in the directory named by the previous
  248: component.  Of course, each of the files that are referenced as
  249: directories must actually exist, be directories instead of regular
  250: files, and have the appropriate permissions to be accessible by the
  251: process; otherwise the file name resolution fails.
  252: 
  253: @cindex root directory
  254: @cindex absolute file name
  255: If a file name begins with a @samp{/}, the first component in the file
  256: name is located in the @dfn{root directory} of the process (usually all
  257: processes on the system have the same root directory).  Such a file name
  258: is called an @dfn{absolute file name}.
  259: @c !!! xref here to chroot, if we ever document chroot. -rm
  260: 
  261: @cindex relative file name
  262: Otherwise, the first component in the file name is located in the
  263: current working directory (@pxref{Working Directory}).  This kind of
  264: file name is called a @dfn{relative file name}.
  265: 
  266: @cindex parent directory
  267: The file name components @file{.} (``dot'') and @file{..} (``dot-dot'')
  268: have special meanings.  Every directory has entries for these file name
  269: components.  The file name component @file{.} refers to the directory
  270: itself, while the file name component @file{..} refers to its
  271: @dfn{parent directory} (the directory that contains the link for the
  272: directory in question).  As a special case, @file{..} in the root
  273: directory refers to the root directory itself, since it has no parent;
  274: thus @file{/..} is the same as @file{/}.
  275: 
  276: Here are some examples of file names:
  277: 
  278: @table @file
  279: @item /a
  280: The file named @file{a}, in the root directory.
  281: 
  282: @item /a/b
  283: The file named @file{b}, in the directory named @file{a} in the root directory.
  284: 
  285: @item a
  286: The file named @file{a}, in the current working directory.
  287: 
  288: @item /a/./b
  289: This is the same as @file{/a/b}.
  290: 
  291: @item ./a
  292: The file named @file{a}, in the current working directory.
  293: 
  294: @item ../a
  295: The file named @file{a}, in the parent directory of the current working
  296: directory.
  297: @end table
  298: 
  299: @c An empty string may ``work'', but I think it's confusing to
  300: @c try to describe it.  It's not a useful thing for users to use--rms.
  301: A file name that names a directory may optionally end in a @samp{/}.
  302: You can specify a file name of @file{/} to refer to the root directory,
  303: but the empty string is not a meaningful file name.  If you want to
  304: refer to the current working directory, use a file name of @file{.} or
  305: @file{./}.
  306: 
  307: Unlike some other operating systems, the GNU system doesn't have any
  308: built-in support for file types (or extensions) or file versions as part
  309: of its file name syntax.  Many programs and utilities use conventions
  310: for file names---for example, files containing C source code usually
  311: have names suffixed with @samp{.c}---but there is nothing in the file
  312: system itself that enforces this kind of convention.
  313: 
  314: @node File Name Errors, File Name Portability, File Name Resolution, File Names
  315: @subsection File Name Errors
  316: 
  317: @cindex file name errors
  318: @cindex usual file name errors
  319: 
  320: Functions that accept file name arguments usually detect these
  321: @code{errno} error conditions relating to the file name syntax or
  322: trouble finding the named file.  These errors are referred to throughout
  323: this manual as the @dfn{usual file name errors}.
  324: 
  325: @table @code
  326: @item EACCES
  327: The process does not have search permission for a directory component
  328: of the file name.
  329: 
  330: @item ENAMETOOLONG
  331: This error is used when either the total length of a file name is
  332: greater than @code{PATH_MAX}, or when an individual file name component
  333: has a length greater than @code{NAME_MAX}.  @xref{Limits for Files}.
  334: 
  335: In the GNU system, there is no imposed limit on overall file name
  336: length, but some file systems may place limits on the length of a
  337: component.
  338: 
  339: @item ENOENT
  340: This error is reported when a file referenced as a directory component
  341: in the file name doesn't exist, or when a component is a symbolic link
  342: whose target file does not exist.  @xref{Symbolic Links}.
  343: 
  344: @item ENOTDIR
  345: A file that is referenced as a directory component in the file name
  346: exists, but it isn't a directory.
  347: 
  348: @item ELOOP
  349: Too many symbolic links were resolved while trying to look up the file
  350: name.  The system has an arbitrary limit on the number of symbolic links
  351: that may be resolved in looking up a single file name, as a primitive
  352: way to detect loops.  @xref{Symbolic Links}.
  353: @end table
  354: 
  355: 
  356: @node File Name Portability,  , File Name Errors, File Names
  357: @subsection Portability of File Names
  358: 
  359: The rules for the syntax of file names discussed in @ref{File Names},
  360: are the rules normally used by the GNU system and by other POSIX
  361: systems.  However, other operating systems may use other conventions.
  362: 
  363: There are two reasons why it can be important for you to be aware of
  364: file name portability issues:
  365: 
  366: @itemize @bullet
  367: @item
  368: If your program makes assumptions about file name syntax, or contains
  369: embedded literal file name strings, it is more difficult to get it to
  370: run under other operating systems that use different syntax conventions.
  371: 
  372: @item
  373: Even if you are not concerned about running your program on machines
  374: that run other operating systems, it may still be possible to access
  375: files that use different naming conventions.  For example, you may be
  376: able to access file systems on another computer running a different
  377: operating system over a network, or read and write disks in formats used
  378: by other operating systems.
  379: @end itemize
  380: 
  381: The @w{ISO C} standard says very little about file name syntax, only that
  382: file names are strings.  In addition to varying restrictions on the
  383: length of file names and what characters can validly appear in a file
  384: name, different operating systems use different conventions and syntax
  385: for concepts such as structured directories and file types or
  386: extensions.  Some concepts such as file versions might be supported in
  387: some operating systems and not by others.
  388: 
  389: The POSIX.1 standard allows implementations to put additional
  390: restrictions on file name syntax, concerning what characters are
  391: permitted in file names and on the length of file name and file name
  392: component strings.  However, in the GNU system, you do not need to worry
  393: about these restrictions; any character except the null character is
  394: permitted in a file name string, and there are no limits on the length
  395: of file name strings.
Syntax (Markdown)