
1: @node String and Array Utilities, Character Set Handling, Character Handling, Top 2: @c %MENU% Utilities for copying and comparing strings and arrays 3: @chapter String and Array Utilities 4: 5: Operations on strings (or arrays of characters) are an important part of 6: many programs. The GNU C library provides an extensive set of string 7: utility functions, including functions for copying, concatenating, 8: comparing, and searching strings. Many of these functions can also 9: operate on arbitrary regions of storage; for example, the @code{memcpy} 10: function can be used to copy the contents of any kind of array. 11: 12: It's fairly common for beginning C programmers to ``reinvent the wheel'' 13: by duplicating this functionality in their own code, but it pays to 14: become familiar with the library functions and to make use of them, 15: since this offers benefits in maintenance, efficiency, and portability. 16: 17: For instance, you could easily compare one string to another in two 18: lines of C code, but if you use the built-in @code{strcmp} function, 19: you're less likely to make a mistake. And, since these library 20: functions are typically highly optimized, your program may run faster 21: too. 22: 23: @menu 24: * Representation of Strings:: Introduction to basic concepts. 25: * String/Array Conventions:: Whether to use a string function or an 26: arbitrary array function. 27: * String Length:: Determining the length of a string. 28: * Copying and Concatenation:: Functions to copy the contents of strings 29: and arrays. 30: * String/Array Comparison:: Functions for byte-wise and character-wise 31: comparison. 32: * Collation Functions:: Functions for collating strings. 33: * Search Functions:: Searching for a specific element or substring. 34: * Finding Tokens in a String:: Splitting a string into tokens by looking 35: for delimiters. 36: * strfry:: Function for flash-cooking a string. 37: * Trivial Encryption:: Obscuring data. 38: * Encode Binary Data:: Encoding and Decoding of Binary Data. 39: * Argz and Envz Vectors:: Null-separated string vectors. 40: @end menu 41: 42: @node Representation of Strings 43: @section Representation of Strings 44: @cindex string, representation of 45: 46: This section is a quick summary of string concepts for beginning C 47: programmers. It describes how character strings are represented in C 48: and some common pitfalls. If you are already familiar with this 49: material, you can skip this section. 50: 51: @cindex string 52: @cindex multibyte character string 53: A @dfn{string} is an array of @code{char} objects. But string-valued 54: variables are usually declared to be pointers of type @code{char *}. 55: Such variables do not include space for the text of a string; that has 56: to be stored somewhere else---in an array variable, a string constant, 57: or dynamically allocated memory (@pxref{Memory Allocation}). It's up to 58: you to store the address of the chosen memory space into the pointer 59: variable. Alternatively you can store a @dfn{null pointer} in the 60: pointer variable. The null pointer does not point anywhere, so 61: attempting to reference the string it points to gets an error. 62: 63: @cindex wide character string 64: ``string'' normally refers to multibyte character strings as opposed to 65: wide character strings. Wide character strings are arrays of type 66: @code{wchar_t} and as for multibyte character strings usually pointers 67: of type @code{wchar_t *} are used. 68: 69: @cindex null character 70: @cindex null wide character 71: By convention, a @dfn{null character}, @code{'\0'}, marks the end of a 72: multibyte character string and the @dfn{null wide character}, 73: @code{L'\0'}, marks the end of a wide character string. For example, in 74: testing to see whether the @code{char *} variable @var{p} points to a 75: null character marking the end of a string, you can write 76: @code{!*@var{p}} or @code{*@var{p} == '\0'}. 77: 78: A null character is quite different conceptually from a null pointer, 79: although both are represented by the integer @code{0}. 80: 81: @cindex string literal 82: @dfn{String literals} appear in C program source as strings of 83: characters between double-quote characters (@samp{"}) where the initial 84: double-quote character is immediately preceded by a capital @samp{L} 85: (ell) character (as in @code{L"foo"}). In @w{ISO C}, string literals 86: can also be formed by @dfn{string concatenation}: @code{"a" "b"} is the 87: same as @code{"ab"}. For wide character strings one can either use 88: @code{L"a" L"b"} or @code{L"a" "b"}. Modification of string literals is 89: not allowed by the GNU C compiler, because literals are placed in 90: read-only storage. 91: 92: Character arrays that are declared @code{const} cannot be modified 93: either. It's generally good style to declare non-modifiable string 94: pointers to be of type @code{const char *}, since this often allows the 95: C compiler to detect accidental modifications as well as providing some 96: amount of documentation about what your program intends to do with the 97: string. 98: 99: The amount of memory allocated for the character array may extend past 100: the null character that normally marks the end of the string. In this 101: document, the term @dfn{allocated size} is always used to refer to the 102: total amount of memory allocated for the string, while the term 103: @dfn{length} refers to the number of characters up to (but not 104: including) the terminating null character. 105: @cindex length of string 106: @cindex allocation size of string 107: @cindex size of string 108: @cindex string length 109: @cindex string allocation 110: 111: A notorious source of program bugs is trying to put more characters in a 112: string than fit in its allocated size. When writing code that extends 113: strings or moves characters into a pre-allocated array, you should be 114: very careful to keep track of the length of the text and make explicit 115: checks for overflowing the array. Many of the library functions 116: @emph{do not} do this for you! Remember also that you need to allocate 117: an extra byte to hold the null character that marks the end of the 118: string. 119: 120: @cindex single-byte string 121: @cindex multibyte string 122: Originally strings were sequences of bytes where each byte represents a 123: single character. This is still true today if the strings are encoded 124: using a single-byte character encoding. Things are different if the 125: strings are encoded using a multibyte encoding (for more information on 126: encodings see @ref{Extended Char Intro}). There is no difference in 127: the programming interface for these two kind of strings; the programmer 128: has to be aware of this and interpret the byte sequences accordingly. 129: 130: But since there is no separate interface taking care of these 131: differences the byte-based string functions are sometimes hard to use. 132: Since the count parameters of these functions specify bytes a call to 133: @code{strncpy} could cut a multibyte character in the middle and put an 134: incomplete (and therefore unusable) byte sequence in the target buffer. 135: 136: @cindex wide character string 137: To avoid these problems later versions of the @w{ISO C} standard 138: introduce a second set of functions which are operating on @dfn{wide 139: characters} (@pxref{Extended Char Intro}). These functions don't have 140: the problems the single-byte versions have since every wide character is 141: a legal, interpretable value. This does not mean that cutting wide 142: character strings at arbitrary points is without problems. It normally 143: is for alphabet-based languages (except for non-normalized text) but 144: languages based on syllables still have the problem that more than one 145: wide character is necessary to complete a logical unit. This is a 146: higher level problem which the @w{C library} functions are not designed 147: to solve. But it is at least good that no invalid byte sequences can be 148: created. Also, the higher level functions can also much easier operate 149: on wide character than on multibyte characters so that a general advise 150: is to use wide characters internally whenever text is more than simply 151: copied. 152: 153: The remaining of this chapter will discuss the functions for handling 154: wide character strings in parallel with the discussion of the multibyte 155: character strings since there is almost always an exact equivalent 156: available. 157: 158: @node String/Array Conventions 159: @section String and Array Conventions 160: 161: This chapter describes both functions that work on arbitrary arrays or 162: blocks of memory, and functions that are specific to null-terminated 163: arrays of characters and wide characters. 164: 165: Functions that operate on arbitrary blocks of memory have names 166: beginning with @samp{mem} and @samp{wmem} (such as @code{memcpy} and 167: @code{wmemcpy}) and invariably take an argument which specifies the size 168: (in bytes and wide characters respectively) of the block of memory to 169: operate on. The array arguments and return values for these functions 170: have type @code{void *} or @code{wchar_t}. As a matter of style, the 171: elements of the arrays used with the @samp{mem} functions are referred 172: to as ``bytes''. You can pass any kind of pointer to these functions, 173: and the @code{sizeof} operator is useful in computing the value for the 174: size argument. Parameters to the @samp{wmem} functions must be of type 175: @code{wchar_t *}. These functions are not really usable with anything 176: but arrays of this type. 177: 178: In contrast, functions that operate specifically on strings and wide 179: character strings have names beginning with @samp{str} and @samp{wcs} 180: respectively (such as @code{strcpy} and @code{wcscpy}) and look for a 181: null character to terminate the string instead of requiring an explicit 182: size argument to be passed. (Some of these functions accept a specified 183: maximum length, but they also check for premature termination with a 184: null character.) The array arguments and return values for these 185: functions have type @code{char *} and @code{wchar_t *} respectively, and 186: the array elements are referred to as ``characters'' and ``wide 187: characters''. 188: 189: In many cases, there are both @samp{mem} and @samp{str}/@samp{wcs} 190: versions of a function. The one that is more appropriate to use depends 191: on the exact situation. When your program is manipulating arbitrary 192: arrays or blocks of storage, then you should always use the @samp{mem} 193: functions. On the other hand, when you are manipulating null-terminated 194: strings it is usually more convenient to use the @samp{str}/@samp{wcs} 195: functions, unless you already know the length of the string in advance. 196: The @samp{wmem} functions should be used for wide character arrays with 197: known size. 198: 199: @cindex wint_t 200: @cindex parameter promotion 201: Some of the memory and string functions take single characters as 202: arguments. Since a value of type @code{char} is automatically promoted 203: into an value of type @code{int} when used as a parameter, the functions 204: are declared with @code{int} as the type of the parameter in question. 205: In case of the wide character function the situation is similarly: the 206: parameter type for a single wide character is @code{wint_t} and not 207: @code{wchar_t}. This would for many implementations not be necessary 208: since the @code{wchar_t} is large enough to not be automatically 209: promoted, but since the @w{ISO C} standard does not require such a 210: choice of types the @code{wint_t} type is used. 211: 212: @node String Length 213: @section String Length 214: 215: You can get the length of a string using the @code{strlen} function. 216: This function is declared in the header file @file{string.h}. 217: @pindex string.h 218: 219: @comment string.h 220: @comment ISO 221: @deftypefun size_t strlen (const char *@var{s}) 222: The @code{strlen} function returns the length of the null-terminated 223: string @var{s} in bytes. (In other words, it returns the offset of the 224: terminating null character within the array.) 225: 226: For example, 227: @smallexample 228: strlen ("hello, world") 229: @result{} 12 230: @end smallexample 231: 232: When applied to a character array, the @code{strlen} function returns 233: the length of the string stored there, not its allocated size. You can 234: get the allocated size of the character array that holds a string using 235: the @code{sizeof} operator: 236: 237: @smallexample 238: char string[32] = "hello, world"; 239: sizeof (string) 240: @result{} 32 241: strlen (string) 242: @result{} 12 243: @end smallexample 244: 245: But beware, this will not work unless @var{string} is the character 246: array itself, not a pointer to it. For example: 247: 248: @smallexample 249: char string[32] = "hello, world"; 250: char *ptr = string; 251: sizeof (string) 252: @result{} 32 253: sizeof (ptr) 254: @result{} 4 /* @r{(on a machine with 4 byte pointers)} */ 255: @end smallexample 256: 257: This is an easy mistake to make when you are working with functions that 258: take string arguments; those arguments are always pointers, not arrays. 259: 260: It must also be noted that for multibyte encoded strings the return 261: value does not have to correspond to the number of characters in the 262: string. To get this value the string can be converted to wide 263: characters and @code{wcslen} can be used or something like the following 264: code can be used: 265: 266: @smallexample 267: /* @r{The input is in @code{string}.} 268: @r{The length is expected in @code{n}.} */ 269: @{ 270: mbstate_t t; 271: char *scopy = string; 272: /* In initial state. */ 273: memset (&t, '\0', sizeof (t)); 274: /* Determine number of characters. */ 275: n = mbsrtowcs (NULL, &scopy, strlen (scopy), &t); 276: @} 277: @end smallexample 278: 279: This is cumbersome to do so if the number of characters (as opposed to 280: bytes) is needed often it is better to work with wide characters. 281: @end deftypefun 282: 283: The wide character equivalent is declared in @file{wchar.h}. 284: 285: @comment wchar.h 286: @comment ISO 287: @deftypefun size_t wcslen (const wchar_t *@var{ws}) 288: The @code{wcslen} function is the wide character equivalent to 289: @code{strlen}. The return value is the number of wide characters in the 290: wide character string pointed to by @var{ws} (this is also the offset of 291: the terminating null wide character of @var{ws}). 292: 293: Since there are no multi wide character sequences making up one 294: character the return value is not only the offset in the array, it is 295: also the number of wide characters. 296: 297: This function was introduced in @w{Amendment 1} to @w{ISO C90}. 298: @end deftypefun 299: 300: @comment string.h 301: @comment GNU 302: @deftypefun size_t strnlen (const char *@var{s}, size_t @var{maxlen}) 303: The @code{strnlen} function returns the length of the string @var{s} in 304: bytes if this length is smaller than @var{maxlen} bytes. Otherwise it 305: returns @var{maxlen}. Therefore this function is equivalent to 306: @code{(strlen (@var{s}) < n ? strlen (@var{s}) : @var{maxlen})} but it 307: is more efficient and works even if the string @var{s} is not 308: null-terminated. 309: 310: @smallexample 311: char string[32] = "hello, world"; 312: strnlen (string, 32) 313: @result{} 12 314: strnlen (string, 5) 315: @result{} 5 316: @end smallexample 317: 318: This function is a GNU extension and is declared in @file{string.h}. 319: @end deftypefun 320: 321: @comment wchar.h 322: @comment GNU 323: @deftypefun size_t wcsnlen (const wchar_t *@var{ws}, size_t @var{maxlen}) 324: @code{wcsnlen} is the wide character equivalent to @code{strnlen}. The 325: @var{maxlen} parameter specifies the maximum number of wide characters. 326: 327: This function is a GNU extension and is declared in @file{wchar.h}. 328: @end deftypefun 329: 330: @node Copying and Concatenation 331: @section Copying and Concatenation 332: 333: You can use the functions described in this section to copy the contents 334: of strings and arrays, or to append the contents of one string to 335: another. The @samp{str} and @samp{mem} functions are declared in the 336: header file @file{string.h} while the @samp{wstr} and @samp{wmem} 337: functions are declared in the file @file{wchar.h}. 338: @pindex string.h 339: @pindex wchar.h 340: @cindex copying strings and arrays 341: @cindex string copy functions 342: @cindex array copy functions 343: @cindex concatenating strings 344: @cindex string concatenation functions 345: 346: A helpful way to remember the ordering of the arguments to the functions 347: in this section is that it corresponds to an assignment expression, with 348: the destination array specified to the left of the source array. All 349: of these functions return the address of the destination array. 350: 351: Most of these functions do not work properly if the source and 352: destination arrays overlap. For example, if the beginning of the 353: destination array overlaps the end of the source array, the original 354: contents of that part of the source array may get overwritten before it 355: is copied. Even worse, in the case of the string functions, the null 356: character marking the end of the string may be lost, and the copy 357: function might get stuck in a loop trashing all the memory allocated to 358: your program. 359: 360: All functions that have problems copying between overlapping arrays are 361: explicitly identified in this manual. In addition to functions in this 362: section, there are a few others like @code{sprintf} (@pxref{Formatted 363: Output Functions}) and @code{scanf} (@pxref{Formatted Input 364: Functions}). 365: 366: @comment string.h 367: @comment ISO 368: @deftypefun {void *} memcpy (void *restrict @var{to}, const void *restrict @var{from}, size_t @var{size}) 369: The @code{memcpy} function copies @var{size} bytes from the object 370: beginning at @var{from} into the object beginning at @var{to}. The 371: behavior of this function is undefined if the two arrays @var{to} and 372: @var{from} overlap; use @code{memmove} instead if overlapping is possible. 373: 374: The value returned by @code{memcpy} is the value of @var{to}. 375: 376: Here is an example of how you might use @code{memcpy} to copy the 377: contents of an array: 378: 379: @smallexample 380: struct foo *oldarray, *newarray; 381: int arraysize; 382: @dots{} 383: memcpy (new, old, arraysize * sizeof (struct foo)); 384: @end smallexample 385: @end deftypefun 386: 387: @comment wchar.h 388: @comment ISO 389: @deftypefun {wchar_t *} wmemcpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) 390: The @code{wmemcpy} function copies @var{size} wide characters from the object 391: beginning at @var{wfrom} into the object beginning at @var{wto}. The 392: behavior of this function is undefined if the two arrays @var{wto} and 393: @var{wfrom} overlap; use @code{wmemmove} instead if overlapping is possible. 394: 395: The following is a possible implementation of @code{wmemcpy} but there 396: are more optimizations possible. 397: 398: @smallexample 399: wchar_t * 400: wmemcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom, 401: size_t size) 402: @{ 403: return (wchar_t *) memcpy (wto, wfrom, size * sizeof (wchar_t)); 404: @} 405: @end smallexample 406: 407: The value returned by @code{wmemcpy} is the value of @var{wto}. 408: 409: This function was introduced in @w{Amendment 1} to @w{ISO C90}. 410: @end deftypefun 411: 412: @comment string.h 413: @comment GNU 414: @deftypefun {void *} mempcpy (void *restrict @var{to}, const void *restrict @var{from}, size_t @var{size}) 415: The @code{mempcpy} function is nearly identical to the @code{memcpy} 416: function. It copies @var{size} bytes from the object beginning at 417: @code{from} into the object pointed to by @var{to}. But instead of 418: returning the value of @var{to} it returns a pointer to the byte 419: following the last written byte in the object beginning at @var{to}. 420: I.e., the value is @code{((void *) ((char *) @var{to} + @var{size}))}. 421: 422: This function is useful in situations where a number of objects shall be 423: copied to consecutive memory positions. 424: 425: @smallexample 426: void * 427: combine (void *o1, size_t s1, void *o2, size_t s2) 428: @{ 429: void *result = malloc (s1 + s2); 430: if (result != NULL) 431: mempcpy (mempcpy (result, o1, s1), o2, s2); 432: return result; 433: @} 434: @end smallexample 435: 436: This function is a GNU extension. 437: @end deftypefun 438: 439: @comment wchar.h 440: @comment GNU 441: @deftypefun {wchar_t *} wmempcpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) 442: The @code{wmempcpy} function is nearly identical to the @code{wmemcpy} 443: function. It copies @var{size} wide characters from the object 444: beginning at @code{wfrom} into the object pointed to by @var{wto}. But 445: instead of returning the value of @var{wto} it returns a pointer to the 446: wide character following the last written wide character in the object 447: beginning at @var{wto}. I.e., the value is @code{@var{wto} + @var{size}}. 448: 449: This function is useful in situations where a number of objects shall be 450: copied to consecutive memory positions. 451: 452: The following is a possible implementation of @code{wmemcpy} but there 453: are more optimizations possible. 454: 455: @smallexample 456: wchar_t * 457: wmempcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom, 458: size_t size) 459: @{ 460: return (wchar_t *) mempcpy (wto, wfrom, size * sizeof (wchar_t)); 461: @} 462: @end smallexample 463: 464: This function is a GNU extension. 465: @end deftypefun 466: 467: @comment string.h 468: @comment ISO 469: @deftypefun {void *} memmove (void *@var{to}, const void *@var{from}, size_t @var{size}) 470: @code{memmove} copies the @var{size} bytes at @var{from} into the 471: @var{size} bytes at @var{to}, even if those two blocks of space 472: overlap. In the case of overlap, @code{memmove} is careful to copy the 473: original values of the bytes in the block at @var{from}, including those 474: bytes which also belong to the block at @var{to}. 475: 476: The value returned by @code{memmove} is the value of @var{to}. 477: @end deftypefun 478: 479: @comment wchar.h 480: @comment ISO 481: @deftypefun {wchar_t *} wmemmove (wchar *@var{wto}, const wchar_t *@var{wfrom}, size_t @var{size}) 482: @code{wmemmove} copies the @var{size} wide characters at @var{wfrom} 483: into the @var{size} wide characters at @var{wto}, even if those two 484: blocks of space overlap. In the case of overlap, @code{memmove} is 485: careful to copy the original values of the wide characters in the block 486: at @var{wfrom}, including those wide characters which also belong to the 487: block at @var{wto}. 488: 489: The following is a possible implementation of @code{wmemcpy} but there 490: are more optimizations possible. 491: 492: @smallexample 493: wchar_t * 494: wmempcpy (wchar_t *restrict wto, const wchar_t *restrict wfrom, 495: size_t size) 496: @{ 497: return (wchar_t *) mempcpy (wto, wfrom, size * sizeof (wchar_t)); 498: @} 499: @end smallexample 500: 501: The value returned by @code{wmemmove} is the value of @var{wto}. 502: 503: This function is a GNU extension. 504: @end deftypefun 505: 506: @comment string.h 507: @comment SVID 508: @deftypefun {void *} memccpy (void *restrict @var{to}, const void *restrict @var{from}, int @var{c}, size_t @var{size}) 509: This function copies no more than @var{size} bytes from @var{from} to 510: @var{to}, stopping if a byte matching @var{c} is found. The return 511: value is a pointer into @var{to} one byte past where @var{c} was copied, 512: or a null pointer if no byte matching @var{c} appeared in the first 513: @var{size} bytes of @var{from}. 514: @end deftypefun 515: 516: @comment string.h 517: @comment ISO 518: @deftypefun {void *} memset (void *@var{block}, int @var{c}, size_t @var{size}) 519: This function copies the value of @var{c} (converted to an 520: @code{unsigned char}) into each of the first @var{size} bytes of the 521: object beginning at @var{block}. It returns the value of @var{block}. 522: @end deftypefun 523: 524: @comment wchar.h 525: @comment ISO 526: @deftypefun {wchar_t *} wmemset (wchar_t *@var{block}, wchar_t @var{wc}, size_t @var{size}) 527: This function copies the value of @var{wc} into each of the first 528: @var{size} wide characters of the object beginning at @var{block}. It 529: returns the value of @var{block}. 530: @end deftypefun 531: 532: @comment string.h 533: @comment ISO 534: @deftypefun {char *} strcpy (char *restrict @var{to}, const char *restrict @var{from}) 535: This copies characters from the string @var{from} (up to and including 536: the terminating null character) into the string @var{to}. Like 537: @code{memcpy}, this function has undefined results if the strings 538: overlap. The return value is the value of @var{to}. 539: @end deftypefun 540: 541: @comment wchar.h 542: @comment ISO 543: @deftypefun {wchar_t *} wcscpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}) 544: This copies wide characters from the string @var{wfrom} (up to and 545: including the terminating null wide character) into the string 546: @var{wto}. Like @code{wmemcpy}, this function has undefined results if 547: the strings overlap. The return value is the value of @var{wto}. 548: @end deftypefun 549: 550: @comment string.h 551: @comment ISO 552: @deftypefun {char *} strncpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size}) 553: This function is similar to @code{strcpy} but always copies exactly 554: @var{size} characters into @var{to}. 555: 556: If the length of @var{from} is more than @var{size}, then @code{strncpy} 557: copies just the first @var{size} characters. Note that in this case 558: there is no null terminator written into @var{to}. 559: 560: If the length of @var{from} is less than @var{size}, then @code{strncpy} 561: copies all of @var{from}, followed by enough null characters to add up 562: to @var{size} characters in all. This behavior is rarely useful, but it 563: is specified by the @w{ISO C} standard. 564: 565: The behavior of @code{strncpy} is undefined if the strings overlap. 566: 567: Using @code{strncpy} as opposed to @code{strcpy} is a way to avoid bugs 568: relating to writing past the end of the allocated space for @var{to}. 569: However, it can also make your program much slower in one common case: 570: copying a string which is probably small into a potentially large buffer. 571: In this case, @var{size} may be large, and when it is, @code{strncpy} will 572: waste a considerable amount of time copying null characters. 573: @end deftypefun 574: 575: @comment wchar.h 576: @comment ISO 577: @deftypefun {wchar_t *} wcsncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) 578: This function is similar to @code{wcscpy} but always copies exactly 579: @var{size} wide characters into @var{wto}. 580: 581: If the length of @var{wfrom} is more than @var{size}, then 582: @code{wcsncpy} copies just the first @var{size} wide characters. Note 583: that in this case there is no null terminator written into @var{wto}. 584: 585: If the length of @var{wfrom} is less than @var{size}, then 586: @code{wcsncpy} copies all of @var{wfrom}, followed by enough null wide 587: characters to add up to @var{size} wide characters in all. This 588: behavior is rarely useful, but it is specified by the @w{ISO C} 589: standard. 590: 591: The behavior of @code{wcsncpy} is undefined if the strings overlap. 592: 593: Using @code{wcsncpy} as opposed to @code{wcscpy} is a way to avoid bugs 594: relating to writing past the end of the allocated space for @var{wto}. 595: However, it can also make your program much slower in one common case: 596: copying a string which is probably small into a potentially large buffer. 597: In this case, @var{size} may be large, and when it is, @code{wcsncpy} will 598: waste a considerable amount of time copying null wide characters. 599: @end deftypefun 600: 601: @comment string.h 602: @comment SVID 603: @deftypefun {char *} strdup (const char *@var{s}) 604: This function copies the null-terminated string @var{s} into a newly 605: allocated string. The string is allocated using @code{malloc}; see 606: @ref{Unconstrained Allocation}. If @code{malloc} cannot allocate space 607: for the new string, @code{strdup} returns a null pointer. Otherwise it 608: returns a pointer to the new string. 609: @end deftypefun 610: 611: @comment wchar.h 612: @comment GNU 613: @deftypefun {wchar_t *} wcsdup (const wchar_t *@var{ws}) 614: This function copies the null-terminated wide character string @var{ws} 615: into a newly allocated string. The string is allocated using 616: @code{malloc}; see @ref{Unconstrained Allocation}. If @code{malloc} 617: cannot allocate space for the new string, @code{wcsdup} returns a null 618: pointer. Otherwise it returns a pointer to the new wide character 619: string. 620: 621: This function is a GNU extension. 622: @end deftypefun 623: 624: @comment string.h 625: @comment GNU 626: @deftypefun {char *} strndup (const char *@var{s}, size_t @var{size}) 627: This function is similar to @code{strdup} but always copies at most 628: @var{size} characters into the newly allocated string. 629: 630: If the length of @var{s} is more than @var{size}, then @code{strndup} 631: copies just the first @var{size} characters and adds a closing null 632: terminator. Otherwise all characters are copied and the string is 633: terminated. 634: 635: This function is different to @code{strncpy} in that it always 636: terminates the destination string. 637: 638: @code{strndup} is a GNU extension. 639: @end deftypefun 640: 641: @comment string.h 642: @comment Unknown origin 643: @deftypefun {char *} stpcpy (char *restrict @var{to}, const char *restrict @var{from}) 644: This function is like @code{strcpy}, except that it returns a pointer to 645: the end of the string @var{to} (that is, the address of the terminating 646: null character @code{to + strlen (from)}) rather than the beginning. 647: 648: For example, this program uses @code{stpcpy} to concatenate @samp{foo} 649: and @samp{bar} to produce @samp{foobar}, which it then prints. 650: 651: @smallexample 652: @include stpcpy.c.texi 653: @end smallexample 654: 655: This function is not part of the ISO or POSIX standards, and is not 656: customary on Unix systems, but we did not invent it either. Perhaps it 657: comes from MS-DOG. 658: 659: Its behavior is undefined if the strings overlap. The function is 660: declared in @file{string.h}. 661: @end deftypefun 662: 663: @comment wchar.h 664: @comment GNU 665: @deftypefun {wchar_t *} wcpcpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}) 666: This function is like @code{wcscpy}, except that it returns a pointer to 667: the end of the string @var{wto} (that is, the address of the terminating 668: null character @code{wto + strlen (wfrom)}) rather than the beginning. 669: 670: This function is not part of ISO or POSIX but was found useful while 671: developing the GNU C Library itself. 672: 673: The behavior of @code{wcpcpy} is undefined if the strings overlap. 674: 675: @code{wcpcpy} is a GNU extension and is declared in @file{wchar.h}. 676: @end deftypefun 677: 678: @comment string.h 679: @comment GNU 680: @deftypefun {char *} stpncpy (char *restrict @var{to}, const char *restrict @var{from}, size_t @var{size}) 681: This function is similar to @code{stpcpy} but copies always exactly 682: @var{size} characters into @var{to}. 683: 684: If the length of @var{from} is more then @var{size}, then @code{stpncpy} 685: copies just the first @var{size} characters and returns a pointer to the 686: character directly following the one which was copied last. Note that in 687: this case there is no null terminator written into @var{to}. 688: 689: If the length of @var{from} is less than @var{size}, then @code{stpncpy} 690: copies all of @var{from}, followed by enough null characters to add up 691: to @var{size} characters in all. This behavior is rarely useful, but it 692: is implemented to be useful in contexts where this behavior of the 693: @code{strncpy} is used. @code{stpncpy} returns a pointer to the 694: @emph{first} written null character. 695: 696: This function is not part of ISO or POSIX but was found useful while 697: developing the GNU C Library itself. 698: 699: Its behavior is undefined if the strings overlap. The function is 700: declared in @file{string.h}. 701: @end deftypefun 702: 703: @comment wchar.h 704: @comment GNU 705: @deftypefun {wchar_t *} wcpncpy (wchar_t *restrict @var{wto}, const wchar_t *restrict @var{wfrom}, size_t @var{size}) 706: This function is similar to @code{wcpcpy} but copies always exactly 707: @var{wsize} characters into @var{wto}. 708: 709: If the length of @var{wfrom} is more then @var{size}, then 710: @code{wcpncpy} copies just the first @var{size} wide characters and 711: returns a pointer to the wide character directly following the last 712: non-null wide character which was copied last. Note that in this case 713: there is no null terminator written into @var{wto}. 714: 715: If the length of @var{wfrom} is less than @var{size}, then @code{wcpncpy} 716: copies all of @var{wfrom}, followed by enough null characters to add up 717: to @var{size} characters in all. This behavior is rarely useful, but it 718: is implemented to be useful in contexts where this behavior of the 719: @code{wcsncpy} is used. @code{wcpncpy} returns a pointer to the 720: @emph{first} written null character. 721: 722: This function is not part of ISO or POSIX but was found useful while 723: developing the GNU C Library itself. 724: 725: Its behavior is undefined if the strings overlap. 726: 727: @code{wcpncpy} is a GNU extension and is declared in @file{wchar.h}. 728: @end deftypefun 729: 730: @comment string.h 731: @comment GNU 732: @deftypefn {Macro} {char *} strdupa (const char *@var{s}) 733: This macro is similar to @code{strdup} but allocates the new string 734: using @code{alloca} instead of @code{malloc} (@pxref{Variable Size 735: Automatic}). This means of course the returned string has the same 736: limitations as any block of memory allocated using @code{alloca}. 737: 738: For obvious reasons @co