logo

utils-std

Collection of commonly available Unix tools git clone https://anongit.hacktivis.me/git/utils-std.git

wc.1 (2249B)


  1. .\" utils-std: Collection of commonly available Unix tools
  2. .\" Copyright 2017 Haelwenn (lanodan) Monnier <contact+utils@hacktivis.me>
  3. .\" SPDX-License-Identifier: MPL-2.0
  4. .Dd 2024-04-24
  5. .Dt WC 1
  6. .Os
  7. .Sh NAME
  8. .Nm wc
  9. .Nd Count lines, words, bytes/characters
  10. .Sh SYNOPSIS
  11. .Nm
  12. .Op Fl c Ns | Ns Fl m
  13. .Op Fl lw
  14. .Op Ar file...
  15. .Sh DESCRIPTION
  16. .Nm
  17. reads each given
  18. .Ar file
  19. and by default report their numbers of newlines, words and bytes.
  20. If no
  21. .Ar file
  22. is given, then
  23. .Nm
  24. reads from standard input.
  25. .Pp
  26. A word is defined as a non-empty string delimited by whitespace,
  27. some other implementation choose to additionally exclude
  28. non-printable characters.
  29. .Sh OPTIONS
  30. .Bl -tag -width __
  31. .It Fl c
  32. Explicitly use single-byte mode, and write the number of bytes in each
  33. .Ar file .
  34. .It Fl l
  35. Write the number of newlines in each
  36. .Ar file .
  37. .It Fl m
  38. Switch to multi-byte mode, and write the number of codepoints in each
  39. .Ar file .
  40. The encoding is dependent on the
  41. .Xr locale 1
  42. environment variables.
  43. .Pp
  44. Note that while codepoints are often close enough to characters,
  45. some characters use multiple codepoints,
  46. plus by design
  47. .Nm
  48. cannot consider glyphs due to lacking rendering.
  49. .Pp
  50. For example with decomposed é (e with acute diacritic) in a
  51. .Ql C.UTF-8
  52. locale:
  53. .Bd -literal -compact
  54. $ printf '\\145\\314\\201\\n'
  55. é
  56. $ printf '\\145\\314\\201' | wc -c
  57. 3
  58. $ printf '\\145\\314\\201' | wc -m
  59. 2
  60. .Ed
  61. .It Fl w
  62. Write the number of words in each
  63. .Ar file .
  64. .El
  65. .Pp
  66. If any option is specified,
  67. .Nm
  68. reports only the requested information, without their ordering
  69. changing output formatting.
  70. The default is equivalent to
  71. .Cm wc
  72. .Fl clw .
  73. .Sh ENVIRONMENT VARIABLES
  74. See
  75. .Xr locale 1 .
  76. .Sh STDOUT
  77. By default the standard output reports each file in the form:
  78. .Bd -literal
  79. "%d %d %d %s", <newlines>, <words>, <bytes>, <file>
  80. .Ed
  81. .Pp
  82. Similarly to GNU and BusyBox, this implementation also makes sure
  83. to not print trailing whitespace, which would have to be trimmed
  84. in most scripts.
  85. .Pp
  86. If more than one
  87. .Ar file
  88. is given, a final line is printed with "total" instead of a pathname.
  89. .Sh EXIT STATUS
  90. .Ex -std
  91. .Sh STANDARDS
  92. .Nm
  93. should be compliant with the
  94. IEEE Std 1003.1-2024 (“POSIX.1”)
  95. specification.
  96. .Sh AUTHORS
  97. .An Haelwenn (lanodan) Monnier Aq Mt contact+utils@hacktivis.me