logo

utils-std

Collection of commonly available Unix tools

tr.1 (8915B)


  1. .\" SPDX-License-Identifier: BSD-3-Clause
  2. .\" $OpenBSD: tr.1,v 1.25 2015/02/28 21:51:57 bentley Exp $
  3. .\" $NetBSD: tr.1,v 1.5 1994/12/07 08:35:13 jtc Exp $
  4. .\"
  5. .\" Copyright (c) 1991, 1993
  6. .\" The Regents of the University of California. All rights reserved.
  7. .\"
  8. .\" This code is derived from software contributed to Berkeley by
  9. .\" the Institute of Electrical and Electronics Engineers, Inc.
  10. .\"
  11. .\" Redistribution and use in source and binary forms, with or without
  12. .\" modification, are permitted provided that the following conditions
  13. .\" are met:
  14. .\" 1. Redistributions of source code must retain the above copyright
  15. .\" notice, this list of conditions and the following disclaimer.
  16. .\" 2. Redistributions in binary form must reproduce the above copyright
  17. .\" notice, this list of conditions and the following disclaimer in the
  18. .\" documentation and/or other materials provided with the distribution.
  19. .\" 3. Neither the name of the University nor the names of its contributors
  20. .\" may be used to endorse or promote products derived from this software
  21. .\" without specific prior written permission.
  22. .\"
  23. .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
  24. .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  25. .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
  26. .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
  27. .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  28. .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
  29. .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  30. .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
  31. .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
  32. .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
  33. .\" SUCH DAMAGE.
  34. .\"
  35. .\" @(#)tr.1 8.1 (Berkeley) 6/6/93
  36. .\"
  37. .Dd $Mdocdate: February 28 2015 $
  38. .Dt TR 1
  39. .Os
  40. .Sh NAME
  41. .Nm tr
  42. .Nd translate characters
  43. .Sh SYNOPSIS
  44. .Nm tr
  45. .Op Fl Ccs
  46. .Ar string1 string2
  47. .Nm tr
  48. .Op Fl Cc
  49. .Fl d
  50. .Ar string1
  51. .Nm tr
  52. .Op Fl Cc
  53. .Fl s
  54. .Ar string1
  55. .Nm tr
  56. .Op Fl Cc
  57. .Fl ds
  58. .Ar string1 string2
  59. .Sh DESCRIPTION
  60. The
  61. .Nm
  62. utility copies the standard input to the standard output with substitution
  63. or deletion of selected characters.
  64. .Pp
  65. The options are as follows:
  66. .Bl -tag -width Ds
  67. .It Fl C
  68. Complements the set of characters in
  69. .Ar string1 ;
  70. for instance,
  71. .Dq -C\ ab
  72. includes every character except for
  73. .Sq a
  74. and
  75. .Sq b .
  76. .It Fl c
  77. The same as
  78. .Fl C .
  79. .It Fl d
  80. The
  81. .Fl d
  82. option causes characters to be deleted from the input.
  83. .It Fl s
  84. The
  85. .Fl s
  86. option squeezes multiple occurrences of the characters listed in the last
  87. operand (either
  88. .Ar string1
  89. or
  90. .Ar string2 )
  91. in the input into a single instance of the character.
  92. This occurs after all deletion and translation is completed.
  93. .El
  94. .Pp
  95. In the first synopsis form, the characters in
  96. .Ar string1
  97. are translated into the characters in
  98. .Ar string2
  99. where the first character in
  100. .Ar string1
  101. is translated into the first character in
  102. .Ar string2
  103. and so on.
  104. If
  105. .Ar string1
  106. is longer than
  107. .Ar string2 ,
  108. the last character found in
  109. .Ar string2
  110. is duplicated until
  111. .Ar string1
  112. is exhausted.
  113. .Pp
  114. In the second synopsis form, the characters in
  115. .Ar string1
  116. are deleted from the input.
  117. .Pp
  118. In the third synopsis form, the characters in
  119. .Ar string1
  120. are compressed as described for the
  121. .Fl s
  122. option.
  123. .Pp
  124. In the fourth synopsis form, the characters in
  125. .Ar string1
  126. are deleted from the input, and the characters in
  127. .Ar string2
  128. are compressed as described for the
  129. .Fl s
  130. option.
  131. .Pp
  132. The following conventions can be used in
  133. .Ar string1
  134. and
  135. .Ar string2
  136. to specify sets of characters:
  137. .Bl -tag -width [:equiv:]
  138. .It character
  139. Any character not described by one of the following conventions
  140. represents itself.
  141. .It \eoctal
  142. A backslash followed by 1, 2, or 3 octal digits represents a character
  143. with that encoded value.
  144. To follow an octal sequence with a digit as a character, left zero-pad
  145. the octal sequence to the full 3 octal digits.
  146. .It \echaracter
  147. A backslash followed by certain special characters maps to special
  148. values.
  149. .Pp
  150. .Bl -tag -width "nn" -offset indent -compact
  151. .It \ea
  152. <alert character>
  153. .It \eb
  154. <backspace>
  155. .It \ef
  156. <form-feed>
  157. .It \en
  158. <newline>
  159. .It \er
  160. <carriage return>
  161. .It \et
  162. <tab>
  163. .It \ev
  164. <vertical tab>
  165. .El
  166. .Pp
  167. A backslash followed by any other character maps to that character.
  168. .It c-c
  169. Represents the range of characters between the range endpoints, inclusively.
  170. .It [:class:]
  171. Represents all characters belonging to the defined character class.
  172. Class names are:
  173. .Pp
  174. .Bl -tag -width "xdigit" -offset indent -compact
  175. .It alnum
  176. <alphanumeric characters>
  177. .It alpha
  178. <alphabetic characters>
  179. .It blank
  180. <blank characters>
  181. .It cntrl
  182. <control characters>
  183. .It digit
  184. <numeric characters>
  185. .It graph
  186. <graphic characters>
  187. .It lower
  188. <lower-case alphabetic characters>
  189. .It print
  190. <printable characters>
  191. .It punct
  192. <punctuation characters>
  193. .It space
  194. <space characters>
  195. .It upper
  196. <upper-case characters>
  197. .It xdigit
  198. <hexadecimal characters>
  199. .El
  200. .Pp
  201. .\" All classes may be used in
  202. .\" .Ar string1 ,
  203. .\" and in
  204. .\" .Ar string2
  205. .\" when both the
  206. .\" .Fl d
  207. .\" and
  208. .\" .Fl s
  209. .\" options are specified.
  210. .\" Otherwise, only the classes ``upper'' and ``lower'' may be used in
  211. .\" .Ar string2
  212. .\" and then only when the corresponding class (``upper'' for ``lower''
  213. .\" and vice-versa) is specified in the same relative position in
  214. .\" .Ar string1 .
  215. .\" .Pp
  216. With the exception of the
  217. .Dq upper
  218. and
  219. .Dq lower
  220. classes, characters
  221. in the classes are in unspecified order.
  222. In the
  223. .Dq upper
  224. and
  225. .Dq lower
  226. classes, characters are entered in
  227. ascending order.
  228. .Pp
  229. For specific information as to which ASCII characters are included
  230. in these classes, see
  231. .Xr isalnum 3 ,
  232. .Xr isalpha 3 ,
  233. and related manual pages.
  234. .It [=equiv=]
  235. Represents all characters or collating (sorting) elements belonging to
  236. the same equivalence class as
  237. .Ar equiv .
  238. If
  239. there is a secondary ordering within the equivalence class, the characters
  240. are ordered in ascending sequence.
  241. Otherwise, they are ordered after their encoded values.
  242. An example of an equivalence class might be
  243. .Dq c
  244. and
  245. .Dq ch
  246. in Spanish;
  247. English has no equivalence classes.
  248. .It [#*n]
  249. Represents
  250. .Ar n
  251. repeated occurrences of the character represented by
  252. .Ar # .
  253. This
  254. expression is only valid when it occurs in
  255. .Ar string2 .
  256. If
  257. .Ar n
  258. is omitted or is zero, it is interpreted as large enough to extend the
  259. .Ar string2
  260. sequence to the length of
  261. .Ar string1 .
  262. If
  263. .Ar n
  264. has a leading zero, it is interpreted as an octal value; otherwise,
  265. it's interpreted as a decimal value.
  266. .El
  267. .Sh EXIT STATUS
  268. .Ex -std tr
  269. .Sh EXAMPLES
  270. The following examples are shown as given to the shell:
  271. .Pp
  272. Create a list of the words in file1, one per line, where a word is taken to
  273. be a maximal string of letters.
  274. .Pp
  275. .Dl $ tr -cs Qo [:alpha:] Qc Qo \en Qc < file1
  276. .Pp
  277. Translate the contents of file1 to upper-case.
  278. .Pp
  279. .Dl $ tr Qo [:lower:] Qc Qo [:upper:] Qc < file1
  280. .Pp
  281. Strip out non-printable characters from file1.
  282. .Pp
  283. .Dl $ tr -cd Qo [:print:] Qc < file1
  284. .Sh SEE ALSO
  285. .Xr sed 1
  286. .Sh STANDARDS
  287. The
  288. .Nm
  289. utility is compliant with the
  290. .St -p1003.1-2008
  291. specification,
  292. except that the
  293. .Fl C
  294. option behaves the same as the
  295. .Fl c
  296. option since
  297. .Nm
  298. is not locale-aware.
  299. .Pp
  300. System V has historically implemented character ranges using the syntax
  301. .Dq [c-c]
  302. instead of the
  303. .Dq c-c
  304. used by historic
  305. .Bx
  306. implementations and
  307. standardized by POSIX.
  308. System V shell scripts should work under this implementation as long as
  309. the range is intended to map in another range, i.e., the command
  310. .Dq tr [a-z] [A-Z]
  311. will work as it will map the
  312. .Sq \&[
  313. character in
  314. .Ar string1
  315. to the
  316. .Sq \&[
  317. character in
  318. .Ar string2 .
  319. However, if the shell script is deleting or squeezing characters as in
  320. the command
  321. .Dq tr\ -d\ [a-z] ,
  322. the characters
  323. .Sq \&[
  324. and
  325. .Sq \&]
  326. will be
  327. included in the deletion or compression list, which would not have happened
  328. under an historic System V implementation.
  329. Additionally, any scripts that depended on the sequence
  330. .Dq a-z
  331. to represent the three characters
  332. .Sq a ,
  333. .Sq - ,
  334. and
  335. .Sq z
  336. will have to be rewritten as
  337. .Dq a\e-z .
  338. .Pp
  339. The
  340. .Nm
  341. utility has historically not permitted the manipulation of NUL bytes in
  342. its input and, additionally, has stripped NUL's from its input stream.
  343. This implementation has removed this behavior as a bug.
  344. .Pp
  345. The
  346. .Nm
  347. utility has historically been extremely forgiving of syntax errors:
  348. for example, the
  349. .Fl c
  350. and
  351. .Fl s
  352. options were ignored unless two strings were specified.
  353. This implementation will not permit illegal syntax.
  354. .Pp
  355. It should be noted that the feature wherein the last character of
  356. .Ar string2
  357. is duplicated if
  358. .Ar string2
  359. has less characters than
  360. .Ar string1
  361. is permitted by POSIX but is not required.
  362. Shell scripts attempting to be portable to other POSIX systems should use
  363. the
  364. .Dq [#*]
  365. convention instead of relying on this behavior.