tr.1 (9018B)
- .\" SPDX-License-Identifier: BSD-3-Clause
- .\" $OpenBSD: tr.1,v 1.25 2015/02/28 21:51:57 bentley Exp $
- .\" $NetBSD: tr.1,v 1.5 1994/12/07 08:35:13 jtc Exp $
- .\"
- .\" Copyright (c) 1991, 1993
- .\" The Regents of the University of California. All rights reserved.
- .\"
- .\" This code is derived from software contributed to Berkeley by
- .\" the Institute of Electrical and Electronics Engineers, Inc.
- .\"
- .\" Redistribution and use in source and binary forms, with or without
- .\" modification, are permitted provided that the following conditions
- .\" are met:
- .\" 1. Redistributions of source code must retain the above copyright
- .\" notice, this list of conditions and the following disclaimer.
- .\" 2. Redistributions in binary form must reproduce the above copyright
- .\" notice, this list of conditions and the following disclaimer in the
- .\" documentation and/or other materials provided with the distribution.
- .\" 3. Neither the name of the University nor the names of its contributors
- .\" may be used to endorse or promote products derived from this software
- .\" without specific prior written permission.
- .\"
- .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
- .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
- .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- .\" SUCH DAMAGE.
- .\"
- .\" @(#)tr.1 8.1 (Berkeley) 6/6/93
- .\"
- .Dd $Mdocdate: February 28 2015 $
- .Dt TR 1
- .Os
- .Sh NAME
- .Nm tr
- .Nd translate characters
- .Sh SYNOPSIS
- .Nm tr
- .Op Fl Ccs
- .Ar string1 string2
- .Nm tr
- .Op Fl Cc
- .Fl d
- .Ar string1
- .Nm tr
- .Op Fl Cc
- .Fl s
- .Ar string1
- .Nm tr
- .Op Fl Cc
- .Fl ds
- .Ar string1 string2
- .Sh DESCRIPTION
- The
- .Nm
- utility copies the standard input to the standard output with substitution
- or deletion of selected characters.
- .Pp
- The options are as follows:
- .Bl -tag -width Ds
- .It Fl C
- Complements the set of characters in
- .Ar string1 ;
- for instance,
- .Dq -C\ ab
- includes every character except for
- .Sq a
- and
- .Sq b .
- .It Fl c
- The same as
- .Fl C .
- .It Fl d
- The
- .Fl d
- option causes characters to be deleted from the input.
- .It Fl s
- The
- .Fl s
- option squeezes multiple occurrences of the characters listed in the last
- operand (either
- .Ar string1
- or
- .Ar string2 )
- in the input into a single instance of the character.
- This occurs after all deletion and translation is completed.
- .El
- .Pp
- In the first synopsis form, the characters in
- .Ar string1
- are translated into the characters in
- .Ar string2
- where the first character in
- .Ar string1
- is translated into the first character in
- .Ar string2
- and so on.
- If
- .Ar string1
- is longer than
- .Ar string2 ,
- the last character found in
- .Ar string2
- is duplicated until
- .Ar string1
- is exhausted.
- .Pp
- In the second synopsis form, the characters in
- .Ar string1
- are deleted from the input.
- .Pp
- In the third synopsis form, the characters in
- .Ar string1
- are compressed as described for the
- .Fl s
- option.
- .Pp
- In the fourth synopsis form, the characters in
- .Ar string1
- are deleted from the input, and the characters in
- .Ar string2
- are compressed as described for the
- .Fl s
- option.
- .Pp
- The following conventions can be used in
- .Ar string1
- and
- .Ar string2
- to specify sets of characters:
- .Bl -tag -width [:equiv:]
- .It character
- Any character not described by one of the following conventions
- represents itself.
- .It \eoctal
- A backslash followed by 1, 2, or 3 octal digits represents a character
- with that encoded value.
- To follow an octal sequence with a digit as a character, left zero-pad
- the octal sequence to the full 3 octal digits.
- .It \echaracter
- A backslash followed by certain special characters maps to special
- values.
- .Pp
- .Bl -tag -width "nn" -offset indent -compact
- .It \ea
- <alert character>
- .It \eb
- <backspace>
- .It \ef
- <form-feed>
- .It \en
- <newline>
- .It \er
- <carriage return>
- .It \et
- <tab>
- .It \ev
- <vertical tab>
- .El
- .Pp
- A backslash followed by any other character maps to that character.
- .It c-c
- Represents the range of characters between the range endpoints, inclusively.
- .It [:class:]
- Represents all characters belonging to the defined character class.
- Class names are:
- .Pp
- .Bl -tag -width "xdigit" -offset indent -compact
- .It alnum
- <alphanumeric characters>
- .It alpha
- <alphabetic characters>
- .It blank
- <blank characters>
- .It cntrl
- <control characters>
- .It digit
- <numeric characters>
- .It graph
- <graphic characters>
- .It lower
- <lower-case alphabetic characters>
- .It print
- <printable characters>
- .It punct
- <punctuation characters>
- .It space
- <space characters>
- .It upper
- <upper-case characters>
- .It xdigit
- <hexadecimal characters>
- .El
- .Pp
- .\" All classes may be used in
- .\" .Ar string1 ,
- .\" and in
- .\" .Ar string2
- .\" when both the
- .\" .Fl d
- .\" and
- .\" .Fl s
- .\" options are specified.
- .\" Otherwise, only the classes ``upper'' and ``lower'' may be used in
- .\" .Ar string2
- .\" and then only when the corresponding class (``upper'' for ``lower''
- .\" and vice-versa) is specified in the same relative position in
- .\" .Ar string1 .
- .\" .Pp
- With the exception of the
- .Dq upper
- and
- .Dq lower
- classes, characters
- in the classes are in unspecified order.
- In the
- .Dq upper
- and
- .Dq lower
- classes, characters are entered in
- ascending order.
- .Pp
- For specific information as to which ASCII characters are included
- in these classes, see
- .Xr isalnum 3 ,
- .Xr isalpha 3 ,
- and related manual pages.
- .It [=equiv=]
- Represents all characters or collating (sorting) elements belonging to
- the same equivalence class as
- .Ar equiv .
- If
- there is a secondary ordering within the equivalence class, the characters
- are ordered in ascending sequence.
- Otherwise, they are ordered after their encoded values.
- An example of an equivalence class might be
- .Dq c
- and
- .Dq ch
- in Spanish;
- English has no equivalence classes.
- .It [#*n]
- Represents
- .Ar n
- repeated occurrences of the character represented by
- .Ar # .
- This
- expression is only valid when it occurs in
- .Ar string2 .
- If
- .Ar n
- is omitted or is zero, it is interpreted as large enough to extend the
- .Ar string2
- sequence to the length of
- .Ar string1 .
- If
- .Ar n
- has a leading zero, it is interpreted as an octal value; otherwise,
- it's interpreted as a decimal value.
- .El
- .Sh EXIT STATUS
- .Ex -std tr
- .Sh EXAMPLES
- The following examples are shown as given to the shell:
- .Pp
- Create a list of the words in file1, one per line, where a word is taken to
- be a maximal string of letters.
- .Pp
- .Dl $ tr -cs '[:alpha:]' '[\en*]' < file1
- .Pp
- Translate the contents of file1 to upper-case.
- .Pp
- .Dl $ tr '[:lower:]' '[:upper:]' < file1
- .Pp
- Strip out non-printable characters from file1.
- .Pp
- .Dl $ tr -cd '[:print:]' < file1
- .\" .Pp
- .\" Strip out diacritical marks from the base character
- .\" .Ql e .
- .\" .Dl $ tr '[=e=]' '[e*]' < file1 > file2
- .Sh SEE ALSO
- .Xr sed 1
- .Sh STANDARDS
- The
- .Nm
- utility is compliant with the
- .St -p1003.1-2008
- specification,
- except that the
- .Fl C
- option behaves the same as the
- .Fl c
- option since
- .Nm
- is not locale-aware.
- .Pp
- System V has historically implemented character ranges using the syntax
- .Dq [c-c]
- instead of the
- .Dq c-c
- used by historic
- .Bx
- implementations and
- standardized by POSIX.
- System V shell scripts should work under this implementation as long as
- the range is intended to map in another range, i.e., the command
- .Dq tr [a-z] [A-Z]
- will work as it will map the
- .Sq \&[
- character in
- .Ar string1
- to the
- .Sq \&[
- character in
- .Ar string2 .
- However, if the shell script is deleting or squeezing characters as in
- the command
- .Dq tr\ -d\ [a-z] ,
- the characters
- .Sq \&[
- and
- .Sq \&]
- will be
- included in the deletion or compression list, which would not have happened
- under an historic System V implementation.
- Additionally, any scripts that depended on the sequence
- .Dq a-z
- to represent the three characters
- .Sq a ,
- .Sq - ,
- and
- .Sq z
- will have to be rewritten as
- .Dq a\e-z .
- .Pp
- The
- .Nm
- utility has historically not permitted the manipulation of NUL bytes in
- its input and, additionally, has stripped NUL's from its input stream.
- This implementation has removed this behavior as a bug.
- .Pp
- The
- .Nm
- utility has historically been extremely forgiving of syntax errors:
- for example, the
- .Fl c
- and
- .Fl s
- options were ignored unless two strings were specified.
- This implementation will not permit illegal syntax.
- .Pp
- It should be noted that the feature wherein the last character of
- .Ar string2
- is duplicated if
- .Ar string2
- has less characters than
- .Ar string1
- is permitted by POSIX but is not required.
- Shell scripts attempting to be portable to other POSIX systems should use
- the
- .Dq [#*]
- convention instead of relying on this behavior.