logo

oasis-root

Compiled tree of Oasis Linux based on own branch at <https://hacktivis.me/git/oasis/> git clone https://anongit.hacktivis.me/git/oasis-root.git

cut.1p (12185B)


  1. '\" et
  2. .TH CUT "1P" 2017 "IEEE/The Open Group" "POSIX Programmer's Manual"
  3. .\"
  4. .SH PROLOG
  5. This manual page is part of the POSIX Programmer's Manual.
  6. The Linux implementation of this interface may differ (consult
  7. the corresponding Linux manual page for details of Linux behavior),
  8. or the interface may not be implemented on Linux.
  9. .\"
  10. .SH NAME
  11. cut
  12. \(em cut out selected fields of each line of a file
  13. .SH SYNOPSIS
  14. .LP
  15. .nf
  16. cut -b \fIlist \fB[\fR-n\fB] [\fIfile\fR...\fB]\fR
  17. .P
  18. cut -c \fIlist \fB[\fIfile\fR...\fB]\fR
  19. .P
  20. cut -f \fIlist \fB[\fR-d \fIdelim\fB] [\fR-s\fB] [\fIfile\fR...\fB]\fR
  21. .fi
  22. .SH DESCRIPTION
  23. The
  24. .IR cut
  25. utility shall cut out bytes (\c
  26. .BR \-b
  27. option), characters (\c
  28. .BR \-c
  29. option), or character-delimited fields (\c
  30. .BR \-f
  31. option) from each line in one or more files, concatenate them, and
  32. write them to standard output.
  33. .SH OPTIONS
  34. The
  35. .IR cut
  36. utility shall conform to the Base Definitions volume of POSIX.1\(hy2017,
  37. .IR "Section 12.2" ", " "Utility Syntax Guidelines".
  38. .P
  39. The application shall ensure that the option-argument
  40. .IR list
  41. (see options
  42. .BR \-b ,
  43. .BR \-c ,
  44. and
  45. .BR \-f
  46. below) is a
  47. <comma>-separated
  48. list or
  49. <blank>-separated
  50. list of positive numbers and ranges. Ranges can be in three forms. The
  51. first is two positive numbers separated by a
  52. <hyphen-minus>
  53. (\c
  54. .IR low \-\c
  55. .IR high ),
  56. which represents all fields from the first number to the second
  57. number. The second is a positive number preceded by a
  58. <hyphen-minus>
  59. (\-\c
  60. .IR high ),
  61. which represents all fields from field number 1 to that number. The
  62. third is a positive number followed by a
  63. <hyphen-minus>
  64. (\c
  65. .IR low \-),
  66. which represents that number to the last field, inclusive. The elements
  67. in
  68. .IR list
  69. can be repeated, can overlap, and can be specified in any order, but
  70. the bytes, characters, or fields selected shall be written in the order
  71. of the input data. If an element appears in the selection list more
  72. than once, it shall be written exactly once.
  73. .P
  74. The following options shall be supported:
  75. .IP "\fB\-b\ \fIlist\fR" 10
  76. Cut based on a
  77. .IR list
  78. of bytes. Each selected byte shall be output unless the
  79. .BR \-n
  80. option is also specified. It shall not be an error to select bytes not
  81. present in the input line.
  82. .IP "\fB\-c\ \fIlist\fR" 10
  83. Cut based on a
  84. .IR list
  85. of characters. Each selected character shall be output. It shall not
  86. be an error to select characters not present in the input line.
  87. .IP "\fB\-d\ \fIdelim\fR" 10
  88. Set the field delimiter to the character
  89. .IR delim .
  90. The default is the
  91. <tab>.
  92. .IP "\fB\-f\ \fIlist\fR" 10
  93. Cut based on a
  94. .IR list
  95. of fields, assumed to be separated in the file by a delimiter character
  96. (see
  97. .BR \-d ).
  98. Each selected field shall be output. Output fields shall be separated
  99. by a single occurrence of the field delimiter character. Lines with no
  100. field delimiters shall be passed through intact, unless
  101. .BR \-s
  102. is specified. It shall not be an error to select fields not present in
  103. the input line.
  104. .IP "\fB\-n\fP" 10
  105. Do not split characters. When specified with the
  106. .BR \-b
  107. option, each element in
  108. .IR list
  109. of the form
  110. .IR low \-\c
  111. .IR high
  112. (\c
  113. <hyphen-minus>-separated
  114. numbers) shall be modified as follows:
  115. .RS 10
  116. .IP " *" 4
  117. If the byte selected by
  118. .IR low
  119. is not the first byte of a character,
  120. .IR low
  121. shall be decremented to select the first byte of the character
  122. originally selected by
  123. .IR low .
  124. If the byte selected by
  125. .IR high
  126. is not the last byte of a character,
  127. .IR high
  128. shall be decremented to select the last byte of the character prior to
  129. the character originally selected by
  130. .IR high ,
  131. or zero if there is no prior character. If the resulting range element
  132. has
  133. .IR high
  134. equal to zero or
  135. .IR low
  136. greater than
  137. .IR high ,
  138. the list element shall be dropped from
  139. .IR list
  140. for that input line without causing an error.
  141. .P
  142. Each element in
  143. .IR list
  144. of the form
  145. .IR low \-
  146. shall be treated as above with
  147. .IR high
  148. set to the number of bytes in the current line, not including the
  149. terminating
  150. <newline>.
  151. Each element in
  152. .IR list
  153. of the form \-\c
  154. .IR high
  155. shall be treated as above with
  156. .IR low
  157. set to 1. Each element in
  158. .IR list
  159. of the form
  160. .IR num
  161. (a single number) shall be treated as above with
  162. .IR low
  163. set to
  164. .IR num
  165. and
  166. .IR high
  167. set to
  168. .IR num .
  169. .RE
  170. .IP "\fB\-s\fP" 10
  171. Suppress lines with no delimiter characters, when used with the
  172. .BR \-f
  173. option. Unless specified, lines with no delimiters shall be passed
  174. through untouched.
  175. .SH OPERANDS
  176. The following operand shall be supported:
  177. .IP "\fIfile\fR" 10
  178. A pathname of an input file. If no
  179. .IR file
  180. operands are specified, or if a
  181. .IR file
  182. operand is
  183. .BR '\-' ,
  184. the standard input shall be used.
  185. .SH STDIN
  186. The standard input shall be used only if no
  187. .IR file
  188. operands are specified, or if a
  189. .IR file
  190. operand is
  191. .BR '\-' .
  192. See the INPUT FILES section.
  193. .SH "INPUT FILES"
  194. The input files shall be text files, except that line lengths shall be
  195. unlimited.
  196. .SH "ENVIRONMENT VARIABLES"
  197. The following environment variables shall affect the execution of
  198. .IR cut :
  199. .IP "\fILANG\fP" 10
  200. Provide a default value for the internationalization variables that are
  201. unset or null. (See the Base Definitions volume of POSIX.1\(hy2017,
  202. .IR "Section 8.2" ", " "Internationalization Variables"
  203. for the precedence of internationalization variables used to determine
  204. the values of locale categories.)
  205. .IP "\fILC_ALL\fP" 10
  206. If set to a non-empty string value, override the values of all the
  207. other internationalization variables.
  208. .IP "\fILC_CTYPE\fP" 10
  209. Determine the locale for the interpretation of sequences of bytes of
  210. text data as characters (for example, single-byte as opposed to
  211. multi-byte characters in arguments and input files).
  212. .IP "\fILC_MESSAGES\fP" 10
  213. .br
  214. Determine the locale that should be used to affect the format and
  215. contents of diagnostic messages written to standard error.
  216. .IP "\fINLSPATH\fP" 10
  217. Determine the location of message catalogs for the processing of
  218. .IR LC_MESSAGES .
  219. .SH "ASYNCHRONOUS EVENTS"
  220. Default.
  221. .SH STDOUT
  222. The
  223. .IR cut
  224. utility output shall be a concatenation of the selected bytes,
  225. characters, or fields (one of the following):
  226. .sp
  227. .RS 4
  228. .nf
  229. "%s\en", <\fIconcatenation of bytes\fR>
  230. .P
  231. "%s\en", <\fIconcatenation of characters\fR>
  232. .P
  233. "%s\en", <\fIconcatenation of fields and field delimiters\fR>
  234. .fi
  235. .P
  236. .RE
  237. .SH STDERR
  238. The standard error shall be used only for diagnostic messages.
  239. .SH "OUTPUT FILES"
  240. None.
  241. .SH "EXTENDED DESCRIPTION"
  242. None.
  243. .SH "EXIT STATUS"
  244. The following exit values shall be returned:
  245. .IP "\00" 6
  246. All input files were output successfully.
  247. .IP >0 6
  248. An error occurred.
  249. .SH "CONSEQUENCES OF ERRORS"
  250. Default.
  251. .LP
  252. .IR "The following sections are informative."
  253. .SH "APPLICATION USAGE"
  254. The
  255. .IR cut
  256. and
  257. .IR fold
  258. utilities can be used to create text files out of files with
  259. arbitrary line lengths. The
  260. .IR cut
  261. utility should be used when the number of lines (or records) needs
  262. to remain constant. The
  263. .IR fold
  264. utility should be used when the contents of long lines need to be
  265. kept contiguous.
  266. .P
  267. Earlier versions of the
  268. .IR cut
  269. utility worked in an environment where bytes and characters were
  270. considered equivalent (modulo
  271. <backspace>
  272. and
  273. <tab>
  274. processing in some implementations). In the extended world of
  275. multi-byte characters, the new
  276. .BR \-b
  277. option has been added. The
  278. .BR \-n
  279. option (used with
  280. .BR \-b )
  281. allows it to be used to act on bytes rounded to character boundaries.
  282. The algorithm specified for
  283. .BR \-n
  284. guarantees that:
  285. .sp
  286. .RS 4
  287. .nf
  288. cut -b 1-500 -n file > file1
  289. cut -b 501- -n file > file2
  290. .fi
  291. .P
  292. .RE
  293. .P
  294. ends up with all the characters in
  295. .BR file
  296. appearing exactly once in
  297. .BR file1
  298. or
  299. .BR file2 .
  300. (There is, however, a
  301. <newline>
  302. in both
  303. .BR file1
  304. and
  305. .BR file2
  306. for each
  307. <newline>
  308. in
  309. .BR file .)
  310. .SH EXAMPLES
  311. Examples of the option qualifier list:
  312. .IP 1,4,7 8
  313. Select the first, fourth, and seventh bytes, characters, or fields and
  314. field delimiters.
  315. .IP "1\-3,8" 8
  316. Equivalent to 1,2,3,8.
  317. .IP "\-5,10" 8
  318. Equivalent to 1,2,3,4,5,10.
  319. .IP "3\-" 8
  320. Equivalent to third to last, inclusive.
  321. .P
  322. The
  323. .IR low \-\c
  324. .IR high
  325. forms are not always equivalent when used with
  326. .BR \-b
  327. and
  328. .BR \-n
  329. and multi-byte characters; see the description of
  330. .BR \-n .
  331. .P
  332. The following command:
  333. .sp
  334. .RS 4
  335. .nf
  336. cut -d : -f 1,6 /etc/passwd
  337. .fi
  338. .P
  339. .RE
  340. .P
  341. reads the System V password file (user database) and produces lines of
  342. the form:
  343. .sp
  344. .RS 4
  345. .nf
  346. <\fIuser ID\fR>:<\fIhome directory\fR>
  347. .fi
  348. .P
  349. .RE
  350. .P
  351. Most utilities in this volume of POSIX.1\(hy2017 work on text files. The
  352. .IR cut
  353. utility can be used to turn files with arbitrary line lengths into a
  354. set of text files containing the same data. The
  355. .IR paste
  356. utility can be used to create (or recreate) files with arbitrary line
  357. lengths. For example, if
  358. .BR file
  359. contains long lines:
  360. .sp
  361. .RS 4
  362. .nf
  363. cut -b 1-500 -n file > file1
  364. cut -b 501- -n file > file2
  365. .fi
  366. .P
  367. .RE
  368. .P
  369. creates
  370. .BR file1
  371. (a text file) with lines no longer than 500 bytes (plus the
  372. <newline>)
  373. and
  374. .BR file2
  375. that contains the remainder of the data from
  376. .BR file .
  377. (Note that
  378. .BR file2
  379. is not a text file if there are lines in
  380. .BR file
  381. that are longer than 500 +
  382. {LINE_MAX}
  383. bytes.) The original file can be recreated from
  384. .BR file1
  385. and
  386. .BR file2
  387. using the command:
  388. .sp
  389. .RS 4
  390. .nf
  391. paste -d "\e0" file1 file2 > file
  392. .fi
  393. .P
  394. .RE
  395. .SH RATIONALE
  396. Some historical implementations do not count
  397. <backspace>
  398. characters in determining character counts with the
  399. .BR \-c
  400. option. This may be useful for using
  401. .IR cut
  402. for processing
  403. .IR nroff
  404. output. It was deliberately decided not to have the
  405. .BR \-c
  406. option treat either
  407. <backspace>
  408. or
  409. <tab>
  410. characters in any special fashion. The
  411. .IR fold
  412. utility does treat these characters specially.
  413. .P
  414. Unlike other utilities, some historical implementations of
  415. .IR cut
  416. exit after not finding an input file, rather than continuing to process
  417. the remaining
  418. .IR file
  419. operands. This behavior is prohibited by this volume of POSIX.1\(hy2017, where only the exit
  420. status is affected by this problem.
  421. .P
  422. The behavior of
  423. .IR cut
  424. when provided with either mutually-exclusive options or options that do
  425. not work logically together has been deliberately left unspecified in
  426. favor of global wording in
  427. .IR "Section 1.4" ", " "Utility Description Defaults".
  428. .P
  429. The OPTIONS section was changed in response to IEEE PASC Interpretation
  430. 1003.2 #149. The change represents historical practice on all known
  431. systems. The original standard was ambiguous on the nature of the
  432. output.
  433. .P
  434. The
  435. .IR list
  436. option-arguments are historically used to select the portions of the
  437. line to be written, but do not affect the order of the data. For
  438. example:
  439. .sp
  440. .RS 4
  441. .nf
  442. echo abcdefghi | cut -c6,2,4-7,1
  443. .fi
  444. .P
  445. .RE
  446. .P
  447. yields
  448. .BR \(dqabdefg\(dq .
  449. .P
  450. A proposal to enhance
  451. .IR cut
  452. with the following option:
  453. .IP "\fB\-o\fP" 6
  454. Preserve the selected field order. When this option is specified, each
  455. byte, character, or field (or ranges of such) shall be written in the
  456. order specified by the
  457. .IR list
  458. option-argument, even if this requires multiple outputs of the same
  459. bytes, characters, or fields.
  460. .P
  461. was rejected because this type of enhancement is outside the scope of
  462. the IEEE\ P1003.2b draft standard.
  463. .SH "FUTURE DIRECTIONS"
  464. None.
  465. .SH "SEE ALSO"
  466. .IR "Section 2.5" ", " "Parameters and Variables",
  467. .IR "\fIfold\fR\^",
  468. .IR "\fIgrep\fR\^",
  469. .IR "\fIpaste\fR\^"
  470. .P
  471. The Base Definitions volume of POSIX.1\(hy2017,
  472. .IR "Chapter 8" ", " "Environment Variables",
  473. .IR "Section 12.2" ", " "Utility Syntax Guidelines"
  474. .\"
  475. .SH COPYRIGHT
  476. Portions of this text are reprinted and reproduced in electronic form
  477. from IEEE Std 1003.1-2017, Standard for Information Technology
  478. -- Portable Operating System Interface (POSIX), The Open Group Base
  479. Specifications Issue 7, 2018 Edition,
  480. Copyright (C) 2018 by the Institute of
  481. Electrical and Electronics Engineers, Inc and The Open Group.
  482. In the event of any discrepancy between this version and the original IEEE and
  483. The Open Group Standard, the original IEEE and The Open Group Standard
  484. is the referee document. The original Standard can be obtained online at
  485. http://www.opengroup.org/unix/online.html .
  486. .PP
  487. Any typographical or formatting errors that appear
  488. in this page are most likely
  489. to have been introduced during the conversion of the source files to
  490. man page format. To report such errors, see
  491. https://www.kernel.org/doc/man-pages/reporting_bugs.html .