csplit.1p (8911B)
- '\" et
- .TH CSPLIT "1P" 2017 "IEEE/The Open Group" "POSIX Programmer's Manual"
- .\"
- .SH PROLOG
- This manual page is part of the POSIX Programmer's Manual.
- The Linux implementation of this interface may differ (consult
- the corresponding Linux manual page for details of Linux behavior),
- or the interface may not be implemented on Linux.
- .\"
- .SH NAME
- csplit
- \(em split files based on context
- .SH SYNOPSIS
- .LP
- .nf
- csplit \fB[\fR-ks\fB] [\fR-f \fIprefix\fB] [\fR-n \fInumber\fB] \fIfile arg\fR...
- .fi
- .SH DESCRIPTION
- The
- .IR csplit
- utility shall read the file named by the
- .IR file
- operand, write all or part of that file into other files as directed
- by the
- .IR arg
- operands, and write the sizes of the files.
- .SH OPTIONS
- The
- .IR csplit
- utility shall conform to the Base Definitions volume of POSIX.1\(hy2017,
- .IR "Section 12.2" ", " "Utility Syntax Guidelines".
- .P
- The following options shall be supported:
- .IP "\fB\-f\ \fIprefix\fR" 10
- Name the created files
- .IR prefix \c
- .BR 00 ,
- .IR prefix \c
- .BR 01 ,
- \&.\|.\|.,
- .IR prefixn .
- The default is
- .BR xx00
- \&.\|.\|.
- .BR xx \c
- .IR n .
- If the
- .IR prefix
- argument would create a filename exceeding
- {NAME_MAX}
- bytes, an error shall result,
- .IR csplit
- shall exit with a diagnostic message, and no files shall be created.
- .IP "\fB\-k\fP" 10
- Leave previously created files intact. By default,
- .IR csplit
- shall remove created files if an error occurs.
- .IP "\fB\-n\ \fInumber\fR" 10
- Use
- .IR number
- decimal digits to form filenames for the file pieces. The default
- shall be 2.
- .IP "\fB\-s\fP" 10
- Suppress the output of file size messages.
- .SH OPERANDS
- The following operands shall be supported:
- .IP "\fIfile\fR" 10
- The pathname of a text file to be split. If
- .IR file
- is
- .BR '\-' ,
- the standard input shall be used.
- .P
- Each
- .IR arg
- operand can be one of the following:
- .IP "/\fIrexp\fR/\fB[\fIoffset\fB]\fR" 10
- .br
- A file shall be created using the content of the lines from the current
- line up to, but not including, the line that results from the
- evaluation of the regular expression with
- .IR offset ,
- if any, applied. The regular expression
- .IR rexp
- shall follow the rules for basic regular expressions described in the Base Definitions volume of POSIX.1\(hy2017,
- .IR "Section 9.3" ", " "Basic Regular Expressions".
- The application shall use the sequence
- .BR \(dq\e/\(dq
- to specify a
- <slash>
- character within the
- .IR rexp .
- The optional offset shall be a positive or negative integer value
- representing a number of lines. A positive integer value can be
- preceded by
- .BR '\(pl' .
- If the selection of lines from an
- .IR offset
- expression of this type would create a file with zero lines, or one
- with greater than the number of lines left in the input file, the
- results are unspecified. After the section is created, the current line
- shall be set to the line that results from the evaluation of the
- regular expression with any offset applied. If the current line is the
- first line in the file and a regular expression operation has not yet
- been performed, the pattern match of
- .IR rexp
- shall be applied from the current line to the end of the file.
- Otherwise, the pattern match of
- .IR rexp
- shall be applied from the line following the current line to the end of
- the file.
- .IP "%\fIrexp\fR%\fB[\fIoffset\fB]\fR" 10
- .br
- Equivalent to /\fIrexp\fR/\fB[\fIoffset\fB]\fR, except that no
- file shall be created for the selected section of the input file. The
- application shall use the sequence
- .BR \(dq\e%\(dq
- to specify a
- <percent-sign>
- character within the
- .IR rexp .
- .IP "\fIline_no\fR" 10
- Create a file from the current line up to (but not including) the line
- number
- .IR line_no .
- Lines in the file shall be numbered starting at one. The current line
- becomes
- .IR line_no .
- .IP "{\fInum\fR}" 10
- Repeat operand. This operand can follow any of the operands described
- previously. If it follows a
- .IR rexp
- type operand, that operand shall be applied
- .IR num
- more times. If it follows a
- .IR line_no
- operand, the file shall be split every
- .IR line_no
- lines,
- .IR num
- times, from that point.
- .P
- An error shall be reported if an operand does not reference a line
- between the current position and the end of the file.
- .SH STDIN
- See the INPUT FILES section.
- .SH "INPUT FILES"
- The input file shall be a text file.
- .SH "ENVIRONMENT VARIABLES"
- The following environment variables shall affect the execution of
- .IR csplit :
- .IP "\fILANG\fP" 10
- Provide a default value for the internationalization variables that are
- unset or null. (See the Base Definitions volume of POSIX.1\(hy2017,
- .IR "Section 8.2" ", " "Internationalization Variables"
- for the precedence of internationalization variables used to determine
- the values of locale categories.)
- .IP "\fILC_ALL\fP" 10
- If set to a non-empty string value, override the values of all the
- other internationalization variables.
- .IP "\fILC_COLLATE\fP" 10
- .br
- Determine the locale for the behavior of ranges, equivalence classes,
- and multi-character collating elements within regular expressions.
- .IP "\fILC_CTYPE\fP" 10
- Determine the locale for the interpretation of sequences of bytes of
- text data as characters (for example, single-byte as opposed to
- multi-byte characters in arguments and input files) and the behavior of
- character classes within regular expressions.
- .IP "\fILC_MESSAGES\fP" 10
- .br
- Determine the locale that should be used to affect the format and
- contents of diagnostic messages written to standard error.
- .IP "\fINLSPATH\fP" 10
- Determine the location of message catalogs for the processing of
- .IR LC_MESSAGES .
- .SH "ASYNCHRONOUS EVENTS"
- If the
- .BR \-k
- option is specified, created files shall be retained. Otherwise, the
- default action occurs.
- .SH STDOUT
- Unless the
- .BR \-s
- option is used, the standard output shall consist of one line per
- file created, with a format as follows:
- .sp
- .RS 4
- .nf
- "%d\en", <\fIfile size in bytes\fR>
- .fi
- .P
- .RE
- .SH STDERR
- The standard error shall be used only for diagnostic messages.
- .SH "OUTPUT FILES"
- The output files shall contain portions of the original input file;
- otherwise, unchanged.
- .SH "EXTENDED DESCRIPTION"
- None.
- .SH "EXIT STATUS"
- The following exit values shall be returned:
- .IP "\00" 6
- Successful completion.
- .IP >0 6
- An error occurred.
- .SH "CONSEQUENCES OF ERRORS"
- By default, created files shall be removed if an error occurs. When the
- .BR \-k
- option is specified, created files shall not be removed if an error
- occurs.
- .LP
- .IR "The following sections are informative."
- .SH "APPLICATION USAGE"
- None.
- .SH EXAMPLES
- .IP " 1." 4
- This example creates four files,
- .BR cobol00
- \&.\|.\|.
- .BR cobol03 :
- .RS 4
- .sp
- .RS 4
- .nf
- csplit -f cobol file \(aq/procedure division/\(aq /par5./ /par16./
- .fi
- .P
- .RE
- .P
- After editing the split files, they can be recombined as follows:
- .sp
- .RS 4
- .nf
- cat cobol0[0-3] > file
- .fi
- .P
- .RE
- .P
- Note that this example overwrites the original file.
- .RE
- .IP " 2." 4
- This example would split the file after the first 99 lines, and every
- 100 lines thereafter, up to 9\|999 lines; this is because lines in the
- file are numbered from 1 rather than zero, for historical reasons:
- .RS 4
- .sp
- .RS 4
- .nf
- csplit -k file 100 {99}
- .fi
- .P
- .RE
- .RE
- .IP " 3." 4
- Assuming that
- .BR prog.c
- follows the C-language coding convention of ending routines with a
- .BR '}'
- at the beginning of the line, this example creates a file containing
- each separate C routine (up to 21) in
- .BR prog.c :
- .RS 4
- .sp
- .RS 4
- .nf
- csplit -k prog.c \(aq%main(%\(aq \(aq/\(ha}/+1\(aq {20}
- .fi
- .P
- .RE
- .RE
- .SH RATIONALE
- The
- .BR \-n
- option was added to extend the range of filenames that could be
- handled.
- .P
- Consideration was given to adding a
- .BR \-a
- flag to use the alphabetic filename generation used by the historical
- .IR split
- utility, but the functionality added by the
- .BR \-n
- option was deemed to make alphabetic naming unnecessary.
- .SH "FUTURE DIRECTIONS"
- None.
- .SH "SEE ALSO"
- .IR "\fIsed\fR\^",
- .IR "\fIsplit\fR\^"
- .P
- The Base Definitions volume of POSIX.1\(hy2017,
- .IR "Chapter 8" ", " "Environment Variables",
- .IR "Section 9.3" ", " "Basic Regular Expressions",
- .IR "Section 12.2" ", " "Utility Syntax Guidelines"
- .\"
- .SH COPYRIGHT
- Portions of this text are reprinted and reproduced in electronic form
- from IEEE Std 1003.1-2017, Standard for Information Technology
- -- Portable Operating System Interface (POSIX), The Open Group Base
- Specifications Issue 7, 2018 Edition,
- Copyright (C) 2018 by the Institute of
- Electrical and Electronics Engineers, Inc and The Open Group.
- In the event of any discrepancy between this version and the original IEEE and
- The Open Group Standard, the original IEEE and The Open Group Standard
- is the referee document. The original Standard can be obtained online at
- http://www.opengroup.org/unix/online.html .
- .PP
- Any typographical or formatting errors that appear
- in this page are most likely
- to have been introduced during the conversion of the source files to
- man page format. To report such errors, see
- https://www.kernel.org/doc/man-pages/reporting_bugs.html .