file.1p (23562B)
- '\" et
- .TH FILE "1P" 2017 "IEEE/The Open Group" "POSIX Programmer's Manual"
- .\"
- .SH PROLOG
- This manual page is part of the POSIX Programmer's Manual.
- The Linux implementation of this interface may differ (consult
- the corresponding Linux manual page for details of Linux behavior),
- or the interface may not be implemented on Linux.
- .\"
- .SH NAME
- file
- \(em determine file type
- .SH SYNOPSIS
- .LP
- .nf
- file \fB[\fR-dh\fB] [\fR-M \fIfile\fB] [\fR-m \fIfile\fB] \fIfile\fR...
- .P
- file -i \fB[\fR-h\fB] \fIfile\fR...
- .fi
- .SH DESCRIPTION
- The
- .IR file
- utility shall perform a series of tests in sequence on each specified
- .IR file
- in an attempt to classify it:
- .IP " 1." 4
- If
- .IR file
- does not exist, cannot be read, or its file status could not be
- determined, the output shall indicate that the file was processed, but
- that its type could not be determined.
- .IP " 2." 4
- If the file is not a regular file, its file type shall be identified.
- The file types directory, FIFO, socket, block special, and character
- special shall be identified as such. Other implementation-defined file
- types may also be identified. If
- .IR file
- is a symbolic link, by default the link shall be resolved and
- .IR file
- shall test the type of file referenced by the symbolic link. (See the
- .BR \-h
- and
- .BR \-i
- options below.)
- .IP " 3." 4
- If the length of
- .IR file
- is zero, it shall be identified as an empty file.
- .IP " 4." 4
- The
- .IR file
- utility shall examine an initial segment of
- .IR file
- and shall make a guess at identifying its contents based on
- position-sensitive tests. (The answer is not guaranteed to be correct;
- see the
- .BR \-d ,
- .BR \-M ,
- and
- .BR \-m
- options below.)
- .IP " 5." 4
- The
- .IR file
- utility shall examine
- .IR file
- and make a guess at identifying its contents based on context-sensitive
- default system tests. (The answer is not guaranteed to be correct.)
- .IP " 6." 4
- The file shall be identified as a data file.
- .P
- If
- .IR file
- does not exist, cannot be read, or its file status could not be
- determined, the output shall indicate that the file was processed, but
- that its type could not be determined.
- .P
- If
- .IR file
- is a symbolic link, by default the link shall be resolved and
- .IR file
- shall test the type of file referenced by the symbolic link.
- .SH OPTIONS
- The
- .IR file
- utility shall conform to the Base Definitions volume of POSIX.1\(hy2017,
- .IR "Section 12.2" ", " "Utility Syntax Guidelines",
- except that the order of the
- .BR \-m ,
- .BR \-d ,
- and
- .BR \-M
- options shall be significant.
- .P
- The following options shall be supported by the implementation:
- .IP "\fB\-d\fP" 10
- Apply any position-sensitive default system tests and
- context-sensitive default system tests to the file. This is the
- default if no
- .BR \-M
- or
- .BR \-m
- option is specified.
- .IP "\fB\-h\fP" 10
- When a symbolic link is encountered, identify the file as a symbolic
- link. If
- .BR \-h
- is not specified and
- .IR file
- is a symbolic link that refers to a nonexistent file,
- .IR file
- shall identify the file as a symbolic link, as if
- .BR \-h
- had been specified.
- .IP "\fB\-i\fP" 10
- If a file is a regular file, do not attempt to classify the type of the
- file further, but identify the file as specified in the STDOUT section.
- .IP "\fB\-M\ \fIfile\fR" 10
- Specify the name of a file containing position-sensitive tests that
- shall be applied to a file in order to classify it (see the EXTENDED
- DESCRIPTION). No position-sensitive default system tests nor
- context-sensitive default system tests shall be applied unless the
- .BR \-d
- option is also specified.
- .IP "\fB\-m\ \fIfile\fR" 10
- Specify the name of a file containing position-sensitive tests that
- shall be applied to a file in order to classify it (see the EXTENDED
- DESCRIPTION).
- .P
- If the
- .BR \-m
- option is specified without specifying the
- .BR \-d
- option or the
- .BR \-M
- option, position-sensitive default system tests shall be applied after
- the position-sensitive tests specified by the
- .BR \-m
- option. If the
- .BR \-M
- option is specified with the
- .BR \-d
- option, the
- .BR \-m
- option, or both, or the
- .BR \-m
- option is specified with the
- .BR \-d
- option, the concatenation of the position-sensitive tests specified by
- these options shall be applied in the order specified by the appearance
- of these options. If a
- .BR \-M
- or
- .BR \-m
- .IR file
- option-argument is
- .BR \- ,
- the results are unspecified.
- .SH OPERANDS
- The following operand shall be supported:
- .IP "\fIfile\fR" 10
- A pathname of a file to be tested.
- .SH STDIN
- The standard input shall be used if a
- .IR file
- operand is
- .BR '\-'
- and the implementation treats the
- .BR '\-'
- as meaning standard input.
- Otherwise, the standard input shall not be used.
- .SH "INPUT FILES"
- The
- .IR file
- can be any file type.
- .SH "ENVIRONMENT VARIABLES"
- The following environment variables shall affect the execution of
- .IR file :
- .IP "\fILANG\fP" 10
- Provide a default value for the internationalization variables that are
- unset or null. (See the Base Definitions volume of POSIX.1\(hy2017,
- .IR "Section 8.2" ", " "Internationalization Variables"
- for the precedence of internationalization variables used to determine
- the values of locale categories.)
- .IP "\fILC_ALL\fP" 10
- If set to a non-empty string value, override the values of all the
- other internationalization variables.
- .IP "\fILC_CTYPE\fP" 10
- Determine the locale for the interpretation of sequences of bytes of
- text data as characters (for example, single-byte as opposed to
- multi-byte characters in arguments and input files).
- .IP "\fILC_MESSAGES\fP" 10
- .br
- Determine the locale that should be used to affect the format and
- contents of diagnostic messages written to standard error and
- informative messages written to standard output.
- .IP "\fINLSPATH\fP" 10
- Determine the location of message catalogs for the processing of
- .IR LC_MESSAGES .
- .SH "ASYNCHRONOUS EVENTS"
- Default.
- .SH STDOUT
- In the POSIX locale, the following format shall be used to identify
- each operand,
- .IR file
- specified:
- .sp
- .RS 4
- .nf
- "%s: %s\en", <\fIfile\fR>, <\fItype\fR>
- .fi
- .P
- .RE
- .P
- The values for <\fItype\fP> are unspecified, except that in the POSIX
- locale, if
- .IR file
- is identified as one of the types listed in the following table,
- <\fItype\fP> shall contain (but is not limited to) the corresponding
- string, unless the file is identified by a position-sensitive test
- specified by a
- .BR \-M
- or
- .BR \-m
- option. Each
- <space>
- shown in the strings shall be exactly one
- <space>.
- .br
- .sp
- .ce 1
- \fBTable 4-9: File Utility Output Strings\fR
- .TS
- center tab(@) box;
- cB | cB | cB
- l | l | l.
- If \fIfile\fP is:@<\fItype\fP> shall contain the string:@Notes
- _
- Nonexistent@cannot open
- .P
- Block special@block special@1
- Character special@character special@1
- Directory@directory@1
- FIFO@fifo@1
- Socket@socket@1
- Symbolic link@symbolic link to@1
- Regular file@regular file@1,2
- Empty regular file@empty@3
- Regular file that cannot be read@cannot open@3
- .P
- Executable binary@executable@3,4,6
- \fIar\fR archive library (see \fIar\fP)@archive@3,4,6
- Extended \fIcpio\fP format (see \fIpax\fP)@cpio archive@3,4,6
- Extended \fItar\fP format (see \fBustar\fP in \fIpax\fP)@tar archive@3,4,6
- .P
- Shell script@commands text@3,5,6
- C-language source@c program text@3,5,6
- FORTRAN source@fortran program text@3,5,6
- .P
- Regular file whose type cannot be determined@data@3
- .TE
- .TP 10
- .BR Notes:
- .RS 10
- .IP " 1." 4
- This is a file type test.
- .IP " 2." 4
- This test is applied only if the
- .BR \-i
- option is specified.
- .IP " 3." 4
- This test is applied only if the
- .BR \-i
- option is not specified.
- .IP " 4." 4
- This is a position-sensitive default system test.
- .IP " 5." 4
- This is a context-sensitive default system test.
- .IP " 6." 4
- Position-sensitive default system tests and context-sensitive
- default system tests are not applied if the
- .BR \-M
- option is specified unless the
- .BR \-d
- option is also specified.
- .RE
- .P
- .P
- In the POSIX locale, if
- .IR file
- is identified as a symbolic link (see the
- .BR \-h
- option), the following alternative output format shall be used:
- .sp
- .RS 4
- .nf
- "%s: %s %s\en", <\fIfile\fR>, <\fItype\fR>, <\fIcontents of link\fR>"
- .fi
- .P
- .RE
- .P
- If the file named by the
- .IR file
- operand does not exist, cannot be read, or the type of the file named
- by the
- .IR file
- operand cannot be determined, this shall not be considered an error
- that affects the exit status.
- .SH STDERR
- The standard error shall be used only for diagnostic messages.
- .SH "OUTPUT FILES"
- None.
- .SH "EXTENDED DESCRIPTION"
- A file specified as an option-argument to the
- .BR \-m
- or
- .BR \-M
- options shall contain one position-sensitive test per line, which shall
- be applied to the file. If the test succeeds, the message field of the
- line shall be printed and no further tests shall be applied, with the
- exception that tests on immediately following lines beginning with a
- single
- .BR '>'
- character shall be applied.
- .P
- Each line shall be composed of the following four
- <tab>-separated
- fields. (Implementations may allow any combination of one or more
- white-space characters other than
- <newline>
- to act as field separators.)
- .IP "\fIoffset\fP" 10
- An unsigned number (optionally preceded by a single
- .BR '>'
- character) specifying the
- .IR offset ,
- in bytes, of the value in the file that is to be compared against the
- .IR value
- field of the line. If the file is shorter than the specified offset,
- the test shall fail.
- .RS 10
- .P
- If the
- .IR offset
- begins with the character
- .BR '>' ,
- the test contained in the line shall not be applied to the file unless
- the test on the last line for which the
- .IR offset
- did not begin with a
- .BR '>'
- was successful. By default, the
- .IR offset
- shall be interpreted as an unsigned decimal number. With a leading 0x
- or 0X, the
- .IR offset
- shall be interpreted as a hexadecimal number; otherwise, with a leading
- 0, the
- .IR offset
- shall be interpreted as an octal number.
- .RE
- .IP "\fItype\fP" 10
- The type of the value in the file to be tested. The type shall consist
- of the type specification characters
- .BR d ,
- .BR s ,
- and
- .BR u ,
- specifying signed decimal, string, and unsigned decimal, respectively.
- .RS 10
- .P
- The
- .IR type
- string shall be interpreted as the bytes from the file starting at the
- specified
- .IR offset
- and including the same number of bytes specified by the
- .IR value
- field. If insufficient bytes remain in the file past the
- .IR offset
- to match the
- .IR value
- field, the test shall fail.
- .P
- The type specification characters
- .BR d
- and
- .BR u
- can be followed by an optional unsigned decimal integer that specifies
- the number of bytes represented by the type. The type specification
- characters
- .BR d
- and
- .BR u
- can be followed by an optional
- .BR C ,
- .BR S ,
- .BR I ,
- or
- .BR L ,
- indicating that the value is of type
- .BR char ,
- .BR short ,
- .BR int ,
- or
- .BR long ,
- respectively.
- .P
- The default number of bytes represented by the type specifiers
- .BR d ,
- .BR f ,
- and
- .BR u
- shall correspond to their respective C-language types as follows. If
- the system claims conformance to the C-Language Development Utilities
- option, those specifiers shall correspond to the default sizes used in
- the
- .IR c99
- utility. Otherwise, the default sizes shall be implementation-defined.
- .P
- For the type specifier characters
- .BR d
- and
- .BR u ,
- the default number of bytes shall correspond to the size of a basic
- integer type of the implementation. For these specifier
- characters, the implementation shall support values of the optional
- number of bytes to be converted corresponding to the number of bytes in
- the C-language types
- .BR char ,
- .BR short ,
- .BR int ,
- or
- .BR long .
- These numbers can also be specified by an application as the characters
- .BR C ,
- .BR S ,
- .BR I ,
- and
- .BR L ,
- respectively. The byte order used when interpreting numeric values is
- implementation-defined, but shall correspond to the order in which a
- constant of the corresponding type is stored in memory on the system.
- .P
- All type specifiers, except for
- .BR s ,
- can be followed by a mask specifier of the form &\fInumber\fP. The mask
- value shall be AND'ed with the value of the input file before the
- comparison with the
- .IR value
- field of the line is made. By default, the mask shall be interpreted as
- an unsigned decimal number. With a leading 0x or 0X, the mask shall be
- interpreted as an unsigned hexadecimal number; otherwise, with a
- leading 0, the mask shall be interpreted as an unsigned octal number.
- .P
- The strings
- .BR byte ,
- .BR short ,
- .BR long ,
- and
- .BR string
- shall also be supported as type fields, being interpreted as
- .BR dC ,
- .BR dS ,
- .BR dL ,
- and
- .BR s ,
- respectively.
- .RE
- .IP "\fIvalue\fP" 10
- The
- .IR value
- to be compared with the value from the file.
- .RS 10
- .P
- If the specifier from the type field is
- .BR s
- or
- .BR string ,
- then interpret the value as a string. Otherwise, interpret it as a
- number. If the value is a string, then the test shall succeed only when
- a string value exactly matches the bytes from the file.
- .P
- If the
- .IR value
- is a string, it can contain the following sequences:
- .IP "\e\fIcharacter\fR" 12
- The
- <backslash>-escape
- sequences as specified in the Base Definitions volume of POSIX.1\(hy2017,
- .IR "Table 5-1" ", " "Escape Sequences and Associated Actions"
- (\c
- .BR '\e\e' ,
- .BR '\ea' ,
- .BR '\eb' ,
- .BR '\ef' ,
- .BR '\en' ,
- .BR '\er' ,
- .BR '\et' ,
- .BR '\ev' ).
- In addition, the escape sequence
- .BR '\e\ '
- (the
- <backslash>
- character followed by a
- <space>
- character) shall be recognized to represent a
- <space>
- character. The results of using any other character, other than an
- octal digit, following the
- <backslash>
- are unspecified.
- .IP "\e\fIoctal\fR" 12
- Octal sequences that can be used to represent characters with specific
- coded values. An octal sequence shall consist of a
- <backslash>
- followed by the longest sequence of one, two, or three octal-digit
- characters (01234567).
- .P
- By default, any value that is not a string shall be interpreted as a
- signed decimal number. Any such value, with a leading 0x or 0X, shall
- be interpreted as an unsigned hexadecimal number; otherwise, with a
- leading zero, the value shall be interpreted as an unsigned octal
- number.
- .P
- If the value is not a string, it can be preceded by a character
- indicating the comparison to be performed. Permissible characters and
- the comparisons they specify are as follows:
- .IP "\fR=\fP" 6
- The test shall succeed if the value from the file equals the
- .IR value
- field.
- .IP "\fR<\fP" 6
- The test shall succeed if the value from the file is less than the
- .IR value
- field.
- .IP "\fR>\fP" 6
- The test shall succeed if the value from the file is greater than the
- .IR value
- field.
- .IP "\fR&\fP" 6
- The test shall succeed if all of the set bits in the
- .IR value
- field are set in the value from the file.
- .IP "\fR^\fP" 6
- The test shall succeed if at least one of the set bits in the
- .IR value
- field is not set in the value from the file.
- .IP "\fRx\fP" 6
- The test shall succeed if the file is large enough to contain a value
- of the type specified starting at the offset specified.
- .RE
- .IP "\fImessage\fP" 10
- The
- .IR message
- to be printed if the test succeeds. The
- .IR message
- shall be interpreted using the notation for the
- .IR printf
- formatting specification; see
- .IR printf .
- If the
- .IR value
- field was a string, then the value from the file shall be the argument
- for the
- .IR printf
- formatting specification; otherwise, the value from the file shall be
- the argument.
- .br
- .SH "EXIT STATUS"
- The following exit values shall be returned:
- .IP "\00" 6
- Successful completion.
- .IP >0 6
- An error occurred.
- .SH "CONSEQUENCES OF ERRORS"
- Default.
- .LP
- .IR "The following sections are informative."
- .SH "APPLICATION USAGE"
- The
- .IR file
- utility can only be required to guess at many of the file types because
- only exhaustive testing can determine some types with certainty. For
- example, binary data on some implementations might match the initial
- segment of an executable or a
- .IR tar
- archive.
- .P
- Note that the table indicates that the output contains the stated
- string. Systems may add text before or after the string. For
- executables, as an example, the machine architecture and various facts
- about how the file was link-edited may be included. Note also that on
- systems that recognize shell script files starting with
- .BR \(dq#!\(dq
- as executable files, these may be identified as executable binary files
- rather than as shell scripts.
- .SH EXAMPLES
- Determine whether an argument is a binary executable file:
- .sp
- .RS 4
- .nf
- file -- "$1" | grep -q \(aq:.*executable\(aq &&
- printf "%s is executable.\en$1"
- .fi
- .P
- .RE
- .SH RATIONALE
- The
- .BR \-f
- option was omitted because the same effect can (and should) be obtained
- using the
- .IR xargs
- utility.
- .P
- Historical versions of the
- .IR file
- utility attempt to identify the following types of files: symbolic
- link, directory, character special, block special, socket,
- .IR tar
- archive,
- .IR cpio
- archive, SCCS archive, archive library, empty,
- .IR compress
- output,
- .IR pack
- output, binary data, C source, FORTRAN source, assembler source,
- .IR nroff /\c
- .IR troff /\c
- .IR eqn /\c
- .IR tbl
- source
- .IR troff
- output, shell script, C shell script, English text, ASCII text, various
- executables, APL workspace, compiled terminfo entries, and CURSES
- screen images. Only those types that are reasonably well specified in
- POSIX or are directly related to POSIX utilities are listed in the
- table.
- .P
- Historical systems have used a ``magic file'' named
- .BR /etc/magic
- to help identify file types. Because it is generally useful for users
- and scripts to be able to identify special file types, the
- .BR \-m
- flag and a portable format for user-created magic files has been
- specified. No requirement is made that an implementation of
- .IR file
- use this method of identifying files, only that users be permitted to
- add their own classifying tests.
- .P
- In addition, three options have been added to historical practice. The
- .BR \-d
- flag has been added to permit users to cause their tests to follow any
- default system tests. The
- .BR \-i
- flag has been added to permit users to test portably for regular files
- in shell scripts. The
- .BR \-M
- flag has been added to permit users to ignore any default system
- tests.
- .P
- The POSIX.1\(hy2008 description of default system tests and the interaction
- between the
- .BR \-d ,
- .BR \-M ,
- and
- .BR \-m
- options did not clearly indicate that there were two types of ``default
- system tests''. The ``position-sensitive tests'' determine file types
- by looking for certain string or binary values at specific offsets in
- the file being examined. These position-sensitive tests were
- implemented in historical systems using the magic file described above.
- Some of these tests are now built into the
- .IR file
- utility itself on some implementations so the output can provide more
- detail than can be provided by magic files. For example, a magic file
- can easily identify a
- .BR core
- file on most implementations, but cannot name the program file that
- dropped the core. A magic file could produce output such as:
- .sp
- .RS 4
- .nf
- /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1
- .fi
- .P
- .RE
- .P
- but by building the test into the
- .IR file
- utility, you could get output such as:
- .sp
- .RS 4
- .nf
- /home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from \(aqtestprog\(aq
- .fi
- .P
- .RE
- .P
- These extended built-in tests are still to be treated as
- position-sensitive default system tests even if they are not listed in
- .BR /etc/magic
- or any other magic file.
- .P
- The context-sensitive default system tests were always built into the
- .IR file
- utility. These tests looked for language constructs in text files
- trying to identify shell scripts, C, FORTRAN, and other computer
- language source files, and even plain text files. With the addition of
- the
- .BR \-m
- and
- .BR \-M
- options the distinction between position-sensitive and
- context-sensitive default system tests became important because the
- order of testing is important. The context-sensitive system default
- tests should never be applied before any position-sensitive tests even
- if the
- .BR \-d
- option is specified before a
- .BR \-m
- option or
- .BR \-M
- option due to the high probability that the context-sensitive system
- default tests will incorrectly identify arbitrary text files as text
- files before position-sensitive tests specified by the
- .BR \-m
- or
- .BR \-M
- option would be applied to give a more accurate identification.
- .P
- Leaving the meaning of
- .BR "\-M \-"
- and
- .BR "\-m \-"
- unspecified allows an existing prototype of these options to continue
- to work in a backwards-compatible manner. (In that implementation,
- .BR "\-M \-"
- was roughly equivalent to
- .BR \-d
- in POSIX.1\(hy2008.)
- .P
- The historical
- .BR \-c
- option was omitted as not particularly useful to users or portable
- shell scripts. In addition, a reasonable implementation of the
- .IR file
- utility would report any errors found each time the magic file is
- read.
- .P
- The historical format of the magic file was the same as that specified
- by the Rationale in the ISO\ POSIX\(hy2:\|1993 standard for the
- .IR offset ,
- .IR value ,
- and
- .IR message
- fields; however, it used less precise type fields than the format
- specified by the current normative text. The new type field values are
- a superset of the historical ones.
- .P
- The following is an example magic file:
- .sp
- .RS 4
- .nf
- 0 short 070707 cpio archive
- 0 short 0143561 Byte-swapped cpio archive
- 0 string 070707 ASCII cpio archive
- 0 long 0177555 Very old archive
- 0 short 0177545 Old archive
- 0 short 017437 Old packed data
- 0 string \e037\e036 Packed data
- 0 string \e377\e037 Compacted data
- 0 string \e037\e235 Compressed data
- >2 byte&0x80 >0 Block compressed
- >2 byte&0x1f x %d bits
- 0 string \e032\e001 Compiled Terminfo Entry
- 0 short 0433 Curses screen image
- 0 short 0434 Curses screen image
- 0 string <ar> System V Release 1 archive
- 0 string !<arch>\en__.SYMDEF Archive random library
- 0 string !<arch> Archive
- 0 string ARF_BEGARF PHIGS clear text archive
- 0 long 0x137A2950 Scalable OpenFont binary
- 0 long 0x137A2951 Encrypted scalable OpenFont binary
- .fi
- .P
- .RE
- .P
- The use of a basic integer data type is intended to allow the
- implementation to choose a word size commonly used by applications
- on that architecture.
- .P
- Earlier versions of this standard allowed for implementations with
- bytes other than eight bits, but this has been modified in this
- version.
- .SH "FUTURE DIRECTIONS"
- None.
- .SH "SEE ALSO"
- .IR "\fIar\fR\^",
- .IR "\fIls\fR\^",
- .IR "\fIpax\fR\^",
- .IR "\fIprintf\fR\^"
- .P
- The Base Definitions volume of POSIX.1\(hy2017,
- .IR "Table 5-1" ", " "Escape Sequences and Associated Actions",
- .IR "Chapter 8" ", " "Environment Variables",
- .IR "Section 12.2" ", " "Utility Syntax Guidelines"
- .\"
- .SH COPYRIGHT
- Portions of this text are reprinted and reproduced in electronic form
- from IEEE Std 1003.1-2017, Standard for Information Technology
- -- Portable Operating System Interface (POSIX), The Open Group Base
- Specifications Issue 7, 2018 Edition,
- Copyright (C) 2018 by the Institute of
- Electrical and Electronics Engineers, Inc and The Open Group.
- In the event of any discrepancy between this version and the original IEEE and
- The Open Group Standard, the original IEEE and The Open Group Standard
- is the referee document. The original Standard can be obtained online at
- http://www.opengroup.org/unix/online.html .
- .PP
- Any typographical or formatting errors that appear
- in this page are most likely
- to have been introduced during the conversion of the source files to
- man page format. To report such errors, see
- https://www.kernel.org/doc/man-pages/reporting_bugs.html .