FILE(1P) POSIX Programmer's Manual FILE(1P)PROLOG
This manual page is part of the POSIX Programmer's Manual. The Linux
implementation of this interface may differ (consult the corresponding
Linux manual page for details of Linux behavior), or the interface may
not be implemented on Linux.
NAMEfile - determine file type
SYNOPSISfile [-dh][-M file][-m file] file ...
file-i [-h] file ...
The file utility shall perform a series of tests in sequence on each
specified file in an attempt to classify it:
1. If file does not exist, cannot be read, or its file status could
not be determined, the output shall indicate that the file was pro‐
cessed, but that its type could not be determined.
2. If the file is not a regular file, its file type shall be identi‐
fied. The file types directory, FIFO, socket, block special, and
character special shall be identified as such. Other implementa‐
tion-defined file types may also be identified. If file is a sym‐
bolic link, by default the link shall be resolved and file shall
test the type of file referenced by the symbolic link. (See the -h
and -i options below.)
3. If the length of file is zero, it shall be identified as an empty
4. The file utility shall examine an initial segment of file and shall
make a guess at identifying its contents based on position-sensi‐
tive tests. (The answer is not guaranteed to be correct; see the
-d, -M, and -m options below.)
5. The file utility shall examine file and make a guess at identifying
its contents based on context-sensitive default system tests. (The
answer is not guaranteed to be correct.)
6. The file shall be identified as a data file.
If file does not exist, cannot be read, or its file status could not be
determined, the output shall indicate that the file was processed, but
that its type could not be determined.
If file is a symbolic link, by default the link shall be resolved and
file shall test the type of file referenced by the symbolic link.
The file utility shall conform to the Base Definitions volume of
IEEE Std 1003.1-2001, Section 12.2, Utility Syntax Guidelines, except
that the order of the -m, -d, and -M options shall be significant.
The following options shall be supported by the implementation:
-d Apply any position-sensitive default system tests and context-
sensitive default system tests to the file. This is the default
if no -M or -m option is specified.
-h When a symbolic link is encountered, identify the file as a sym‐
bolic link. If -h is not specified and file is a symbolic link
that refers to a nonexistent file, file shall identify the file
as a symbolic link, as if -h had been specified.
-i If a file is a regular file, do not attempt to classify the type
of the file further, but identify the file as specified in the
Specify the name of a file containing position-sensitive tests
that shall be applied to a file in order to classify it (see the
EXTENDED DESCRIPTION). No position-sensitive default system
tests nor context-sensitive default system tests shall be
applied unless the -d option is also specified.
Specify the name of a file containing position-sensitive tests
that shall be applied to a file in order to classify it (see the
If the -m option is specified without specifying the -d option or the
-M option, position-sensitive default system tests shall be applied
after the position-sensitive tests specified by the -m option. If the
-M option is specified with the -d option, the -m option, or both, or
the -m option is specified with the -d option, the concatenation of the
position-sensitive tests specified by these options shall be applied in
the order specified by the appearance of these options. If a -M or -m
file option-argument is -, the results are unspecified.
The following operand shall be supported:
file A pathname of a file to be tested.
The file can be any file type.
The following environment variables shall affect the execution of file:
LANG Provide a default value for the internationalization variables
that are unset or null. (See the Base Definitions volume of
IEEE Std 1003.1-2001, Section 8.2, Internationalization Vari‐
ables for the precedence of internationalization variables used
to determine the values of locale categories.)
LC_ALL If set to a non-empty string value, override the values of all
the other internationalization variables.
Determine the locale for the interpretation of sequences of
bytes of text data as characters (for example, single-byte as
opposed to multi-byte characters in arguments and input files).
Determine the locale that should be used to affect the format
and contents of diagnostic messages written to standard error
and informative messages written to standard output.
Determine the location of message catalogs for the processing of
In the POSIX locale, the following format shall be used to identify
each operand, file specified:
"%s: %s\n", <file>, <type>
The values for <type> are unspecified, except that in the POSIX locale,
if file is identified as one of the types listed in the following ta‐
ble, <type> shall contain (but is not limited to) the corresponding
string, unless the file is identified by a position-sensitive test
specified by a -M or -m option. Each space shown in the strings shall
be exactly one <space>.
Table: File Utility Output Strings
If file is: <type> shall contain the Notes
Nonexistent cannot open
Block special block special 1
Character special character special 1
Directory directory 1
FIFO fifo 1
Socket socket 1
Symbolic link symbolic link to 1
Regular file regular file 1,2
Empty regular file empty 3
Regular file that cannot be read cannot open 3
Executable binary executable 4,6
ar archive library (see ar) archive 4,6
Extended cpio format (see pax) cpio archive 4,6
Extended tar format (see ustar in pax) tar archive 4,6
Shell script commands text 5,6
C-language source c program text 5,6
FORTRAN source fortran program text 5,6
Regular file whose type cannot be deter‐ data
1. This is a file type test.
2. This test is applied only if the -i option is specified.
3. This test is applied only if the -i option is not specified.
4. This is a position-sensitive default system test.
5. This is a context-sensitive default system test.
6. Position-sensitive default system tests and context-sensi‐
tive default system tests are not applied if the -M option
is specified unless the -d option is also specified.
In the POSIX locale, if file is identified as a symbolic link (see the
-h option), the following alternative output format shall be used:
"%s: %s %s\n", <file>, <type>, <contents of link>"
If the file named by the file operand does not exist, cannot be read,
or the type of the file named by the file operand cannot be determined,
this shall not be considered an error that affects the exit status.
The standard error shall be used only for diagnostic messages.
A file specified as an option-argument to the -m or -M options shall
contain one position-sensitive test per line, which shall be applied to
the file. If the test succeeds, the message field of the line shall be
printed and no further tests shall be applied, with the exception that
tests on immediately following lines beginning with a single '>' char‐
acter shall be applied.
Each line shall be composed of the following four <blank>-separated
offset An unsigned number (optionally preceded by a single '>' charac‐
ter) specifying the offset, in bytes, of the value in the file
that is to be compared against the value field of the line. If
the file is shorter than the specified offset, the test shall
If the offset begins with the character '>', the test contained in the
line shall not be applied to the file unless the test on the last line
for which the offset did not begin with a '>' was successful. By
default, the offset shall be interpreted as an unsigned decimal number.
With a leading 0x or 0X, the offset shall be interpreted as a hexadeci‐
mal number; otherwise, with a leading 0, the offset shall be inter‐
preted as an octal number.
type The type of the value in the file to be tested. The type shall
consist of the type specification characters c, d, f, s, and u,
specifying character, signed decimal, floating point, string,
and unsigned decimal, respectively.
The type string shall be interpreted as the bytes from the file start‐
ing at the specified offset and including the same number of bytes
specified by the value field. If insufficient bytes remain in the file
past the offset to match the value field, the test shall fail.
The type specification characters d, f, and u can be followed by an
optional unsigned decimal integer that specifies the number of bytes
represented by the type. The type specification character f can be
followed by an optional F, D, or L, indicating that the value is of
type float, double, or long double, respectively. The type specifica‐
tion characters d and u can be followed by an optional C, S, I, or L,
indicating that the value is of type char, short, int, or long, respec‐
The default number of bytes represented by the type specifiers d, f,
and u shall correspond to their respective C-language types as follows.
If the system claims conformance to the C-Language Development Utili‐
ties option, those specifiers shall correspond to the default sizes
used in the c99 utility. Otherwise, the default sizes shall be imple‐
For the type specifier characters d and u, the default number of bytes
shall correspond to the size of a basic integer type of the implementa‐
tion. For these specifier characters, the implementation shall support
values of the optional number of bytes to be converted corresponding to
the number of bytes in the C-language types char, short, int, or long.
These numbers can also be specified by an application as the characters
C, S, I, and L, respectively. The byte order used when interpreting
numeric values is implementation-defined, but shall correspond to the
order in which a constant of the corresponding type is stored in memory
on the system.
For the type specifier f, the default number of bytes shall correspond
to the number of bytes in the basic double precision floating-point
data type of the underlying implementation. The implementation shall
support values of the optional number of bytes to be converted corre‐
sponding to the number of bytes in the C-language types float, double,
and long double. These numbers can also be specified by an application
as the characters F, D, and L, respectively.
All type specifiers, except for s, can be followed by a mask specifier
of the form &number. The mask value shall be AND'ed with the value of
the input file before the comparison with the value field of the line
is made. By default, the mask shall be interpreted as an unsigned deci‐
mal number. With a leading 0x or 0X, the mask shall be interpreted as
an unsigned hexadecimal number; otherwise, with a leading 0, the mask
shall be interpreted as an unsigned octal number.
The strings byte, short, long, and string shall also be supported as
type fields, being interpreted as dC, dS, dL, and s, respectively.
value The value to be compared with the value from the file.
If the specifier from the type field is s or string, then interpret the
value as a string. Otherwise, interpret it as a number. If the value is
a string, then the test shall succeed only when a string value exactly
matches the bytes from the file.
If the value is a string, it can contain the following sequences:
The backslash-escape sequences as specified in the Base Defini‐
tions volume of IEEE Std 1003.1-2001, Table 5-1, Escape
Sequences and Associated Actions ( '\\', '\a', '\b', '\f', '\n',
'\r', '\t', '\v' ). The results of using any other character,
other than an octal digit, following the backslash are unspeci‐
Octal sequences that can be used to represent characters with
specific coded values. An octal sequence shall consist of a
backslash followed by the longest sequence of one, two, or three
octal-digit characters (01234567). If the size of a byte on the
system is greater than 9 bits, the valid escape sequence used to
represent a byte is implementation-defined.
By default, any value that is not a string shall be interpreted as a
signed decimal number. Any such value, with a leading 0x or 0X, shall
be interpreted as an unsigned hexadecimal number; otherwise, with a
leading zero, the value shall be interpreted as an unsigned octal num‐
If the value is not a string, it can be preceded by a character indi‐
cating the comparison to be performed. Permissible characters and the
comparisons they specify are as follows:
The test shall succeed if the value from the file equals the
The test shall succeed if the value from the file is less than
the value field.
The test shall succeed if the value from the file is greater
than the value field.
The test shall succeed if all of the set bits in the value field
are set in the value from the file.
The test shall succeed if at least one of the set bits in the
value field is not set in the value from the file.
The test shall succeed if the file is large enough to contain a
value of the type specified starting at the offset specified.
The message to be printed if the test succeeds. The message
shall be interpreted using the notation for the printf format‐
ting specification; see printf(). If the value field was a
string, then the value from the file shall be the argument for
the printf formatting specification; otherwise, the value from
the file shall be the argument.
The following exit values shall be returned:
0 Successful completion.
>0 An error occurred.
CONSEQUENCES OF ERRORS
The following sections are informative.
The file utility can only be required to guess at many of the file
types because only exhaustive testing can determine some types with
certainty. For example, binary data on some implementations might match
the initial segment of an executable or a tar archive.
Note that the table indicates that the output contains the stated
string. Systems may add text before or after the string. For executa‐
bles, as an example, the machine architecture and various facts about
how the file was link-edited may be included. Note also that on systems
that recognize shell script files starting with "#!" as executable
files, these may be identified as executable binary files rather than
as shell scripts.
Determine whether an argument is a binary executable file:
file "$1" | grep -Fq executable &&
printf "%s is executable.\n" "$1"
The -f option was omitted because the same effect can (and should) be
obtained using the xargs utility.
Historical versions of the file utility attempt to identify the follow‐
ing types of files: symbolic link, directory, character special, block
special, socket, tar archive, cpio archive, SCCS archive, archive
library, empty, compress output, pack output, binary data, C source,
FORTRAN source, assembler source, nroff/ troff/ eqn/ tbl source troff
output, shell script, C shell script, English text, ASCII text, various
executables, APL workspace, compiled terminfo entries, and CURSES
screen images. Only those types that are reasonably well specified in
POSIX or are directly related to POSIX utilities are listed in the ta‐
Historical systems have used a "magic file" named /etc/magic to help
identify file types. Because it is generally useful for users and
scripts to be able to identify special file types, the -m flag and a
portable format for user-created magic files has been specified. No
requirement is made that an implementation of file use this method of
identifying files, only that users be permitted to add their own clas‐
In addition, three options have been added to historical practice. The
-d flag has been added to permit users to cause their tests to follow
any default system tests. The -i flag has been added to permit users to
test portably for regular files in shell scripts. The -M flag has been
added to permit users to ignore any default system tests.
The IEEE Std 1003.1-2001 description of default system tests and the
interaction between the -d, -M, and -m options did not clearly indicate
that there were two types of "default system tests". The "position-sen‐
sitive tests'' determine file types by looking for certain string or
binary values at specific offsets in the file being examined. These
position-sensitive tests were implemented in historical systems using
the magic file described above. Some of these tests are now built into
the file utility itself on some implementations so the output can pro‐
vide more detail than can be provided by magic files. For example, a
magic file can easily identify a core file on most implementations, but
cannot name the program file that dropped the core. A magic file could
produce output such as:
/home/dwc/core: ELF 32-bit MSB core file SPARC Version 1
but by building the test into the file utility, you could get output
/home/dwc/core: ELF 32-bit MSB core file SPARC Version 1, from 'testprog'
These extended built-in tests are still to be treated as position-sen‐
sitive default system tests even if they are not listed in /etc/magic
or any other magic file.
The context-sensitive default system tests were always built into the
file utility. These tests looked for language constructs in text files
trying to identify shell scripts, C, FORTRAN, and other computer lan‐
guage source files, and even plain text files. With the addition of the
-m and -M options the distinction between position-sensitive and con‐
text-sensitive default system tests became important because the order
of testing is important. The context-sensitive system default tests
should never be applied before any position-sensitive tests even if the
-d option is specified before a -m option or -M option due to the high
probability that the context-sensitive system default tests will incor‐
rectly identify arbitrary text files as text files before position-sen‐
sitive tests specified by the -m or -M option would be applied to give
a more accurate identification.
Leaving the meaning of -M - and -m - unspecified allows an existing
prototype of these options to continue to work in a backwards-compati‐
ble manner. (In that implementation, -M - was roughly equivalent to -d
in IEEE Std 1003.1-2001.)
The historical -c option was omitted as not particularly useful to
users or portable shell scripts. In addition, a reasonable implementa‐
tion of the file utility would report any errors found each time the
magic file is read.
The historical format of the magic file was the same as that specified
by the Rationale in the ISO POSIX-2:1993 standard for the offset,
value, and message fields; however, it used less precise type fields
than the format specified by the current normative text. The new type
field values are a superset of the historical ones.
The following is an example magic file:
0 short 070707 cpio archive
0 short 0143561 Byte-swapped cpio archive
0 string 070707 ASCII cpio archive
0 long 0177555 Very old archive
0 short 0177545 Old archive
0 short 017437 Old packed data
0 string \037\036 Packed data
0 string \377\037 Compacted data
0 string \037\235 Compressed data
>2 byte&0x80 >0 Block compressed
>2 byte&0x1f x %d bits
0 string \032\001 Compiled Terminfo Entry
0 short 0433 Curses screen image
0 short 0434 Curses screen image
0 string <ar> System V Release 1 archive
0 string !<arch>\n__.SYMDEF Archive random library
0 string !<arch> Archive
0 string ARF_BEGARF PHIGS clear text archive
0 long 0x137A2950 Scalable OpenFont binary
0 long 0x137A2951 Encrypted scalable OpenFont binary
The use of a basic integer data type is intended to allow the implemen‐
tation to choose a word size commonly used by applications on that
ar, ls, pax
Portions of this text are reprinted and reproduced in electronic form
from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
-- Portable Operating System Interface (POSIX), The Open Group Base
Specifications Issue 6, Copyright (C) 2001-2003 by the Institute of
Electrical and Electronics Engineers, Inc and The Open Group. In the
event of any discrepancy between this version and the original IEEE and
The Open Group Standard, the original IEEE and The Open Group Standard
is the referee document. The original Standard can be obtained online
at http://www.opengroup.org/unix/online.html .
IEEE/The Open Group 2003 FILE(1P)