This lecture looks at the way to create batch files under Unix,
and discusses the most common commands used in them.
it also looks at file redirection.
File permissions
Files within a file system usually have some attributes
set which control permissions to access them.
Unix files have 10 bits set
as in
-rwxr-x---
(read, write, execute for owner; read, execute for group; none for others).
Modes can be added or subtracted using the chmod command eg
chmod ug+rx file
chmod o-rwx file
Modes may also be set numerically using a 3-digit octal number. The first
digit sets user permissions, the second sets group permissions and the last
sets other permissions. Read has value 4, write has value 2, execute has value
1, and the bits are or-ed together to give a single octal digit. For example
chmod 751 file
gives r,w,x permission to user; r,x to group; x to other.
A batch job is a command or set of commands that are entered into a queue of
jobs to be executed at some time suitable to the O/S.
A batch job must contain
enough information to allow the O/S to determine input and output files,
resources required, etc.
MSDOS command files are required to end in .bat.
command.com then reads the
set of instructions and executes them. It recognises a particular language.
e.g
cls
PATH c:\dos
if exist c:lcl.bat c:\lcl
(clear the screen; set the search path for executables to c:\dos;
if the file c:lcl.bat exists then execute it.)
The common Unix command line interpreters attempt to execute
files if execute permission
is set. Executable files produced by the compilers have a
`magic' number stored
in the first few bytes of the file to say what type it is. Executable files
without a magic number are assumed to be command files.
The information about magic numbers is stored in the file
/etc/magic. For example, an executable for a VAX starts with
(octal) 0575, whereas an MSDOS library file begins with (hex) 0xf0.
If the first line
of the command file is
#!interpreter
(eg #!/bin/sh) then that interpreter is executed to process the command file.
Otherwise the current command interpreter is used to execute the command file.
For the current course, as used at UC in 95/2, the bash interpreter is
used. To make sure the correct shell is used, make the first line of
your program
#!/bin/bash
A full programming language is accepted by the Unix command interpreters such
as sh, csh, tcsh, bash, ksh, zsh. A typical file might be
clear
PATH=/bin/usr/bin
if [ -x myfile ]
then
myfile
fi
(This does the Unix equivalent of the MSDOS script above.)
To make a Unix command file (or shell script) you first edit the file to store
the commands and save the file. You then change the mode to executable by
chmod u+x file
It can then be executed by just typing its name.
A program may often need to execute with a set of arguments given at runtime.
These are usually either keyword arguments or positional arguments. eg on
the A9, linking of internal files to external files is done by the
FILE
word.
Example:
In C, the arguments are made available to the program by an array of strings
(always called argv).
This array is indexed by position, where the zero'th
position is the command name itself.
main(int argc, char *argv[])
{
printf("program name is %s\n",
argv[0]);
}
MSDOS command files makes arguments available as positional arguments, %0,
%1, ..., %9. This echoes the command name for any .bat file:
echo Command name is %0
The Unix shells have positional arguments $0 (name of the command file), $1,
$2,...,$9. In addition $* is the string of all arguments $1,..., and $# is
the total number of arguments. e.g
echo "the name of this\
command is $0"
echo "and it has $# arguments"
(Lines that are too long may be broken up using the continuation
character `\'.)
In batch processing systems there is no interactive I/O. In interactive systems
there is an input stream and an output stream, and perhaps an error output
stream. Usually these will be connected by the command interpreter to the
keyboard and screen. Some systems allow these to be changed so that a command
can be run with different I/O streams.
Example:
MSDOS allows redirection of input and output
command < input-file
command > output-file
e.g to display the contents of file myfile.txt
more < myfile.txt
Unix allows redirection of standard input (stdin, 0),
standard output (stdout, 1)
and error output (stderr, 2)
command 0< in 1> out 2> error
command < in > out 2> error
In addition, output may be appended to a file
(rather than overwriting existing
contents) by
command >> file
The shells also allow the output of one command to be made the input to another,
forming a pipeline
command1 | command2
Programs designed to work in pipelines are often called filters.
These can all be used in command files. For example, you can create a command
called ``manprint'' that contains
man $1 | lpr
that is run by
manprint cp
There are about 200 commands that are supplied with every Unix system, many
of them quite obscure. The Unix philosophy is to have lots of tools specialised
to particular tasks rather than big tools that do everything.
cmp
Compare two files for differences. Returns true if they are the same.
This command is only useful in shell scripts where a Boolean
value is required from a file comparison
Find files that match a pattern and perform an action on them. This command
searches for files recursively from a given directory. eg
find . -name "*" -print
find / -name core -exec rm {} \;
Search for lines in a file containing a pattern.
grep pattern file
It prints each line in the file that matches the pattern.
You won't believe how useful this is till you have a lot of files...
With multiple files, you also get the filename as well as the line
matched:
The -l option just tells you which files contain matching lines
Stream editor. This is useful for on the fly editing, typically of small strings.
sed 1,10d file
prints file with lines 1-10 deleted
sed 20q file
prints first 20 lines and then quits
sed -n 20,30p file
-n turns off default printing, so only prints lines 20 to 30
sed 's/old/new/' file
prints file with occurrences of ``old'' changed to ``new''.
If no file is given, sed reads from standard input e.g. to remove the first
header line from ps output
Many editors allow you to search for a string in a file. Usually this is just
an ordinary piece of text. Sometimes you want something more complex. e.g.
search for either ``the'' or ``The''.
The Turbo Pascal editor allows Ctrl-A to stand for any single character.
The Unix utilities grep, ed, vi, sed, awk, emacs etc., all support a particular
type of pattern.
This is also available from C using the regexp or regex libraries, and is available
in some languages such as tcl and perl.
Because the utilities grep and sed are often used in Unix shell programs it
is worth looking at their pattern mechanism.
The simplest patterns are
^ matches the start of a line
$ matches the end of a line
. matches any single character
* after a character matches zero or more occurrences of that character
To match anything from the beginning of the line except a full stop,
then the full stop, after that from there to the end of the line,
saving both patterns (but not the full stop):
^\([^.]*\)\.\(.*\)$
Then the same patterns reversed are
\2\1
For example, to change ``John.Smith'' to ``Smith, John'':
The examples keep doing a
cd /tmp
If you want to create a temporary file, you need somewhere to do it.
You could create it in your current directory.
But what if you wanted to, say, count the number of files in your
current directory - you would have just pushed the count out by one
with your new temporary file!
The directory /tmp is for programs that need temporary
files. Create your own in there. Files there are usually removed
at regular periods, so don't keep anything important there.
Just for
courtesy and saving of file space, remove your temporary files when
you are done with them.
We all like simple filenames, like tmp.
In a multi-user system, where we have switched to a common directory
like /tmp we can't all have filenames /tmp/tmp.
If you create the file and I want to use it, then I can't because of
file permissions - or worse, if you let me write to it then we end up with
each other's garbage in it.
The shell value $$ is the current process id. This is guaranteed to be
unique among all existing processes. Whenever you use $$ it substitutes
the current process id. Everyone (like you from now on) adopts the convention
of using this on filenames where there might be a clash of names.
When you have finished with this filename
clean up afterwards by removing the file.
There are variations among the different versions of Unix, which are
fortunately disappearing. Some of these variations concern common files.
Here I usually want to look at commands that use text files, and
/etc/ is about the only place that I can guarantee finding
common text files. This is why I use it.
This directory is vital to System Administrators. It is probably not
important to you, unless you want to delve into what is going on
in this part of Unix.