Está en la página 1de 9

UNIT 1 THEORETICAL CONCEPTS OF

UNIX OPERATING SYSTEM


Structure
Introduction
Objectives
Basic Features Of UNIX Operating System
File Structure
CPU Scheduling
Memory Management
1.5.1 Swapping
1.5.2 Demand Paging
File System
1.6.1 Blocks And Fragments
1.6.2 Inodes
1.6.3 Directory Smcture
Summary
Model Answers

1.0 INTRODUCTION
In block 2 we discussed several theoretical concepts of operating systcm in general, it is
often useful to see them in practice. In this block we h v e presented in-depth study of UNIX
o p a t i n g system as an example of the various concepts presented in block 2. In this unit we
have taken up issues related to how file management, process management, memory
management is done in UNIX. Whereas rest of the units discuss about UNIX commands, its
shell programming editors, system administration etc.

1.1 OBJECTIVES
After going through this unit you will be able to:
List the basic featurcs of UNIX operating systcm
Describe UNIX file structure
Discuss CPU scheduling in UNIX system
Discuss memory management schemes in UNIX
Discuss file systems in UNIX operating system

1.2 BASIC FEATURES OF UNIX OPERATING SYSTEM


It is written in high-level language, 'C' making it easy to port to different configura-
tions.
It is a good operating system, especially. for programs. UNIX programming environ-
ment is unusually rich and productive. It provides features that allow complex
programs to be built from simpler programs.
It uses a hierarchical file system that allows easy maintenance and efficient im-
plementation.
* It uses a consistent format for files, the byte stream. making application programs
easier to write. '

It is a multi-user. multiprocess system. Each user can execute several processes


simultaneously.
It hides the machine.Zwchitecture from the user, making it easier to write programs
that run on different hardwarc implementation.
UNIX Operating system-I UNIX System Architecture
As do most computer systems. UNIX consists of two separable parts: the Kernel and System
programs. We can view the UNIX operating system as being layered as shown in figure 1.
I

Application Ptcrgrams Created by Users


v + -
Shells editor and commands
(who, we, grep, comp)
complilers and enter pretens system
libraries
.t
E
Systerir, call i n t e r k c t o the kerrlel

Signals File system CPU scheduling,


terminal handling swapping pages replacement,
character 110 block 110 system demand paging,
system terminal disk & tape visual memory
drivers drivers

I
Kernel intertfse to Hardware ;

Terminal Device controllers Memory


qontrollers disks and Controllers
terminals tapes Physical memory

Flgure 1 : Unix System Architerture

Everything below the system call interface and above the physical hardware is the Kernel.
The Kernel provides the file system. CPU Scheduling. memory management and other
operating system functions through system calls.
Programs such as shell (Sh) and editors (vi) shown in the top layer interact with the Kernel
by invoking a well defined set of system calls. The system calls instruct the Kernel to do
various opxations for the calling programs and exchange data between the Kernel and the
program. i

System calls for UNIX can be roughly grouped into three categories: file manipulation.
process cqntrol and information manipulation. Another category can be considered for
device mzjnipulation, but, since devices in UNIX are treated as (special) files, the same
system calls support both files and devices.

1.3 FILE STRUCTURE


A file in mIX is a sequence of bytes. Different programs expect various levels of structure,
but the Kornel does not impose any structure on files, and no meaning is attached to its
contents - the meaning of the bytes depends solely on the programs that interpret (he file.
This is not tm of just disc files but of peripheral devices as well. Magnetic tapes. mail
messages, character typed on the keyboard, line printer output, data flowing in pipes - each
of these id just a sequence of bytes as far as the system and the programs in it are concerned.
Files are organized in tree-structured directories. Directories are themselves files that contain
information on how to find other files. A path name to 1 file is a text string that identifies a
file by spkifying a path thr.ough the directory structure to the file. Syntactically it contains
of individbal file name elements separated by the slash character. For example, in
/usr/Aksh&y/data,the first slash indicates the root of the directory tree, called the root
directory. The next element, us, is a subdirectory of the root. Akshay is a subdirectory of usr
and data is a file or a directory in the directory &hay.
Figure 2 shows a typical UNIX f k systems.The file system

Root

Figure 2 : UNIX file system

is organised as a Uee with a single root node called root (written "f'); every non-leaf-node of
the file system structure is a directory of files, and files atthe leaf nodes of the tree am either
directories or regular files or special device files. Idev contains device files, such as
/dev/console, /dev/lpO, IdevImtO and so on; /bin contains the binaries of essential UNIX
system programs.
Create, open, read, write, close, uplink and frunc are system calls which are used for basic
file manipulation. The create system call, given a pathname, creates a (empty) file (or
truncates and existing one). An existing file is opened by the open system call, which takes a
path name and a node (such as read, write or read-write) and returns a small descriptor which
may then be passed to a read or write system call (along with a buifer address and a number
'of bytes to transfer) to perform d a b transfer to or from the file.
A file descriptor is an index into a small table of open files for this process. Descriptors start
at 0 and seldom get higher than 6 or 7 for typical programs, depending on the maximum
number of simultaneously open files.
Each read or write updates the current offset into the file. which is associated with file table
entry and is used to determine the position in the file for the next read cr write.

Processing Environment
A process is a program in execution. Many processes can execute simultaneously on UNIX
System (this feature is sometimes called multiprogramming or multikking) with no logical
unit to their number, and many insmces of a program (such as copy) can exist
simultaneously in the system. Varioys system calls allow processes to create new processes,
terminate processes, synchronize stages of process execution, and communicate with the rest
of the world. For example, in UNIX new processes are created by the fork system call.
Every process except process 0 is created when another process executes the fork system
call. The process that invoked the fork system is the parent process and the newly created
UNIX Operating Syskm-I process is the cHild process. Every process has one parent process. but a process can have
many child processes. The Kernel identifies each process by its process number, called the
process ID (PID). Process 0 is a special process that is created when the System boots; after
forking a child process (process 1). process 0 becomes the swapper process. hocess 1.
known as init, ig the ancestor of every process in the system.

Check Your Progress


Q1. What are the important features of UNIX operating system?

Q2. What is a file structure in UNIX?What is its importance in UNIX?

1.4 CPU SCHEDULING


CPU scheduling in UNIX is designed to benefit interactive processes. Processes are given
small CPU time slices by a priority algorithm that reduces to round-robin scheduling for
CPU-bound jobs.
The scheduler on UNIX system belongs to the general class of operating system schedulers
known as rouqd robin with multilevel feedback which means that the kernel allocates the
CPU time lo a process for small time slice, preempts a process that exceeds its time slice and
feed it back in@one of seveml priority queues. A process may need many iterations through
the "feedback bop'' before it finishes. When kernel does a context switch and restores the
context of a process, the process resumes execution from the point where it had been
suspended.
Each process wble entry contains a priority field. There is a process table for each process
which contain4 a priority field for process scheduling. The priority of a process, is lower if
they have recebtly'used the CPU and vice versa.
The more CPCf time a process accumulates, the lower (more positive) its priority becomes,
and vice versa, so there is negative feedback in CPU scheduling and it is dirficult for a single
process to take all the CPU time. Process aging is employed to prevent starvation.
Older UNIX systems used a 1-second quantum for the round- robin scheduling. 4.33SD
reschedulespmesses every 0.1 second and recomputes priorities every second. The
round-robin sdheduling is accomplished by the time-out mechanism. which tells the clock
interrupt drivar to call a kernel subroutineafter a specified interval; the subroutine to be
called in this case causes the rescheduling and then resubmits a time-out m call itself again.
The priority r6computation is also timed by a subroutine that resubmits a time-out for itself.
There is no pre-emption of one process by another in the kernel. A process may relinquish
the CPU because it is waiting on I/O or because its time slice has expired. When a process
chooses to relinquish the CPU, it goes to sleep on an event. The kernel primitive used far
this purpose is called sleep (not to be confused with the user-level library routine of the same
name.) It takes an argument, which is by convention the address of a kernel data structure
related to an qvent that the process wants to occur before that process is awakened. When the
event occurs, the system process that knows about it calls wakeup wilh the address
corresponding to the event, and all processes that had done a sleep on the same address are
put in the ready queue to be run.
Theoretical Concepts of
1'5 MEMORY M. nNAGEMENT Unix Operating System

The CPU scheduling is strongly influenced by memory management schemes. At least part
of a process must be contained in primary memory to run; a process cannot be executed by a
CPU if it is existing entirely in main memory. It is not also possible to contain all active
processes in the main memory. For example 4MB main memory will not be able to pmvide
space for 5MB process. It is the job of memory management module to decide which process
should reside (at least partially) in main memory, and manage the parts of the virtual address
of a process which are residing on secondary storage devices. It monitors the amount of
physical memory and provide swapping of processes between physical memory and
secondary storage devices.

1.5.1 Swapping
The early deveIopment of UMX systems transferred entire processes between primary
memory and secondary storage device but did not transfer parts of a process independently,
except for shared text. Such a memory management policy is called swapping. UNIX was
fist implemented on PDP-11, where the total physical memory was limited to 256Kbytes.
The total memory resources were insufficient to justify or support complex memory
management algorithms. Thus, UNIX swapped entire process memory images.
Allocation of both main memory and swap space is done first- fit. When the size of a
process' memory image increases (due to either stack expansion or data expansion), a new
piece of memory .big enough for the whole image is allocated. The memory image is copied,
the old memory is freed, and the appropriate tables are updated. (An attempt is made in some
systems to find memory contiguous to the end of the current piece, to avoid some copying.)
If no single piece of main memory is large enough, the process is swapped out such that it
will be swapped back in with the new size.
There is no need to swap out a sharable text segment, because it is read-only, and there is no
need to read in a sharable text segment for a process when another instance is already in
memory. That is one of the main reasons for keeping track of sharable text segments: less
swap traffic. The other reason is the reduced amount of main memory required for multiple
processes using the same text segment.
Decisions regarding which processes to swap in or swap out are made by the scheduler
process (also known as the swapper). The scheduler wakes up at lcast once every 4
seconds to check for processes to be swapped in or out. A process is more likely to be
swapped out if it is idle or has been in main memory for a long time, or is large; if no
obvious candidates are found, other processes are picked by age. A process is more likely
to be swapped in if its has been swapped out a long time, or is small. There are checks to
prevent thrashing, basically by not letting a process be swapped out if it's not been in
memory for a certain amount of time.
If jobs do not need to be swapped out, the process table is searched for a process deserving to
be brought in (determined by how small the process is and how long it has been swapped
out). Processes are swapped out until there is not enough memory available.
Many UNIX systems still use the swapping scheme just described. All Berkeley UNIX
systems, on the other hand, depend primarily on paging for memory-contention management,
. and depend only secondarily on swapping. A scheme similar in outline to the traditional one
is used to determine which processes get swapped in or out, but the details differ and the
1 influence of swapping is less.
!
Q
i 1.5.2 Demand Paging
i

Berkeley introduced demand paging to UNIX with BSD (Berkeley System) which
transferred memory pages instead of processes to and from a secondary device; recent
releases of UNIX system also support demand paging. Demand paging is done in a
straightforward manner. When a process needs a page and the page is not there, a page fault
to the kernel occurs, a frame of main memdry is allocated, and then the process is loaded into
the frame by the kernel.
The advantage of demand paging policy is that it permits greater flexibility in mapping the
virtual address of a process into the physical memory of a machine, usually allowing the size
of a process to be greater than the amount of availability of physical memory and allowing

J
UNIX Operating system-I more processes to fit into main memory. The advantage of a swapping policy is that it is
easier to implement and results in less system overhead.

1.6 FILE SYSTEM


The UNIX dle system supportj two main objects: files and directories. Directories are just
files wilh a special format, so the representation of a file is the basic UNIX concept.

1.6.1 Blocks and Fragments


Most of the file system is taken up by data blocks, which contain whatever the users have pul
in their files. Let us consider how these data blocks are stored on the disk.
The hardware disk sector is usually 512 bytes. A block size larger than 512 bytes is desirable
for speed. However, because UNIX file systems usually contain a very large number of small
files, much lbger blocks would cause excessive internal fragmentation. That is why the
earlier 4 . 1 ~ file
3 ~system was limited to a 1024-byte (1K) block.

The 4.2BSD solution is to use two block sizes for files which have no indirect blocks: all the
blocks of a file are of a large block size (such as 8K), except the last. The last block is an
appropriate multiple of a smaller fragment size (for example, 1024) to fill out the file. Thus,
a file of size 18,000 bytes would have two 8K blocks and one 2K fragment (which would not
be filled completely).
The block ar)d fragment sizes are set during file-system creation according to the intended
use of the file system: If many small files are expected. the fragment size should be small; if
repeated trarisfers of large files are expected, the basic block size should be,largc
Implementation details force a maximum block-to-fragment ratio of 8 9 , and a minimum
block size of4K, so typical choices are 4096 1 512 for the former case and 8192 : 1024 for
the latter.
Suppose data are written to a file in transfer sizes of 1K bytes, and the block and fragment
sizes of the flilesystem are 4K and 512 bytes. The file system will allocate a 1K fragment to
contain the data from the first transfer. The next transfer will cause a new 2K fragment to be
allocated. The data from the original fragment must be copied into this new fragment,
followed bythe second 1K transfer. The allocation routines do attempt to find the required
space on the disk immediately following the existing ferment so that no copying is necessary,
but, if they cannot do so, up to seven copies may be required before the fragment becomes a
block. Provisions have been made for programs to discover the block size for a file so that
transfers of that size can be made, to avoid fragment recopying.

Associated with each file in UNIX is a little table (on disk) called an i-node. An inode is a
record that describes the attributes of a file, including the lay out of itj data on disk. Inodes
exist in a sqtic form on disk and the kernel read them into the main memory and manipulates
them. Disk inodes consist of the following fields:
FilC owner identifier - File ownership is divided between an individual owner and a
grolup owner and defines the set of users who have access rightj to a file. There su-
pervisor has access rights to all files in the system.
File type - Files may be of type regular, directory, character or block special or
pipes.
File access permission - The system protectj files according to three classes: the
owher and the group owner of the file and other users; each class has access rights
to dead, write and execute the file which can be set individually. Although directory
is s file but it cannot be executed, execution permission for a directory gives the
right to search the directory, for a file name.
- File access times - Giving the time the file was last modified, when it was last ac-
cessed.
In addition, Chc inodc contains 15 pointers to the disk blocks containing the data contenls of
the file. The fist 12 of these pointers (as shown in figure 3) point to direct blocks; that is,
they contaiq addresses ol' blocks that contain data of
Theoretical Concepts o f
Unix Operntlng System

Inode Data Block

0
0
0

#
#.
% Er-73-
pointer to

F i g u r e 3 : D i r e c t and indirect block of inode

the file. Thus, the data for small files (no more than 12 blocks) can be referenced
immediately, because a copy of the inode is kept in main mem.ory while a file is open. If the
block size is 4K, then up to 48K of data may be accessed directly from the inode.
The next three pointers in the inode point to indirect blocks. If the file is large enough to use
indirect blocks, the indirect blocks are each of the major block size; the fragment size applies
to only data blocks. The first indirect block pointer is the address of a single indirect block.
The single indirect block is an index block, containing not data, but rather the addresses of
blocks that do contain data. Then, there is a double-indirect-block pointer, the address of a
block that contains the addresses of blocks that contain pointers to the actual data blocks.
The last pointer would contain the address of a triple indirect Mock; however, there is no
need for it. The minimum block size for a file system in 4.2BSD is 4K, so files wit as many
as 232bytes will use only double, not triple, indirection. That is, as each block pointer takes 4
bytes, we have 49,152 (4K x 12) bytes accessible in direct blocks, 4,194,304 bytes
accessible by a single indirection, and 4,294,967,296 bytes reachable through double
indirection, for a total of 4,299,210,752 bytes, which is larger than 23? bytes. The number 232
is significant becau_sethe file offset in the file structure in main memory is kept in a 32-bit
word. Files therefore cannot be larger than 232bytes. Since file pointers are si ned integers
(for seeking backward and forward in a file), the actual maximum file size is 252-1 bytes.
Two gigabytes is large enough for most purposes.

1.6.3 Directory Structure


Before a file can be read, it must be opened. When a file is opened, the operating system uses
the path name supplied by the user to locate the disk blocks, so that it can read and write the
file later. Mapping path names onto i-nodes (or the equivalent) brings us to the subject of
how directory systems are organized. These vary from quite simple to reasonably
sophisticated.
Now let us consider some examples of systems with hierarchical directory trees. Figure 4
shows an MS-DOS directory entry. It is 32 bytes long and contains the file name and the first
h!ock number, among other items. The first block number can be used as an index into the
UNIX Operating system-I FAT, to find the second block number, and so oh. In this way all the blocks can he found for
a given file. Except for the m t directory, which is fixed size (112 entries for a 360K floppy
disk). MS-DOSdirectories are files and may contain an arbitrary number of entries.

Bytes R 3 2 2 2 4

Attributes

Figure 4 : The MS-DOS directory entry

The directory structure used in UNIX is extremely simple, as shown in figure 5. Each entry
contains just a file name and its i-node number. All the information about the type. size.
times, ownership, and disk blocks is contained in the i- node (see figure 3). All directcries in
UNIX are files. and may contain arbitrarily many of these entries.
Bytes 2 14

File name

i-node
number

Figure 5 : A Unlx dlrectnry entry

When a fib is opened, the file system must take the file name supplied and locate its disk
block.. Let us consider how the path name /usr/ast/mbox is looked up. We will use UMX as
an example. but the algorithm is basically the same for all hierarchical directory systems.
F i t the ffle system locates the m t directory. In UNIX its i-node is located at a fixed place
on the disk.
Then it lwks up the fmt component of the path, u&, in the root directory to find the i-node
of the file /us. From this i-node, the system locates the directory for /usr and looks up the
next component, ast. in it. when it has found the entry for ast. it has the i-node for directory
/usr/ast. From this i-node it can find the directory itself and look up mbox. The i- node for
this file islthen read into memory and kept there until the file is closed. The lookup process is
illustrated in figure 6.

Block 132 I-node 26 Block 406


I-node 6 islusr i s for is/usr/as t
Root directory is for_lusr directory Iusrlast directory

size size
times times

Looking up I-node 6 /usr/as t I-node 26 /usr/as t/mbox


usr yields says that is i-node says that is i-node
i-node 6 /use is in 26 /usa/ast .60
block 132 block 406
Flgure 6: The steps In looking up lusrlast~mbox

Relative path names are looked up the same way as absolute ones, only starting from the
working directory instead of starting from the mot directorj. Every dirsctorj has entries for .
and ..which are put there when the directory is created. The entry . has the i-node number for
the current directory, and the entry for ..has the i-node number for h e parent directory, and
searches that directory for disk. NO special mechanism is needed to handle these names. As Theoretical Conccpts o f
Unix Operating Systcm
rar as the directory system is concerned, they are just ordinary ASCII strings.

1.7 SUMMARY
In this unit we discussed issues broadly related to CPU scheduling, memory management
schemes and file systems of UNIX operdting system. We did not go into implementation
details of these schemes as we11 as system calls in detail. Students are strongly advised to
refer to a book " The Design of UNIX Operating System " by Maurice J. Back. for detailed
discussion. The main points covered in this unit are:
UNIX provides a good programirling environment. ILsupports features Lhat allow
complex programs to be built from simpler programs.
The UNIX file is simply a seqlience of bytes without my me=ing imposed on it. Zts
meaning is mainly dependent upon programs that interpret it.
Kernel allocates the CPU w a process for small fraction of time, preempts the
process that exceeds its rime slice and feed it back into one of several priority
queues. G -12substantial difference between UNIX and many other systems is the
ease with ;-.,I~ichmultiple processes can be created and manipulated.
Memory management in UNIX is swapping supported by paging.
The b N X file system supports two inail1 objqcts: files and directories. Directories
arc just filzs with a special b r ~ n aso
t the representative of a file is the basic UNlX
concept.

Chcck Your Progress


1. UNIX is writtzn in high ]&el language, C. This makcs the syslem highly portable.
UNIX supports goo3 programming environment. ILallows complex programming to
be built from smailzr programs.
It uses a hierarchical file system that allows easy maintcnance and efficient
implementation.
2. UNIX system uses a hierarchical file system structure. Any file may be located by
tracing a path from Lhe root directory. Each supported device is associated with one or
more spccial files. Input/Output to special files is done in the same inanner as wiii
ordinary disk files, but thcsc reques'i cause activation of tile associated devices.

I..

También podría gustarte