Lec4 - Computer Memories

LECTURE 4: COMPUTER MEMORY SYSTEM
The complex subject of computer memory is made more manageable if we classify memory
systems according to their key characteristics. Most important of these are:
i) Location: Refers to whether memory is internal or external to the computer.

- Internal memory is often equated with main memory but there are other forms of
internal memory such as registers and caches.
- External memory consists of peripheral storage devices such as disk and tape that
are accessible to processor via I/O controllers.
ii) Capacity: For internal and external memory, capacity is typically expressed in terms of
bytes. Consider three related concepts for internal memory:
- Unit of transfer
- Word size
- Number of words
iii) Method of accessing units of data. This include
- Sequential access: Memory is organized into units of data, called records. Access
must be made in a sequential linear sequence. Time to access an arbitrary record is
highly variable. Tape units are sequential access.
- Direct access: Individual blocks or records have unique address based on physical
location. Access is accomplished by direct access to general area of desired
information, then some search for final location. Access time is variable, but not as
much as sequential access. Disk units are direct access.
- Random access: Each addressable location has a unique physical location. Access is
direct access to desired location. Access time is constant and independent of prior
accesses. Main memory and some cache systems are random access.
- Associative: Data is located by a comparison with contents of a portion of store.

Access time is constant and independent of prior accesses. Cache memories may
employ associative access.
iv) Performance: Three performance parameters are used:

- Access time (latency): For random-access memory, this is time it takes to perform a
read or write operation, that is, the time from the instant that an address is presented
to the memory to the instant that data have been stored or made available for use.
For non-random-access memory, access time is time it takes to position read-write
mechanism at desired location.
- Memory cycle time: This concept is primarily applied to random-access memory
and consists of the access time plus any additional time required before a second
access can commence.
- Transfer rate: Rate at which data can be transferred into or out of a memory unit.
Generally measured in bits/second.
v) Physical types : A variety of physical types of memory have been employed. The most
common today are semiconductor memory (RAM), magnetic surface memory (disk,
tape) and optical (CD, DVD). Several physical characteristics of data storage are
important and it can be
- Volatile: Information is lost when power is switched off (RAM).
- Non-volatile: Information remains without deterioration until changed, no electrical
power is needed (disk).
COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 1

- Non-erasable memory: Cannot be altered, except by destroying storage unit (ROM).
There are trade-offs between three key characteristics of memory: cost, capacity, and access time.
The following relationship holds:
- Faster access time = greater cost per bit
- Greater capacity = less cost per bit
- Greater capacity = slower access time
The designer would like to use memory technologies that provide for large-capacity memory,
both because the capacity is needed and because the cost per bit is low. However, to meet
performance requirements, the designer needs to use expensive, relatively lower-capacity
memories with short access times. The way out of this dilemma is not to rely on a single memory
component or technology, but to employ a memory hierarchy.
The Memory Hierarchy.
As one goes down memory hierarchy, one finds

 decreasing cost per bit,
 increasing capacity
 Increasing access time.
 Decreasing frequency of access of memory by the processor

At the highest level (closest to processor), are the processor registers. Next comes one or more
levels (L1, L2, L3) of cache. Next comes main memory, which is usually made out of dynamic
random-access memory (DRAM). All of these are considered internal to computer system (directly
accessible by processor).
Hierarchy continues with external memory (accessible by processor via an I/O module), with next
level typically being a fixed hard disk.
One or more level below that consisting of removable media such as ZIP cartridges, optical disks
and tape.
Success of the memory hierarchy scheme depends upon locality of reference principle. During
course of execution of a program, memory references, both instructions and data, which tend to
cluster.
- Temporal locality. If a location is referenced, it is likely to be referenced again in near
future.
- Positional locality. When a location is referenced, it is probably close to last location
referenced.
Cache Memory
- Cache memory is intended to give memory speed approaching that of the fastest memories
available, and at the same time provide a large memory size at the price of less expensive
types of semiconductor memories.
- There is a relatively large and slow main memory together with a smaller, faster cache
memory. Cache is in middle of CPU and main memory.
- Cache contains a copy of portions of main memory.

- When processor attempts to read a word of memory, a check is made to determine if word
is in cache. If so, word is delivered to processor. If not, a block of main memory is read into
cache and then word is delivered to processor.
- Because of locality of reference, when a block of data is fetched into cache to satisfy a single
memory reference, it is likely that there will be future references to that memory location or
to other words in block.
- Computer can have more than one cache , it has multiple levels of cache

Cache memory Principles
- Cache connects to processor via data, control and address lines.

- Data and address lines are also attached to data and address buffers which attached to a
system bus from which main memory is reached.
- When a cache hit occurs, data and address buffers are disabled and communication is only
between processor and cache without system bus.
- When a cache miss occurs, desired address is loaded onto system bus and data are returned
through data buffer to both cache and processor.
Elements of Cache Design

There are a few basic design elements that serve to classify and differentiate cache architectures:
1. Cache address: Almost all processors support virtual memory. If virtual addresses are used, the
system designer may choose to place the cache between the processor and the memory
management unit (MMU) or between the MMU and main memory.
- A logical cache, also known as a virtual cache, stores data using virtual addresses. The processor
accesses the cache directly, without going through the MMU.
-A physical cache stores data using main memory physical addresses.
2. Cache size: We would like size of cache to be

- small enough so that overall average cost per bit is close to that of main memory alone
- large enough so that overall average access time is close to that of cache alone

- Large caches tend to be slightly slower than small ones since they include more number of
gates for cache addressing.
- Available chip and board area also limits cache size.
- Performance is sensitive to nature of workload, so there is no single “optimum” size.
3. Mapping function: Because there are fewer cache lines than main memory blocks, an algorithm
is needed for mapping main memory blocks into cache lines. Further, a means is needed for
determining which main memory block currently occupies a cache line.
- The choice of the mapping function dictates how the cache is organized. Three techniques
can be used:
4.Replacement algorithms: When all lines are occupied, bringing in a new block requires that an
existing line be overwritten. For direct mapping, there is only one possible line for any particular
block and no choice is possible. For associative and set associative techniques, a replacement
algorithm is required. Algorithms must be implemented in hardware for speed. Four most
common algorithms are:
- Least-recently-used (LRU): The idea is to replace that block in set which has been in cache
longest with no reference to it. Probably most effective method.
- First-in-first-out (FIFO): The idea is to replace that block in set which has been in cache
longest. The implementation uses a round-robin or circular buffer technique.
- Least-frequently-used (LFU): The idea is to replace that block in set which has experienced
fewest references. For implementation, associate a counter with each slot and increment
when used.
- Random: The idea is to replace a random block in set. Interesting because it provides only
slightly inferior performance to an algorithms based on usage.
5. Write policy: If a block has been altered in cache, it is necessary to write it back out to main
memory before replacing it with another block. There are two problems:
- More than one device may have access to main memory. I/O modules may be able to read/write
directly to memory. If a word has been altered only in cache, then corresponding memory word is
invalid or vice versa.
- Multiple CPUs may be attached to same bus, each with their own cache. Then, if a word is
altered in one cache, it could possibly invalidate
There are some techniques to overcome these problems:
i) Write through
- All write operations are made to main memory as well as to cache, so main memory is
always valid.
- Other CPUs monitor traffic to main memory to update their caches when needed.
- This generates large memory traffic and may create a traffic jam.
ii) Write back
- Minimizes memory writes. Updates are done only in cache.
- When an update occurs, an UPDATE bit associated with line is set, so when a block is
replaced it is written back to memory if and only if UPDATE bits is set.
- Portions of main memory are invalid, so accesses by I/O modules must occur through
cache.

- Multiple caches still can become invalidated, unless some cache coherency system is used.
6. Line size: As block size increases, more useful data is brought into cache.
- But larger blocks reduce number of blocks that fit into cache and a small number of blocks
results in data being overwritten shortly after it is fetched.
- As a block becomes larger, each additional word is farther from requested word, therefore
less likely to be needed in near future.
- A size from 8 to 64 bytes seems close to optimum.
7. Number of caches: When caches were originally introduced, the typical system had a single
cache.
- More recently, the use of multiple caches has become the norm.
- Two aspects of this design issue concern the number of levels of caches and the use of unified
versus split caches.
i) Multilevel Caches
- As logic density has increased, it has become possible to have a cache on the same chip as
the processor: the on-chip cache.
- Most contemporary designs include both on-chip and external caches. The simplest such
organization is known as a two-level cache, with the internal cache designated as level 1
(L1) and the external cache designated as level 2 (L2).
- With the increasing availability of on-chip area available for cache, most contemporary
microprocessors have moved the L2 cache onto the processor chip and added a level 3 (L3)
cache. More recently, most microprocessors have incorporated an on-chip L3 cache.
ii) Unified versus split caches

- Unified cache: A single cache stores data and instruction. It has higher hit rate than split
cache, because it automatically balances load between data and instructions. Only one
cache need to be designed and implemented.
- Split cache: One cache is dedicated to instructions and one cache is dedicated to data.
These two caches both exist at the same level, typically as two L1 caches. Trend is toward
split caches particularly for superscalar machines (Pentium). The key advantage of the split
cache design is that it eliminates contention for the cache between the instruction
fetch/decode unit and the execution unit.
SEMI CONDUCTOR MAIN MEMORY
Nowadays, the use of semiconductor chips for main memory is almost universal.
Organization
The basic element of a semiconductor memory is the memory cell.
Although a variety of electronic technologies are used, all semiconductor memory cells share
certain properties:
- Exhibit 2 stable (or semi-stable) states which can represent binary 1 or 0.
- Capable of being written into (at least once) to set states.
- Capable of being read to sense states.
Most commonly cell has three functional terminals capable of carrying an electrical signal.
- Select terminal: Selects a memory cell for a read or write operation.
- Control terminal: Indicates read or write.
- Other terminal: For writing, it provides an electrical signal that sets state of cell to 1 or 0.
For reading, it is used for output of cell’s state.
Semiconductor memory types includes:
Characteristic of RAM
- It is possible both to read data from the memory and to write new data into the memory
easily and rapidly. Both the reading and writing are accomplished through the use of
electrical signals.
- RAM is volatile. A RAM must be provided with a constant power supply. If the power is
interrupted, then the data are lost. Thus, RAM can be used only as temporary storage. The
two traditional forms of RAM used in computers are DRAM and SRAM.
Dynamic RAM (DRAM)

- A dynamic RAM (DRAM) is made with cells that store data as charge on capacitors.
- The presence or absence of charge in a capacitor is interpreted as a binary 1 or 0.
- Because capacitors have a natural tendency to discharge, dynamic RAMs require periodic
charge refreshing to maintain data storage.
- The term dynamic refers to this tendency of the stored charge to leak away, even with
power continuously applied.

Fig:Typica DRAM structure
- The figure shows a typical DRAM structure for an individual cell that stores 1 bit.
- The address line is activated when the bit value from this cell is to be read or written.
- The transistor acts as a switch that is closed (allowing current to flow) if a voltage is applied
to the address line and open (no current flows) if no voltage is present on the address line.
For the write operation, a voltage signal is applied to the bit line; a high voltage represents 1, and a
low voltage represents 0. A signal is then applied to the address line, allowing a charge to be
transferred to the capacitor.
- For the read operation, when the address line is selected, the transistor turns on and the
charge stored on the capacitor is fed out onto a bit line and to a sense amplifier. The sense
amplifier compares the capacitor voltage to a reference value and determines if the cell
contains a logic 1 or a logic 0. The readout from the cell discharges the capacitor, which
must be restored to complete the operation.
- Although the DRAM cell is used to store a single bit (0 or 1), it is essentially an analog
device. The capacitor can store any charge value within a range; a threshold value
determines whether the charge is interpreted as 1 or 0.
Static RAM (SRAM)

- In contrast, a static RAM (SRAM) is a digital device that uses the same logic elements used
in the processor.
- In a SRAM, binary values are stored using traditional flip-flop logic-gate configurations.
- A static RAM will hold its data as long as power is supplied to it.
- Unlike the DRAM, no refresh is needed to retain data.
SRAM vs. DRAM

- Both static and dynamic RAMs are volatile; that is, power must be continuously supplied
to the memory to preserve the bit values.
- A dynamic memory cell is simpler and smaller than a static memory cell. Thus, a DRAM is
denser (smaller cells means more cells per unit area) and less expensive than a
corresponding SRAM.
- DRAM requires the supporting refresh circuitry. For larger memories, the fixed cost of the
refresh circuitry is more than compensated for by the smaller variable cost of DRAM cells.
Thus, DRAMs tend to be favored for large memory requirements.
- SRAMs are generally somewhat faster than DRAMs. Because of these relative
characteristics, SRAM is used for cache memory (both on and off chip), and DRAM is used
for main memory.
Read-Only Memory (ROM)
- Contains a permanent pattern of data which cannot be changed. It is non-volatile. No
power source is required to maintain bit values in memory.
- While it is possible to read a ROM, it is not possible to write new data into it.
- Data is actually wired into chip as part of fabrication process. This presents two problems:
 Data insertion step has a large fixed cost.
 There is no room for error. If one bit is wrong, whole set of ROMs must be
thrown out.
Types of Read-Only Memory (ROM) are:

- Programmable ROM (PROM)
- Erasable programmable ROM (EPROM)
- Electrically erasable programmable ROM (EEPROM)
- Flash memory
In fact, only PROM is a read only memory whereas EPROM, EEPROM and flash memory are
read-mostly memory.
Programmable ROM (PROM)

When only a small number of ROMs with particular memory content is needed, a less expensive
alternative is the programmable ROM (PROM).
-Like the ROM, the PROM is non-volatile and may be written into only once.
-The writing process is performed electrically and may be performed by a supplier or customer at
a time later than the original chip fabrication.
-Special equipment is required for the writing process.
-PROMs provide flexibility and convenience.
Erasable Programmable ROM (EPROM)

-EPROM is read and written electrically, as with PROM.
-However, before a write operation, all the storage cells must be erased to the same initial state by
exposure of the packaged chip to ultraviolet radiation.
-Erasure is performed by shining an intense ultraviolet light through a window that is designed
into the memory chip.
-This erasure process can be performed repeatedly; each erasure can take as much as 20 minutes to
perform.
-Thus, the EPROM can be altered multiple times and holds its data virtually indefinitely.
-EPROM is more expensive than PROM.
Electrically Erasable Programmable ROM (EEPROM)

-EEPROM can be written into at any time without erasing prior contents; only the byte or bytes
addressed are updated.
-The write operation takes considerably longer than the read operation, on the order of several
hundred microseconds per byte.
-The EEPROM combines the advantage of non-volatility with the flexibility of being updatable in
place, using ordinary bus control, address, and data lines.
-EEPROM is more expensive than EPROM and also is less dense, supporting fewer bits per chip.

Flash Memory
-Gets its name because the microchip is organized so that a section of memory cells are erased in a
single action (flash).
-First introduced in the mid-1980s.
-Flash memory is intermediate between EPROM and EEPROM in both cost and functionality.
-Like EEPROM, flash memory uses an electrical erasing technology. An entire flash memory can
be erased in one or a few seconds, which is much faster than EPROM.
-In addition, it is possible to erase just blocks of memory rather than an entire chip. However, flash
memory does not provide byte-level erasure.
-It has same density as EPROM.
Error Correction
A semiconductor memory is subject to errors. These errors can be categorized as:
Hard failure is a permanent physical defect so that memory cell or cells affected cannot reliably
store data. Hard errors can be caused by manufacturing defects.
Soft Error is a random, non-destructive event that alters contents of one or more memory cells,
without damaging memory. Soft errors can be caused by power supply problems or alpha
particles which result from radioactive decay.
Both hard and soft errors are clearly undesirable and most modern main memory systems include
logic for both detecting and correcting errors.
Example:
When data are to be read into memory, a calculation, depicted as a function f, is performed on the
data to produce a code.
Both the code and the data are stored. Thus, if an M-bit word of data is to be stored and the code is
of length K bits, then the actual size of the stored word is M+K bits.

When previously stored word is read out, code is used to detect and possibly correct errors.
- A new set of K code bits is generated from M data bits and compared with fetched code bits.
Comparison yields one of three results:
- No errors are detected, fetched data bits are sent out.

- An error is detected and it is possible to correct error, the data bits plus error correction bits
are fed into a corrector, which produces a corrected set of M-bits to be sent out.
- An error is detected, but it is not possible to correct it. This condition is reported.
Codes that operate in this fashion are referred to as error-correcting codes.
Error correcting techniques
i) Hamming code
The simplest of error-correcting codes is the Hamming code devised by Richard Hamming at Bell
Laboratories.
The figure below uses Venn diagrams to illustrate the use of this code on 4-bit words (M=4).
- 4 data bits (1110) are assigned to the inner compartments. The remaining
compartments are filled with what are called parity bits.
- Each parity bit is chosen so that the total number of 1s in its circle is even.
In order to determine the length of the code, the following inequality is used:
2K - 1 ≥ M + K
where M: length of data
K: length of code (check bits)
Example: develop a code that can detect and correct single-bit errors in 8-bit word.

- The length of check bits (K) for a word of 8 data bits (M=8) is calculated as:
if K=3 then 23-1 ≱ 8+3
if K=4 then 24-1 ≥ 8+4
-So, 8 data bits require 4 check bits.
Advanced DRAM Organization
In recent years, a number of enhancements to the basic DRAM architecture have been explored,
and some of these are now on the market.
The schemes that currently dominate the market are SDRAM, DDR-DRAM and RDRAM.
CDRAM has also received considerable attention.
i) Synchronous DRAM (SDRAM)
Unlike the traditional DRAM, which is asynchronous, the SDRAM exchanges data with the
processor synchronized to an external clock signal and running at the full speed of the
processor/memory bus without imposing wait states. With synchronous access, the DRAM moves
data in and out under control of the system clock.
The processor or other master issues the instruction and address information, which is latched by
the DRAM. Meanwhile, the master can safely do other tasks while the SDRAM is processing the
request. The DRAM then responds after a set number of clock cycles.
ii) Double Data Rate SDRAM (DDR-SDRAM)
There is now an enhanced version of SDRAM, known as double data rate SDRAM (DDR-SDRAM)
that overcomes the once-per-cycle limitation.
SDRAM is limited by the fact that it can only send data to the processor once per bus clock cycle.
DDR-SDRAM can send data to the processor twice per clock cycle, one on the rising edge of the
clock pulse and one on the falling edge.
There have been two generations of improvement to the DDR technology.
DDR2 increases the data transfer rate by increasing the operational frequency of the RAM chip
and by increasing the prefetch buffer from 2 bits to 4 bits per chip. The prefetch buffer is a
memory cache located on the RAM chip. The buffer enables the RAM chip to preposition bits to be
placed on the data base as rapidly as possible.
DDR3, introduced in 2007 increased the prefetch buffer size to 8 bits.
iii) Rambus DRAM (RDRAM)

RDRAM chips are packaged, with all pins on one side. Designed to plug into RDRAM bus (a
special high-speed bus just for memory). Rambus memory sends data more frequently. It reads
data on both the rising and falling edges of the clock signal.
iv) Cache DRAM (CDRAM)
Integrates a small SRAM cache (16 Kb) onto a basic DRAM chip. The SRAM on the CDRAM
can be used in two ways:
- As a true cache.
- As a buffer to support serial access of a block of data.

Lec4 - Computer Memories

Cargado por

Información del documento

Descripción original:

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Lec4 - Computer Memories

Cargado por

Copyright:

Formatos disponibles

LECTURE 4: COMPUTER MEMORY SYSTEM

i) Location: Refers to whether memory is internal or external to the computer.

- Associative: Data is located by a comparison with contents of a portion of store.

iv) Performance: Three performance parameters are used:

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 1

The Memory Hierarchy.

As one goes down memory hierarchy, one finds

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 2

- Cache contains a copy of portions of main memory.

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 3

- Cache connects to processor via data, control and address lines.

Elements of Cache Design

2. Cache size: We would like size of cache to be

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 4

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 5

ii) Unified versus split caches

SEMI CONDUCTOR MAIN MEMORY

Dynamic RAM (DRAM)

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 7

Static RAM (SRAM)

SRAM vs. DRAM

Types of Read-Only Memory (ROM) are:

Programmable ROM (PROM)

Erasable Programmable ROM (EPROM)

Electrically Erasable Programmable ROM (EEPROM)

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 9

A semiconductor memory is subject to errors. These errors can be categorized as:

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 10

Comparison yields one of three results:

- No errors are detected, fetched data bits are sent out.

Codes that operate in this fashion are referred to as error-correcting codes.

Error correcting techniques

compartments are filled with what are called parity bits.

where M: length of data

K: length of code (check bits)

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 11

if K=3 then 23-1 ≱ 8+3

if K=4 then 24-1 ≥ 8+4

-So, 8 data bits require 4 check bits.

Advanced DRAM Organization

i) Synchronous DRAM (SDRAM)

ii) Double Data Rate SDRAM (DDR-SDRAM)

There have been two generations of improvement to the DDR technology.

DDR3, introduced in 2007 increased the prefetch buffer size to 8 bits.

iii) Rambus DRAM (RDRAM)

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 12

- As a buffer to support serial access of a block of data.

COMPUTER ORGANIZATION: COMP 121: LECTURE 4 Page 13

También podría gustarte