Media and Storage

Media And Storage
Digitization
Session 7
AM Zeus-Brown
Le
ar
ni
n
• What is digitisation
g
ou • Some simple methods of digitisation
tc • What's digital and what's analogue
o
m
es
Gr
ou
p
• In Two groups of spend 5 minutes
Di discussing the following
sc
us – What is Analogue media ?
si – What is Digital media?
on
– The difference between digital and
analogue
Ar
ea
s
• Two main areas we will look at
of
– Image digitisation
st – Audio digitisation
u
dy
Pe • The next few slide will be shown to
rc you quickly write down the 1st thing
ep you see
ti
on
I
m
ag
e
1
Images
Digitisation
Pi • Taking an image from the real world
ct to the digital
ur – You can take a photograph using a
e conventional film camera, process the film
chemically, print it onto photographic paper
and then use a digital scanner to sample the
print (record the pattern of light as a series of
pixel values).
– You can directly sample the original light that

bounces off your subject, immediately breaking
that light pattern down into a series of pixel
values -- in other words, you can use a digital
camera.
C • Two types of capture
C
D – A CCD transports the charge across
an the chip and reads it at one corner of
the array. An analog-to-digital
d converter (ADC) then turns each
pixel's value into a digital value by
C measuring the amount of charge at
each photosite and converting that
M measurement to binary form.
O
– CMOS devices use several transistors
S at each pixel to amplify and move the
charge using more traditional wires.
The CMOS signal is digital, so it needs
no ADC.
Di • pros and cons:
ff – CCD sensors create high-quality, low-noise images.
CMOS sensors are generally more susceptible to
er noise.
en
ce
s
• Pro and cons
– Because each pixel on a CMOS sensor has several
transistors located next to it, the light sensitivity of a
CMOS chip is lower.
– Many of the photons hit the transistors instead of the
photodiode.
– CMOS sensors traditionally consume little power.
– CCDs, on the other hand, use a process that
consumes lots of power. CCDs consume as much as
100 times more power than an equivalent CMOS
sensor.
– CCD sensors have been mass produced for a longer
period of time, so they are more mature.
– They tend to have higher quality pixels, and more of
them.
C • Additive colour (starts Black)
ol – Adding colours in the RGB range goes
ou to white
r • Where might you see this
– The world
– Adding light frequencies
• Subtractive colour (starts white)
– Adding colours in the RGB range goes
to black
• Where might you see this
– Printers
– Filtering out light frequencies
Ex • Splitting the colours
pe – each colour is then
ns mapped
iv – also see rotation
e filter
CCD
This method tends to be use in larger more expensive

cameras
C • The Bayer filter
he – overlay the filter over
ap each individual
er photosite
A CCD
A sample of the
image zoomed in
Sc • Scanner work in much the same
an way
ne – the light bar send out light
rs – this is picked up by the sensor and then
it’s the same as a camera
Scanner work in much the same way
this is picked up by the sensor and

then it’s the same as a camera
the light bar send out light
–
–
Di • The DCT is a linear, invertible
sc function F : RN -> RN (where R
re denotes the set of real numbers), or
te equivalently an invertible N × N
co square matrix.
si
ne • There are several variants of the
tr DCT with slightly modified
an definitions.
sf • The N real numbers x0, ..., xN-1 are
or transformed into the N real numbers
m X0, ..., XN-1 according to one of the
formulas:
D
C • The DCT-I is exactly equivalent (up to an overall scale factor of 2), to a DFT of 2N − 2
T1 real numbers with even symmetry. For example, a DCT-I of N=5 real numbers abcde
is exactly equivalent to a DFT of eight real numbers abcdedcb (even symmetry),
divided by two. (In contrast, DCT types II-IV involve a half-sample shift in the
equivalent DFT.)
• Note, however, that the DCT-I is not defined for N less than 2. (All other DCT types
are defined for any positive N.)
• Thus, the DCT-I corresponds to the boundary conditions: xn is even around n=0 and
even around n=N-1; similarly for Xk.
• The are many other versions of DCT if you wish to study

these in your own time please feel free to do so
Fe • What did you notice in the lab
ed
ba
ck
fr
o
m
la
b
Ot • Digitally created images
he – Working in layers
r – Vector and raster
i – Working out fill or no fill
m
ag
e
fil
– Simple algorithm
es • Count the lines
• Fill on odd
• The human eye is good at seeing small differences
in brightness over a relatively large area, but not
so good at distinguishing the exact strength of a
high frequency brightness variation.
• This fact allows one to get away with greatly
reducing the amount of information in the high
frequency components.
• This is done by simply dividing each component in
the frequency domain by a constant for that
component, and then rounding to the nearest
integer.
• This is the main lossy operation in the whole
process.
• As a result of this, it is typically the case that many
of the higher frequency components are rounded
to zero, and many of the rest become small
positive or negative numbers, which take many
fewer bits to store.
Audio
Digitisation
Di • CDs
gi – 44,100 samples/second * 16
tis bits/sample * 2 channels =
ed 1,411,200 bits per second
so • So what does that mean ?
u – Let's break that down:
• 1.4 million bits per second equals 176,000
n bytes per second.
d • If an average song is three minutes long,
then the average song on a CD consumes
about 32 million bytes of space.
• That's a lot of space for one song, and it's
especially large when you consider that over
a 56K modem, it would take close to two
hours to download that one song.
Th • To make a good compression algorithm
in for sound, a technique called perceptual
k noise shaping is used.
ba • It is "perceptual" partly because the MP3
format uses characteristics of the human
ck ear to design the compression algorithm.
to For example:
co – There are certain sounds that the human ear
cannot hear.
m – There are certain sounds that the human ear
pr hears much better than others.
– If there are two sounds playing simultaneously,
es we hear the louder one but cannot hear the
si softer one.
on
So • Sample and Bitrate what are they
u – Sample rate how often you sample
n – Bitrate how detailed your sample is
d
– Sample rate of 9 per 1 sec

– Bitrate of 8 per sample
Di
gi
tis
ed
ve
rsi
on
How much detail will be lost in digitisation ?

Pe • the most Effective lossy
rc compression for audio data is
ep – Identify data that doesn’t matter
tu – In the sense of not affecting the
all perceived audio is no different from the
original
y
– Thus disregarding sounds that the
B human ear cant hear
as –
ed
C
o
m
pr
R • There are two particular reasons
ea why the human ear may fail to hear
so a sound and they are
ns – A sound may be to quite to hear
yo – A Sound may be mask by another
u sound
m
ig
ht
no
t
he
ar
• Of course nether of the reasons are
as simple as they may 1st seem
– The threshold of hearing is the
minimum level but this is varies along a
none linear line
• Avery high or very low Frequency sound
must be much louder than a midrange tone
– WHY do we heart better at mid range?
• There is no point compressing audio out side
this threshold
– To do this we apply the psycho-
acoustical model, a mathematical
description of the way the ear and brain
react to sound
• Threshold of Hearing
• Sound level measurements in decibels are generally
referenced to a standard threshold of hearing at 1000
Hz for the human ear which can be stated in terms of
sound intensity:
• or in terms of sound pressure:
• This value has wide acceptance as a nominal standard

threshold and corresponds to 0 decibels.
• It represents a pressure change of less than one billionth
of standard atmospheric pressure.
• This is indicative of the incredible sensitivity of human
hearing.
• The actual average threshold of hearing at 1000 Hz is
more like 2.5 x 10-12 watts/m2 or about 4 decibels, but
zero decibels is a convenient reference.
• The threshold of hearing varies with frequency, as
illustrated by the measured hearing curves.
Th
re
sh
ol
d
Figure base: Threshold of hearing curve in a quiet

environment. [Zwicker and Fastl, 1999]
Th
re
sh
ol
d
Th
re
sh
ol
d
si
m
ul
ta
ne
ou
s
m
as
ki
n •Threshold of detection curve in the presence of a masking noise
g with a bandwidth equal to the critical bandwidth, a centre frequency
of 1 kHz and a level of 60 dBspl [Zwicker and Fastl, 1999].
si
m
ul
ta
ne
ou
s
m
as
ki
n The threshold of detection caused by a masking noise with a
g bandwidth equal to the critical bandwidth with a centre
frequency of 1 kHz and various levels [Zwicker and Fastl, 1999].
M
P3
Fe • What did you notice from lab
ed
• Audio task
ba
ck
fr
o
m
la
b
1. N, Chapman J, Chapman 2002 Wiley Digital Multimedia
2. Zwicker and Fastl, 1999 Zwicker, E. and Fastl, H. (1999).
Psychoacoustics: facts and models.
Springer series in information sciences, 22. Springer, Berlin ; New York,
2nd updated edition.
3. http://www.cs.sfu.ca/CC/365/li/interactive-jpeg/Ijpeg.html

Media and Storage

Cargado por

Información del documento

Descripción original:

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Media and Storage

Cargado por

Copyright:

Formatos disponibles

Media And Storage

– You can directly sample the original light that

This method tends to be use in larger more expensive

this is picked up by the sensor and

• The are many other versions of DCT if you wish to study

– Sample rate of 9 per 1 sec

How much detail will be lost in digitisation ?

• or in terms of sound pressure:

• This value has wide acceptance as a nominal standard

Figure base: Threshold of hearing curve in a quiet

También podría gustarte