Está en la página 1de 17

Non-volatile Memory in the Storage

Hierarchy: Opportunities and


Challenges
Dhruva Chakrabarti
Hewlett-Packard Laboratories

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

From Disks to Flash and Beyond

Historical situation with non-volatile storage (hard disks)

Flash memory is a huge leap

Slow device, slow hardware interfaces


Overheads in the software stack
Orders of magnitude faster than hard disks
But still much slower than main memory
Capacity, price, performance somewhere in the middle
Software being optimized

NVRAM presents even bigger opportunities

Performs like DRAM


Access interface like DRAM

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

Approximate Device Characteristics1, 2

HDD

SSD (NAND Flash)

DRAM

NVRAM

Density
(bit/cm2)

1011

1010

109

> 1010

Retention

> 10 yrs

10 yrs

64 ms

> 10 yrs

Endurance
(cycles)

Unlimited

104 105

Unlimited

109

Read latency

3-5 ms

0.1 ms

< 10 ns

20-40 ns

Write Latency

3-5 ms

100 us

< 10 ns

50-100 ns

0.1

10

< 10

Cost ($/GB)

1International Technology

Roadmap for Semiconductors (ITRS): Emerging Research Devices, 2011.


2 Qureshi et al., Scalable High Performance Main Memory System using Phase-Change Memory
Technology, ISCA 2009.
2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

Storage Class Memory (SCM)1

Benefits of DRAM and archival capabilities of HDD


Two levels of SCM possible:

S-SCM1, accessed through the I/O subsystem. Key factors:

Retention
Cost per bit

M-SCM1, accessed through the memory subsystem. Key factors:

Access latency
Endurance

1International Technology

Roadmap for Semiconductors (ITRS): Emerging Research Devices, 2011.

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

Access Interface Choices for SCM

Block interface

Byte-addressable CPU load/store model

Operating system overheads3


Legacy optimizations for disks
SSD specific optimizations
Fast (occurs at CPU speed)
Fine granularity persistence, potentially immediately
But more exposed to failure-related consistency issues

Caulfield et al., Moneta: A High-performance Storage Array Architecture


for Next-generation, Non-volatile Memories, MICRO 2010.

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

Architectural Model for NVRAM

CPU

Store
Buffer

Both DRAM and NVRAM may co-exist


Volatile buffers and caches still present
Updates may linger within volatile structures
Caches

DRAM

NVRAM

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

Failure Models

Fail-stop (processes need to be crash-tolerant)

A large percentage of failures are indeed fail-stop4

Byzantine (BFT techniques have been studied)


Arbitrary state corruption (hardening techniques exist)5
Memory vulnerability during system/application crashes

Memory protection can achieve a high degree of reliability6


Requirement: Store to memory must be failure-atomic
Invariant: Data in buffers and caches do not survive a failure

Chandra et al., How Fail-Stop are Faulty Programs? FTCS 1998.


5 Correia et al., Practical Hardening of Crash-Tolerant Systems, USENIX ATC 2012.
6 Chen et al., The Rio File Cache: Surviving Operating System Crashes, ASPLOS 1996.
2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

Models requiring more flexibility

NVRAM Opportunities and Challenges

General programming
Persistent data structure
Logging
HPC-style checkpointing

Opportunities:
Achieve durability practically free
o No interface translation
o Low write latencies
Reuse and share durable data

SQL database

Challenges:

OS (e.g. filesystems))

How do we keep persistent data consistent?


Whats the programming complexity?

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

Visibility Ordering Requirements


Volatile cache

NVRAM

Allocate

Initialize

Crash

Publish
A crash may leave a pointer to uninitialized memory

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

Potential Inconsistencies

Wild pointers
Pointers to uninitialized memory
Incomplete updates
Violation of data structure invariants
Persistent memory leaks

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

10

A Quick Detour
Do we have an analog in multithreading?
Initially: volatile int x = 10, *y = NULL, int r = 0
Thread 1
x = 15
y = &x

Thread 2
if (y)
r = *y

Can r == 10?

Either add fences between stores or use C++11 atomics

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

11

Back to NVRAM and Visibility Issues

How to ensure that a store x is visible on NVRAM before


store y?
Insert a cache line flush to ensure visibility in NVRAM

Similar to a memory fence


Reminiscent of a disk cache flush

Allocate
Initialize
Publish

Allocate
Initialize
fence
flush
fence
Publish

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

12

Flavors of Cache Line Flushes

x86: clflush (Flushes and invalidates a cache line)


Power: dcbst (Flushes a data cache block), dcbf (flushes
and invalidates a data cache block)
Ia64: fc (flushes and invalidates a cache line)
ARM: separate operations for flush, flush & invalidate
are provided
UltraSPARC: block store (flushes and invalidates)

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

13

Issues about Cache Flushes

Must honor the intended semantics

Volatile buffers in the memory hierarchy must be flushed

Invalidation should be separated from flush


Processor instruction or a different API
Cost
Granularity
How to track what to flush
Can we use existing binaries on NVRAM?
Is a recompilation sufficient?

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

14

Alternatives

Use other caching modes such as uncacheable and


write-through

Cost
x86 memory model not well understood
Weaker store ordering

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

15

Other Requirements

Higher level abstractions


Interactions with other threads and processes
Memory management issues

Failure-atomicity enforcement

Whats the right granularity?

Should we be wrapping all NVRAM stores within


transactions?

Store ordering does not prevent memory leaks


Garbage collection a necessity but is it always possible?

Significant performance cost


But still significantly faster than persisting on block devices

Creation, identification, listing of persistent data

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

16

Conclusion

Non-volatility is moving closer to the CPU


Byte-addressability offers significant benefits
But failures complicate matters
What is the right API?
How much is the additional programmer effort?
What are the costs of implementing the API?

2012 Storage Developer Conference. Hewlett-Packard Company. All Rights Reserved.

17