Nano-Memory Simulation
by Bradley Berg
December 14, 2005
ABSTRACT
The wires interconnecting a grid of sub-lithographic memory
cells are too small to be directly manipulated. Randomized
address decoders can be used to access individual memory
cells. Either differentiated wires are randomly positioned
or undifferentiated wires are randomly connected.
Randomized access requires a mapping memory to translate
ordered addresses into random addresses. A paged serial access
chip architecture is presented to minimize the size of the
memory map and to compensate for the resultant timing latencies.
Simulations were performed to determine plausible memory
configurations for differentiated Core-Shell wires and Random-
Particle decoders. With both approaches about 25% of the cells
were usable assuming a projected 10% failure rate for wires and
cells. About 25% of the cells were lost due to failures and
the remaining half were lost as a result of randomized access.
1. Introduction
Initial nano-memory offerings will compete against dominant incumbent
technologies, DRAM, SRAM, and flash, as well as several emerging technologies.
Random access nano-memories will have to outperform either DRAM (< 50ns) or
SRAM (< 8ns) at lower cost. If near term random access memory technologies
(e.g. Ovonic phase change memory [1]) is to displace DRAM it will need to be
even faster; raising the performance bar. Non-volatile memory has less
stringent performance constraints, but has higher density requirements.
Flash memory can be read quickly (< 60ns), but writes can take milliseconds.
Some proposed nano-memories may be fast enough to compete for use as main
memory, but many will not. A claim has been made that the nano-memory being
developed by Nantero has an access time of 1/2 nanosecond [2]. Even if the
storage medium itself is fast the total chip level access time may be
substantially slower. Proposed nano-memory schemes need to cope with high
failure rates and randomized configuration. These issues are addressed with
mesa-scale (usually CMOS) address translation maps and error correction.
An additional factor that may degrade nano-memory performance is output
coupling delay. This occurs at the junction of a nano-scale read sensor and a
mesa-scale output line. It takes time for the small read sensor to transfer
enough charge to register on the large output line due to its higher
capacitance.
Considering these factors it is likely that nano-memories will be better
positioned to compete on it's density advantage rather than speed. File
systems are stored in paged memory and comprise the bulk of the storage bits in
a computing system. Currently, file systems are stored on low-cost disk with
access times in the tens of milliseconds. New high capacity memory chips can
store frequently accessed pages in fast solid-state devices to dramatically
improve file system access speed [3].
This paper considers chip level architectures for paged nano-memory devices.
It is applicable to nano-memory technologies whose performance is not
sufficient to compete with incumbent and emerging random access devices.
Plausible circuitry to access pages of nano-memory is devised using simulation.
The simulations take into account a varying degrees of fabrication faults and
are used to compare configuration options.
2. Paged Memory Configuration
Solid-state paged memory can be used to improve file system performance and to
build portable storage devices. With the advent of low cost non-volatile
solid-state memory, paged memory can be combined with rotating disks to create
high performance file systems. There are four different uses for solid-state
paged memory in general purpose computer systems.
* Portable drives are already used widely in pen drives and music players.
Over time it may be desirable for personal data to migrate to pen drives [4].
* Non-volatile memory can be used to cache data for rotating disk drives.
Microsoft Windows Vista (a.k.a Longhorn) buffers disk reads in main memory
while disk writes are buffered in non-volatile memory on a hybrid drive [5].
* Solid state disk can potentially sustain peak transfer rates for a given
transfer protocol with access times significantly faster than hard drives.
Transfer rates for SATA 1 are 150MB/second. The recently introduced SATA
2.5 specification has a peak rate of 300MB/second. The next generation
SATA 3 will double the rate again to 600MB / second. [6]
* Even higher transfer rates can be achieved by storing local file systems in
non-volatile memory placed on the mainboard. Local storage can use full
speed DMA channels to transfer data in and out of main memory. This can
greatly improve the performance of local desktop and server systems and
vector data in and out of supercomputer systems.
Dehon [7, 8] proposed that nano-memory chips be hierarchically organized using
a set of crossbar grids of nano-wires. Within a grid nano-wires are
partitioned into bundles. Each bundle is individually addressable by
mesa-scale (typically CMOS) wiring. Throughout this project each grid contains
64 usable bundles per axis. The actual number will be greater due to flaws,
unaddressable wires, and parity bits.
Bundles are kept small enough that the bulk of address decoding can be done
with reliable mesa-scale circuits. A practical bundle size is near the ratio
of mesa-scale to nano-scale pitches. This is expected to be about 10 (e.g.
90nm:9nm). On this basis the page address size will be set at 3 bits and is
used to select a wire within a bundle using a decoder.
GRID ADDRESSING
+-------------------------------------------------------+
| X Bundle(6) | Y Bundle(6) | X Page(3) | Y Page(3) |
+-------------------------------------------------------+
-++++++++++++++-++++++++++++++-
-++++++++++++++-++++++++++++++-
+-----+ -++++++++++++++-++++++++++++++-
->| | -++++++++++++++-++++++++++++++-
Page(3) ->| Map | -++++++++++++++-++++++++++++++-
->| | |||||||||||||| ||||||||||||||
+-----+ |||||||||||||| ||||||||||||||
||||| |||||||||||||| ||||||||||||||
Decoder Inputs ||||| |||||||||||||| ||||||||||||||
||||| |||||||||||||| ||||||||||||||
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
Bundle -+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
||||| |||||||||||||| ||||||||||||||
-+++++--------+++++++++++++++++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
Bundle -+++++--------+++++++++++++++++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
-+++++--------++++++++++++++-++++++++++++++----
|||||||||||||| ||||||||||||||
The X and Y Bundle addresses are scanned and the X and Y Page addresses select
a particular 4096 bit page out of 64 pages within the grid. When a page is
accessed, bits are selected over the entire grid. Correspondingly, the heat at
cross points will be dissipated over a wide area, avoiding hot spots on the
chip. Memory cells this small are likely to be particularly susceptible to
thermal perturbations. Heat can more easily change their analog characteristics
or damage the device. Spreading out access over pages also increases endurance.
Over a long period of time no small group of bits will be repeatedly changed.
3. Fault Model
This section details sources of faults within a grid. However, with respect to
page addressing the decoder logic simply needs to know whether or not a
particular page address can access a fully functional nano-wire. Faults are
categorized into always on, always off, and intermittent [9, 10]. Hard faults
are found using a discovery process. Each address in the grid is tested to see
if data can be successfully written. As invalid addresses are found the
corresponding map entries are marked.
The addresses map is scanned sequentially and invalid or unused addresses are
skipped. The validity of neighboring addresses has no effect on a mapped
address. Faults can be treated as independent events. Consequently a single
composite fault metric is used based on this observation.
Simplifying the fault model down to a single parameter implies the fundamental
chip architecture does not need to be altered as the capacity or fault rate
changes. At most a small number of bundles can be added or removed from grids
to accommodate design changes.
Intermittent faults occur while the chip is in use and can not be detected at
the time of fabrication. These are managed using error correction and are
discussed in section 4. Any hard fault has a corresponding intermittent
counterpart so the same categorization of hard faults can also be used to
structure a detailed analysis of intermittent faults.
3.1 Crossbar faults
Crossbar faults occur in an individual memory cell at the point where two
nano-wires cross.
* Short: The wires can short causing failures along both intersecting wires.
Mark both wires as faulty.
* Open: Only the single cell will appear to be always 0 or 1.
Additional logic to remap single cells would increase the circuit size.
It is better to just mark both wires as faulty.
* Cell: The memory cell itself could fail and appear as always 0 or 1.
As before both cross wires are marked faulty.
3.2 Wire Faults
Due to their delicate nature nano-wires can easily break. Their close
proximity means that a small fabrication error can allow them to come in
contact with each other. Note that within a bundle duplicate wires may be
selected by the same address; operating as a single line. In some cases a
break in one of the wires may be masked by the other.
* Broken: Cells past the point of breakage will not be accessible.
Disable the address for the broken wire.
* Touch: Two or more touching wires probably will have different addresses.
Once a faulty wire is found the discovery process needs to account
for other interacting addresses. All interacting addresses must
be marked faulty.
3.3 Contact Faults
All wires in a bundle are activated and then their activation is selectively
blocked by enabled decoder input (control) lines. An input line blocks a wire
when it is activated and passes if deactivated. A faulty uncontrolled contact
never blocks and conversely a faulty controlled contact always blocks.
There are no hard contact faults possible for the Random-Particle decoder since
the contacts are randomly present or not. However, intermittent contact faults
might still occur.
* Uncontrolled: A contact is uncontrolled when activated when it should be
controlled. This causes a wire to be activated when it should
not be. If more than one address can activate multiple wires
then they will interfere.
With a linear decoder the wire will always be selected so the
address is discarded. For a logarithmic decoder, if each
wire is accessible through unique addresses despite the
uncontrolled contact, it is still usable. In fact this
condition may be undetectable.
* Controlled: When a contact is always controlled the Wire is not selected
when it should be. In this case the address will not have a
detectable wire and is indistinguishable from a wire missing
from the bundle.
4. Error Correction
Error correction is required to correct intermittent faults that occur after a
chip is fabricated. Reed-Solomon codes are effective at correcting errors in
paged access memories. The page is divided into a sequence of k-bit symbols
and additional parity symbols are appended to the page. The parity symbols
can be used to correct errors in up to a fixed number of symbols determined
by design parameters. Any number of bits within a symbol can be corrected.
Consequently, lengthy sequences of errors can be corrected.
Unlike mesa-scale memory grids, failures in a nano-scale grid are likely to
involve complete wires. Serially reading cells along a faulty nano-wire yields
a contiguous sequence of failed bits. Reading cells perpendicular to a faulty
nano-wire distributes the failures throughout the page. In this case many
symbols need correcting as each failed cell occurs in a different symbol. This
requires many correctable symbols and consequently many parity bits.
A more balanced error pattern can be achieved by scanning the grid linearly
halfway along both axis. The grid can be divided into quadrants and accessed
along a different axis in each quadrant as the following diagram illustrates.
This cuts the number of distributed faults in half.
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
----------------||||||||||||||||
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
||||||||||||||||----------------
Dividing each 4096 bit page into three RS(255, 233) code words reduces the
number of parity bits needed even further. RS(255, 233) is a Reed-Solomon code
with 8 bit symbols of which up to 16 can be corrected per code word. Each
code word contains up to 233 data symbols and 32 parity symbols. Together the
three code words use (32 * 8 * 3) 768 parity bits per page. Alternatively,
using a single code word for each page requires a 10 bit symbol resulting in
(32 * 10 * 3) 960 parity bits per page.
As the page is scanned cells are transferred to alternating code words. Invalid
data due to a nano-wire failure is then evenly distributed over all three
code words. Consequently the number of failures per code word will not exceeds
the upper bound of 16 corrections per code word.
The data capacity of the three RS(255, 233) code words is (223 * 8 * 3) 5352
bits; which is more than is needed. Rounding up the 4096 bit page size to a
multiple of 3 symbols gives (171 * 8 * 3) 4104 data bits. The capacity of the
grid is then (4104 + 768) 4872 total bits per page; which can be stored in
(70 * 70) 4900 grid bits. A 70 by 70 bit grid size leaves enough room to store
an additional byte per code word that can be used as a checksum. The checksum
can detect when errors exceed the correctable limit.
Once a fault is corrected, the corresponding nano-wires are re-mapped to spare
bundles. This is done very simply by updating the bundle map and rewriting the
page. At some point there might be no more spare bundles available to be
remapped. Spare grids can also be added to the chip and the entire grid can be
rewritten to a spare grid.
5. Nano-Memory Simulation
Simulation is used to determine plausible chip configurations for nano-memories.
In particular Core-Shell and Random-Particle decoders are simulated with
varying defect rates, bundle sizes, and the number of decoder inputs.
5.1 Core-Shell Nano-Wires
Core-shell decoders can double up the contacts to increase reliability. For
independent contact failures the probability of a fault is squared. For a
given fabrication process contact failures may not be totally independent and
the actual failure rate will be less than the square. Random particle contacts
can not use double contacts to increase their reliability as the contacts are
random and can't be duplicated.
Different Core-Shell nano-wires are developed through a chemical design
process. It can be expected that making many different types of shells is
bounded by chemistry to a small number of wire types. Consequently the
simulated models seek to use a small, but reasonable number of wire types.
The input address map for a Core-Shell decoder can use one bit for each wire
type. When a wire type is addressable in a bundle the bit is set. To
determine if a particular wire type (address) is present the map bits are
shifted and the address is decremented each time a one is encountered. Another
counter counts the number of shifts. As the address reaches each 0 the shift
counter has the input line values.
The number of cycles needed for decoding each bundle can be up to the number of
wire types. The decoding process has to be faster than the memory access time
to achieve maximum access speed. For paged access it's assumed that the memory
is slower than DRAM, so there is plenty of time for the computation. A more
complex equivalent circuit could perform the same computation in a single cycle.
Example: 7 of 8 addresses usable in a bundle with map settings of:
A3: 0 1 2 3 4 5 6 7
Map: 1 0 1 1 0 1 0 0 1 0 1 1
A4: 0 1 2 3 4 5 6 7 8 9 10 11 Miss
A3 = 3 (in) A4 = 0 (out)
2 1
2 2
1 3
0 4
0 5 Hit
A3 = 7 (in) A4 = 0 (out)
6 1
6 2
5 3
4 4
4 5
3 6
3 7
3 8
2 9
2 10
1 11
0 Miss
To prevent the upper addresses from always being mapped to a miss, the
addresses need to be cycled. This can be done using the exclusive or of the
low bits of the bundle index and the input address. In the example above the
low 3 bits of the bundle index would be exclusive or'ed with A(in).
5.1.2 Single Core Shell [11,12]
The decoder for the Single Core-Shell modeled here can use either a linear
encoding (8 control lines) or a dual-rail log encoder (6 control lines). A
preliminary simulation was run to determine reasonable bundle sizes and to
observe the effect of disabling individual input addresses in groups. Each
simulation was run to determine the number of bundles required along each axis
to produce a working 64 by 64 bundle grid. There are 8 wire types with no
faults and 1000 grids were generated. Simulations were not run for the blank
fields as it was apparent 2 bit maps were impractical after a few runs.
map bits per bundle -> (bundles per axis, wires, map bits per axis)
bundle 2 4 8
6 205, 1230,820 124, 744,992
7 175, 1225,700 103, 721,824
8 198, 1584,396 118, 944,472 *88, 704,704
9 126, 1134,504 *91, 819,728
10 160, 1600,320 116, 1160,464 *85, 850,680
12 145, 1740,290 100, 1200,400 80, 960,640
14 125, 1750,250 88, 1232,352 77, 1078,616
Note that the more plausible configurations are marked with an asterisk in this
and subsequent tables.
Observations:
* Using fewer map bits than wire types requires many more wires.
This is because addressable wires are being discarded.
* Using fewer wires per bundle than wire types requires more wires
and more map bits.
* Using more wires per bundle than wire types uses more wires but
fewer map bits, but with diminishing returns. 8 to 10 wires
per bundle seem reasonable.
Simulations incorporating different fault rates were run next. Ten 70 by 70
grids were generated in each of two runs. The average number of bundles
required was used to determine the number of wires and map bits needed for each
axis. The first table contains the raw results and the second multiplies the
number of bundles to yield the number of nano-wires and map bits required.
(run1 run2 | truncated average bundles per grid)
Fault 8 wire bundles 9 wire bundles 10 wire bundles
0% 103 99 | 101 98 98 | 98 92 92 | 92
1% 117 117 | 117 110 110 | 110 103 104 | 103
5% 125 125 | 125 121 120 | 120 114 112 | 113
10% 137 137 | 137 129 127 | 128 126 123 | 124
20% 162 166 | 164 158 157 | 157 151 150 | 150
30% 201 200 | 200 196 196 | 196 191 191 | 191
(average bundles, wires, map bits)
Fault 8 wire bundles 9 wire bundles 10 wire bundles
0% 101, 808, 808 * 98, 882, 784 92, 920, 736
1% 117, 936, 936 *110, 990, 880 103,1030, 824
5% 125,1000,1000 *120,1080, 960 113,1130, 904
10% 137,1096,1096 *128,1152,1024 124,1240, 992
20% 164,1312,1312 *157,1413,1256 150,1500,1200
30% 200,1600,1600 *196,1764,1568 191,1910,1528
Observations:
* Using nine wires per bundle provides a good trade-off between map size
and wire utilization.
* For fault rates of 10% and under wire utilization is dominated by duplicate
wire addresses and not faulty wires. Note that the ideal utilization is
(70 * 8) 560 wires. The 0% fault case shows the utilization due solely to
duplicate wires.
5.1.3 Double Core
Coating nano-wires with two shells reduces the number of different material
types and etching steps. Four material types can be used to fabricate up to 12
different wire types. The table below shows each etching step (columns) for
each of the possible 12 combination of wire coatings.
In 5 etchings there are 9 wire types.
k1 k2
k1 k3
k1 k4
k2 k3
k2 k4
k3 k4
k2 k1
k3 k1
k4 k1
For 11 wire types add a 6th etch:
K3 K2
K4 K2
For 12 wire types add a 7th etch:
K4 K3
As before simulations for ten 70 by 70 grids were run for 9, 11, and 12 wire
types. This shows the effect of using 5, 6, or 7 etches respectively.
9 wire types with 9 bit map shift/count map
(run1 run2 | truncated average wires per grid)
Fault 9 wire bundles 10 wire bundles 11 wire bundles
0% 104 102 | 103 98 96 | 97 92 93 | 92
1% 105 104 | 104 101 101 | 101 95 95 | 95
5% 112 113 | 112 106 106 | 106 102 103 | 102
10% 123 126 | 124 121 121 | 121 112 113 | 112
20% 146 147 | 146 143 141 | 142 139 139 | 139
30% 181 180 | 180 181 181 | 181 176 176 | 176
(average bundles, wires, map bits)
Fault 9 wire bundles 10 wire bundles 11 wire bundles
0% 103, 927, 927 * 97, 970, 873 92,1012, 828
1% 104, 936, 936 101,1010, 909 * 95,1045, 855
5% 112,1008,1008 106,1060, 954 *102,1122, 918
10% 124,1116,1116 121,1210,1089 *112,1232,1008
20% 146,1314,1314 *142,1420,1278 139,1529,1251
30% *180,1620,1620 181,1810,1629 176,1936,1584
11 wire types with 11 bit shift/count map
(run1 run2 | truncated average wires per grid)
Fault 11 wire bundles 12 wire bundles 13 wire bundles
0% 86 86 | 86 83 81 | 82 78 79 | 78
1% 88 86 | 87 85 84 | 84 82 82 | 82
5% 93 93 | 93 90 90 | 90 86 88 | 87
10% 100 103 | 101 100 98 | 99 95 95 | 95
20% 124 123 | 123 119 119 | 119 117 117 | 117
30% 149 149 | 149 147 145 | 146 142 143 | 142
(average bundles, wires, map bits)
Fault 11 wire bundles 12 wire bundles 13 wire bundles
0% 86, 946, 946 82, 984, 902 78,1014, 858
1% 87, 957, 957 84,1008, 924 82,1066, 902
5% 93,1023,1023 90,1080, 990 87,1131, 957
10% 101,1111,1111 99,1188,1089 95,1235,1045
20% 123,1353,1353 119,1428,1309 117,1521,1287
30% 149,1639,1639 146,1752,1606 142,1846,1562
12 wire types with 12 bit shift/count map
(run1 run2 | truncated average wires per grid)
Fault 12 wire bundles 13 wire bundles 14 wire bundles
0% 86 80 | 83 82 81 | 81 77 77 | 77
1% 81 81 | 81 79 79 | 79 77 77 | 77
5% 86 88 | 87 84 84 | 84 83 83 | 83
10% 95 95 | 95 93 90 | 91 88 89 | 88
20% 109 115 | 112 108 109 | 108 106 108 | 107
30% 138 135 | 136 133 133 | 133 132 132 | 132
(average bundles, wires, map bits)
Fault 12 wire bundles 13 wire bundles 14 wire bundles
0% 83, 996, 996 81,1053, 972 77,1078, 924
1% 81, 972, 972 79,1027, 948 77,1078, 924
5% 87,1044,1044 84,1092,1008 83,1162, 996
10% 95,1140,1140 91,1183,1092 88,1232,1056
20% 112,1344,1344 108,1404,1296 107,1498,1284
30% 136,1632,1632 133,1729,1596 132,1848,1584
Observations:
* Typical wire utilization and map size were close for 8 to 12 wire types.
It is probably not worth the additional cost to perform the additional
fifth or sixth etches.
5.2 Random Particle
Williams and Kuekes describe a scheme for building a decoder based on the
random deposition of gold particles [13]. Control lines with random controlled
and uncontrolled contacts are produced. The deposition process is tuned such
that there is an even distribution of controlled and uncontrolled contacts.
The number of input lines needs to be larger than in the decoder for Core-Shell
in order to uniquely address nano-wires. Consequently the dense mapping scheme
used for Core-Shell decoders can not be used. Instead each 3 bit page address
needs to be mapped to the setting for each input line. For each bundle 8 input
lines use an (8 * 8) 64 bit map, 10 use (8 * 10) 80 bits, and a 12 line decoder
uses (8 * 12) 96 bits.
A preliminary simulation was run to determine a reasonable number of input
lines and bundle size. The six most promising configurations were selected
for further analysis. The simulation results for the six chosen input and
bundle size combinations are shown in the second set of tables.
(average bundles, wires, map bits)
Inputs bundle 8 bundle 10 bundle 12 bundle 14 bundle 16
8 120,960,7680 106,1060,6784 *98,1176,6272 93,1116,5952 93,1488,5952
10 95,760,7600 *81, 810,6480 *78, 936,6240 *77,1078,6160 74,1184,5920
12 83,664,7968 75, 750,7200 70, 840,6720 *68, 952,6528 *67,1072,6432
(input lines, wires per bundle) -> (average bundles, wires, map bits)
Fault 8, 12 64 bit map
0% 106,1272, 6784
1% 112,1344, 7168
5% 117,1404, 7488
10% 124,1488, 7936
20% 138,1656, 8832
30% 158,1896,10112
Fault 10, 10 10, 12 10, 14 80 bit map
0% 89, 890, 7120 *83, 996,6640 82,1148,6560
1% 94, 940, 7520 *86,1032,6880 82,1148,6560
5% 98, 980, 7840 *88,1056,7040 86,1204,6880
10% 103,1030, 8240 93,1116,7440 87,1218,6960
20% 114,1140, 9120 103,1236,8240 96,1344,7680
30% 132,1320,10560 119,1428,9520 109,1526,8720
Fault 12, 14 12, 16 96 bit map
0% 74,1036,7104 74,1184,7104
1% 75,1050,7200 75,1200,7200
5% 75,1050,7200 75,1200,7200
10% *77,1078,7392 76,1216,7296
20% 83,1162,7968 *80,1280,7680
30% 91,1274,8736 *85,1360,8160
Observations:
* The number of nano-wires used is close to that for Core-Shell wires.
* The number of map bits required shows an increase of about 6 or 7 times
compared to Core-Shell wires.
6. Conclusions
Using a paged address scheme for nano-memories relaxes several design
parameters; lowering technical risks. This is particularly relevant for first
generation devices. Access is distributed over many bits ensuring there are no
hot spots at the chip or memory cell level. Disbursed access also means longer
endurance (number of rewrites) over the life of the chip.
Support logic for paged access is simpler than the logic for random access.
This is particularly true of nano-memory; which has to cope with randomized wire
placement and high fault rates. Fault management can also result in irregular
timing; which is undesirable in random access memory. Buffers used in paged
access memory eliminate the irregularities. The overall access speeds required
for paged access are less stringent than for random access.
The following recommendation are based on the simulation runs:
Single Core-Shell
* Use 8 wire types in 9 wire bundles.
* Use either an 8 bit linear decoder or a 3 bit dual rail log decoder.
* Double up the decoder input lines to reduce contacts faults.
Double Core-Shell
* With a fault rate of 10%, use 9 wire types in 11 wire bundles.
* Use a 4 bit dual rail log decoder.
* Double up the decoder input lines to reduce contacts faults.
* Compress the input map using the shift-and-add mapping method.
Random-Particle
* Use a 10 to 12 bit log decoder.
* Use 12 to 16 wires per bundle.
* As the map size is the same for paged and random access,
Random-Particle wires are suitable for either access mode.
All 3 fabrication methods are limited by the small number of wire types leading
to many duplicate nano-wires. At a 10% fault rate duplicate wires dominate
wire utilization over faulty wires by about a factor of two. The major
limitation of the Core-Shell process is the cost of making additional wire
types. The limiting factor for Random-Particle is the large input map size.
With a typical 10:1 mesa-scale to nano-scale pitch ratio the potential density
increase for nano-memories is 22 to 28 times that of conventional CMOS memory.
ACKNOWLEDGEMENT
Thanks to Eric Rachlin for his assistance with the discovery algorithm for the
Random-Particle decoder.
BIBLIOGRAPHY
[1] Stefan Lai (Intel) and Tyler Lowrey (Ovonyx). OUM - A 180 nm Nonvolatile
Memory Cell Element Technology For Stand Alone and Embedded Applications.
ftp://download.intel.com/technology/silicon/OUM_doc.pdf
[2] On the Tube. The Economist. May 8th 2003.
http://www.economist.com/science/displaystory.cfm?story_id=1763552
[3] Bradley A. Berg. New Computers Based on Non-Volatile Random Access
Memory. July 18, 2003.
http://www.techneon.com/paper/nvram.html
[4] Bradley A. Berg. Securing Personal Portable Storage. May 12, 2005.
http://www.techneon.com/paper/pen.html
[5] Jack Creasey (Microsoft). Hybrid Hard Drives with Non-Volatile Flash
and Longhorn.
http://download.microsoft.com/download/9/8/f/98f3fe47-dfc3-4e74-92a3-088782200fe7/TWST05002_WinHEC05.ppt#309,10,Technical Assumptions for Hybrid Disk
[6] Michael Alexenko (Maxtor). ATA for the Enterprise: The Present and
Future State of ATA. February 21, 2001.
http://www.sata-io.org/docs/srvrio0201b.pdf
[7] Andre DeHon. Array-Based Architecture for FET-Based, Nanoscale
Electronics. IEEE Transactions on Nanotechnology, VOL. 2, NO. 1,
March 2003 pp. 23-32.
[8] Andre DeHon, Patrick Lincoln, and John Savage. Stochastic Assembly of
Sublithographic Nanoscale Interfaces. IEEE Transactions on
Nanotechnology, vol. 2, no. 3, pp. 165-174, 2003.
[9] Myung-Hyun Lee, Young Kwan Kim, and Yoon-Hwa Choi. A Defect-Tolerant
Memory Architecture for Molecular Electronics. IEEE Transactions on
Nanotechnology, VOL. 3, NO. 1, March 2004.
[10] Philip J. Kuekes, Warren Robinett, Gadiel Seroussi and R. Stanley
Williams. Quantum Defect-tolerant Interconnect to Nanoelectronic
Circuits: Internally Redundant Demultiplexers Based on Error-correcting
Codes. Science Research, Hewlett-Packard Labs, 1501 Page Mill Road,
Palo Alto, CA.
[11] Lincoln J. Lauhon, Mark S. Gudiksen, Deli Wang, and Charles M. Lieber.
Epitaxial Core-shell and Core-multishell Nanowire Heterostructures.
Nature, Vol. 420, pp. 57-61 (2002).
[12] Dongmok Whang, Song Jin, and Charles M. Lieber. Nanolithography Using
Hierarchically Assembled Nanowire Masks. Nano Letters, Vol. 300, No. 7,
pp. 951-954.
[13] Stanley Williams and Philip Kuekes. Demultiplexer for a Molecular Wire
Crossbar Network. United States Patent Number: 6,256,767, July 3 2001.
http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=/netahtml/srchnum.htm&r=1&f=G&l=50&s1=6,256,767.WKU.&OS=PN/6,256,767&RS=PN/6,256,767