您可以添加到网摘 让更多人关注此文章:
The recently introduced DDR3 SDRAM technology paves the way to higher data rates (from 800 Mbps to 1600 Mbps) and provides higher performance for many systems that depend on data, video, or packet processing.
Every architectural change for higher performance comes at a price, however, and one aspect is measured in additional man hours of system design time, simulation, and troubleshooting. DDR3 SDRAM is an evolutionary step from DDR2 and provides enhanced features to enable higher data rates. It also maintains enough backward compatibility with DDR2 to provide system designers with the benefit of not having to reinvent the wheel on all aspects of controller and interface design.
In the case of FPGA-based designs, some FPGA vendors have taken on the task of designing a complete controller and physical layer interface. This article outlines the major differences between DDR3 and DDR2 SDRAM architecture, the challenges that come with architectural changes for higher data rates, and also reviews them in the context of a Xilinx Virtex-5 FPGA reference design tested in hardware at 800 Mbps. The reference design is available free for downloading.
DDR3 evolution from DDR2 SDRAM DDR3 SDRAMs are following the same path that previous generations of DDR SDRAMs have, and are thus slated to provide higher data rates than DDR2 SDRAMs. This new architecture comes with enhancements that help with the design of the physical layer, system, and controller interface. Architectural changes were required to enable external controllers to communicate at these higher rates and the DDR3 memory device to provide I/O throughput up to 1,600 Mbps.
The performance limiter for DRAMs has been – and continues to be – the internal memory array. Increasing external data rates to keep up with performance demands is difficult to scale at the internal memory array level. The DDR2 architecture uses a 4n prefetch scheme that enables four words to be accessed in the internal memory array for every command. This was not sufficient to keep up with higher external data rates, however, and an 8n prefetch was introduced with DDR3.
The trade-off is that for applications that require shorter bursts of four, this longer prefetch mode is a mismatch. A special command will chop the burst of eight in half for those cases when a burst of four is required. The burst of four will have gaps on the data bus between bursts, since the internal memory array access is an eight-word access.
Increasing the burst rate will increase the operating power, but the DDR3 architecture gives designers a break with a lower core and I/O voltage reduction from 1.8V to 1.5V. Additionally, the BGA package has improved, while additional power and ground balls better shield signals at higher data rates and reduce crosstalk.
The higher transition rates for both data and address lines can be a challenge. Improved connectivity options are necessary to give designers more options for matching the driver to PCB impedance and thus minimizing signal reflections. The ZQ pin option with an external resistor provides the control mechanism to calibrate the impedance of the DDR3 drivers. Reducing the reflections improves the data eye and consequently improves timing margins, especially in systems that have multi-load topologies and more predispositions to signal reflections.
Because calibration for improved timing is essential with higher data rates, the DDR3 architecture has introduced two new features. For read calibration during initialization, a multi-purpose register (MPR) provides the means for calibrating the read operation at the controller side, with a data pattern sent out from this register rather than the memory array. This feature can be useful when the memory array is not used to generate the data read pattern.
For applications requiring the use of DDR3 DIMMs, the JEDEC standard prescribes the fly-by topology. This fly-by topology is different than the star or T topology used for DDR2 DIMMs, where the signals are routed to arrive at the same time for all of the memory components on the DIMM. In the case of the fly-by topology, signals are routed to each component in serial-like fashion and flight times to memory devices are skewed. This improves signal integrity from the reduction of stubs but requires additional complexity for the controller. The function of the controller to compensate for the skew between signals to satisfy write timing is called "write leveling."
Table 1 overviews the major changes introduced by the DDR3 architecture from previous-generation DDR2 SDRAMs.
 Table 1. Comparison between DDR2 and DDR3 SDRAMs.
For those designers who like to read industry standards, a 188-page description of the DDR3 JEDEC standard is located at www.jedec.org/DOWNLOAD/search/JESD79-3.pdf.
Memory interface and controller design challenges Designing the memory interface and controller with an FPGA presents a number of challenges: a physical layer interface comprising read and write data capture, controller state machine, bank management, and board design issues related to signal integrity and layout. Let's look at these challenges and how the new DDR3 SDRAM architecture plays a role in making them more – or less – difficult.
Physical layer – read and write data capture DDR3 SDRAMs were built to enable higher data rates than DDR2, but these rates come with a price. The data valid window (the period during which data can be reliably latched) for reads by the controller and for writes by the memory device is rapidly shrinking, with data rates moving to 800 Mbps and beyond.
The data valid window is shrinking faster than the data period due to timing uncertainties that do not scale well with increasing data rates. These uncertainties are caused by parameter changes such as clock jitter, device process, and system voltage changes. The capture mechanism for read data is particularly challenging because data (DQ) signals are transmitted by the memory device edge-aligned with the clocking strobe (DQS). It is up to the FPGA I/O and logic circuitry to properly center-align them for reliable capture.
The DDR3 data and strobe signaling is very similar to the DDR2 interface for read cycles. The previous DDR2 design methodology can be leveraged if it involves calibration or "read leveling" to correctly align various DQS signals to their respective DQs during system initialization.
Write operations have become more complex for DDR3 DIMM implementations where the fly-by topology is mandated by the JEDEC standard. The burden is on the controller to generate the data strobe (DQS) and data (DQs) signals with the proper delays to compensate for different flight times to each memory device on the DIMM.
If the design involves point-to-point component implementations, the fly-by topology is not needed and signaling for writes is similar to the DDR2 architecture. In either case, the FPGA needs to have programmable I/O delays for DQ and DQS signals to meet the read and write timing requirements.
Controller state machine design Capturing data for reads and properly sending data out for writes is only one aspect of the controller and interface design. The state machine that manages the physical layer interface and the user or back-end interface that communicates with the rest of the FPGA design can be elaborate and time-consuming to implement from scratch.
The DDR3 SDRAM architecture is different, but maintains a great deal of backward compatibility with previous-generation DDR2, ensuring that some of the DDR2 basic functions implemented in RTL code can be leveraged in the new architecture. One notable aspect is the internal bank structure access algorithm, which can be used identically for both DDR2 and DDR3. Other aspects have changed with the 8n prefetch scheme. Some modifications are required for changes in burst length from four to eight if the previous DDR2 design was only using a burst length of four.
System design issue – signal integrity Transmission of data through parallel lines from chip-to-chip at high rates has always presented a challenge, both because of cross-talk and ground bounce from multiple signals switching at the same time and from ringing caused by impedance mismatching. With higher data rates and lower voltages, these issues have become very serious. Fortunately, the package and driver design for both controller and memory device can help deal with these challenges.
The DDR3 package has additional power and ground pins for better immunity to crosstalk. Some FPGA vendors have also introduced improved packages with sufficient power and ground balls surrounding the signal I/Os, in addition to power and ground planes and decoupling capacitors inside the BGA package.
The impedance mismatching between chip drivers and the PCB has always caused signal reflections that disturb the valid eye with unwanted ringing. Even though the DDR3 architecture provides the means to control driver impedance (ODT and ZQ pin), IBIS simulations are a necessary design practice to verify FPGA and DDR3 driver calibration and the actual signal behavior with a particular board topology.
Xilinx has alleviated many of these challenges by implementing in hardware a 32-bit-wide DDR3 interface and controller running at 800 Mbps, using the Virtex-5 FPGA on the ML561 evaluation board for memory interfaces.
[1] [2] 下一页
|