“Since the PCIe interface was introduced, it has become the most important interface on PC and Server. In order to increase the data throughput rate, the PCI-SIG organization constantly refreshes the interface standards, from the 8GT/s data rate of PCIe 3.0 to the 16GT/s data rate of PCIe 4.0, and then to the 32GT/x of PCIe 5.0. The PCI-SIG organization has achieved the doubling of the rate while still maintaining the use of ordinary FR4 boards and cheap connectors, mainly due to two improvements. One is to use 128b/130b encoding instead of 8b/10b encoding, making The coding efficiency is greatly improved; the other is to use dynamic equalization technology to replace the previous generation of static equalization technology.
Author: Li Yu, Application Engineer, Tektronix
Since the PCIe interface was introduced, it has become the most important interface on PC and Server. In order to increase the data throughput rate, the PCI-SIG organization constantly refreshes the interface standards, from the 8GT/s data rate of PCIe 3.0 to the 16GT/s data rate of PCIe 4.0, and then to the 32GT/x of PCIe 5.0. The PCI-SIG organization has achieved the doubling of the rate while still maintaining the use of ordinary FR4 boards and cheap connectors, mainly due to two improvements. One is to use 128b/130b encoding instead of 8b/10b encoding, making The coding efficiency is greatly improved; the other is to use dynamic equalization technology to replace the previous generation of static equalization technology.
Here we focus on the dynamic equalization technology in PCIe 3.0 and 4.0, and introduce its principle, implementation and related conformance tests. Such a dynamic equalization technology is called “Link Equalization” (Link Equalization, LEQ for short) in the spec. This series of articles is divided into two parts, the theoretical part of this article mainly introduces the working principle of PCIe 3.0/4.0 link balance, and the next practical part focuses on the test and debugging of link balance.
In addition, Tektronix PCI Express expert David Bouse will live on April 10 (Friday) 13:00-16:00[PCI Express 5.0 Specification Update Interpretation and Test Reveal]to explain how to solve the new test challenges of PCIe 5.0 https:/ /info.tek.com/cn-pcie-mofu.html.
Link Balance for PCIe 3.0 & 4.0
The link equalization technology in PCIe 3.0 and 4.0 is much more complex than the previous generation. Such a dynamic equalization technology can be discussed in two aspects.
Ÿ Equalization characteristics: From this aspect, compared with the previous generation of equalization, the hardware performance requirements of the equalization technology in 3.0 and 4.0 are higher.
Ÿ Protocol: In order to dynamically adjust the equalization settings, the cooperation of the protocol layer is required, which is realized through the Recovery.Equalization sub-state in the LTSSM state machine of the PHY layer.
Let’s take a look at the equalization of PCIe 3.0 and 4.0 from the perspective of equalization characteristics. The following shows all the equalization technologies used in PCIe 3.0/4.0. There is FFE (Feed Forward Equalizer) on the Tx side; On the Rx side, there are: CTLE (Continuous Time Linear Equalizer, continuous time linear equalizer) and DFE (Decision Feedback Equalizer, decision feedback equalizer). Through FFE and CTLE, most of the jitter introduced by ISI can be removed; ISI can be further removed through DFE, which can also remove some of the reflections caused by impedance mismatches. Through these equalization processes, it is possible to ensure that the eye diagram is opened at the decision input of the receiver to the greatest extent.
In addition to the support for the above-mentioned equalization features, the protocol layer (LTSSM) also stipulates that the equalization setting value on the link needs to be dynamically adjusted by means of the protocol. This whole process is called Link Equalization (LEQ) . During link balancing:
Ÿ The local end sends data according to the setting of an initial Tx EQ;
Ÿ When the opposite end receives data, it will judge whether the Tx EQ is appropriate according to the bit error rate or signal quality;
Ÿ If it is not suitable, the peer end will request a new Tx EQ value from the local end through the protocol;
Ÿ After receiving this request value, the local end will change the value of Tx EQ.
Through this dynamic process, the Tx EQ on the link can be guaranteed to be the optimal value. At the same time, the local end and the peer end will also adjust the Rx EQ at the same time. By dynamically adjusting Tx EQ and Rx EQ, it can flexibly adapt to different channel conditions.
Figure 1 Block diagram of LEQ hardware implementation
Equalization at the sender: FFE
Both 3-tap FFEs are used in PCIe 3.0 & 4.0, as shown in Figure 2a. Among them, it is a digital signal, and the value is ±1 during modeling; it is the tap coefficient of the FFE; it is the analog signal output of the sending end.
(a) Model block diagram of FFE (b) Analog voltage output of FFE
Figure 2 3-tap FFE used by PCIe 3.0 & 4.0 transmitter
The ideal differential voltage amplitude is: 23÷2=4 possibilities, these four voltage amplitudes are marked as Va, Vb, Vc, Vd in the PCIe standard (as shown in Figure 2b).
Among them, Vb is called de-emphasis voltage (de-emphasis voltage), Vc is called preshoot voltage (preshoot voltage); Vd is called maximum amplitude voltage (boost voltage), PCIe standard does not take a special for Va name. On this basis, three sets of ratios are used to fully describe the performance of FFE in the standard:
Without limitation, there are infinitely many combinations. But not all combinations are suitable in practical applications. One of the most important constraints is that the de-emphasis voltage Vb cannot be too small. Too small de-emphasis voltage will cause the eye height of the output signal at the receiving end to be too low. Therefore, the amplitude of the de-emphasized ground voltage is limited by the BOOST ratio: for the full swing Tx output, the specification requires BOOST≤9.5dB; for the reduced swing Tx output, the specification requires BOOST≤3.5dB. Eventually a matrix table similar to Figure 3 will be formed, where the coefficients in the figure have a granularity of 1/24. In practical applications, it can be other granularity values, such as 1/64; a smaller granularity can make the value of the coefficient space more possible, and it is also finer when adjusting the LEQ.
Figure 3. Example of matrix table of coefficient space equalized at the transmitting end
In view of the possibility of taking values in the coefficient space, the PCI-SIG Association has extensively studied the optimal combination of coefficient values under different insertion losses in the process of developing the protocol; In the actual LEQ process, both sides of the link can use the preset values for coarse adjustment; Excellent, it can be further fine-tuned by means of coefficient space, and finally achieve a balance between speed and accuracy.
Equalization at the receiver: CTLE and DFE
In the PCIe 3.0 & 4.0 basic specification, the structure of the receiver is not clearly specified; instead, the performance of the receiver is only specified from the perspective of measurement. Instead, a behavioral CTLE and behavioral DFE are defined in the specification. These behavior-level models can be used as design guides; and in order to enable the object to be tested to pass the requirements of the specification, generally speaking, the performance of the receiver designed by the user must be at least equal to the performance of these behavior-level models, which can be stronger than these behavior-level models, but cannot be weaker than these behavioral-level models.
Figure 4 Frequency response curve of behavioral CTLE: (a) PCIe 3.0 (b) PCIe 4.0
After a long FR4 trace for the output of the transmitter, using CTLE alone may not be enough. Therefore, in PCIe 3.0 & 4.0, DFE technology is also used. In 3.0, a 1-tap DFE is used, and in 4.0, since the rate is doubled relative to 3.0; a 2-tap DFE is used in order to remove larger ISI.
Compared with linear equalizers FFE and CTLE, DFE is a nonlinear equalizer. The basic idea of DFE is: if the previous bit data has been received correctly; then the influence of the previous bit data on the current bit is known; so we can compensate by feedback, which can further eliminate Effects of jitter and noise. It is not difficult to see that the nonlinearity here is reflected in: the feedback signal is a digital signal after the decision; and the decision circuit is a nonlinear circuit. Obviously, the higher the number of taps in the feedback path, the better the cancellation of jitter and noise may be; this is why a 1-tap DFE is used in 3.0 and a 2-tap DFE is used in 4.0.
Figure 5 Structure of a behavioral DFE: (a) PCIe 3.0 (b) PCIe 4.0
Link Balancing Process
When the two ends of the link first establish communication, they do not know the physical characteristics of the entire channel, such as how much the insertion loss is and whether there is impedance discontinuity. Since the insertion loss of PCIe 3.0 and 4.0 can vary widely, a static equalization setting cannot cover all situations. In this way, both parties on the link need to dynamically adjust the equalization setting according to the characteristics of the current physical channel, so that the equalization setting is optimal for the current physical channel. Assuming that Port A and Port B are two ends of a link, the link balancing process needs to do the following:
Ÿ Configure the initial equalization settings of Port A and Port B;
Ÿ Configure the equalization settings from Port A Tx à Port B Rx direction;
Ÿ Configure the equalization settings in the direction from Port B Tx à Port A Rx;
Next, we will use the direction of Port A Tx à Port B Rx to illustrate how to achieve link balance. As shown in Figure 6, when the link at the rate of 8GTs/or 16GT/s starts to establish communication, the TS1/TS2 sequence is sent with the initial unoptimized TX EQ, and Port A indicates in the TS1/TS2 sequence that its The value of the TX EQ used.
Figure 6 LEQ: Local side sends unoptimized initial TX EQ
When Port B Rx receives these TS1/TS2 sequences, there is a circuit or a set of algorithms inside the chip to evaluate whether the current TX EQ is suitable. If it is considered inappropriate, it will send the TS1 sequence as shown in Figure 7 to request A new TX EQ.
Figure 7 LEQ: The peer requests a new TX EQ
Then, Port A will receive the TS1 sequence requesting to set the TX EQ, as shown in Figure 8, and adjust the FFE setting on its TX side.
Figure 8 LEQ: The peer’s request is correctly received locally, and a new TX EQ is set
After Port A adjusts the setting of Tx FFE, as shown in Figure 9, it will update the new TX EQ setting value to the sequence of TS1/TS2 and send it to Port B side. If Port B still feels that the TX EQ at this time is not optimal, it will still repeat steps 2 to 4 in the figure until it reaches the optimal TX EQ. Of course, the above process cannot be carried out indefinitely, and it must be completed in the range of about 32ms.
Figure 9 LEQ: The local end informs the peer end that a new TX EQ has been successfully set
At the same time as the above steps 2~4, the RX side of Port B is constantly adjusting its RX EQ, as shown in Figure 10. As discussed in Figures 6 to 10, LEQ is based on a request-response mechanism to achieve dynamic balancing. In the PCIe specification, LEQ contains a total of four phases: Phase 0, Phase 1, Phase 2, and Phase 3. The upstream port contains all four processes; the downstream port does not contain Phase 0.
Figure 10 LEQ: Adjusting the RX EQ simultaneously throughout the process
It is not difficult to see from Figure 11 that in the LEQ process, the behaviors of the upstream port and the downstream port are different. Described above is how both parties on the link adjust the Tx EQ during the LEQ process. For Rx EQ, according to the description in the Base specification, both sides of the link can always adjust the Rx EQ during the entire LEQ process and during the subsequent normal operation.
Figure 11 State transition diagram of LEQ