Skip to content

[MonteCarlo][European Vanilla Core] Max delay path issue

Weird behavior with DSP multiplication it seems not to take the registers pipeline to reduce the critical path. It seems to be caused by the loop of the result as an entry of the multiplication.

Chisel faulty code : (EuropeanVanillaCore.scala:78)

val result = Wire(elemType)
result := Mux(iterationIndex === 0.U, s0, result) * factor

Which led to this critical path :

                         DSP48E1 (Prop_dsp48e1_A[16]_PCOUT[47])
                                                      2.962     5.566 r  cores_0/_T_643__2/PCOUT[47]
                         net (fo=1, unplaced)         0.000     5.566    cores_0/_T_643__2_n_106
                         DSP48E1 (Prop_dsp48e1_PCIN[47]_PCOUT[47])
                                                      1.219     6.785 r  cores_0/_T_643__3/PCOUT[47]
                         net (fo=1, unplaced)         0.000     6.785    cores_0/_T_643__3_n_106
                         DSP48E1 (Prop_dsp48e1_PCIN[47]_PCOUT[47])
                                                      1.219     8.004 r  cores_0/_T_643__4/PCOUT[47]
                         net (fo=1, unplaced)         0.000     8.004    cores_0/_T_643__4_n_106
                         DSP48E1 (Prop_dsp48e1_PCIN[47]_P[1])
                                                      1.077     9.081 r  cores_0/_T_643__5/P[1]
                         net (fo=4, unplaced)         0.466     9.548    cores_0/_T_643__5_n_104
                         LUT3 (Prop_lut3_I0_O)        0.043     9.591 r  cores_0/_T_642_0[40]_i_12/O
                         net (fo=3, unplaced)         0.288     9.879    cores_0/_T_642_0[40]_i_12_n_0
                         LUT5 (Prop_lut5_I1_O)        0.043     9.922 r  cores_0/_T_642_0[40]_i_4/O
                         net (fo=1, unplaced)         0.281    10.203    cores_0/_T_642_0[40]_i_4_n_0
                         CARRY4 (Prop_carry4_DI[1]_CO[3])
                                                      0.253    10.456 r  cores_0/_T_642_0_reg[40]_i_1/CO[3]
                         net (fo=1, unplaced)         0.000    10.456    cores_0/_T_642_0_reg[40]_i_1_n_0
                         CARRY4 (Prop_carry4_CI_CO[3])
                                                      0.054    10.510 r  cores_0/_T_642_0_reg[44]_i_1/CO[3]
                         net (fo=1, unplaced)         0.000    10.510    cores_0/_T_642_0_reg[44]_i_1_n_0
                         CARRY4 (Prop_carry4_CI_CO[3])
                                                      0.054    10.564 r  cores_0/_T_642_0_reg[48]_i_1/CO[3]
                         net (fo=1, unplaced)         0.000    10.564    cores_0/_T_642_0_reg[48]_i_1_n_0
                         CARRY4 (Prop_carry4_CI_CO[3])
                                                      0.054    10.618 r  cores_0/_T_642_0_reg[52]_i_1/CO[3]
                         net (fo=1, unplaced)         0.000    10.618    cores_0/_T_642_0_reg[52]_i_1_n_0
                         CARRY4 (Prop_carry4_CI_CO[3])
                                                      0.054    10.672 r  cores_0/_T_642_0_reg[56]_i_1/CO[3]
                         net (fo=1, unplaced)         0.000    10.672    cores_0/_T_642_0_reg[56]_i_1_n_0
                         CARRY4 (Prop_carry4_CI_CO[3])
                                                      0.054    10.726 r  cores_0/_T_642_0_reg[60]_i_1/CO[3]
                         net (fo=1, unplaced)         0.000    10.726    cores_0/_T_642_0_reg[60]_i_1_n_0
                         CARRY4 (Prop_carry4_CI_O[1])
                                                      0.173    10.899 r  cores_0/_T_642_0_reg[63]_i_1/O[1]
                         net (fo=1, unplaced)         0.000    10.899    cores_0/_GEN_25[62]

Unsuccessful attempts :

  • delaying the Mux with a register
  • replacing the result wire with a register