April 24, 2016

Hold Time Violations

How often has someone asked you how to fix setup time violations?! And how often have you replied with many techniques ranging from cell upsizing, to logical retiming. From Vt swapping to utilizing useful clock skew or perhaps even reduction in the clock frequency?

And how often someone has trapped you for the asking the impact of clock frequency on the hold time!


Let's imagine a scenario. You designed a chip, and it's been manufactured. You discovered that there's one hold time violation and let's say, the slack is -10ps. Well, logical answer would be to throw that chip away since hold time cannot be met by tweaking the clock frequency. But it it were that simple, I wouldn't have asked this question! :P

Now, think a little. And answer what all "engineering tweaks" you can do in order to make the chip work, or I should say to try and make the chip work?


I expect a healthy discourse on this question, and I'm sure even I would end up learning a few things which I might not have appreciated till now. I request you to enlighten me with your thoughts.


Thanks! :)

25 comments:

  1. reduction in voltage can help to reduce hold violation.
    please correct me if i am wrong.
    Thank you
    Venu

    ReplyDelete
    Replies
    1. Hi Perni,

      That's absolutely correct! But reducing voltage alone won't help because that way we'll witness setup time violations in the most critical path. So, voltage has to be decreased in conjunction with reducing the clock frequency by an appropriate margin!

      -Naman

      Delete
    2. Even if reducing the voltage(and clock frequency) would help fix hold violation,is it recommended ? because lowering voltage means current will also decrease, thereby increasing propagation delay.Is this method is used in the industry?

      Delete
    3. its better to use a device in low frequency than throw it

      Delete
  2. increase the temparature of operation.

    ReplyDelete
    Replies
    1. Hi Pandit,

      That's indeed a good starting point considering that the magnitude of violation was just 10ps. However,I'm just a tad concerned about the temperature inversion phenomenon exhibited by lower technology nodes. But yes, you are correct, increasing temperature should most probably take care of this hold violation.

      Thanks for your answer! :)

      -Naman

      Delete
    2. Hi Naman/Pandit,

      I am kind of scratching my head. How come increasing temperature will fix hold violation? My knowledge is, if we increase temperature, it will make devices faster. In that case it will worsen this hold violating path. Correct me if my thinking is wrong.

      Delete
    3. Increasing the temperature causes the more atomic collisions inside the semiconductor lattice which greatly reduce the mobility of electrons/holes. This causes an increase in the propagation delay when the temperature is increased.

      However, for sub-65nm technology nodes, a phenomenon called temperature inversion has been observed where increasing temperature might cause a decrease in propagation delay first, and then decrease. The dependence of delay on temperature is slightly complex, but general trends remain the same.

      Your perspective is correct, maybe you were talking about a more specific case.

      Delete
  3. How about adding additional buffers in the path that are activated with an external signal. The path selection between 1 with buffer(delayed path) and 1 w/o buffer(regular path) can be done using Mux. Thus, in case if we encounter hold time violation, we can activate path with buffers, which results in increasing contamination delay for the path.

    ReplyDelete
    Replies
    1. Silicon is out! The chip has already been fabricated. I was talking about all the methods you can try after the silicon has been fabricated.

      What you suggested can only be done in the design phase. I hope it makes sense.

      -Naman

      Delete
    2. This comment has been removed by the author.

      Delete
    3. This comment has been removed by the author.

      Delete
    4. If only Si is fabricated-
      1) In the metal only ECO you can fix it, using spare cells.
      2) Using PDLY cells - if you have in the design to increase delay.
      3) additional load or cap can be added for data path.
      If CHIP is manufactured then on external phenomena can change it.
      1) Voltage - needs to be above threshold.
      2) Temp

      Delete
  4. Do we have a mechanism to figure out if functional failure is because of setup or hold after the silicon is out?

    ReplyDelete
  5. I guess if after relaxing clock frequency to a huge extent functional failure persists, we can infer that hold violation has occurred....in that case we can reduce the voltage iteratively at that relaxed clock frequency and after a point has reached where functional failure vanishes, we can start increasing clock frequency keeping voltage constant. We will be increasing the clock frequency till a point where functional failure occurs

    ReplyDelete
  6. If the failure is in DFT only while functional mode is working properly then I think we can add the pattern for masking.

    If it is a functional failure only then at the same voltage , the 1st approach should be to increase the frequency slightly till the time no setup violation occurs (it should not occur at the same point as it is already hold critical) , we do know that clock activity does impact setup and hold time, so why not try and see if a higher frequency fixes the hold issue. This way we can maintain the performance as well with slight increase in dynamic power which will further impact the ambient temperature and hence data path may also get slightly more resistive adding more delay (assumption is that clock path won't scale that much as data path)

    If this doesn't resolve the issue then voltage and frequency sweep iteration should be performed.

    ReplyDelete
    Replies
    1. Makes sense.

      Just a few comments: While I agree that frequency can impact hold time but that is architecturally wrong. One should never have any hold paths that are frequency dependent.

      If increasing the frequency reduces the hold time violations, that would mean that decreasing the frequency would cause more hold time violations which is not acceptable.

      Delete
  7. Hi Naman Guupta,

    I am a Design Engineer(Physical Design) having almost 1.5 years of experience and having lesser real-time project experience. Now in my current company, i am also on a alarming situation and for changing job i am feeling troubles while answering them. can you able to help me, if you will reply than i will be able to explain in detail.

    Sorry to use this platform.

    Thanks!!

    ReplyDelete
  8. Temperature and voltage are global variables which affects not just the path of interest but the whole chip. Similarly, burnin is also applied to the complete chip. Burnin is in general used to identify and get rid of the early failing chips. Can the same be done here, which might make the paths slower fixing hold. But this might cause setup violations which again can be fixed by lowering the frequency of operation?

    ReplyDelete
  9. If you don"t mind proceed with this extraordinary work and I anticipate a greater amount of your magnificent blog entries.  Tucson Code Enforcement Violations

    ReplyDelete
  10. This is interesting, Naman.
    If we reduce voltageo get the condition t-combo > t-hold and reduce the frequency so not to violate t-setup, are you considering effect of voltage on t-hold too? Like does t-hold increase with low voltage?

    ReplyDelete
  11. HI Naman,

    according to me this can be rectified if the path having hold violation is routed in the lower layers and as we know lower metal layers have high resistance so increasing the length of the metal can add some resistance and in turn will add delay into the path.

    ReplyDelete
  12. Can reducing voltage really fix the hold violation?
    Considering hold time of sequential lesser compared to datapath or clk delays,
    we can generally say that,a hold violation occurs when sampling clk path delay is higher than generating clk + data path delay combined. if u lower the voltage, all delays will degrade and the situation will still be the same.
    sampiling clk delay > generating clk delay + data path delay.
    Also, moving to lower voltages leads to more deviations in cells delays, due to lower Vgs - Vt, and I believe more chips will fail with hold violations after reducing voltage.

    ReplyDelete