FDE vs latch?

  • Follow


What's the big deal about a latch? Is it less efficient in floorspace than 
an FDE? Or is it just some amount of combinatorial concerns? For example:

    tsc_start <= tsc when sof_in_n = '0' and rising_edge(clk);
versus
    tsc_start <= tsc when sof_in_n = '0';

What concerns are there with crossing clock domains with either? For 
example, a 10 ns tick clock would be more directly meaningful than, say, a 
125 MHz clock in this application. (The logic reading the value runs on a 
different clock than the logic updating the value.)

Also, I had apparently missed the explanation of the semantic difference 
between 'to' and 'downto'. Does simple intuition apply here, that with 'to', 
low indexes reference the more significant bits or words?

Last, does the following infer a multiplier and adder in synthesis? I 
wouldn't expect to see one, and didn't see one in the synthesis log or RTL 
schematic, but the whole thing got really messy with simply adding the very 
wide registers.

    function w_tsc(val : std_logic_vector; i : natural) return 
std_logic_vector is
        variable lo : natural := i * 8;
        variable hi : natural := lo + 7;
    begin
        return val(hi downto lo);
    end function;

begin -- arch
    with tx_state select
        dout <= ....
                w_tsc(tsc_start, 7) when SEND_TSC_START,
                w_tsc(tsc_start, 6) when SEND_TSC_START_1,
                w_tsc(tsc_start, 5) when SEND_TSC_START_2,
                w_tsc(tsc_start, 4) when SEND_TSC_START_3,
                w_tsc(tsc_start, 3) when SEND_TSC_START_4,
                w_tsc(tsc_start, 2) when SEND_TSC_START_5,
                ...

Truly and finally last, is there a good way to generate the above, in the 
midst of a selected assignment having other, unrelated states?

Thanks.

0
Reply boat042-nospam (121) 5/15/2012 7:31:39 PM

>What's the big deal about a latch? 
>

It's usually harder to meet Static Timing Analysis with latches.
They are as troublesome as a very troublesome thing.
If you think you need to use one, think again.
	   
					
---------------------------------------		
Posted through http://www.FPGARelated.com
0
Reply robert.ingham2755 (75) 5/16/2012 4:17:05 PM


"RCIngham" <robert.ingham@n_o_s_p_a_m.n_o_s_p_a_m.gmail.com> wrote in 
message news:kt-dnRe2Lb2cTi7SnZ2dnUVZ_rydnZ2d@giganews.com...
> >What's the big deal about a latch?
>>
>
> It's usually harder to meet Static Timing Analysis with latches.
> They are as troublesome as a very troublesome thing.
> If you think you need to use one, think again.

In this particular case, it's grabbing a rather large 100 ns tick count, to 
inject into a message as a timestamp later. Timing isn't an issue. The only 
concern was relative space efficiency. There seems no difference between a 
latch and FDE, eating up a slice every 2 bits.

I "solved" the other problem of serializing the multi-word timestamps by 
sending 8-bit counts after the first full timestamp. The question remains, 
though, how to efficiently byte-serialize the large data word. I hate to 
hand write a state machine to do this, for every word size I might 
encounter. Shifting the latched value seems unnecessarily painful.

Rotating a single hot bit in a surrogate shadow word, one bit representing a 
data byte, seems so far the best solution, but I can't dream up a way to 
generate the required statements to go with it. Rotating the surrogate mask 
effectively makes for a cheap, easy one-hot FSM, but I don't have the 
language skill to generate the corresponding:

    with vmask  select
        dout <= w(47 downto 40) when b"100000",
        dout <= w(39 downto 32) when b"010000",
        dout <= w(32 downto 24) when b"001000",
        ...

Writing one for each N-sized word is tedious and error prone. Any help or 
ideas?

0
Reply boat042-nospam (121) 5/16/2012 7:31:13 PM

WRT latches: With FDE, all timing paths start and/or end at the FDE.
With a latch, they don't (e.g. when the latch is transparent). This
makes for many more false paths, etc that cause STA to be very
conservative, to the point that you often cannot meet timing unless
you use a lot of false path constraints. The problem with those is
that they (the constraints) are inherently very difficult to verify
(that they are correctly stated and applied) except by inspection and
manual analysis.

Also some (most) FPGA architectures do not have a latch primitive, but
use a macro built from one or more LUTs. These circuits are
notoriously glitchy in the presence of two or more inputs changing
simultaneously, especially likely since upstream logic is often merged
into the same LUT. Unless very carefully designed around, this can
cause latches to unlatch even though the enable did not change.

Short version: it hurts when you do that.

WRT your function. If the i argument is known at synthesis time (this
includes the index of a for-loop), then no hardware is synthesized at
all, just wires. Otherwise, it will just be multiplexers, with no
adder or multiplier (multiply/divide by power of two is just a shift,
and the addition of 7 to a number that is already known to have zeroes
in the least three bits does not take an adder, just a lut that always
outputs 1, which may be shared with others, and some more wires).

I do not understand your last question.


Andy
0
Reply jonesandy (110) 5/16/2012 7:49:23 PM

"Andy" <jonesandy@comcast.net> wrote in message 
news:2afbb4b2-b865-4f40-b706-4cd60365535d@t35g2000yqd.googlegroups.com...
> WRT your function. If the i argument is known at synthesis time (this
> includes the index of a for-loop), then no hardware is synthesized at
> all, just wires. Otherwise, it will just be multiplexers, with no
> adder or multiplier (multiply/divide by power of two is just a shift,
> and the addition of 7 to a number that is already known to have zeroes
> in the least three bits does not take an adder, just a lut that always
> outputs 1, which may be shared with others, and some more wires).

Doh! A for loop does seem the obvious answer. Does that synthesize in a 
clocked process?

Thanks.

0
Reply boat042-nospam (121) 5/16/2012 8:16:32 PM

On Wed, 16 May 2012 15:16:32 -0500
"MikeWhy" <boat042-nospam@yahoo.com> wrote:

> "Andy" <jonesandy@comcast.net> wrote in message 
> news:2afbb4b2-b865-4f40-b706-4cd60365535d@t35g2000yqd.googlegroups.com...
> > WRT your function. If the i argument is known at synthesis time (this
> > includes the index of a for-loop), then no hardware is synthesized at
> > all, just wires. Otherwise, it will just be multiplexers, with no
> > adder or multiplier (multiply/divide by power of two is just a shift,
> > and the addition of 7 to a number that is already known to have zeroes
> > in the least three bits does not take an adder, just a lut that always
> > outputs 1, which may be shared with others, and some more wires).
> 
> Doh! A for loop does seem the obvious answer. Does that synthesize in a 
> clocked process?
> 
> Thanks.
> 

It does if it's synthesizable.

More specifically, if the loop can be unrolled into some mess of
combinational logic, then that logic can be placed in front of a
register, and the for loop can be synthesized.  If there is no
combinational logic that would generate that function (or if there is,
but it's enormous, e.g. a divider) then you can't synthesize it.

-- 
Rob Gaddi, Highland Technology -- www.highlandtechnology.com
Email address domain is currently out of order.  See above to fix.
0
Reply rgaddi1 (124) 5/16/2012 8:26:44 PM

"Rob Gaddi" <rgaddi@technologyhighland.invalid> wrote in message 
news:20120516132644.4ee18da4@rg.highlandtechnology.com...
> On Wed, 16 May 2012 15:16:32 -0500
> "MikeWhy" <boat042-nospam@yahoo.com> wrote:
>
>> "Andy" <jonesandy@comcast.net> wrote in message
>> news:2afbb4b2-b865-4f40-b706-4cd60365535d@t35g2000yqd.googlegroups.com...
>> > WRT your function. If the i argument is known at synthesis time (this
>> > includes the index of a for-loop), then no hardware is synthesized at
>> > all, just wires. Otherwise, it will just be multiplexers, with no
>> > adder or multiplier (multiply/divide by power of two is just a shift,
>> > and the addition of 7 to a number that is already known to have zeroes
>> > in the least three bits does not take an adder, just a lut that always
>> > outputs 1, which may be shared with others, and some more wires).
>>
>> Doh! A for loop does seem the obvious answer. Does that synthesize in a
>> clocked process?
>>
>> Thanks.
>>
>
> It does if it's synthesizable.
>
> More specifically, if the loop can be unrolled into some mess of
> combinational logic, then that logic can be placed in front of a
> register, and the for loop can be synthesized.  If there is no
> combinational logic that would generate that function (or if there is,
> but it's enormous, e.g. a divider) then you can't synthesize it.

A counter and an array of bytes, rather than a shift mask and whatever, was 
the trick. It synthesized to a counter and a big mux. Thanks for the help.

    do_vcount : process (clk)
    ...

    v_out <= word_bytes(vcount);

    init_vword : for iword in 0 to NWORDS-1 generate
        word_bytes(iword) <= vword(word, iword);
    end generate;

0
Reply boat042-nospam (121) 5/17/2012 12:12:03 AM

6 Replies
61 Views

(page loaded in 5.851 seconds)


Reply: