Dear All,
I'm not an expert in VHDL, i'm just a curious trying to solve a
research problem with an FPGA.
I'm using a 32 bit accumulator in a IP, as part of a SoC project with
a microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a
Xilinx XC3S200). The code is included at the end of this message. The
input is a 32 bit signed integer coded in two's complement and the
output also a 32 bit signed integer. What I would like the accumulator
to do is to accumulate synchronously with the rising edge of clk when
enb=1 and maintain the result stable at the output when enb=0 ( enb is
a asynchronous signal generated elsewhere in the system)
But it does not work in this way, it behaves in a strange manner...
Some times I get the expected results but often I get strange values
(large when they should be small, often negative instead of positive,
etc.). If I look at the binary representation of the output, it looks
like if the output din't had time to sum and propagate to the output
again. In fact, the post place and route simulation shows that when
the enb signal goes to 0, the output stays in a undetermined condition
(you know, red line with XXXX).
I'm guessing I'm doing a very basic mistake that as something to do
with the timing of the enb signal, but after 3 days banging my had to
the wall, all I have is a a monumental headache.
Can some kind soul help me with this?
jmariano
================
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity int_accum is
port (clk:in std_logic;
clr:in std_logic;
enb:in std_logic;
d: in std_logic_vector(31 downto 0);
ovf:out std_logic; -- overflow
q: out std_logic_vector(31 downto 0));
end int_accum;
architecture archi of int_accum is
signal tmp : signed(32 downto 0);
begin
process(clk, clr)
begin
if (clr = '1') then
tmp <= (others => '0');
elsif (rising_edge (clk)) then
if (enb = '1') then
-- The result of the adder will be on 33 bits to keep the carry
tmp <= tmp + signed ('0'& d);
end if;
end if;
end process;
-- The carry is extracted from the most significant bit of the
result
ovf <= tmp(32);
-- The q output is the 32 least significant bits of sum
q <= std_logic_vector (tmp(31 downto 0));
end archi;
|
|
0
|
|
|
|
Reply
|
jmariano65 (8)
|
7/2/2012 11:20:52 PM |
|
On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote:
> Dear All,
>
> I'm not an expert in VHDL, i'm just a curious trying to solve a
> research problem with an FPGA.
>
> I'm using a 32 bit accumulator in a IP, as part of a SoC project with
> a microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a
> Xilinx XC3S200). The code is included at the end of this message. =A0The
> input is a 32 bit signed integer coded in two's complement and the
> output also a 32 bit signed integer. What I would like the accumulator
> to do is to accumulate synchronously with the rising edge of clk when
> enb=3D1 and maintain the result stable at the output when enb=3D0 ( enb i=
s
> a asynchronous signal generated elsewhere in the system)
>
> But it does not work in this way, it behaves in a strange manner...
>
> Some times I get the expected results but often I get strange values
> (large when they should be small, often negative instead of positive,
> etc.). If I look at the binary representation of the output, it looks
> like if the output din't had time to sum and propagate to the output
> again. In fact, the post place and route simulation shows that when
> the enb signal goes to 0, the output stays in a undetermined condition
> (you know, red line with XXXX).
>
> I'm guessing I'm doing a very basic mistake that as something to do
> with the timing of the enb signal, but after 3 days banging my had to
> the wall, all I have is a a monumental headache.
>
> Can some kind soul help me with this?
>
> jmariano
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>
> library ieee;
> use ieee.std_logic_1164.all;
> use ieee.numeric_std.all;
>
> entity int_accum is
> =A0 port =A0(clk:in =A0std_logic;
> =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic;
> =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic;
> =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0);
> =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow
> =A0 =A0 =A0 =A0 =A0q: =A0out std_logic_vector(31 downto 0));
> end int_accum;
>
> architecture archi of int_accum is
>
> =A0 signal tmp : signed(32 downto 0);
>
> =A0 begin
>
> =A0 process(clk, clr)
> =A0 begin
> =A0 =A0 =A0 =A0 if (clr =3D '1') then
> =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0');
> =A0 =A0elsif (rising_edge (clk)) then
> =A0 =A0 =A0 =A0 if (enb =3D '1') then
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be on 33 =
bits to keep the carry
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 tmp <=3D tmp + signed ('0'& d);
> =A0 =A0 end if;
> =A0 =A0end if;
> =A0 end process;
>
> =A0 -- The carry is extracted from the most significant bit of the
> result
> =A0 ovf <=3D tmp(32);
>
> =A0 -- The q output is the 32 least significant bits of sum
> =A0 q <=3D std_logic_vector (tmp(31 downto 0));
>
> end archi;
This is the key to your problem:
> enb is a asynchronous signal generated elsewhere in the system
You can't expect to take an asynchronous signal into multiple (32 in
this case) registers in a synchronous domain and expect that it will
work reliably. You need to first synchronize the asynchronous input
to the synchronous clock domain before you can use it.
Ed McGettigan
--
Xilinx Inc.
|
|
0
|
|
|
|
Reply
|
ed.mcgettigan (73)
|
7/3/2012 12:19:59 AM
|
|
On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote:
> On Jul 2, 4:20 pm, jmariano <jmarian...@gmail.com> wrote:
>> Dear All,
>>
>> I'm not an expert in VHDL, i'm just a curious trying to solve a
>> research problem with an FPGA.
>>
>> I'm using a 32 bit accumulator in a IP, as part of a SoC project with a
>> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a
>> Xilinx XC3S200). The code is included at the end of this message. The
>> input is a 32 bit signed integer coded in two's complement and the
>> output also a 32 bit signed integer. What I would like the accumulator
>> to do is to accumulate synchronously with the rising edge of clk when
>> enb=1 and maintain the result stable at the output when enb=0 ( enb is
>> a asynchronous signal generated elsewhere in the system)
>>
>> But it does not work in this way, it behaves in a strange manner...
>>
>> Some times I get the expected results but often I get strange values
>> (large when they should be small, often negative instead of positive,
>> etc.). If I look at the binary representation of the output, it looks
>> like if the output din't had time to sum and propagate to the output
>> again. In fact, the post place and route simulation shows that when the
>> enb signal goes to 0, the output stays in a undetermined condition (you
>> know, red line with XXXX).
>>
>> I'm guessing I'm doing a very basic mistake that as something to do
>> with the timing of the enb signal, but after 3 days banging my had to
>> the wall, all I have is a a monumental headache.
>>
>> Can some kind soul help me with this?
>>
>> jmariano
>>
>> ================
>>
>> library ieee;
>> use ieee.std_logic_1164.all;
>> use ieee.numeric_std.all;
>>
>> entity int_accum is
>> port (clk:in std_logic;
>> clr:in std_logic;
>> enb:in std_logic;
>> d: in std_logic_vector(31 downto 0);
>> ovf:out std_logic; -- overflow q: out
>> std_logic_vector(31 downto 0));
>> end int_accum;
>>
>> architecture archi of int_accum is
>>
>> signal tmp : signed(32 downto 0);
>>
>> begin
>>
>> process(clk, clr)
>> begin
>> if (clr = '1') then
>> tmp <= (others => '0');
>> elsif (rising_edge (clk)) then
>> if (enb = '1') then
>> -- The result of the adder will be on 33 bits
>> to keep the carry tmp <= tmp + signed ('0'& d);
>> end if;
>> end if;
>> end process;
>>
>> -- The carry is extracted from the most significant bit of the
>> result
>> ovf <= tmp(32);
>>
>> -- The q output is the 32 least significant bits of sum q <=
>> std_logic_vector (tmp(31 downto 0));
>>
>> end archi;
>
> This is the key to your problem:
>
>> enb is a asynchronous signal generated elsewhere in the system
>
> You can't expect to take an asynchronous signal into multiple (32 in
> this case) registers in a synchronous domain and expect that it will
> work reliably. You need to first synchronize the asynchronous input to
> the synchronous clock domain before you can use it.
Which means that you should latch enb in a register, with the same clock
that you're using to twiddle your accumulator, and use the output of that
register as your enable signal.
Paranoid logic designers will have a string of two or three registers to
avoid metastability, but I've been told that's not necessary. (I'm not
much of a logic designer).
--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com
|
|
0
|
|
|
|
Reply
|
tim866 (392)
|
7/3/2012 5:24:02 AM
|
|
On Mon, 02 Jul 2012 16:20:52 -0700, jmariano wrote:
> Dear All,
>
> I'm not an expert in VHDL, i'm just a curious trying to solve a research
> problem with an FPGA.
>
> I'm using a 32 bit accumulator in a IP,
.... The
> input is a 32 bit signed integer coded in two's complement and the
> output also a 32 bit signed integer.
> But it does not work in this way, it behaves in a strange manner...
You have one likely answer from Ed and Tim : unless you KNOW that the
input signals "enb" and "d" are already synchronous with "clk" you MUST
synchronise them.
But there is another problem:
tmp <= tmp + signed ('0'& d);
This is NOT how to add a leading bit to d.
It will convert a small negative d to a very large positive value!
Instead you must replicate d's sign bit (MSB) into the leading bit.
tmp <= tmp + signed (d(d'high) & d);
(Or look for "resize" functions in numeric_std to do this for you).
This is far more likely to be the problem, especially if you are
detecting these errors at behavioural simulation (as you should be)
Incidentally, unless this is the top level of your design, I would
consider making the D and Q ports signed. Apart from keeping the type
conversions to a minimum, this means the external view of the design (the
entity specification) better reflects (or documents) what the design
does; preventing surprises when someone re-uses it with unsigned data...
- Brian
|
|
0
|
|
|
|
Reply
|
brian7107 (195)
|
7/3/2012 2:31:10 PM
|
|
On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote:
> On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote:
>=20
> > On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote:
> >> Dear All,
> >>
> >> I'm not an expert in VHDL, i'm just a curious trying to solve a
> >> research problem with an FPGA.
> >>
> >> I'm using a 32 bit accumulator in a IP, as part of a SoC project with =
a
> >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a
> >> Xilinx XC3S200). The code is included at the end of this message. =A0T=
he
> >> input is a 32 bit signed integer coded in two's complement and the
> >> output also a 32 bit signed integer. What I would like the accumulator
> >> to do is to accumulate synchronously with the rising edge of clk when
> >> enb=3D1 and maintain the result stable at the output when enb=3D0 ( en=
b is
> >> a asynchronous signal generated elsewhere in the system)
> >>
> >> But it does not work in this way, it behaves in a strange manner...
> >>
> >> Some times I get the expected results but often I get strange values
> >> (large when they should be small, often negative instead of positive,
> >> etc.). If I look at the binary representation of the output, it looks
> >> like if the output din't had time to sum and propagate to the output
> >> again. In fact, the post place and route simulation shows that when th=
e
> >> enb signal goes to 0, the output stays in a undetermined condition (yo=
u
> >> know, red line with XXXX).
> >>
> >> I'm guessing I'm doing a very basic mistake that as something to do
> >> with the timing of the enb signal, but after 3 days banging my had to
> >> the wall, all I have is a a monumental headache.
> >>
> >> Can some kind soul help me with this?
> >>
> >> jmariano
> >>
> >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >>
> >> library ieee;
> >> use ieee.std_logic_1164.all;
> >> use ieee.numeric_std.all;
> >>
> >> entity int_accum is
> >> =A0 port =A0(clk:in =A0std_logic;
> >> =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic;
> >> =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic;
> >> =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0);
> >> =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow q: =A0out
> >> =A0 =A0 =A0 =A0 =A0std_logic_vector(31 downto 0));
> >> end int_accum;
> >>
> >> architecture archi of int_accum is
> >>
> >> =A0 signal tmp : signed(32 downto 0);
> >>
> >> =A0 begin
> >>
> >> =A0 process(clk, clr)
> >> =A0 begin
> >> =A0 =A0 =A0 =A0 if (clr =3D '1') then
> >> =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0');
> >> =A0 =A0elsif (rising_edge (clk)) then
> >> =A0 =A0 =A0 =A0 if (enb =3D '1') then
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be on =
33 bits
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 to keep the carry tmp <=3D tmp + signe=
d ('0'& d);
> >> =A0 =A0 end if;
> >> =A0 =A0end if;
> >> =A0 end process;
> >>
> >> =A0 -- The carry is extracted from the most significant bit of the
> >> result
> >> =A0 ovf <=3D tmp(32);
> >>
> >> =A0 -- The q output is the 32 least significant bits of sum q <=3D
> >> =A0 std_logic_vector (tmp(31 downto 0));
> >>
> >> end archi;
> >=20
> > This is the key to your problem:
> >=20
> >> enb is a asynchronous signal generated elsewhere in the system
> >=20
> > You can't expect to take an asynchronous signal into multiple (32 in
> > this case) registers in a synchronous domain and expect that it will
> > work reliably. You need to first synchronize the asynchronous input to
> > the synchronous clock domain before you can use it.
>=20
> Which means that you should latch enb in a register, with the same clock=
=20
> that you're using to twiddle your accumulator, and use the output of that=
=20
> register as your enable signal.
>=20
> Paranoid logic designers will have a string of two or three registers to=
=20
> avoid metastability, but I've been told that's not necessary. (I'm not=
=20
> much of a logic designer).
>=20
> --=20
> Tim Wescott
> Control system and signal processing consulting
> www.wescottdesign.com
It isn't just the paranoid logic designer, it should be every logic designe=
r. =20
A single register only partially solves the problem of an asynchronous inpu=
t with multiple register destinations, but it does not solve the very real =
metastability problem. At least two registers should be used to ensure tha=
t the metastability condition has resolved and with increasing clock freque=
ncy and finer process nodes using three or more stages may be necessary.
Ed McGettigan
--
Xilinx Inc.
|
|
0
|
|
|
|
Reply
|
ed.mcgettigan (73)
|
7/3/2012 9:45:44 PM
|
|
On Jul 3, 5:45=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote:
> On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote:
> > On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote:
>
> > > On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote:
> > >> Dear All,
>
> > >> I'm not an expert in VHDL, i'm just a curious trying to solve a
> > >> research problem with an FPGA.
>
> > >> I'm using a 32 bit accumulator in a IP, as part of a SoC project wit=
h a
> > >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is a
> > >> Xilinx XC3S200). The code is included at the end of this message. =
=A0The
> > >> input is a 32 bit signed integer coded in two's complement and the
> > >> output also a 32 bit signed integer. What I would like the accumulat=
or
> > >> to do is to accumulate synchronously with the rising edge of clk whe=
n
> > >> enb=3D1 and maintain the result stable at the output when enb=3D0 ( =
enb is
> > >> a asynchronous signal generated elsewhere in the system)
>
> > >> But it does not work in this way, it behaves in a strange manner...
>
> > >> Some times I get the expected results but often I get strange values
> > >> (large when they should be small, often negative instead of positive=
,
> > >> etc.). If I look at the binary representation of the output, it look=
s
> > >> like if the output din't had time to sum and propagate to the output
> > >> again. In fact, the post place and route simulation shows that when =
the
> > >> enb signal goes to 0, the output stays in a undetermined condition (=
you
> > >> know, red line with XXXX).
>
> > >> I'm guessing I'm doing a very basic mistake that as something to do
> > >> with the timing of the enb signal, but after 3 days banging my had t=
o
> > >> the wall, all I have is a a monumental headache.
>
> > >> Can some kind soul help me with this?
>
> > >> jmariano
>
> > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>
> > >> library ieee;
> > >> use ieee.std_logic_1164.all;
> > >> use ieee.numeric_std.all;
>
> > >> entity int_accum is
> > >> =A0 port =A0(clk:in =A0std_logic;
> > >> =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic;
> > >> =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic;
> > >> =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0);
> > >> =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow q: =A0o=
ut
> > >> =A0 =A0 =A0 =A0 =A0std_logic_vector(31 downto 0));
> > >> end int_accum;
>
> > >> architecture archi of int_accum is
>
> > >> =A0 signal tmp : signed(32 downto 0);
>
> > >> =A0 begin
>
> > >> =A0 process(clk, clr)
> > >> =A0 begin
> > >> =A0 =A0 =A0 =A0 if (clr =3D '1') then
> > >> =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0');
> > >> =A0 =A0elsif (rising_edge (clk)) then
> > >> =A0 =A0 =A0 =A0 if (enb =3D '1') then
> > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be o=
n 33 bits
> > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 to keep the carry tmp <=3D tmp + sig=
ned ('0'& d);
> > >> =A0 =A0 end if;
> > >> =A0 =A0end if;
> > >> =A0 end process;
>
> > >> =A0 -- The carry is extracted from the most significant bit of the
> > >> result
> > >> =A0 ovf <=3D tmp(32);
>
> > >> =A0 -- The q output is the 32 least significant bits of sum q <=3D
> > >> =A0 std_logic_vector (tmp(31 downto 0));
>
> > >> end archi;
>
> > > This is the key to your problem:
>
> > >> =A0enb is a asynchronous signal generated elsewhere in the system
>
> > > You can't expect to take an asynchronous signal into multiple (32 in
> > > this case) registers in a synchronous domain and expect that it will
> > > work reliably. =A0You need to first synchronize the asynchronous inpu=
t to
> > > the synchronous clock domain before you can use it.
>
> > Which means that you should latch enb in a register, with the same cloc=
k
> > that you're using to twiddle your accumulator, and use the output of th=
at
> > register as your enable signal.
>
> > Paranoid logic designers will have a string of two or three registers t=
o
> > avoid metastability, but I've been told that's not necessary. =A0(I'm n=
ot
> > much of a logic designer).
>
> > --
> > Tim Wescott
> > Control system and signal processing consulting
> >www.wescottdesign.com
>
> It isn't just the paranoid logic designer, it should be every logic desig=
ner.
>
> A single register only partially solves the problem of an asynchronous in=
put with multiple register destinations, but it does not solve the very rea=
l metastability problem. =A0At least two registers should be used to ensure=
that the metastability condition has resolved and with increasing clock fr=
equency and finer process nodes using three or more stages may be necessary=
..
>
> Ed McGettigan
> --
> Xilinx Inc.
Hi Ed. They way it was explained to me, I believe from Peter Alfke,
is that what really resolves metastability is the slack time in a
register to register path. Over the years FPGA process has resulted
in FFs which only need a couple of ns to resolve metastability to 1 in
a million operation years or something like that (I don't remember the
metric, but it was good enough for anything I do). It doesn't matter
that you have logic in that path, you just need those few ns in every
part of the path. In theory, even if you use multiple registers with
no logic, what really matters is the slack time in the path and that
is not guaranteed even with no logic. So the design protocol should
be to assure the slack time from the input register to all subsequent
registers have sufficient slack time.
Do you remember how much time that needs to be? I want to say 2 ns,
but it might be more like 5 ns, I just can't recall. Of course it
depends on your clock rates, but I believe Peter picked some more
aggressive speeds like 100 MHz for his example.
Rick
|
|
0
|
|
|
|
Reply
|
gnuarm (2644)
|
7/4/2012 7:49:07 PM
|
|
Dear All,
Thank you very much for your input and sorry for the late reply.
It is really great to be able to get the opinion of such experts,
specially since, at my current location and in a radius of some 200
km, I must be the only person working with FPGA and VHDL! I'm also
glad that the discussion as evolved to levels of complexity far beyond
my knowledge.
I was hoping that by now I would be able to say that the thing was
working as expected but, unfortunately, no.
I've synchronized the enable signal, as suggested by Ed and Tim, using
3 FF (I'm not paranoid, I just have room). Also, following Brian
suggestions, I've clean up the code regarding type conversions. All
this as allow me to isolate the remaining source of error, thank you
very much.
Here's the full story: I'm implementing a gated integrator, as a part
of a boxcar averager. This is the standard noise reduction technique
used in nuclear magnetic resonance (nmr). This is research, not a
commercial product! The module gets is data from 4 8 bits ADC's at 5
MHz (adc0, adc90, adc180, adc270) and accumulates wile enb=1. enb is
generated in a different module. The module does this:
1 - generates the acquisition clock (adc_clk) by division by 10 of the
S3-SKB 50 MHz main clock
2- generates the accumulation clock (acc_clk) by inverting adc_clk.
In this way, there is a delay of 100 ns from the moment the ADC's
receive the rising edge of the clock to the moment when the data gets
registered at the output.
3 - converts the data from the adc's to excess 128 (bipolar adc) and
extends to 32 bit signed
4 - calculates u = adc0-adc180 and v=adc90-adc270. u and v go through
a switch and emerge as r and i, to be delivered to 2 alike
accumulators.
Of course, 3 and 4 must occur in less than 100 ns.
The switch unit is very simple: It has a control signal, s[1:0] that
comes from a different module, and the following table: 00 -> r=u,
i=v; 01 -> r=v, i=-u; 10 -> r=-v, i=u; 11 -> r=-v, i=-u. The s signal
is generated in a different clock domain and is stable 500 us before
the enb. enb has a typical duration of 10 us. The code is at the end
of this message.
I continue to get errors, specially when the input values are closed
to zero, which means that the result is changing from say FFFFFFFF to
00000001, so lots of bits to change.
I have (i think!) trace the source of error to the switch_unit
because, if I tie the s signal to a fixed value, 11 for example, the
unit works well, but if I connect to a real s signal, I get errors. So
I thought, this must be because the real s is noisy and r and i change
during the acquisition period (1mm ns) so I have synchronized s with
acc_clk, but the problem persists. What is more strange is that, if I
do s <= "01" inside the synchronization process, I also get the same
type of errors.
Really, don't now what to do next.
jmariano
=================
architecture archi of int_su is
begin
process(u, v, s)
begin
case s is
when "00" =>
r <= u;
i <= v;
when "01" =>
r <= v;
i <= -u;
when "10" =>
r <= -u;
i <= -v;
when "11" =>
r <= -v;
i <= u;
when others =>
r <= (others => 'X');
i <= (others => 'X');
end case;
end process;
end archi;
============
|
|
0
|
|
|
|
Reply
|
jmariano65 (8)
|
7/5/2012 11:44:53 AM
|
|
On Jul 5, 7:44=A0am, jmariano <jmarian...@gmail.com> wrote:
> Dear All,
>
> Thank you very much for your input and sorry for the late reply.
> It is really great to be able to get the opinion of such experts,
> specially since, at my current location and in a radius of some 200
> km, I must be the only person working with FPGA and VHDL! I'm also
> glad that the discussion as evolved to levels of complexity far beyond
> my knowledge.
>
> I was hoping that by now I would be able to say that the thing was
> working as expected but, unfortunately, no.
>
> I've synchronized the enable signal, as suggested by Ed and Tim, using
> 3 FF (I'm not paranoid, I just have room). Also, following Brian
> suggestions, I've clean up the code regarding type conversions. All
> this as allow me to isolate the remaining source of error, thank you
> very much.
>
> Here's the full story: I'm implementing a gated integrator, as a part
> of a boxcar averager. =A0This is the standard noise reduction technique
> used in nuclear magnetic resonance (nmr). This is research, not a
> commercial product! The module gets is data from 4 8 bits ADC's at 5
> MHz (adc0, adc90, adc180, adc270) and accumulates wile enb=3D1. enb is
> generated in a different module. The module does this:
> 1 - generates the acquisition clock (adc_clk) by division by 10 of the
> S3-SKB 50 MHz main clock
> 2- =A0generates the accumulation clock (acc_clk) by inverting adc_clk.
> In this way, there is a delay of 100 ns from the moment the ADC's
> receive the rising edge of the clock to the moment when the data gets
> registered at the output.
> 3 - converts the data from the adc's to excess 128 (bipolar adc) and
> extends to 32 bit signed
> 4 - calculates u =3D adc0-adc180 and v=3Dadc90-adc270. u and v go through
> a switch and emerge as r and i, to be delivered to 2 alike
> accumulators.
> Of course, 3 and 4 must occur in less than 100 ns.
>
> The switch unit is very simple: It has a control signal, s[1:0] that
> comes from a different module, and the following table: 00 -> r=3Du,
> i=3Dv; 01 -> r=3Dv, i=3D-u; 10 -> r=3D-v, i=3Du; 11 -> r=3D-v, i=3D-u. Th=
e s signal
> is generated in a different clock domain and is stable 500 us before
> the enb. enb has a typical duration of 10 us. The code is at the end
> of this message.
>
> I continue to get errors, specially when the input values are closed
> to zero, which means that the result is changing from say FFFFFFFF to
> 00000001, so lots of bits to change.
>
> I have (i think!) trace the source of error to the switch_unit
> because, if I tie the s signal to a fixed value, 11 for example, the
> unit works well, but if I connect to a real s signal, I get errors. So
> I thought, this must be because the real s is noisy and r and i change
> during the acquisition period (1mm ns) so I have synchronized s with
> acc_clk, but the problem persists. =A0What is more strange is that, if I
> do s <=3D "01" inside the synchronization process, I also get the same
> type of errors.
>
> Really, don't now what to do next.
>
> jmariano
>
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> architecture archi of int_su is
> begin
> =A0 =A0 =A0 =A0 process(u, v, s)
> =A0 =A0 =A0 =A0 begin
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 case s is
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "00" =3D>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D =A0u;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D =A0v;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "01" =3D>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D =A0v;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D -u;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "10" =3D>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D -u;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D -v;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when "11" =3D>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D -v;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D =A0u;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 when others =3D>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 r <=3D (others =3D> 'X');
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 i <=3D (others =3D> 'X');
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 end case;
> =A0 =A0 =A0 =A0 end process;
> end archi;
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
I'm not real clear on your description of your design, but if you are
really generating clocks from the 50 MHz, I recommend that inside the
FPGA you instead use a single clock and generate clock enables for the
various functions. When you use multiple clocks in a circuit you have
to do extra work for every signal that crosses a clock domain. Could
that be your problem?
I don't see anything in your original post about simulation. Do you
simulate your modules? I highly recommend that you write a test
benche for each and every module you code. You may think this takes
too much time, but I believe it pays off in the end with shorter
integration time.
Rick
|
|
0
|
|
|
|
Reply
|
gnuarm (2644)
|
7/5/2012 5:12:02 PM
|
|
On Wednesday, July 4, 2012 12:49:07 PM UTC-7, rickman wrote:
> On Jul 3, 5:45=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote:
> > On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote:
> > > On Mon, 02 Jul 2012 17:19:59 -0700, Ed McGettigan wrote:
> >
> > > > On Jul 2, 4:20=A0pm, jmariano <jmarian...@gmail.com> wrote:
> > > >> Dear All,
> >
> > > >> I'm not an expert in VHDL, i'm just a curious trying to solve a
> > > >> research problem with an FPGA.
> >
> > > >> I'm using a 32 bit accumulator in a IP, as part of a SoC project w=
ith a
> > > >> microblaze, implemented in a Digilent Spartan-3 SKB ( the FPGA is =
a
> > > >> Xilinx XC3S200). The code is included at the end of this message. =
=A0The
> > > >> input is a 32 bit signed integer coded in two's complement and the
> > > >> output also a 32 bit signed integer. What I would like the accumul=
ator
> > > >> to do is to accumulate synchronously with the rising edge of clk w=
hen
> > > >> enb=3D1 and maintain the result stable at the output when enb=3D0 =
( enb is
> > > >> a asynchronous signal generated elsewhere in the system)
> >
> > > >> But it does not work in this way, it behaves in a strange manner..=
..
> >
> > > >> Some times I get the expected results but often I get strange valu=
es
> > > >> (large when they should be small, often negative instead of positi=
ve,
> > > >> etc.). If I look at the binary representation of the output, it lo=
oks
> > > >> like if the output din't had time to sum and propagate to the outp=
ut
> > > >> again. In fact, the post place and route simulation shows that whe=
n the
> > > >> enb signal goes to 0, the output stays in a undetermined condition=
(you
> > > >> know, red line with XXXX).
> >
> > > >> I'm guessing I'm doing a very basic mistake that as something to d=
o
> > > >> with the timing of the enb signal, but after 3 days banging my had=
to
> > > >> the wall, all I have is a a monumental headache.
> >
> > > >> Can some kind soul help me with this?
> >
> > > >> jmariano
> >
> > > >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >
> > > >> library ieee;
> > > >> use ieee.std_logic_1164.all;
> > > >> use ieee.numeric_std.all;
> >
> > > >> entity int_accum is
> > > >> =A0 port =A0(clk:in =A0std_logic;
> > > >> =A0 =A0 =A0 =A0 =A0clr:in =A0std_logic;
> > > >> =A0 =A0 =A0 =A0 =A0enb:in =A0std_logic;
> > > >> =A0 =A0 =A0 =A0 =A0d: =A0in =A0std_logic_vector(31 downto 0);
> > > >> =A0 =A0 =A0 =A0 =A0ovf:out std_logic; =A0 =A0 =A0-- overflow q: =
=A0out
> > > >> =A0 =A0 =A0 =A0 =A0std_logic_vector(31 downto 0));
> > > >> end int_accum;
> >
> > > >> architecture archi of int_accum is
> >
> > > >> =A0 signal tmp : signed(32 downto 0);
> >
> > > >> =A0 begin
> >
> > > >> =A0 process(clk, clr)
> > > >> =A0 begin
> > > >> =A0 =A0 =A0 =A0 if (clr =3D '1') then
> > > >> =A0 =A0 =A0 =A0 =A0 =A0tmp <=3D (others =3D> '0');
> > > >> =A0 =A0elsif (rising_edge (clk)) then
> > > >> =A0 =A0 =A0 =A0 if (enb =3D '1') then
> > > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -- The result of the adder will be=
on 33 bits
> > > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 to keep the carry tmp <=3D tmp + s=
igned ('0'& d);
> > > >> =A0 =A0 end if;
> > > >> =A0 =A0end if;
> > > >> =A0 end process;
> >
> > > >> =A0 -- The carry is extracted from the most significant bit of the
> > > >> result
> > > >> =A0 ovf <=3D tmp(32);
> >
> > > >> =A0 -- The q output is the 32 least significant bits of sum q <=3D
> > > >> =A0 std_logic_vector (tmp(31 downto 0));
> >
> > > >> end archi;
> >
> > > > This is the key to your problem:
> >
> > > >> =A0enb is a asynchronous signal generated elsewhere in the system
> >
> > > > You can't expect to take an asynchronous signal into multiple (32 i=
n
> > > > this case) registers in a synchronous domain and expect that it wil=
l
> > > > work reliably. =A0You need to first synchronize the asynchronous in=
put to
> > > > the synchronous clock domain before you can use it.
> >
> > > Which means that you should latch enb in a register, with the same cl=
ock
> > > that you're using to twiddle your accumulator, and use the output of =
that
> > > register as your enable signal.
> >
> > > Paranoid logic designers will have a string of two or three registers=
to
> > > avoid metastability, but I've been told that's not necessary. =A0(I'm=
not
> > > much of a logic designer).
> >
> > > --
> > > Tim Wescott
> > > Control system and signal processing consulting
> > >www.wescottdesign.com
> >
> > It isn't just the paranoid logic designer, it should be every logic des=
igner.
> >
> > A single register only partially solves the problem of an asynchronous =
input with multiple register destinations, but it does not solve the very r=
eal metastability problem. =A0At least two registers should be used to ensu=
re that the metastability condition has resolved and with increasing clock =
frequency and finer process nodes using three or more stages may be necessa=
ry.
> >
> > Ed McGettigan
> > --
> > Xilinx Inc.
>=20
> Hi Ed. They way it was explained to me, I believe from Peter Alfke,
> is that what really resolves metastability is the slack time in a
> register to register path. Over the years FPGA process has resulted
> in FFs which only need a couple of ns to resolve metastability to 1 in
> a million operation years or something like that (I don't remember the
> metric, but it was good enough for anything I do). It doesn't matter
> that you have logic in that path, you just need those few ns in every
> part of the path. In theory, even if you use multiple registers with
> no logic, what really matters is the slack time in the path and that
> is not guaranteed even with no logic. So the design protocol should
> be to assure the slack time from the input register to all subsequent
> registers have sufficient slack time.
>=20
> Do you remember how much time that needs to be? I want to say 2 ns,
> but it might be more like 5 ns, I just can't recall. Of course it
> depends on your clock rates, but I believe Peter picked some more
> aggressive speeds like 100 MHz for his example.
>=20
> Rick
I'm glad to see that one of my 5-6 attempts to post was finally accepted by=
Google. I have got to switch to something else.
Peter Alfke's publications on metastability definitely fall into the semina=
l category, but you must be careful to extrapolate the original data to the=
latest technology nodes, circuits and design requirements. There are two =
major factors that impact the metastability equations, the tau or metastabi=
lity decay rate and the settling time. =20
The tau value is an inherent characteristic of the circuit and technology n=
ode and for a long time the expectation was that this is would decrease wit=
h each generation, but this has stopped being true.
The settling time, Ts, is dependent on the design and is under the user's c=
ontrol. Ts is a factor of the destination clock frequency and the timing sl=
ack between registers. If you have 100 MHz clock frequency, but you use up =
9.5nS to get to the destination your slack is only 500pS. Adding register s=
tages allows for maximum use of the clock period increasing the settling ti=
me and for each stage it increases again.=20
Ed McGettigan
--
Xilinx Inc.
|
|
0
|
|
|
|
Reply
|
ed.mcgettigan (73)
|
7/5/2012 6:03:09 PM
|
|
Hi Rick, tanks for your help.
> I'm not real clear on your description of your design, but if you are
> really generating clocks from the 50 MHz, I recommend that inside the
> FPGA you instead use a single clock and generate clock enables for the
> various functions. =20
Yes, I generate a 5 MHz clock inside the module from the main 50 MHz clock =
by simple division by 10 because I need a 5 MHz adc clock. I can't use cloc=
k enable because the AD9058 adc does not have a enable input, just clock.
> When you use multiple clocks in a circuit you have
> to do extra work for every signal that crosses a clock domain. Could
> that be your problem?
What is the extra work? Have no idea! Synchronization?
> I don't see anything in your original post about simulation. Do you
> simulate your modules? I highly recommend that you write a test
> benche for each and every module you code. You may think this takes
> too much time, but I believe it pays off in the end with shorter
> integration time.
Sorry about that, I did, in fact, simulate each module and the top entity. =
The behavior simulation gives the expected results, the post and place simu=
lation gives same errors that I could not understand, but I'll run the simu=
lations again and post the results here.
jmariano=20
=20
|
|
0
|
|
|
|
Reply
|
jmariano65 (8)
|
7/5/2012 6:04:36 PM
|
|
On Jul 5, 11:04=A0am, jmariano <jmarian...@gmail.com> wrote:
> Hi Rick, tanks for your help.
>
> > I'm not real clear on your description of your design, but if you are
> > really generating clocks from the 50 MHz, I recommend that inside the
> > FPGA you instead use a single clock and generate clock enables for the
> > various functions.
>
> Yes, I generate a 5 MHz clock inside the module from the main 50 MHz cloc=
k by simple division by 10 because I need a 5 MHz adc clock. I can't use cl=
ock enable because the AD9058 adc does not have a enable input, just clock.
>
> > When you use multiple clocks in a circuit you have
> > to do extra work for every signal that crosses a clock domain. =A0Could
> > that be your problem?
>
> What is the extra work? Have no idea! Synchronization?
>
> > I don't see anything in your original post about simulation. =A0Do you
> > simulate your modules? =A0I highly recommend that you write a test
> > benche for each and every module you code. =A0You may think this takes
> > too much time, but I believe it pays off in the end with shorter
> > integration time.
>
> Sorry about that, I did, in fact, simulate each module and the top entity=
.. The behavior simulation gives the expected results, the post and place si=
mulation gives same errors that I could not understand, but I'll run the si=
mulations again and post the results here.
>
> jmariano
The good news here is that you have a simulation that shows the same
behavior in hardware. Looking at these simulation runs should tell
you exactly what the problem is. I don't think that anyone here will
be able to the same with the full source code for the design.
Ed McGettigan
--
Xilinx Inc.
|
|
0
|
|
|
|
Reply
|
ed.mcgettigan (73)
|
7/5/2012 6:09:51 PM
|
|
On Jul 5, 2:03=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote:
> On Wednesday, July 4, 2012 12:49:07 PM UTC-7, rickman wrote:
> > On Jul 3, 5:45=A0pm, Ed McGettigan <ed.mcgetti...@xilinx.com> wrote:
> > > On Monday, July 2, 2012 10:24:02 PM UTC-7, Tim Wescott wrote:
>
> > > > Paranoid logic designers will have a string of two or three registe=
rs to
> > > > avoid metastability, but I've been told that's not necessary. =A0(I=
'm not
> > > > much of a logic designer).
>
> > > > --
> > > > Tim Wescott
> > > > Control system and signal processing consulting
> > > >www.wescottdesign.com
>
> > > It isn't just the paranoid logic designer, it should be every logic d=
esigner.
>
> > > A single register only partially solves the problem of an asynchronou=
s input with multiple register destinations, but it does not solve the very=
real metastability problem. =A0At least two registers should be used to en=
sure that the metastability condition has resolved and with increasing cloc=
k frequency and finer process nodes using three or more stages may be neces=
sary.
>
> > > Ed McGettigan
> > > --
> > > Xilinx Inc.
>
> > Hi Ed. =A0They way it was explained to me, I believe from Peter Alfke,
> > is that what really resolves metastability is the slack time in a
> > register to register path. =A0Over the years FPGA process has resulted
> > in FFs which only need a couple of ns to resolve metastability to 1 in
> > a million operation years or something like that (I don't remember the
> > metric, but it was good enough for anything I do). =A0It doesn't matter
> > that you have logic in that path, you just need those few ns in every
> > part of the path. =A0In theory, even if you use multiple registers with
> > no logic, what really matters is the slack time in the path and that
> > is not guaranteed even with no logic. =A0So the design protocol should
> > be to assure the slack time from the input register to all subsequent
> > registers have sufficient slack time.
>
> > Do you remember how much time that needs to be? =A0I want to say 2 ns,
> > but it might be more like 5 ns, I just can't recall. =A0Of course it
> > depends on your clock rates, but I believe Peter picked some more
> > aggressive speeds like 100 MHz for his example.
>
> > Rick
>
> I'm glad to see that one of my 5-6 attempts to post was finally accepted =
by Google. =A0I have got to switch to something else.
>
> Peter Alfke's publications on metastability definitely fall into the semi=
nal category, but you must be careful to extrapolate the original data to t=
he latest technology nodes, circuits and design requirements. =A0There are =
two major factors that impact the metastability equations, the tau or metas=
tability decay rate and the settling time.
>
> The tau value is an inherent characteristic of the circuit and technology=
node and for a long time the expectation was that this is would decrease w=
ith each generation, but this has stopped being true.
>
> The settling time, Ts, is dependent on the design and is under the user's=
control. Ts is a factor of the destination clock frequency and the timing =
slack between registers. If you have 100 MHz clock frequency, but you use u=
p 9.5nS to get to the destination your slack is only 500pS. Adding register=
stages allows for maximum use of the clock period increasing the settling =
time and for each stage it increases again.
>
> Ed McGettigan
> --
> Xilinx Inc.
The info I am referring to are posts that were made here and pertained
to the "current" generation of some six or eight years ago. At that
time Peter made the point that the "tau" as you call it, had gotten so
fast that the impact was negligible for all but the most stringent
designs and only a small amount of slack time is needed.
A quick search found these two posts about V2Pro devices. I assume
your newer devices are at least as good as 10 year old technology.
Note that Peter makes a point that the capture window T0, which is a
product in the formula, is not an important parameter. Tau is an
exponent (in ratio with Tslack) in the formula and so makes much
larger contribution to the result. The same is true for the two clock
frequencies, they are just products in the formula and so don't make
huge changes to the MTBF.
So it seems like not much would have changed in 10 years in how a
designer should deal with metastability. Leaving 2 ns of slack time
in the first register to register path should make literally all
designs extremely robust regardless of how many registers are
receiving the first register output or if there is logic in the path.
Just make sure there is 2 ns slack time and your designs should be
good for many, many years!
Rick
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Peter Alfke comp.arch.fpga Oct 10 2002, 8:40 pm
You mentioned metastability, and that caught my attention.
Metastability is a reality, but it (and the fear of it) is highly
overrated.
We recently tested Virtex-IIPro flip-flops, made on 130 nm technology.
You
might call that cutting edge technology, but not exotic.
When a 330 MHz clock synchronized a ~50 MHz input, there was a 200 ps
extra
metastable delay ( causing a clock-to-out + short routing + set-up
total of 1.5
ns) once every second. That translates into a metastable capture
window that
has a width of 3 ns divided by 100 million ( since we looked at both
edges of
the 50 MHz signal).
So the window for a 200 ps extra delay is 0.03 femtoseconds.
If you can tolerate 500 ps more, the MTBF increases 100 000 times, and
the
capture window gets that much smaller.
Metastability is a real, but highly overrated problem.
Peter Alfke, Xilinx Applications
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Peter Alfke comp.arch.fpga Oct 15 2002, 1:11 pm
Here are the K2 values for Virtex-IIPro:
CLB @1.50V: K2 =3D 27.2, i.e. 1/K2 =3D tau =3D 36.8 picoseconds
CLB @1.35V: K2 =3D 23.3, i.e. 1/K2 =3D tau =3D 42.9 picoseconds
CLB @1.65V: K2 =3D 35.7, i.e. 1/K2 =3D tau =3D 28.0 picoseconds
IOB @1.50V: K2 =3D 24.4, i.e. 1/K2 =3D tau =3D 41.0 picoseconds
IOB @1.35V: K2 =3D 19.24, i.e. 1/K2 =3D tau =3D 52.0 picoseconds
IOB @1.65V: K2 =3D 44.05, i.e. 1/K2 =3D tau =3D 22.7 picoseconds
For each extra 100 ps of acceptable metastable delay,
the MTBF increases by a factor 10.3 for CLB @ 1.35 V,
or a factor 6.85 for IOB @ 1.35 V.
Much better values, of course, at nominal or high Vcc.
Klick on
http://support.xilinx.com/support/techxclusives/techX-home.htm
in early November.
Here is the worst-case data point:
50 MHz asynchronous data rate, 330 MHz clock , single-stage
synchronizer in IOB,
Vcc =3D 1.35 V:
clock-to-Q + short routing + set-up time + metastable delay exceeds
clock period
once per 30,000 years.
At nominal Vcc: once per 100 million years.
At a 250 MHz clock rate, delay exceeds clock period less often than
once per
billion years.
Peter Alfke, Xilinx Applications
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
|
|
0
|
|
|
|
Reply
|
gnuarm (2644)
|
7/5/2012 9:16:33 PM
|
|
On Jul 5, 8:04=A0pm, jmariano <jmarian...@gmail.com> wrote:
> Hi Rick, tanks for your help.
>
> > I'm not real clear on your description of your design, but if you are
> > really generating clocks from the 50 MHz, I recommend that inside the
> > FPGA you instead use a single clock and generate clock enables for the
> > various functions.
>
> Yes, I generate a 5 MHz clock inside the module from the main 50 MHz cloc=
k by simple division by 10 because I need a 5 MHz adc clock. I can't use cl=
ock enable because the AD9058 adc does not have a enable input, just clock.
>
you could just have a state machine running at 50MHz that grap data
and set/clear the clock
which I guess is partly what you have in you divide by 10
-Lasse
|
|
0
|
|
|
|
Reply
|
langwadt1 (169)
|
7/5/2012 9:48:43 PM
|
|
On Thu, 05 Jul 2012 11:04:36 -0700, jmariano wrote:
> Hi Rick, tanks for your help.
>
>> I'm not real clear on your description of your design, but if you are
>> really generating clocks from the 50 MHz, I recommend that inside the
>> FPGA you instead use a single clock and generate clock enables for the
>> various functions.
>
> Yes, I generate a 5 MHz clock inside the module from the main 50 MHz
> clock by simple division by 10 because I need a 5 MHz adc clock. I can't
> use clock enable because the AD9058 adc does not have a enable input,
> just clock.
That's OK.
But you need to register the AD9058 outputs, inside the FPGA, to your
internal 50MHz clock. I would also register the S input and the U,V
outputs from the switch. (In fact I would make the switch a synch process
with only "clk" in its sensitivity list - it will effectively register
the switch outputs for you)
All these can be combined into a single synchronous process.
-- assuming u,v,r,i,adcnn are all signed!
process(clk)
begin
if rising_edge(clk) then
-- First pipe stage... synchronise the inputs
if adc_enable then -- 10 MHz, when ADC is stable
adc0_int <= adc0;
...
end if;
-- Second pipe stage... add the (synchronised inputs)
u <= adc0_int - adc90_int;
v <= ...
-- I assume "s" has to be synchronised to "adcnn"
-- so pipeline it to the same depth (also syncs it)
s_int <= s;
s_int2 <= s_int;
-- Third pipe stage ... the switch
case s_int2 is when "00" =>
r <= u;
i <= v;
when "01" =>
...
end case;
-- etc
end if;
end process;
Addition at 50MHz in a Spartan-3 should be no problem.
As your sample rate is 1/10 of the clock rate, I would expect you can
afford a few cycles for internal processing. (If this is not the case you
need to think carefully about how you pipeline the design)
>> I don't see anything in your original post about simulation.
> Sorry about that, I did, in fact, simulate each module and the top
> entity. The behavior simulation gives the expected results, the post and
> place simulation gives same errors that I could not understand,
Excellent.
Before changing the design, I would sim with low level zero-crossing
signals, and see which inputs (s, ADCs) and internal signals (U,V, R,I)
are unstable whenever the large unexpected outputs are occurring.
Then what you need to do to fix will be clear.
You can also install multiple versions in the testbench, asserting their
outputs are the same, and reporting any difference.
- Brian
|
|
0
|
|
|
|
Reply
|
brian7107 (195)
|
7/6/2012 9:43:42 AM
|
|
In article <nZ-dnch1rrNvHG_SnZ2dnUVZ_qSdnZ2d@web-ster.com>,
Tim Wescott <tim@seemywebsite.please> writes:
>Paranoid logic designers will have a string of two or three registers to
>avoid metastability, but I've been told that's not necessary. (I'm not
>much of a logic designer).
Ahh, but are they paranoid enough?
The key is settling time.
In the old days of TTL chips, a pair of FFs (with no logic in between)
got you settling time of as much logic as the worst case delay for
the rest of the system. In practice, that was enough.
With FPGAs, routhing is important. A pair of FFs close together
is probably good enough. If you put them on opposite sides of a big
chip, the routing delays may match the long path of the logic delays
and eat up all of your slack time.
Have any FPGA vendors published recent metastability info?
(Many thanks to Peter Alfke for all his good work in this area.)
I'm not a silicon wizard. Is it reasonable to simulate this stuff?
I'd like to know worst case rather than typicals. It should be possible
to do something like verify simulations with lab typicals and then
use simulations to find the numbers for the nasty corners.
--
These are my opinions. I hate spam.
|
|
0
|
|
|
|
Reply
|
hal-usenet (74)
|
7/7/2012 2:00:22 AM
|
|
On Jul 6, 10:00 pm, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal
Murray) wrote:
> In article <nZ-dnch1rrNvHG_SnZ2dnUVZ_qSdn...@web-ster.com>,
> Tim Wescott <t...@seemywebsite.please> writes:
>
> >Paranoid logic designers will have a string of two or three registers to
> >avoid metastability, but I've been told that's not necessary. (I'm not
> >much of a logic designer).
>
> Ahh, but are they paranoid enough?
>
> The key is settling time.
>
> In the old days of TTL chips, a pair of FFs (with no logic in between)
> got you settling time of as much logic as the worst case delay for
> the rest of the system. In practice, that was enough.
>
> With FPGAs, routhing is important. A pair of FFs close together
> is probably good enough. If you put them on opposite sides of a big
> chip, the routing delays may match the long path of the logic delays
> and eat up all of your slack time.
>
> Have any FPGA vendors published recent metastability info?
> (Many thanks to Peter Alfke for all his good work in this area.)
>
> I'm not a silicon wizard. Is it reasonable to simulate this stuff?
> I'd like to know worst case rather than typicals. It should be possible
> to do something like verify simulations with lab typicals and then
> use simulations to find the numbers for the nasty corners.
I'm not sure what you would want to simulate. Metastability is
probabilistic. There is For a given length of settling time there is
some probability of it happening. Increasing the settling time
reduces the probability but it will never be zero meaning there is no
max length of time it takes for the output of a metastable ff to
settle.
Is that what you are asking?
Rick
|
|
0
|
|
|
|
Reply
|
gnuarm (2644)
|
7/8/2012 10:38:45 PM
|
|
Ed McGettigan <ed.mcgettigan@xilinx.com> wrote:
(snip)
>> >> But it does not work in this way, it behaves in a strange manner...
>> >> Some times I get the expected results but often I get strange values
>> >> (large when they should be small, often negative instead of positive,
>> >> etc.). If I look at the binary representation of the output, it looks
>> >> like if the output din't had time to sum and propagate to the output
>> >> again. In fact, the post place and route simulation shows that when the
>> >> enb signal goes to 0, the output stays in a undetermined condition (you
>> >> know, red line with XXXX).
(snip)
> It isn't just the paranoid logic designer, it should be every
> logic designer.
> A single register only partially solves the problem of an
> asynchronous input with multiple register destinations, but it
> does not solve the very real metastability problem.
> At least two registers should be used to ensure that the
> metastability condition has resolved and with increasing
> clock frequency and finer process nodes using three or more
> stages may be necessary.
Metastability can be a problem, but often the problem is clocking
multiple FFs off the same clock edge, with different delays on
either the clock or data. (The chance of the delays being exactly
equal is close to zero.) The two effects are different.
Note, for example, the common FIFO implementation using a
gray code counter (or binary to gray code converter).
That avoids the clock edge problem, as either value will
work correctly.
Metastability is a different problem, but one that also occurs
when using asynchronous input values.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12236)
|
7/9/2012 5:25:31 AM
|
|
rickman <gnuarm@gmail.com> wrote:
(snip)
>> > Paranoid logic designers will have a string of two or three registers to
>> > avoid metastability, but I've been told that's not necessary. �(I'm not
>> > much of a logic designer).
(snip)
> Hi Ed. They way it was explained to me, I believe from Peter Alfke,
> is that what really resolves metastability is the slack time in a
> register to register path. Over the years FPGA process has resulted
> in FFs which only need a couple of ns to resolve metastability to 1 in
> a million operation years or something like that (I don't remember the
> metric, but it was good enough for anything I do). It doesn't matter
> that you have logic in that path, you just need those few ns in every
> part of the path. In theory, even if you use multiple registers with
> no logic, what really matters is the slack time in the path and that
> is not guaranteed even with no logic. So the design protocol should
> be to assure the slack time from the input register to all subsequent
> registers have sufficient slack time.
I suppose that is true, but really it shouldn't be a problem.
It is usual for many systems to clock as fast as you can,
consistent with the critical path delay. As metastability
is exponential, even a slightly shorter delay is usually enough
to make enough difference in the exponent.
That assumes that there is a FF to FF path that is faster than
the FF logic FF path. I believe that is usual for FPGAs, but
if you manage to get a critical path with only one LUT, then
I am not so sure. But that is pretty hard in most real systems.
> Do you remember how much time that needs to be? I want to say 2 ns,
> but it might be more like 5 ns, I just can't recall. Of course it
> depends on your clock rates, but I believe Peter picked some more
> aggressive speeds like 100 MHz for his example.
I would expect most systems to have at least a 10% margin.
That is, the clock period is at least 10% longer than the
critical path delay. Probably closer to 20%, but maybe 10%.
So, with a 10ns clock there might be only 1ns slack.
Assuming some delay, say 1ns minimum from FF to FF, that
has nine times the slack, and that is in an exponent.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12236)
|
7/9/2012 5:38:51 AM
|
|
jmariano <jmariano65@gmail.com> wrote:
(snip)
> Thank you very much for your input and sorry for the late reply.
> It is really great to be able to get the opinion of such experts,
> specially since, at my current location and in a radius of some 200
> km, I must be the only person working with FPGA and VHDL! I'm also
> glad that the discussion as evolved to levels of complexity far beyond
> my knowledge.
> I was hoping that by now I would be able to say that the thing was
> working as expected but, unfortunately, no.
(snip)
> Here's the full story: I'm implementing a gated integrator, as a part
> of a boxcar averager. This is the standard noise reduction technique
> used in nuclear magnetic resonance (nmr). This is research, not a
> commercial product! The module gets is data from 4 8 bits ADC's at 5
> MHz (adc0, adc90, adc180, adc270) and accumulates wile enb=1. enb is
> generated in a different module. The module does this:
(big snip)
I believe that most FPGA families have FFs with clock enable.
Be sure that you are writing your logic in such a way that
the tools figure that out. In most cases, I believe that means
not writing it as a gated clock. Write it as FF's with enable.
(I know how to write it in verilog but not VHDL.)
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12236)
|
7/9/2012 5:42:12 AM
|
|
Hal Murray <hal-usenet@ip-64-139-1-69.sjc.megapath.net> wrote:
(snip)
> With FPGAs, routhing is important. A pair of FFs close together
> is probably good enough. If you put them on opposite sides of a big
> chip, the routing delays may match the long path of the logic delays
> and eat up all of your slack time.
That is a good question. I usually assume that they won't have
a long route, but that might not be a good assumption.
Some time ago, I was working on a small design in a very large FPGA.
Expanding to fill the available space, things were very far apart.
(And, as I had so much space, I put three FFs in to synchronize,
but with long enough routes even that could fail.)
> Have any FPGA vendors published recent metastability info?
> (Many thanks to Peter Alfke for all his good work in this area.)
> I'm not a silicon wizard. Is it reasonable to simulate this stuff?
> I'd like to know worst case rather than typicals. It should be possible
> to do something like verify simulations with lab typicals and then
> use simulations to find the numbers for the nasty corners.
As I noted previously, though, often the problem isnt' metastabilty
but multiple FFs on the same asynchronous clock. Different problem.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12236)
|
7/9/2012 6:12:51 AM
|
|
rickman <gnuarm@gmail.com> wrote:
(snip)
> I'm not sure what you would want to simulate. Metastability is
> probabilistic. There is For a given length of settling time there is
> some probability of it happening. Increasing the settling time
> reduces the probability but it will never be zero meaning there is no
> max length of time it takes for the output of a metastable ff to
> settle.
A favorite statistical physics problem is calculating the
probability that all the air molecules will move to one half
of a room. There are many other problems with a very small,
but non-zero, probability.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12236)
|
7/9/2012 6:15:20 AM
|
|
Dear All,
I would like to thank you all for your contributions. I finally solved the =
problem, that was not in the code as I immediately decided since i'm not ve=
ry experienced in VHDL, but rather in my miss interpretation of the AD9058'=
s datashet. I feel very stupid!=20
It was tanks to all your comments that I decided to finally rethink the pro=
ject as a all and spotted the problem.
"God saves the internet and the good people that lives there"
jmariano
|
|
0
|
|
|
|
Reply
|
jmariano65 (8)
|
7/9/2012 3:36:05 PM
|
|
On Sun, 08 Jul 2012 15:38:45 -0700, rickman wrote:
> On Jul 6, 10:00 pm, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal
> Murray) wrote:
>> In article <nZ-dnch1rrNvHG_SnZ2dnUVZ_qSdn...@web-ster.com>,
>> Tim Wescott <t...@seemywebsite.please> writes:
>>
>> >Paranoid logic designers will have a string of two or three registers
>> >to avoid metastability, but I've been told that's not necessary. (I'm
>> >not much of a logic designer).
>>
>> Ahh, but are they paranoid enough?
>>
>> The key is settling time.
>>
>> In the old days of TTL chips, a pair of FFs (with no logic in between)
>> got you settling time of as much logic as the worst case delay for the
>> rest of the system. In practice, that was enough.
>>
>> With FPGAs, routhing is important. A pair of FFs close together is
>> probably good enough. If you put them on opposite sides of a big chip,
>> the routing delays may match the long path of the logic delays and eat
>> up all of your slack time.
>>
>> Have any FPGA vendors published recent metastability info? (Many thanks
>> to Peter Alfke for all his good work in this area.)
>>
>> I'm not a silicon wizard. Is it reasonable to simulate this stuff? I'd
>> like to know worst case rather than typicals. It should be possible to
>> do something like verify simulations with lab typicals and then use
>> simulations to find the numbers for the nasty corners.
>
> I'm not sure what you would want to simulate. Metastability is
> probabilistic. There is For a given length of settling time there is
> some probability of it happening. Increasing the settling time reduces
> the probability but it will never be zero meaning there is no max length
> of time it takes for the output of a metastable ff to settle.
The drivers of metastability are probabilistic, yes. But given enough
information you could certainly simulate the positive feedback loop that
is a flip-flop.
I suspect that unless the ball that is the flip-flop state is poised
right on the top of the mountain between the Valley of Zero and the
Valley of One, that the problem is mostly deterministic. It's only when
the after-strobe balance is perfect and the gain is so low that the FF
voltage is affected more by noise than by actual circuit forces that the
problem would remain probabilistic _after_ the strobe happened.
"Enough information", in this case, would involve a whole lot of deep
knowledge of the inner workings of the FPGA, and the simulation would be
an analog circuits problem. So I suspect that you couldn't do it for any
specific part unless you worked at the company in question.
--
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?
Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com
|
|
0
|
|
|
|
Reply
|
tim177 (4404)
|
7/9/2012 6:35:11 PM
|
|
On Jul 9, 1:38=A0am, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> rickman <gnu...@gmail.com> wrote:
>
> (snip)
>
> >> > Paranoid logic designers will have a string of two or three register=
s to
> >> > avoid metastability, but I've been told that's not necessary. =A0(I'=
m not
> >> > much of a logic designer).
>
> (snip)
>
> > Hi Ed. =A0They way it was explained to me, I believe from Peter Alfke,
> > is that what really resolves metastability is the slack time in a
> > register to register path. =A0Over the years FPGA process has resulted
> > in FFs which only need a couple of ns to resolve metastability to 1 in
> > a million operation years or something like that (I don't remember the
> > metric, but it was good enough for anything I do). =A0It doesn't matter
> > that you have logic in that path, you just need those few ns in every
> > part of the path. =A0In theory, even if you use multiple registers with
> > no logic, what really matters is the slack time in the path and that
> > is not guaranteed even with no logic. =A0So the design protocol should
> > be to assure the slack time from the input register to all subsequent
> > registers have sufficient slack time.
>
> I suppose that is true, but really it shouldn't be a problem.
> It is usual for many systems to clock as fast as you can,
> consistent with the critical path delay. As metastability
> is exponential, even a slightly shorter delay is usually enough
> to make enough difference in the exponent.
>
> That assumes that there is a FF to FF path that is faster than
> the FF logic FF path. I believe that is usual for FPGAs, but
> if you manage to get a critical path with only one LUT, then
> I am not so sure. But that is pretty hard in most real systems.
>
> > Do you remember how much time that needs to be? =A0I want to say 2 ns,
> > but it might be more like 5 ns, I just can't recall. =A0Of course it
> > depends on your clock rates, but I believe Peter picked some more
> > aggressive speeds like 100 MHz for his example.
>
> I would expect most systems to have at least a 10% margin.
> That is, the clock period is at least 10% longer than the
> critical path delay. Probably closer to 20%, but maybe 10%.
> So, with a 10ns clock there might be only 1ns slack.
> Assuming some delay, say 1ns minimum from FF to FF, that
> has nine times the slack, and that is in an exponent.
>
> -- glen
You keep talking about the critical path delay as if the metastable
input is driving the critical path. There is only one critical path
in a design normally. All other paths are faster. Are you assuming
that all paths have the same amount of delay?
Regardless, all I am saying is that you don't need to use a path that
has no logic to obtain *enough* slack to give enough settling time to
the metastable input. But in all cases you need to verify this. As
mentioned in another post, Peter Alfke's numbers show that you only
need about 2 ns to get 100 million years MTBF. Of course whether this
is good enough depends on just how reliable your systems have to be
and how many there are. It is 100 million years for one unit, but for
10 million units it will only be 10 years MTBF for the group.
Rick
|
|
0
|
|
|
|
Reply
|
gnuarm (2644)
|
7/10/2012 12:14:12 AM
|
|
On Jul 9, 11:36=A0am, jmariano <jmarian...@gmail.com> wrote:
> Dear All,
>
> I would like to thank you all for your contributions. I finally solved th=
e problem, that was not in the code as I immediately decided since i'm not =
very experienced in VHDL, but rather in my miss interpretation of the AD905=
8's datashet. I feel very stupid!
>
> It was tanks to all your comments that I decided to finally rethink the p=
roject as a all and spotted the problem.
>
> "God saves the internet and the good people that lives there"
>
> jmariano
Don't think of it as a stupid mistake, think of it as a "good
catch"!
Rick
|
|
0
|
|
|
|
Reply
|
gnuarm (2644)
|
7/10/2012 12:16:01 AM
|
|
On Jul 9, 2:35=A0pm, Tim Wescott <t...@seemywebsite.com> wrote:
> On Sun, 08 Jul 2012 15:38:45 -0700, rickman wrote:
> > On Jul 6, 10:00 pm, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal
> > Murray) wrote:
> >> In article <nZ-dnch1rrNvHG_SnZ2dnUVZ_qSdn...@web-ster.com>,
> >> =A0Tim Wescott <t...@seemywebsite.please> writes:
>
> >> >Paranoid logic designers will have a string of two or three registers
> >> >to avoid metastability, but I've been told that's not necessary. =A0(=
I'm
> >> >not much of a logic designer).
>
> >> Ahh, but are they paranoid enough?
>
> >> The key is settling time.
>
> >> In the old days of TTL chips, a pair of FFs (with no logic in between)
> >> got you settling time of as much logic as the worst case delay for the
> >> rest of the system. =A0In practice, that was enough.
>
> >> With FPGAs, routhing is important. =A0A pair of FFs close together is
> >> probably good enough. =A0If you put them on opposite sides of a big ch=
ip,
> >> the routing delays may match the long path of the logic delays and eat
> >> up all of your slack time.
>
> >> Have any FPGA vendors published recent metastability info? (Many thank=
s
> >> to Peter Alfke for all his good work in this area.)
>
> >> I'm not a silicon wizard. =A0Is it reasonable to simulate this stuff? =
I'd
> >> like to know worst case rather than typicals. =A0It should be possible=
to
> >> do something like verify simulations with lab typicals and then use
> >> simulations to find the numbers for the nasty corners.
>
> > I'm not sure what you would want to simulate. =A0Metastability is
> > probabilistic. =A0There is For a given length of settling time there is
> > some probability of it happening. =A0Increasing the settling time reduc=
es
> > the probability but it will never be zero meaning there is no max lengt=
h
> > of time it takes for the output of a metastable ff to settle.
>
> The drivers of metastability are probabilistic, yes. =A0But given enough
> information you could certainly simulate the positive feedback loop that
> is a flip-flop.
>
> I suspect that unless the ball that is the flip-flop state is poised
> right on the top of the mountain between the Valley of Zero and the
> Valley of One, that the problem is mostly deterministic. =A0It's only whe=
n
> the after-strobe balance is perfect and the gain is so low that the FF
> voltage is affected more by noise than by actual circuit forces that the
> problem would remain probabilistic _after_ the strobe happened.
>
> "Enough information", in this case, would involve a whole lot of deep
> knowledge of the inner workings of the FPGA, and the simulation would be
> an analog circuits problem. =A0So I suspect that you couldn't do it for a=
ny
> specific part unless you worked at the company in question.
>
> --
> My liberal friends think I'm a conservative kook.
> My conservative friends think I'm a liberal kook.
> Why am I not happy that they have found common ground?
>
> Tim Wescott, Communications, Control, Circuits & Softwarehttp://www.wesco=
ttdesign.com
That's what probability is all about, dealing with the lack of
knowledge. You don't know the exact voltage of the input when the
clock edge changed and you don't know how fast either signal was
changing... etc. But you do know how often you expect all of these
events to line up to produce metastability and you know the
distribution of delay is a logarithmic taper.
I won't try to argue about how many angels can dance on the head of a
pin, but I have no information to show me that the formula that Peter
used is not accurate, even for extreme cases.
Rick
|
|
0
|
|
|
|
Reply
|
gnuarm (2644)
|
7/10/2012 12:20:26 AM
|
|
rickman <gnuarm@gmail.com> wrote:
(snip, I wrote)
>> I suppose that is true, but really it shouldn't be a problem.
>> It is usual for many systems to clock as fast as you can,
>> consistent with the critical path delay. As metastability
>> is exponential, even a slightly shorter delay is usually enough
>> to make enough difference in the exponent.
>> That assumes that there is a FF to FF path that is faster than
>> the FF logic FF path. I believe that is usual for FPGAs, but
>> if you manage to get a critical path with only one LUT, then
>> I am not so sure. But that is pretty hard in most real systems.
(snip)
> You keep talking about the critical path delay as if the metastable
> input is driving the critical path. There is only one critical path
> in a design normally. All other paths are faster. Are you assuming
> that all paths have the same amount of delay?
No, but it might be that many have about the same delay. Well,
my favorite things to design are systolic arrays, where there is
the same logic (though with different routing, different delay)
between a large number of FFs.
For any pipelined processor, the most efficient logic has about
the same delay between successive registers.
> Regardless, all I am saying is that you don't need to use a path that
> has no logic to obtain *enough* slack to give enough settling time to
> the metastable input. But in all cases you need to verify this.
Yes. One would hope that no logic would have the shortest delay,
though in the case of FPGAs, you might not be able to count on that.
> As mentioned in another post, Peter Alfke's numbers show that you only
> need about 2 ns to get 100 million years MTBF. Of course whether this
> is good enough depends on just how reliable your systems have to be
> and how many there are. It is 100 million years for one unit, but for
> 10 million units it will only be 10 years MTBF for the group.
I have done designs with at most two LUTs between registers,
and might even be able to do one.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12236)
|
7/10/2012 1:23:45 AM
|
|
On Mon, 09 Jul 2012 17:20:26 -0700, rickman wrote:
> On Jul 9, 2:35 pm, Tim Wescott <t...@seemywebsite.com> wrote:
>> On Sun, 08 Jul 2012 15:38:45 -0700, rickman wrote:
>> > On Jul 6, 10:00 pm, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal
>> > Murray) wrote:
>> >> In article <nZ-dnch1rrNvHG_SnZ2dnUVZ_qSdn...@web-ster.com>,
>> >> Tim Wescott <t...@seemywebsite.please> writes:
>>
>> >> >Paranoid logic designers will have a string of two or three
>> >> >registers to avoid metastability, but I've been told that's not
>> >> >necessary. (I'm not much of a logic designer).
>>
>> >> Ahh, but are they paranoid enough?
>>
>> >> The key is settling time.
>>
>> >> In the old days of TTL chips, a pair of FFs (with no logic in
>> >> between) got you settling time of as much logic as the worst case
>> >> delay for the rest of the system. In practice, that was enough.
>>
>> >> With FPGAs, routhing is important. A pair of FFs close together is
>> >> probably good enough. If you put them on opposite sides of a big
>> >> chip, the routing delays may match the long path of the logic delays
>> >> and eat up all of your slack time.
>>
>> >> Have any FPGA vendors published recent metastability info? (Many
>> >> thanks to Peter Alfke for all his good work in this area.)
>>
>> >> I'm not a silicon wizard. Is it reasonable to simulate this stuff?
>> >> I'd like to know worst case rather than typicals. It should be
>> >> possible to do something like verify simulations with lab typicals
>> >> and then use simulations to find the numbers for the nasty corners.
>>
>> > I'm not sure what you would want to simulate. Metastability is
>> > probabilistic. There is For a given length of settling time there is
>> > some probability of it happening. Increasing the settling time
>> > reduces the probability but it will never be zero meaning there is no
>> > max length of time it takes for the output of a metastable ff to
>> > settle.
>>
>> The drivers of metastability are probabilistic, yes. But given enough
>> information you could certainly simulate the positive feedback loop
>> that is a flip-flop.
>>
>> I suspect that unless the ball that is the flip-flop state is poised
>> right on the top of the mountain between the Valley of Zero and the
>> Valley of One, that the problem is mostly deterministic. It's only
>> when the after-strobe balance is perfect and the gain is so low that
>> the FF voltage is affected more by noise than by actual circuit forces
>> that the problem would remain probabilistic _after_ the strobe
>> happened.
>>
>> "Enough information", in this case, would involve a whole lot of deep
>> knowledge of the inner workings of the FPGA, and the simulation would
>> be an analog circuits problem. So I suspect that you couldn't do it
>> for any specific part unless you worked at the company in question.
>>
>> --
>> My liberal friends think I'm a conservative kook. My conservative
>> friends think I'm a liberal kook. Why am I not happy that they have
>> found common ground?
>>
>> Tim Wescott, Communications, Control, Circuits &
>> Softwarehttp://www.wescottdesign.com
>
> That's what probability is all about, dealing with the lack of
> knowledge. You don't know the exact voltage of the input when the clock
> edge changed and you don't know how fast either signal was changing...
> etc. But you do know how often you expect all of these events to line
> up to produce metastability and you know the distribution of delay is a
> logarithmic taper.
>
> I won't try to argue about how many angels can dance on the head of a
> pin, but I have no information to show me that the formula that Peter
> used is not accurate, even for extreme cases.
Well, first, I wasn't trying to contradict you -- I just picked the wrong
place in the thread to answer Hal's question.
And second, before you can know the necessary inputs to your statistical
calculations, you need to do some simulating to see how long it takes for
the state to come down from various places on the mountaintop.
The difference between a circuit that has a narrow & sharp potential peak
vs. one that has a wide, flat, broad one is significant.
(One that had a true stable spot at 1/2 voltage would be mucho worse, but
that's not too likely in this day and age).
--
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?
Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com
|
|
0
|
|
|
|
Reply
|
tim177 (4404)
|
7/10/2012 1:47:47 AM
|
|
Tim Wescott <tim@seemywebsite.com> wrote:
(snip, someone wrote)
>> That's what probability is all about, dealing with the lack of
>> knowledge. You don't know the exact voltage of the input when the clock
>> edge changed and you don't know how fast either signal was changing...
>> etc. But you do know how often you expect all of these events to line
>> up to produce metastability and you know the distribution of delay is a
>> logarithmic taper.
(snip)
> Well, first, I wasn't trying to contradict you -- I just picked the wrong
> place in the thread to answer Hal's question.
> And second, before you can know the necessary inputs to your statistical
> calculations, you need to do some simulating to see how long it takes for
> the state to come down from various places on the mountaintop.
> The difference between a circuit that has a narrow & sharp potential peak
> vs. one that has a wide, flat, broad one is significant.
Story I heard some years ago, the sharper and narrower the peak,
the harder it is to get into the metastable state, but the
longer it stays when it actually gets there.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12236)
|
7/10/2012 4:12:21 AM
|
|
On Tue, 10 Jul 2012 04:12:21 +0000, glen herrmannsfeldt wrote:
> Tim Wescott <tim@seemywebsite.com> wrote:
>
> (snip, someone wrote)
>>> That's what probability is all about, dealing with the lack of
>>> knowledge. You don't know the exact voltage of the input when the
>>> clock edge changed and you don't know how fast either signal was
>>> changing... etc. But you do know how often you expect all of these
>>> events to line up to produce metastability and you know the
>>> distribution of delay is a logarithmic taper.
>
>
> (snip)
>> Well, first, I wasn't trying to contradict you -- I just picked the
>> wrong place in the thread to answer Hal's question.
>
>> And second, before you can know the necessary inputs to your
>> statistical calculations, you need to do some simulating to see how
>> long it takes for the state to come down from various places on the
>> mountaintop.
>
>> The difference between a circuit that has a narrow & sharp potential
>> peak vs. one that has a wide, flat, broad one is significant.
>
> Story I heard some years ago, the sharper and narrower the peak,
> the harder it is to get into the metastable state, but the longer it
> stays when it actually gets there.
Wow. That's counter-intuitive. I would think that the sharper the peak
the less likely that the device would be stuck without knowing which way
to fall.
--
Tim Wescott
Control system and signal processing consulting
www.wescottdesign.com
|
|
0
|
|
|
|
Reply
|
tim866 (392)
|
7/10/2012 3:34:26 PM
|
|
Tim Wescott <tim@seemywebsite.please> wrote:
(snip, I wrote)
>> Story I heard some years ago, the sharper and narrower the peak,
>> the harder it is to get into the metastable state, but the longer it
>> stays when it actually gets there.
> Wow. That's counter-intuitive. I would think that the sharper the peak
> the less likely that the device would be stuck without knowing which way
> to fall.
First, remember that it is conditional on actually getting it
to the metastable state.
I don't know if it is convincing or not, but consider balancing
a knife on its edge on a table. You have a sharp and dull knife.
Once you get the sharp knife balanced, it will make a deeper
impression into the table and so stay up longer.
For the actual physics, there are some symmetries that require
some correlations in the probability of getting into, and getting
out of, a certain state. If you get it wrong, then energy
conservation fails.
There is an old favorite, of putting a dark and light colored
object in a mirrored room. (Consider an ellipoidal mirror
with two spheres at the foci.) Now, consider the effect of
black body radiation with a black and a white sphere.
The black sphere absorbs most radiation (mostly IR light)
but the white one doesn't absorb as much. Conservation of
energy requires that the black one emit more black body
radiation (that is where the name comes from). If not,
the black one would get warmer, and you could extract
energy from the temperature difference.
Note that this is why heat sinks are (usually) black.
(To get a connection to DSP.)
Warm objects have more electrons in higher (metastable) states.
-- glen
|
|
0
|
|
|
|
Reply
|
gah (12236)
|
7/10/2012 7:16:36 PM
|
|
rickman wrote:
> Regardless, all I am saying is that you don't need to use a path that
> has no logic to obtain *enough* slack to give enough settling time to
> the metastable input.
Well, one thing that I learned on this group is that metastability
is not the most likely problem, it is time skew. If an unsynchronized
input is fed to a number of LUTs scattered around the chip, it can have
several ns of skew between them. The clocks have tightly controlled
skew, so the unsynched input can be sensed differently at two locations.
I ran into this on a state machine and it caused the state logic to go
to undefined states. This was finally explained, I think by one of the
guys at Xilinx, and that it can have a thousand times higher probability
than true metastability of a single FF.
Jon
|
|
0
|
|
|
|
Reply
|
jmelson1 (65)
|
7/10/2012 8:35:40 PM
|
|
On Jul 9, 9:47=A0pm, Tim Wescott <t...@seemywebsite.com> wrote:
> On Mon, 09 Jul 2012 17:20:26 -0700, rickman wrote:
> > On Jul 9, 2:35=A0pm, Tim Wescott <t...@seemywebsite.com> wrote:
> >> On Sun, 08 Jul 2012 15:38:45 -0700, rickman wrote:
> >> > On Jul 6, 10:00 pm, hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal
> >> > Murray) wrote:
> >> >> In article <nZ-dnch1rrNvHG_SnZ2dnUVZ_qSdn...@web-ster.com>,
> >> >> =A0Tim Wescott <t...@seemywebsite.please> writes:
>
> >> >> >Paranoid logic designers will have a string of two or three
> >> >> >registers to avoid metastability, but I've been told that's not
> >> >> >necessary. =A0(I'm not much of a logic designer).
>
> >> >> Ahh, but are they paranoid enough?
>
> >> >> The key is settling time.
>
> >> >> In the old days of TTL chips, a pair of FFs (with no logic in
> >> >> between) got you settling time of as much logic as the worst case
> >> >> delay for the rest of the system. =A0In practice, that was enough.
>
> >> >> With FPGAs, routhing is important. =A0A pair of FFs close together =
is
> >> >> probably good enough. =A0If you put them on opposite sides of a big
> >> >> chip, the routing delays may match the long path of the logic delay=
s
> >> >> and eat up all of your slack time.
>
> >> >> Have any FPGA vendors published recent metastability info? (Many
> >> >> thanks to Peter Alfke for all his good work in this area.)
>
> >> >> I'm not a silicon wizard. =A0Is it reasonable to simulate this stuf=
f?
> >> >> I'd like to know worst case rather than typicals. =A0It should be
> >> >> possible to do something like verify simulations with lab typicals
> >> >> and then use simulations to find the numbers for the nasty corners.
>
> >> > I'm not sure what you would want to simulate. =A0Metastability is
> >> > probabilistic. =A0There is For a given length of settling time there=
is
> >> > some probability of it happening. =A0Increasing the settling time
> >> > reduces the probability but it will never be zero meaning there is n=
o
> >> > max length of time it takes for the output of a metastable ff to
> >> > settle.
>
> >> The drivers of metastability are probabilistic, yes. =A0But given enou=
gh
> >> information you could certainly simulate the positive feedback loop
> >> that is a flip-flop.
>
> >> I suspect that unless the ball that is the flip-flop state is poised
> >> right on the top of the mountain between the Valley of Zero and the
> >> Valley of One, that the problem is mostly deterministic. =A0It's only
> >> when the after-strobe balance is perfect and the gain is so low that
> >> the FF voltage is affected more by noise than by actual circuit forces
> >> that the problem would remain probabilistic _after_ the strobe
> >> happened.
>
> >> "Enough information", in this case, would involve a whole lot of deep
> >> knowledge of the inner workings of the FPGA, and the simulation would
> >> be an analog circuits problem. =A0So I suspect that you couldn't do it
> >> for any specific part unless you worked at the company in question.
>
> >> --
> >> My liberal friends think I'm a conservative kook. My conservative
> >> friends think I'm a liberal kook. Why am I not happy that they have
> >> found common ground?
>
> >> Tim Wescott, Communications, Control, Circuits &
> >> Softwarehttp://www.wescottdesign.com
>
> > That's what probability is all about, dealing with the lack of
> > knowledge. =A0You don't know the exact voltage of the input when the cl=
ock
> > edge changed and you don't know how fast either signal was changing...
> > etc. =A0But you do know how often you expect all of these events to lin=
e
> > up to produce metastability and you know the distribution of delay is a
> > logarithmic taper.
>
> > I won't try to argue about how many angels can dance on the head of a
> > pin, but I have no information to show me that the formula that Peter
> > used is not accurate, even for extreme cases.
>
> Well, first, I wasn't trying to contradict you -- I just picked the wrong
> place in the thread to answer Hal's question.
>
> And second, before you can know the necessary inputs to your statistical
> calculations, you need to do some simulating to see how long it takes for
> the state to come down from various places on the mountaintop.
>
> The difference between a circuit that has a narrow & sharp potential peak
> vs. one that has a wide, flat, broad one is significant.
>
> (One that had a true stable spot at 1/2 voltage would be mucho worse, but
> that's not too likely in this day and age).
>
> --
> My liberal friends think I'm a conservative kook.
> My conservative friends think I'm a liberal kook.
> Why am I not happy that they have found common ground?
>
> Tim Wescott, Communications, Control, Circuits & Softwarehttp://www.wesco=
ttdesign.com
Sorry if my tone sounded like I was offended at all, I'm not. I was
just trying to make the point that you don't know the shape of the
"mountain" the ball is balanced on and I doubt a simulation could
model it very well. But that is outside my expertise so if I am
wrong...
But I still fail to see how that shape would affect anything
significantly. Unless it has flat spots or even indentations that
were local minima, what would the shape change? It would most likely
only change the speed at which the ball falls off the "mountain" which
is part of what is measured when they characterize a device the way
Peter Alfke did.
Still, even if there is some abnormalities in the shape of the
"mountain", is that really important? The goal is to get the
possibility so far out you just don't have to think about it. If the
shape changes the probability by a factor of 10 either way it
shouldn't make a problem. Just add another 200 ps to the slack and
get another order of magnitude in the MTBF. Or was it 100 ps?
Rick
|
|
0
|
|
|
|
Reply
|
gnuarm (2644)
|
7/10/2012 11:22:40 PM
|
|
On Jul 9, 9:23=A0pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> rickman <gnu...@gmail.com> wrote:
>
> (snip, I wrote)
>
> >> I suppose that is true, but really it shouldn't be a problem.
> >> It is usual for many systems to clock as fast as you can,
> >> consistent with the critical path delay. As metastability
> >> is exponential, even a slightly shorter delay is usually enough
> >> to make enough difference in the exponent.
> >> That assumes that there is a FF to FF path that is faster than
> >> the FF logic FF path. I believe that is usual for FPGAs, but
> >> if you manage to get a critical path with only one LUT, then
> >> I am not so sure. But that is pretty hard in most real systems.
>
> (snip)
>
> > You keep talking about the critical path delay as if the metastable
> > input is driving the critical path. =A0There is only one critical path
> > in a design normally. =A0All other paths are faster. =A0Are you assumin=
g
> > that all paths have the same amount of delay?
>
> No, but it might be that many have about the same delay. Well,
> my favorite things to design are systolic arrays, where there is
> the same logic (though with different routing, different delay)
> between a large number of FFs.
>
> For any pipelined processor, the most efficient logic has about
> the same delay between successive registers.
We are still not talking about the same thing. The *max* delay in
each stage will be roughly even, but within a stage there will be all
sorts of delays. If you are balancing all paths to achieve even
delays you are working on a very intense design akin to the original
Cray computers with hand designed ECL chip logic.
> > Regardless, all I am saying is that you don't need to use a path that
> > has no logic to obtain *enough* slack to give enough settling time to
> > the metastable input. =A0But in all cases you need to verify this.
>
> Yes. One would hope that no logic would have the shortest delay,
> though in the case of FPGAs, you might not be able to count on that.
Yes, that is the point, you need to verify the required slack time no
mater what is in the path.
> > As mentioned in another post, Peter Alfke's numbers show that you only
> > need about 2 ns to get 100 million years MTBF. =A0Of course whether thi=
s
> > is good enough depends on just how reliable your systems have to be
> > and how many there are. =A0It is 100 million years for one unit, but fo=
r
> > 10 million units it will only be 10 years MTBF for the group.
>
> I have done designs with at most two LUTs between registers,
> and might even be able to do one.
That would be good, but I don't know if it is very practical. To make
that useful you also have to optimize the placement to minimize
routing delays. I haven't seen that done since some of Ray Andraka's
designs which are actually fairly small by today's standards. I can't
conceive of trying that with many current designs.
Rick
|
|
0
|
|
|
|
Reply
|
gnuarm (2644)
|
7/10/2012 11:29:37 PM
|
|
|
33 Replies
50 Views
(page loaded in 0.531 seconds)
|