FrenziedEngi
18th March 2009, 12:20 AM
So I did a little bit of search tonight about what is going on with the Altera / Xilinx memories and portable designs. What I found is that Altera (please correct me if I am wrong here) has a "super-wrapper" for memories called altsyncram which is configurable by generics and synthesizable. This means that you can create a few wrappers of different types (ram, rom, dual ports, etc) and they are portable through altera devices.
Xilinx on the other hand makes use of cores to generate their memory structures from blockrams. These are configuration AND device specific (which really sucks :mad:) meaning you need to generate new cores every time you want to change the size of a memory, or move to a different device.
The way around using these cores on a Xilinx device (or even using the super wrapper on Altera devices) is modeling a memory in VHDL in such a way that the synthesis tools you are using recognize it and implement it into a blockram. This is really ideal since you are not bound to a device or vendor. Unfortunately this requires work. Some memories are easy to model (single port sync RAM), while other memories (dual port RAM with asymmetric port sizes) can get a lot more complicated.
Is this ringing true with anyone?
http://www.fpgarelated.com/usenet/fpga/show/52897-1.php
tcdev
18th March 2009, 12:56 PM
The way around using these cores on a Xilinx device (or even using the super wrapper on Altera devices) is modeling a memory in VHDL in such a way that the synthesis tools you are using recognize it and implement it into a blockram. This is really ideal since you are not bound to a device or vendor. Unfortunately this requires work. Some memories are easy to model (single port sync RAM), while other memories (dual port RAM with asymmetric port sizes) can get a lot more complicated.
That's pretty much how I understand the problem!
Someone more experienced with Xilinx could probably add to this discussion. IIRC, it was pointed out to me a while back that there may be some such memories in one or more of the projects I ported to PACE from external sources?!? I've been too lazy to confirm...:o
overclocked
6th July 2009, 08:48 AM
[I saw that this thread was old but I still though maybe someone would like advice or suggesions.]
I second that, FrenziedEngi!
But I see 3 alternatives in Xilinx ISE, you either:
1) Use the underlying device primitives and write your own wrapper for them. This could be quite tedious because this means fiddling with multiple instances of BlockRAM's which together form the width and/or depth you need. But at least you've got low-level control.
2) Use CoreGen to infer device/family-specific macros. Different types of scenarios for using the built-in BlockRAM resources can be handled with RAM and ROM, pipelining for maximum performance and it also handles initializing the memory in a simple way. The low-level primitives are handled automatically.
3) Use your RTL-language of preference (VHDL or Verilog) and write the source in a way that makes ISE infer the correct BlockRAM usage.
---------------------
I would only use 1) if building a really simple thing that only uses 1 BlockRAM resource and also maps the maximum 18-bit width that a BlockRAM use.
2) is my normal choice. It is easy to use as long as the size of the RAM/ROM does not change/differ from day to day. Also initialization RAM content would yield a recreate of the CoreGen core I think.
What this project would need is probably 3). A simple way to instance device-independent RAM/ROM memories with the possibility for initialization of MEM content. If we are talking VHDL we have a straight solution. Use the "Language Templates" Tool that are present within the ISE environment. It has been present in many versions and I think the actual syntax of the example cores have been the same for some time.
Wouldn't these suit PACE quite good and also be easy to implement? What do you think?
Here are some examples of the types of memories that can be implemented "by force" :
BlockRAM
Example Code
Single Port:
No-Change Mode
No-Change Mode w/ 2-bit write-enable
No-Change Mode w/ 4-bit write-enable
Read First Mode
Read First Mode w/ 2-bit write-enable
Read First Mode w/ 4-bit write-enable
Write First Mode
Write First Mode w/ 2-bit write-enable
Dual Port:
1 Clock, 1 Read/Write port, 1 Read port
1 Clock, 1 Write port, 1 Read port
2 Clock, 1 Read/Write port, 1 Read port
2 Clocks, 1 Write port, 1 Read port
2 Clocks, 2 Read/Write ports
ROM
Example Code
The Example code are the simple ones that come with all code needed and the other ones are partial and need to be combined with the example code. Examples:
Simple RAM memory Example:
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity rams_01 is
port (CLK : in std_logic;
WE : in std_logic;
EN : in std_logic;
ADDR : in std_logic_vector(5 downto 0);
DI : in std_logic_vector(15 downto 0);
DO : out std_logic_vector(15 downto 0));
end rams_01;
architecture syn of rams_01 is
type ram_type is array (63 downto 0) of std_logic_vector (15 downto 0);
signal RAM: ram_type;
begin
process (CLK)
begin
if CLK'event and CLK = '1' then
if EN = '1' then
if WE = '1' then
RAM(conv_integer(ADDR)) <= DI;
end if;
DO <= RAM(conv_integer(ADDR)) ;
end if;
end if;
end process;
end syn;
2 Clocks, 2 Read/Write ports
-- Ensure that the <ram_name> is correctly defined. Please refer to the RAM Type
-- Declaration template for more info.
process (<clockA>)
begin
if (<clockA>'event and <clockA> = '1') then
if (<enableA> = '1') then
if (<write_enableA> = '1') then
<ram_name>(conv_integer(<addressA>)) := <input_dataA>;
end if;
<ram_outputA> <= <ram_name>(conv_integer(<addressA>));
end if;
end if;
end process;
process (<clockB>)
begin
if (<clockB>'event and <clockB> = '1') then
if (<enableB> = '1') then
if (<write_enableB> = '1') then
<ram_name>(conv_integer(<addressB>)) := <input_dataB>;
end if;
<ram_outputB> <= <ram_name>(conv_integer(<addressB>));
end if;
end if;
end process;
Simple ROM memory:
library ieee;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity rams_21b is
port (CLK : in std_logic;
EN : in std_logic;
ADDR : in std_logic_vector(5 downto 0);
DATA : out std_logic_vector(19 downto 0));
end rams_21b;
architecture syn of rams_21b is
type rom_type is array (63 downto 0) of std_logic_vector (19 downto 0);
signal ROM : rom_type:= (X"0200A", X"00300", X"08101", X"04000", X"08601", X"0233A",
X"00300", X"08602", X"02310", X"0203B", X"08300", X"04002",
X"08201", X"00500", X"04001", X"02500", X"00340", X"00241",
X"04002", X"08300", X"08201", X"00500", X"08101", X"00602",
X"04003", X"0241E", X"00301", X"00102", X"02122", X"02021",
X"00301", X"00102", X"02222", X"04001", X"00342", X"0232B",
X"00900", X"00302", X"00102", X"04002", X"00900", X"08201",
X"02023", X"00303", X"02433", X"00301", X"04004", X"00301",
X"00102", X"02137", X"02036", X"00301", X"00102", X"02237",
X"04004", X"00304", X"04040", X"02500", X"02500", X"02500",
X"0030D", X"02341", X"08201", X"0400D");
signal rdata : std_logic_vector(19 downto 0);
begin
rdata <= ROM(conv_integer(ADDR));
process (CLK)
begin
if (CLK'event and CLK = '1') then
if (EN = '1') then
DATA <= rdata;
end if;
end if;
end process;
end syn;
Best Regards
Magnus
Talus
24th July 2009, 11:40 AM
Inference is the way to go, in my opinion, as it is device-independent and, in principle, technology independent. If you instantiate BlockRAM directly, then the synthesizer can't substitute with distributed RAM in case you run out of blocks. I'll be the first to admit it's a poor substitute (resource usage and synthesis time will sky rocket!), but it's better than no substitute and just might make a design fit, that otherwise wouldn't.
There was a thread on RAM inference patterns in comp.arch.fpga not too long ago.
overclocked
25th July 2009, 11:07 AM
For me to propagate 3) (VHDL syntax that makes ISE to automatically infer) from my above list was bacause this could potentially be a single way to infer both ROM and RAM for all devices and for at least both Quartus/Altera and ISE/Xilinx which would make all projects simpler to handle.
I mean, I have examples in my VHDL book of RAM and ROM which both should automatically generate the use of dedicated internal resources, but maybe the Xilinx way works right-away for Quartus also.. or the other way around.
I would be most interesting if anyone with a newer Quartus already installed could try out the above examples and see what type of resources that are mapped. It would of course be possible that the two competing companies can see gains in NOT having the same infer-syntax, but I don't know.
vBulletin® v3.6.8, Copyright ©2000-2010, Jelsoft Enterprises Ltd.