Microblaze + Linux: Sample design of a custom peripheral

This post was written by eli on August 14, 2011
Posted Under: Linux kernel,Microblaze

Scope

Even though Xilinx supplies a cute wizard for creating peripherals in its EDK (version 13.2 in my case), it’s just enough to work as a demo. For a real-life case there’s no escape from getting down to the system’s guts. As it turns out, things are pretty well organized under EDK’s hood, which makes the attempt to cover it all up with a wizard even more questionable.

This post is a jot-down of the technicalities behind designing a minimal bare-boned peripheral and its Linux driver. With “bare-boned” I mean that it has the absolutely simplest bus interface (AXI4 Lite without even decoding the addresses), and that it’s in pure Verilog. No external IP is used.

This peripheral connects to the SP605′s four LEDs. Any write to its address region updates the LED’s state according to the written value’s four LSBs. Reading back from any of its covered addresses gives the four LSBs back. That’s all.

Sources of information

This post assumes that you’re familiar with running Linux on Microblaze. I have a pretty extensive tutorial on the subject for reference.

These are worth to look at:

  • The official cores’ sources (in VHDL) can be found at ISE_DS\EDK\hw\XilinxProcessorIPLib\pcores (path from where ISE is installed). It’s interesting in particular to look at axi_lite_ipif_v1_00_a.
  • The AMBA AXI protocol specification, downloaded free from ARM’s site.
  • Platform Specification Format Reference Manual (UG642, psf_rm.pdf): Describes the file formats in detail. Necessary when editing the files.
  • EDK Concepts, Tools, and Techniques (UG683, edk_ctt.pdf) : The chapter about Creating Your Own Intellectual Property is somewhat helpful to try out the Wizard.

Understanding the process

It looks like the missing link in Xilinx’ documentation is to explain how the whole machinery works with regard to adopting a custom made peripheral. I’ll try to fill in that gap now.

Generally speaking, the minimal core consists of the following files, which should be in a dedicated directory under the “pcores” directory, which is under the EDK project’s top directory:

  • data/minimal_v2_1_0.mpd: This file is what EDK looks at when an IP is added to a project. It contains all the information used directly by the EDK. The peripheral’s GUI is set up according to this information, and it’s also used when the EDK generates wrappers and connections for it. Its format is well documented, but it looks like it’s easier to just copy snippets from existing core’s MPD files. It’s also possible to generate this file automatically with PsfUtility from the toplevel source file, but it’s not clear if it’s worth the effort to learn yet another tool.
  • data/minimal_v2_1_0.pao: This file supplies EDK with a list of HDL files which need to be synthesized to create the peripheral. It also sets the order of synthesis.
  • hdl/verilog/minimal.v: The Verilog file constituting the peripheral. Of course there may be several files, which need to be mentioned in the PAO file.
  • Note that “black box” modules (presynthesized netlists) are listed in BBD files, which are not necessary in this example. When used, the MPD file is set to reflect this.

The file names above relate to a peripheral called “minimal”. They change according to the project’s setting and version numbers.

All in all, the flow is pretty simple: Only the MPD file is considered by EDK, and only at platform netlist generation are the HDL files synthesized according to the PAO file. The instantiation and connection depend on the settings within the EDK (actually, the MHS file).

It’s easiest to create just any peripheral with the wizard, see what they do, and then modify the files.

Going from Wizard’s output to minimal peripheral

This is a short outline of the stages. The result is given in the next section.

  • Edit the data/*.pao file: Remove all files and insert the single Verilog file, changing the type to verilog.
  • In the data/*.mpd file, change OPTION HDL = VHDL to VERILOG too. Also add, under ##ports, PORT minimal_leds = “”, DIR = O, VEC = [3:0] (so that the I/O port is known. Note the =”" part).
  • Remove data/*.prj file  so no unnecessary files are included (even though this file seems to be ignored).
  • Roughly translate the VHDL file to Verilog. Make it Verilog parameters instead of VHDL generics.
  • Rename/remove the devl directory, since its information is not in sync with the new situation, and Custom IP Wizard can’t do any good at this point.
  • And finally, in EDK, Project > Rescan User Repositories
  • Remove the LED_4bits core from the project, choosing “Delete instance and any connections to internal nets”. This will keep the net names used for connecting to the LEDs, and make them available for connection to the new peripheral. Otherwise, the external net names need to be set, and the system.ucf given at the “project” tab updated to reflect the new nets.
  • Add the minimal core to the project, and connect the just released LEDs_4Bits_TRI_O to its minimal_leds port.
  • Create bitfile

The synthesis of the peripheral’s HDL takes place during the “create netlist” flow (which is, of course, part of generating bitfile). For example, the synthesis of an instance named minimal_0 will appear as follows in the console

INSTANCE:minimal_0 - C:\tryperipheral\system.mhs line 424 - Running XST
synthesis
PMSPEC -- Overriding Xilinx file
<C:/ise13_2/ISE_DS/EDK/spartan6/data/spartan6.acd> with local file
<C:/ise13_2/ISE_DS/ISE/spartan6/data/spartan6.acd>

And if there are errors in the HDL, they will show up at this point.

Sample files

These are the files used for the minimal peripheral. They are a sloppy adoption of the files generated by the Custom IP Wizard, so they’re very likely to contain unnecessary declarations.

First, the Verilog file:

module minimal #(
 parameter C_S_AXI_DATA_WIDTH = 32,
 parameter C_S_AXI_ADDR_WIDTH = 32,
 parameter C_S_AXI_MIN_SIZE = 'h000001FF,
 parameter C_USE_WSTRB = 0,
 parameter C_DPHASE_TIMEOUT = 8,
 parameter C_BASEADDR = 'hFFFFFFFF,
 parameter C_HIGHADDR = 'h00000000,
 parameter C_FAMILY = "spartan6",
 parameter C_NUM_REG = 1,
 parameter C_NUM_MEM = 1,
 parameter C_SLV_AWIDTH = 32,
 parameter C_SLV_DWIDTH = 32
 )
 (
 input S_AXI_ACLK,
 input S_AXI_ARESETN,
 input [(C_S_AXI_ADDR_WIDTH-1):0] S_AXI_AWADDR,
 input S_AXI_AWVALID,
 input  [(C_S_AXI_DATA_WIDTH-1):0] S_AXI_WDATA,
 input  [((C_S_AXI_DATA_WIDTH/8)-1):0] S_AXI_WSTRB,
 input S_AXI_WVALID,
 input S_AXI_BREADY,
 input  [(C_S_AXI_ADDR_WIDTH-1):0] S_AXI_ARADDR,
 input S_AXI_ARVALID,
 input S_AXI_RREADY,
 output S_AXI_ARREADY,
 output  [(C_S_AXI_DATA_WIDTH-1):0] S_AXI_RDATA,
 output  [1:0] S_AXI_RRESP,
 output S_AXI_RVALID,
 output S_AXI_WREADY,
 output [1:0] S_AXI_BRESP,
 output reg S_AXI_BVALID,
 output S_AXI_AWREADY,

 output reg [3:0] minimal_leds
 );

 assign  S_AXI_RDATA = minimal_leds;
 assign  S_AXI_RRESP = 0; // OKAY on AXI4
 assign  S_AXI_ARREADY = 1; // Always ready for read address
 assign  S_AXI_AWREADY = 1; // Always ready for write address
 assign  S_AXI_RVALID = 1; // Read data always valid (ILLEGAL)
 assign  S_AXI_WREADY = 1; // Always ready to write
 assign  S_AXI_BRESP = 0; // OKAY on AXI4

 // This will not work OK if several "bursts" are sent with no BVALIDs
 // inbetween. Not an expected scenario.

 always @(posedge S_AXI_ACLK)
   if (S_AXI_WVALID)
     begin
        S_AXI_BVALID <= 1;
        minimal_leds <= S_AXI_WDATA;
     end
   else if (S_AXI_BREADY && S_AXI_BVALID) // Active BRESP cycle
     S_AXI_BVALID <= 0;

endmodule

Most of the parameters at the top can be removed, I believe. It appears like they are necessary only when creating the MPD file with PsfUtility.

All ports, except minimal_leds are standard AXI4 lite ports. The implementation of the interface isn’t example for anything except a quick and dirty peripheral which responds to bus requests. The only thing it does actively is to update minimal_leds when necessary, and toggle the AXI_BVALID, so that only one burst response is sent for each write cycle (which is always one clock long in AXI4 lite). It’s OK not to decode the address, since it’s the interconnect’s job to make sure each peripheral gets only what it directed to it.

Holding S_AXI_RVALID high all the time violates the AXI4 spec, since it’s required to be asserted only after ARVALID and ARREADY. But the interconnect tolerated this anyhow.

Now to minimal_v2_1_0.mpd:

BEGIN minimal

## Peripheral Options
OPTION IPTYPE = PERIPHERAL
OPTION IMP_NETLIST = TRUE
OPTION HDL = VERILOG
OPTION IP_GROUP = MICROBLAZE:USER
OPTION DESC = MINIMAL
OPTION LONG_DESC = A minimal peripheral to start off with
OPTION ARCH_SUPPORT_MAP = (others=DEVELOPMENT)

## Bus Interfaces
BUS_INTERFACE BUS = S_AXI, BUS_STD = AXI, BUS_TYPE = SLAVE

## Generics for VHDL or Parameters for Verilog
PARAMETER C_S_AXI_DATA_WIDTH = 32, DT = INTEGER, BUS = S_AXI, ASSIGNMENT = CONSTANT
PARAMETER C_S_AXI_ADDR_WIDTH = 32, DT = INTEGER, BUS = S_AXI, ASSIGNMENT = CONSTANT
PARAMETER C_S_AXI_MIN_SIZE = 0x000001ff, DT = std_logic_vector, BUS = S_AXI
PARAMETER C_USE_WSTRB = 0, DT = INTEGER
PARAMETER C_DPHASE_TIMEOUT = 8, DT = INTEGER
PARAMETER C_BASEADDR = 0xffffffff, DT = std_logic_vector, MIN_SIZE = 0x0, PAIR = C_HIGHADDR, ADDRESS = BASE, BUS = S_AXI
PARAMETER C_HIGHADDR = 0x00000000, DT = std_logic_vector, PAIR = C_BASEADDR, ADDRESS = HIGH, BUS = S_AXI
PARAMETER C_FAMILY = virtex6, DT = STRING
PARAMETER C_NUM_REG = 1, DT = INTEGER
PARAMETER C_NUM_MEM = 1, DT = INTEGER
PARAMETER C_SLV_AWIDTH = 32, DT = INTEGER
PARAMETER C_SLV_DWIDTH = 32, DT = INTEGER
PARAMETER C_S_AXI_PROTOCOL = AXI4LITE, TYPE = NON_HDL, ASSIGNMENT = CONSTANT, DT = STRING, BUS = S_AXI

## Ports
PORT S_AXI_ACLK = "", DIR = I, SIGIS = CLK, BUS = S_AXI
PORT S_AXI_ARESETN = ARESETN, DIR = I, SIGIS = RST, BUS = S_AXI
PORT S_AXI_AWADDR = AWADDR, DIR = I, VEC = [(C_S_AXI_ADDR_WIDTH-1):0], ENDIAN = LITTLE, BUS = S_AXI
PORT S_AXI_AWVALID = AWVALID, DIR = I, BUS = S_AXI
PORT S_AXI_WDATA = WDATA, DIR = I, VEC = [(C_S_AXI_DATA_WIDTH-1):0], ENDIAN = LITTLE, BUS = S_AXI
PORT S_AXI_WSTRB = WSTRB, DIR = I, VEC = [((C_S_AXI_DATA_WIDTH/8)-1):0], ENDIAN = LITTLE, BUS = S_AXI
PORT S_AXI_WVALID = WVALID, DIR = I, BUS = S_AXI
PORT S_AXI_BREADY = BREADY, DIR = I, BUS = S_AXI
PORT S_AXI_ARADDR = ARADDR, DIR = I, VEC = [(C_S_AXI_ADDR_WIDTH-1):0], ENDIAN = LITTLE, BUS = S_AXI
PORT S_AXI_ARVALID = ARVALID, DIR = I, BUS = S_AXI
PORT S_AXI_RREADY = RREADY, DIR = I, BUS = S_AXI
PORT S_AXI_ARREADY = ARREADY, DIR = O, BUS = S_AXI
PORT S_AXI_RDATA = RDATA, DIR = O, VEC = [(C_S_AXI_DATA_WIDTH-1):0], ENDIAN = LITTLE, BUS = S_AXI
PORT S_AXI_RRESP = RRESP, DIR = O, VEC = [1:0], BUS = S_AXI
PORT S_AXI_RVALID = RVALID, DIR = O, BUS = S_AXI
PORT S_AXI_WREADY = WREADY, DIR = O, BUS = S_AXI
PORT S_AXI_BRESP = BRESP, DIR = O, VEC = [1:0], BUS = S_AXI
PORT S_AXI_BVALID = BVALID, DIR = O, BUS = S_AXI
PORT S_AXI_AWREADY = AWREADY, DIR = O, BUS = S_AXI
PORT minimal_leds = "", DIR = O, VEC = [3:0]

END

This file is exactly as generated by the Wizard, except for the HDL option in the beginning changed to VERILOG, and the added port minimal_leds at the end. Note its assignment to “”. This file is best created by looking at examples of existing cores.

Now to minimal_v2_1_0.pao:

lib minimal_v1_00_a minimal verilog

which was rewritten to reflect that the peripheral consists of one single Verilog file.

The device tree file

The device tree file needs to be generated as described in one of my posts. The relevant section is given here, since it relates to kernel code presented next:

minimal_0: minimal@7ae00000 {
 compatible = "xlnx,minimal-1.00.a";
 reg = < 0x7ae00000 0x10000 >;
 xlnx,dphase-timeout = <0x8>;
 xlnx,family = "spartan6";
 xlnx,num-mem = <0x1>;
 xlnx,num-reg = <0x1>;
 xlnx,s-axi-min-size = <0x1ff>;
 xlnx,slv-awidth = <0x20>;
 xlnx,slv-dwidth = <0x20>;
 xlnx,use-wstrb = <0x0>;
 }

It’s pretty evident that some of these parameters have no use.

The driver

First, it’s convenient to create a makefile for cross compilation. Even though the correct way is to set the environment variables in the shell, and run the module compilation in the same way the kernel itself is compiled, it’s much more convenient to go just “make” or “make clean” with this makefile. It’s not good for distribution, as the paths to both the kernel tree and cross compiler are hardcoded.

So here’s a dirty, but yet convenient makefile:

export CROSS_COMPILE=/path/to/microblazeel-unknown-linux-gnu/bin/microblazeel-unknown-linux-gnu-
export ARCH=microblaze

ifneq ($(KERNELRELEASE),)
obj-m    := minimal.o

else
KDIR := /path/to/linux-2.6.38.6

default:
 @echo $(TARGET) > module.target
 $(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules

clean:
 @rm -f *.ko *.o modules.order Module.symvers *.mod.? .minimal.* *~
 @rm -rf .tmp_versions module.target

minimal.ko:
 $(MAKE)
endif

And now to the driver itself, minimal.c:

#include <linux/platform_device.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/of_platform.h>
#include <asm/io.h>

/* Match table for of_platform binding */
static struct of_device_id minimal_of_match[] __devinitdata = {
 { .compatible = "xlnx,minimal-1.00.a", },
 {}
};

MODULE_ALIAS("minimal");

static void __iomem *regs;
static struct resource res;

static int __devinit
minimal_of_probe(struct platform_device *op, const struct of_device_id *match)
{

 const int *width;
 int ret;
 int val;

 ret = of_address_to_resource(op->dev.of_node, 0, &res);
 if (ret) {
 printk(KERN_WARNING "minimal: Failed to obtain device tree resource\n");
 return ret;
 }

 printk(KERN_WARNING "minimal: Physical address to resource is %x\n", (unsigned int) res.start);

 if (!request_mem_region(res.start, 32, "mimimal")) {
 printk(KERN_WARNING "minimal: Failed to request I/O memory\n");
 return -EBUSY;
 }

 regs = of_iomap(op->dev.of_node, 0); /* Verify it's non-null! */

 printk(KERN_WARNING "minimal: Access address to registers is %x\n", (unsigned int) regs);

 width = of_get_property(op->dev.of_node, "xlnx,slv-dwidth", NULL);

 printk(KERN_WARNING "minimal: Obtained width=%d\n", be32_to_cpu(*width));

 val = ioread32(regs);
 printk(KERN_WARNING "minimal: Read %d, writing %d\n", val, val+1);

 iowrite32(++val, regs);

 return 0; /* Success */
}

static int __devexit minimal_of_remove(struct platform_device *op)
{
 iounmap(regs);
 release_mem_region(res.start, 32);
 return 0; /* Success */
}

static struct of_platform_driver minimal_of_driver = {
 .probe = minimal_of_probe,
 .remove = __devexit_p(minimal_of_remove),
 .driver = {
 .name = "minimal",
 .owner = THIS_MODULE,
 .of_match_table = minimal_of_match,
 },
};

int __init minimal_init(void)
{
 int ret;
 ret = of_register_platform_driver(&minimal_of_driver);
 return ret;
}

void __exit minimal_exit(void)
{
 of_unregister_platform_driver(&minimal_of_driver);
}

module_init(minimal_init);
module_exit(minimal_exit);

MODULE_AUTHOR("Eli Billauer");
MODULE_DESCRIPTION("Microblaze minimal module");
MODULE_LICENSE("GPL")

It doesn’t do anything special, except for change the state of the LEDs every time it’s loaded. The drivers also reads one of the parameters from the device tree structure. Not fascinating, but keeps the code, well, minimal.

This code should be pretty straightforward to programmers who are familiar with PCI device drivers, with probing and removal working in more or less the same way. I’ve chosen a hardcoded segment of 32 bytes as the requested region. This depends on the peripheral, of course.

A test run

This is the transcript of the session on the UART console, as run on a freshly booted system. LEDs did indeed go on and off as reflected by the numbers.

/ # insmod minimal.ko
minimal: Physical address to resource is 7ae00000
minimal: Access address to registers is c87e0000
minimal: Obtained width=32
minimal: Read 0, writing 1
/ # lsmod
minimal 1978 0 - Live 0xc8056000
ipv6 305961 10 - Live 0xc8763000
/ # cat /proc/iomem
40600000-4060000f : uartlite
40a00000-40a0ffff : xilinx_spi
40e00000-40e0ffff : xilinx_emaclite
7ae00000-7ae0001f : mimimal
/ # rmmod minimal
rmmod: module 'minimal' not found
/ # cat /proc/iomem
40600000-4060000f : uartlite
40a00000-40a0ffff : xilinx_spi
40e00000-40e0ffff : xilinx_emaclite
/ # lsmod
ipv6 305961 10 - Live 0xc8763000
/ # insmod minimal.ko
minimal: Physical address to resource is 7ae00000
minimal: Access address to registers is c8820000
minimal: Obtained width=32
minimal: Read 1, writing 2

Note that rmmod produces an error message, which makes it look as if it failed to remove the module, when in fact all went well.

The physical address was indeed detected correctly (see device tree), and mapped to another kernel virtual address each time.

Reader Comments

I want to implement this peripheral on zynq and have already create an IP to read the status of switches and control the LED on my Zedboard. It uses two 32-bit registers and works well. Now I need to control my peripheral through a Linux app.

But this driver seems not corecct to zynq platform. I konw that an ioremap function should be used instead of of_iomap.

And what else should I do to make it work?

Expect for your replayment.

#1 
Written By Nighseas on October 9th, 2012 @ 11:22

There are so many things that can go wrong, so I can’t help you much here.

As a side note, I don’t necessarily agree with you on the ioremap remark: The standard way is to add an entry for your peripheral in Linux’ device tree, and have the driver fetch the physical address from there. But for a on-off project, hardcoding the physical address of the peripheral in the driver is fairly acceptable.

#2 
Written By eli on October 9th, 2012 @ 13:38

Linux is new for me, and this howto is very helpful. Thank you!! By the way, do you know if it is possible to have access to this driver from the user application, like we can do with ioctl on char drivers, or is it limited to the kernel?
Thanks.

#3 
Written By Fred on March 1st, 2013 @ 10:29

This specific driver does nothing except loading itself. For interaction with a user-space program, you probably want it to stand behind a device file.

Recommended reading: http://lwn.net/Kernel/LDD3/

#4 
Written By eli on March 1st, 2013 @ 12:06

I found that you did not use the C_BASEADDR in you verilog file, i want to know how you peripheral know that the data is for you peripheral.I am a student from China, is learning how to develop AXI4Lite Peripheral,please forgive my poor English!

#5 
Written By Anonymous on August 4th, 2014 @ 08:56

The address range of each peripheral is set when it’s connected to the bus in XPS (or Vivado’s block builder). The AXI interconnect makes sure than any request that arrives to the peripheral is intended for it — the peripheral doesn’t need to make any address decoding.

As a result, the peripheral doesn’t need to know what its base address is.

#6 
Written By eli on August 4th, 2014 @ 09:01

Thank you very much!

#7 
Written By lwei on August 4th, 2014 @ 14:34

Add a Comment

required, use real name
required, will not be published
optional, your blog address