Discussion:
[Qemu-devel] [RFC v2 3/4] hw/pci-host: Add a generic PCI host controller for virtual platforms
Alvise Rigo
2014-11-21 18:07:40 UTC
Permalink
Add a generic PCI host controller for virtual platforms, based on the
previous work by Rob Herring:
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html

The controller creates a PCI bus for PCI devices; it is intended to
receive from the platform all the needed information to initiate the
memory regions expected by the PCI system. Due to this dependence, a
header file is used to define the structure that the platform has to
fill with the data needed by the controller. The device provides also a
routine for the device tree node generation, which mostly has to take
care of the interrupt-map property generation. This property is fetched
by the guest operating system to map any PCI interrupt to the interrupt
controller. For the time being, the device expects a GIC v2 to be used
by the guest.
Only mach-virt has been used to test the controller.

Signed-off-by: Alvise Rigo <***@virtualopensystems.com>
---
hw/pci-host/Makefile.objs | 2 +-
hw/pci-host/generic-pci.c | 280 ++++++++++++++++++++++++++++++++++++++
include/hw/pci-host/generic-pci.h | 74 ++++++++++
3 files changed, 355 insertions(+), 1 deletion(-)
create mode 100644 hw/pci-host/generic-pci.c
create mode 100644 include/hw/pci-host/generic-pci.h

diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index bb65f9c..8ef9fac 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -1,4 +1,4 @@
-common-obj-y += pam.o
+common-obj-y += pam.o generic-pci.o

# PPC devices
common-obj-$(CONFIG_PREP_PCI) += prep.o
diff --git a/hw/pci-host/generic-pci.c b/hw/pci-host/generic-pci.c
new file mode 100644
index 0000000..9c90263
--- /dev/null
+++ b/hw/pci-host/generic-pci.c
@@ -0,0 +1,280 @@
+/*
+ * Generic PCI host controller
+ *
+ * Copyright (c) 2014 Linaro, Ltd.
+ * Author: Rob Herring <***@linaro.org>
+ *
+ * Based on ARM Versatile PCI controller (hw/pci-host/versatile.c):
+ * Copyright (c) 2006-2009 CodeSourcery.
+ * Written by Paul Brook
+ *
+ * This code is licensed under the LGPL.
+ */
+
+#include "hw/sysbus.h"
+#include "hw/pci-host/generic-pci.h"
+#include "exec/address-spaces.h"
+#include "sysemu/device_tree.h"
+
+static const VMStateDescription pci_generic_host_vmstate = {
+ .name = "generic-host-pci",
+ .version_id = 1,
+ .minimum_version_id = 1,
+};
+
+static void pci_cam_config_write(void *opaque, hwaddr addr,
+ uint64_t val, unsigned size)
+{
+ PCIGenState *s = opaque;
+ pci_data_write(&s->pci_bus, addr, val, size);
+}
+
+static uint64_t pci_cam_config_read(void *opaque, hwaddr addr, unsigned size)
+{
+ PCIGenState *s = opaque;
+ uint32_t val;
+ val = pci_data_read(&s->pci_bus, addr, size);
+ return val;
+}
+
+static const MemoryRegionOps pci_vpb_config_ops = {
+ .read = pci_cam_config_read,
+ .write = pci_cam_config_write,
+ .endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void pci_generic_set_irq(void *opaque, int irq_num, int level)
+{
+ qemu_irq *pic = opaque;
+ qemu_set_irq(pic[irq_num], level);
+}
+
+static void pci_generic_host_init(Object *obj)
+{
+ PCIHostState *h = PCI_HOST_BRIDGE(obj);
+ PCIGenState *s = PCI_GEN(obj);
+ GenericPCIHostState *gps = &s->pci_gen;
+
+ memory_region_init(&s->pci_io_space, OBJECT(s), "pci_io", 1ULL << 32);
+ memory_region_init(&s->pci_mem_space, OBJECT(s), "pci_mem", 1ULL << 32);
+
+ pci_bus_new_inplace(&s->pci_bus, sizeof(s->pci_bus), DEVICE(obj), "pci",
+ &s->pci_mem_space, &s->pci_io_space,
+ PCI_DEVFN(0, 0), TYPE_PCIE_BUS);
+ h->bus = &s->pci_bus;
+
+ object_initialize(gps, sizeof(*gps), TYPE_GENERIC_PCI_HOST);
+ qdev_set_parent_bus(DEVICE(gps), BUS(&s->pci_bus));
+ gps->devfns = 0;
+}
+
+static int generic_pci_map_irq_fn(PCIDevice *pci_dev, int pin)
+{
+ BusState *bus = qdev_get_parent_bus(&pci_dev->qdev);
+ PCIBus *pci_bus = PCI_BUS(bus);
+ PCIDevice *pdev = pci_bus->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+
+ return gps->irqmap.devfn_idx_map[PCI_SLOT(pci_dev->devfn)]
+ [PCI_FUNC(pci_dev->devfn)];
+}
+
+static void pci_generic_host_realize(DeviceState *dev, Error **errp)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+ GenericPCIPlatformData *pdata = &s->plat_data;
+ int i;
+
+ if (!pdata->addr_map[PCI_CFG].size || !pdata->addr_map[PCI_IO].size ||
+ !pdata->addr_map[PCI_MEM].size || !pdata->irqs) {
+ hw_error("generic_pci: PCI controller not fully configured.");
+ }
+
+ for (i = 0; i < pdata->irqs; i++) {
+ sysbus_init_irq(sbd, &s->irq[i]);
+ }
+
+ pci_bus_irqs(&s->pci_bus, pci_generic_set_irq, generic_pci_map_irq_fn,
+ s->irq, pdata->irqs);
+
+ memory_region_init_io(&s->mem_config, OBJECT(s), &pci_vpb_config_ops, s,
+ "pci-config", pdata->addr_map[PCI_CFG].size);
+ sysbus_init_mmio(sbd, &s->mem_config);
+
+ /* Depending on the available memory of the board using the controller, we
+ create a window on both the I/O and mememory space. */
+ memory_region_init_alias(&s->pci_io_window, OBJECT(s), "pci-io-win",
+ &s->pci_io_space, 0,
+ pdata->addr_map[PCI_IO].size);
+ sysbus_init_mmio(sbd, &s->pci_io_window);
+
+ memory_region_init_alias(&s->pci_mem_window, OBJECT(s), "pci-mem-win",
+ &s->pci_mem_space,
+ pdata->addr_map[PCI_MEM].addr,
+ pdata->addr_map[PCI_MEM].size);
+ sysbus_init_mmio(sbd, &s->pci_mem_window);
+
+ /* TODO Remove once realize propagates to child devices. */
+ object_property_set_bool(OBJECT(&s->pci_gen), true, "realized", errp);
+}
+
+static void pci_generic_host_class_init(ObjectClass *klass, void *data)
+{
+ PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ k->vendor_id = PCI_VENDOR_ID_REDHAT;
+ k->device_id = 0x1234;
+ k->class_id = PCI_CLASS_PROCESSOR_CO;
+ /*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+ dc->cannot_instantiate_with_device_add_yet = true;
+}
+
+struct dt_irq_mapping {
+ int irqs;
+ uint32_t gic_phandle;
+ int base_irq_num;
+ uint64_t *data;
+};
+
+/* */
+#define IRQ_MAP_ENTRY_DESC_SZ 14
+static void gen_int_mapping_fn(PCIBus *b, PCIDevice *d, void *opaque)
+{
+ struct dt_irq_mapping *map_data;
+ int pin;
+
+ PCIDevice *pdev = b->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+ map_data = (struct dt_irq_mapping *)opaque;
+
+ /* Check if the platform has enough IRQs available. */
+ if (gps->devfns > map_data->irqs) {
+ hw_error("generic_pci: too many PCI devices.");
+ }
+
+ pin = pci_get_byte(d->config + PCI_INTERRUPT_PIN);
+ if (pin) {
+ uint64_t *data_ptr = map_data->data + (IRQ_MAP_ENTRY_DESC_SZ * gps->devfns);
+ uint8_t pci_slot = PCI_SLOT(d->devfn);
+ uint8_t pci_func = PCI_FUNC(d->devfn);
+ uint8_t *devfn_idx = &gps->irqmap.devfn_idx_map[pci_slot][0];
+ uint8_t *devfn_irq = gps->irqmap.devfn_irq_map;
+
+ devfn_idx[pci_func] = gps->devfns;
+ devfn_irq[gps->devfns] = map_data->base_irq_num + gps->devfns;
+
+ uint64_t buffer[IRQ_MAP_ENTRY_DESC_SZ] =
+ {1, (pci_slot << 11)|(pci_func << 8), 2, 0x00000000, 1, pin,
+ 1, map_data->gic_phandle, 1, 0x0, 1, devfn_irq[gps->devfns],
+ 1, 0x1};
+
+ memcpy(data_ptr, buffer, IRQ_MAP_ENTRY_DESC_SZ * sizeof(*buffer));
+ gps->devfns++;
+ }
+}
+
+/* Generate the "interrup-map" node's data and store it in map_data */
+static void generate_int_mapping(struct dt_irq_mapping *map_data,
+ PCIGenState *s)
+{
+ pci_for_each_device(s->parent_obj.bus, 0, gen_int_mapping_fn, map_data);
+}
+
+static void generate_dt_node(void *fdt, DeviceState *dev)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ char *nodename;
+ uint32_t gic_phandle;
+ uint32_t plat_acells;
+ uint32_t plat_scells;
+
+ SysBusDevice *busdev;
+ busdev = SYS_BUS_DEVICE(dev);
+ MemoryRegion *cfg = sysbus_mmio_get_region(busdev, 0);
+ MemoryRegion *io = sysbus_mmio_get_region(busdev, 1);
+ MemoryRegion *mem = sysbus_mmio_get_region(busdev, 2);
+
+ nodename = g_strdup_printf("/pci@%" PRIx64, cfg->addr);
+ qemu_fdt_add_subnode(fdt, nodename);
+ qemu_fdt_setprop_string(fdt, nodename, "compatible",
+ "pci-host-cam-generic");
+ qemu_fdt_setprop_string(fdt, nodename, "device_type", "pci");
+ qemu_fdt_setprop_cell(fdt, nodename, "#address-cells", 0x3);
+ qemu_fdt_setprop_cell(fdt, nodename, "#size-cells", 0x2);
+ qemu_fdt_setprop_cell(fdt, nodename, "#interrupt-cells", 0x1);
+
+ plat_acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
+ plat_scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", plat_acells, cfg->addr,
+ plat_scells, memory_region_size(cfg));
+
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "ranges",
+ 1, 0x01000000, 2, 0x00000000, /* I/O space */
+ 2, io->addr, 2, memory_region_size(io),
+ 1, 0x02000000, 2, mem->addr, /* 32bit memory space */
+ 2, mem->addr, 2, memory_region_size(mem));
+
+ gic_phandle = qemu_fdt_get_phandle(fdt, s->plat_data.gic_node_name);
+ /* A mask covering bits [15,8] of phys.high allows to honor the
+ function number when resolving a triggered PCI interrupt. */
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "interrupt-map-mask",
+ 1, 0xff00, 1, 0x0, 1, 0x0, 1, 0x7);
+
+ uint64_t *int_mapping_data = g_malloc0(IRQ_MAP_ENTRY_DESC_SZ *
+ sizeof(uint64_t) * s->plat_data.irqs);
+ struct dt_irq_mapping dt_map_data = {
+ .irqs = s->plat_data.irqs,
+ .gic_phandle = gic_phandle,
+ .base_irq_num = s->plat_data.base_irq,
+ .data = int_mapping_data
+ };
+ /* Generate the interrupt mapping according to the devices attached
+ * to the PCI bus of the controller. */
+ generate_int_mapping(&dt_map_data, s);
+ qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "interrupt-map",
+ (s->pci_gen.devfns * IRQ_MAP_ENTRY_DESC_SZ)/2, int_mapping_data);
+
+ g_free(int_mapping_data);
+ g_free(nodename);
+}
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev)
+{
+ generate_dt_node(fdt, dev);
+}
+
+static const TypeInfo pci_generic_host_info = {
+ .name = TYPE_GENERIC_PCI_HOST,
+ .parent = TYPE_PCI_DEVICE,
+ .instance_size = sizeof(GenericPCIHostState),
+ .class_init = pci_generic_host_class_init,
+};
+
+static void pci_generic_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ dc->realize = pci_generic_host_realize;
+ dc->vmsd = &pci_generic_host_vmstate;
+}
+
+static const TypeInfo pci_generic_info = {
+ .name = TYPE_GENERIC_PCI,
+ .parent = TYPE_PCI_HOST_BRIDGE,
+ .instance_size = sizeof(PCIGenState),
+ .instance_init = pci_generic_host_init,
+ .class_init = pci_generic_class_init,
+};
+
+static void generic_pci_host_register_types(void)
+{
+ type_register_static(&pci_generic_info);
+ type_register_static(&pci_generic_host_info);
+}
+
+type_init(generic_pci_host_register_types)
\ No newline at end of file
diff --git a/include/hw/pci-host/generic-pci.h b/include/hw/pci-host/generic-pci.h
new file mode 100644
index 0000000..43c7a0f
--- /dev/null
+++ b/include/hw/pci-host/generic-pci.h
@@ -0,0 +1,74 @@
+#ifndef QEMU_GENERIC_PCI_H
+#define QEMU_GENERIC_PCI_H
+
+#include "hw/pci/pci.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci_host.h"
+
+#define MAX_PCI_DEVICES (PCI_SLOT_MAX * PCI_FUNC_MAX)
+
+enum {
+ PCI_CFG = 0,
+ PCI_IO,
+ PCI_MEM,
+};
+
+typedef struct {
+ /* addr_map[PCI_CFG].addr = base address of configuration memory
+ addr_map[PCI_CFG].size = configuration memory size
+ ... */
+ struct addr_map {
+ hwaddr addr;
+ hwaddr size;
+ } addr_map[3];
+
+ const char *gic_node_name;
+ uint32_t base_irq;
+ uint32_t irqs;
+} GenericPCIPlatformData;
+
+typedef struct {
+ PCIDevice parent_obj;
+
+ struct irqmap {
+ /* devfn_idx_map[i][j] = index inside slot_irq_map for device at slot i,
+ fn j */
+ uint8_t devfn_idx_map[PCI_SLOT_MAX][PCI_FUNC_MAX];
+ /* devfn_irq_map[i] = IRQ num. of the i-th device (devfn) attached to
+ the bus */
+ uint8_t devfn_irq_map[MAX_PCI_DEVICES];
+ } irqmap;
+ /* number of devices attached to the bus */
+ uint32_t devfns;
+} GenericPCIHostState;
+
+typedef struct {
+ PCIHostState parent_obj;
+
+ qemu_irq irq[MAX_PCI_DEVICES];
+ MemoryRegion mem_config;
+ /* Containers representing the PCI address spaces */
+ MemoryRegion pci_io_space;
+ MemoryRegion pci_mem_space;
+ /* Alias regions into PCI address spaces which we expose as sysbus regions.
+ * The offsets into pci_mem_space is fixed. */
+ MemoryRegion pci_io_window;
+ MemoryRegion pci_mem_window;
+ PCIBus pci_bus;
+ GenericPCIHostState pci_gen;
+
+ /* Platform dependent data */
+ GenericPCIPlatformData plat_data;
+} PCIGenState;
+
+#define TYPE_GENERIC_PCI "generic_pci"
+#define PCI_GEN(obj) \
+ OBJECT_CHECK(PCIGenState, (obj), TYPE_GENERIC_PCI)
+
+#define TYPE_GENERIC_PCI_HOST "generic_pci_host"
+#define PCI_GEN_HOST(obj) \
+ OBJECT_CHECK(GenericPCIHostState, (obj), TYPE_GENERIC_PCI_HOST)
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev);
+
+#endif
\ No newline at end of file
--
2.1.0
Alvise Rigo
2014-11-21 18:07:39 UTC
Permalink
Signed-off-by: Alvise Rigo <***@virtualopensystems.com>
---
hw/arm/virt.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index e8d527d..4e7b869 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -151,6 +151,10 @@ static VirtBoardInfo *find_machine_info(const char *cpu)
{
int i;

+ if (!cpu) {
+ return NULL;
+ }
+
for (i = 0; i < ARRAY_SIZE(machines); i++) {
if (strcmp(cpu, machines[i].cpu_model) == 0) {
return &machines[i];
--
2.1.0
Peter Maydell
2015-01-05 15:36:41 UTC
Permalink
Post by Alvise Rigo
---
hw/arm/virt.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index e8d527d..4e7b869 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -151,6 +151,10 @@ static VirtBoardInfo *find_machine_info(const char *cpu)
{
int i;
+ if (!cpu) {
+ return NULL;
+ }
+
for (i = 0; i < ARRAY_SIZE(machines); i++) {
if (strcmp(cpu, machines[i].cpu_model) == 0) {
return &machines[i];
What's the motivation for this change? We can never call this
function with a NULL pointer at the moment...

thanks
-- PMM
alvise rigo
2015-01-05 16:31:32 UTC
Permalink
It's just because in patch 1/4 of this series we use
find_machine_info(machine->cpu_model), which could be a NULL pointer.
Indeed this patch can be avoided reworking a bit the calling function code.

Regards,
alvise
Post by Peter Maydell
Post by Alvise Rigo
---
hw/arm/virt.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index e8d527d..4e7b869 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -151,6 +151,10 @@ static VirtBoardInfo *find_machine_info(const char *cpu)
{
int i;
+ if (!cpu) {
+ return NULL;
+ }
+
for (i = 0; i < ARRAY_SIZE(machines); i++) {
if (strcmp(cpu, machines[i].cpu_model) == 0) {
return &machines[i];
What's the motivation for this change? We can never call this
function with a NULL pointer at the moment...
thanks
-- PMM
Peter Maydell
2015-01-05 16:42:38 UTC
Permalink
Post by alvise rigo
It's just because in patch 1/4 of this series we use
find_machine_info(machine->cpu_model), which could be a NULL pointer.
Well, don't do that, because in that case you'll do the wrong thing
if we take the default value of the cpu model. But I don't think
patch 1 is the right approach anyway.

(Also, if you have a series where patch B requires a change or
fix that is put in patch A, then A should come before B in the series.)

thanks
-- PMM
Alvise Rigo
2014-11-21 18:07:38 UTC
Permalink
Keep a global list with all the functions that need to modify the device
tree. Using qemu_add_machine_init_done_notifier we register a notifier
that executes all the functions on the list and loads the kernel.

Signed-off-by: Alvise Rigo <***@virtualopensystems.com>
---
hw/arm/virt.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 54 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 314e55b..e8d527d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -70,6 +70,16 @@ enum {
VIRT_RTC,
};

+typedef void (*modify_dtb_func)(void *fdt, DeviceState *dev);
+typedef struct DTModifier {
+ QLIST_ENTRY(DTModifier) entry;
+ modify_dtb_func modify_dtb;
+ DeviceState *dev;
+} DTModifier;
+
+static QLIST_HEAD(, DTModifier) dtb_modifiers =
+ QLIST_HEAD_INITIALIZER(dtb_modifiers);
+
typedef struct MemMapEntry {
hwaddr base;
hwaddr size;
@@ -149,6 +159,23 @@ static VirtBoardInfo *find_machine_info(const char *cpu)
return NULL;
}

+static void add_dtb_modifier(modify_dtb_func func, DeviceState *dev)
+{
+ DTModifier *mod_entry = g_new(DTModifier, 1);
+ mod_entry->modify_dtb = func;
+ mod_entry->dev = dev;
+ QLIST_INSERT_HEAD(&dtb_modifiers, mod_entry, entry);
+}
+
+static void free_dtb_modifiers(void)
+{
+ while (!QLIST_EMPTY(&dtb_modifiers)) {
+ DTModifier *modifier = QLIST_FIRST(&dtb_modifiers);
+ QLIST_REMOVE(modifier, entry);
+ g_free(modifier);
+ }
+}
+
static void create_fdt(VirtBoardInfo *vbi)
{
void *fdt = create_device_tree(&vbi->fdt_size);
@@ -527,6 +554,29 @@ static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
return board->fdt;
}

+static void machvirt_finalize_dt(Notifier *notify, void *data)
+{
+ VirtBoardInfo *vbi;
+ MachineState *machine;
+
+ machine = MACHINE(qdev_get_machine());
+
+ vbi = find_machine_info(machine->cpu_model);
+ if (!vbi) {
+ vbi = find_machine_info("cortex-a15");
+ }
+
+ struct DTModifier *modifier, *next;
+ QLIST_FOREACH_SAFE(modifier, &dtb_modifiers, entry, next) {
+ modifier->modify_dtb(vbi->fdt, modifier->dev);
+ }
+
+ free_dtb_modifiers();
+
+ /* Load the kernel only after that the device tree has been modified. */
+ arm_load_kernel(ARM_CPU(first_cpu), &vbi->bootinfo);
+}
+
static void machvirt_init(MachineState *machine)
{
qemu_irq pic[NUM_IRQS];
@@ -604,6 +654,10 @@ static void machvirt_init(MachineState *machine)
*/
create_virtio_devices(vbi, pic);

+ Notifier *finalize_dtb_notifier = g_new(Notifier, 1);
+ finalize_dtb_notifier->notify = machvirt_finalize_dt;
+ qemu_add_machine_init_done_notifier(finalize_dtb_notifier);
+
vbi->bootinfo.ram_size = machine->ram_size;
vbi->bootinfo.kernel_filename = machine->kernel_filename;
vbi->bootinfo.kernel_cmdline = machine->kernel_cmdline;
@@ -612,7 +666,6 @@ static void machvirt_init(MachineState *machine)
vbi->bootinfo.board_id = -1;
vbi->bootinfo.loader_start = vbi->memmap[VIRT_MEM].base;
vbi->bootinfo.get_dtb = machvirt_dtb;
- arm_load_kernel(ARM_CPU(first_cpu), &vbi->bootinfo);
}

static QEMUMachine machvirt_a15_machine = {
--
2.1.0
Claudio Fontana
2014-11-24 11:47:42 UTC
Permalink
Post by Alvise Rigo
Keep a global list with all the functions that need to modify the device
tree. Using qemu_add_machine_init_done_notifier we register a notifier
that executes all the functions on the list and loads the kernel.
Peter, could you weigh in about whether this is a good idea or not?
Post by Alvise Rigo
---
hw/arm/virt.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 54 insertions(+), 1 deletion(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 314e55b..e8d527d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -70,6 +70,16 @@ enum {
VIRT_RTC,
};
+typedef void (*modify_dtb_func)(void *fdt, DeviceState *dev);
+typedef struct DTModifier {
+ QLIST_ENTRY(DTModifier) entry;
+ modify_dtb_func modify_dtb;
+ DeviceState *dev;
+} DTModifier;
+
+static QLIST_HEAD(, DTModifier) dtb_modifiers =
+ QLIST_HEAD_INITIALIZER(dtb_modifiers);
+
typedef struct MemMapEntry {
hwaddr base;
hwaddr size;
@@ -149,6 +159,23 @@ static VirtBoardInfo *find_machine_info(const char *cpu)
return NULL;
}
+static void add_dtb_modifier(modify_dtb_func func, DeviceState *dev)
+{
+ DTModifier *mod_entry = g_new(DTModifier, 1);
+ mod_entry->modify_dtb = func;
+ mod_entry->dev = dev;
+ QLIST_INSERT_HEAD(&dtb_modifiers, mod_entry, entry);
+}
+
+static void free_dtb_modifiers(void)
+{
+ while (!QLIST_EMPTY(&dtb_modifiers)) {
+ DTModifier *modifier = QLIST_FIRST(&dtb_modifiers);
+ QLIST_REMOVE(modifier, entry);
+ g_free(modifier);
+ }
+}
+
static void create_fdt(VirtBoardInfo *vbi)
{
void *fdt = create_device_tree(&vbi->fdt_size);
@@ -527,6 +554,29 @@ static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
return board->fdt;
}
+static void machvirt_finalize_dt(Notifier *notify, void *data)
+{
+ VirtBoardInfo *vbi;
+ MachineState *machine;
+
+ machine = MACHINE(qdev_get_machine());
+
+ vbi = find_machine_info(machine->cpu_model);
+ if (!vbi) {
+ vbi = find_machine_info("cortex-a15");
+ }
+
+ struct DTModifier *modifier, *next;
+ QLIST_FOREACH_SAFE(modifier, &dtb_modifiers, entry, next) {
+ modifier->modify_dtb(vbi->fdt, modifier->dev);
+ }
+
+ free_dtb_modifiers();
+
+ /* Load the kernel only after that the device tree has been modified. */
+ arm_load_kernel(ARM_CPU(first_cpu), &vbi->bootinfo);
+}
+
static void machvirt_init(MachineState *machine)
{
qemu_irq pic[NUM_IRQS];
@@ -604,6 +654,10 @@ static void machvirt_init(MachineState *machine)
*/
create_virtio_devices(vbi, pic);
+ Notifier *finalize_dtb_notifier = g_new(Notifier, 1);
+ finalize_dtb_notifier->notify = machvirt_finalize_dt;
+ qemu_add_machine_init_done_notifier(finalize_dtb_notifier);
+
vbi->bootinfo.ram_size = machine->ram_size;
vbi->bootinfo.kernel_filename = machine->kernel_filename;
vbi->bootinfo.kernel_cmdline = machine->kernel_cmdline;
@@ -612,7 +666,6 @@ static void machvirt_init(MachineState *machine)
vbi->bootinfo.board_id = -1;
vbi->bootinfo.loader_start = vbi->memmap[VIRT_MEM].base;
vbi->bootinfo.get_dtb = machvirt_dtb;
- arm_load_kernel(ARM_CPU(first_cpu), &vbi->bootinfo);
}
static QEMUMachine machvirt_a15_machine = {
Peter Maydell
2015-01-05 15:36:14 UTC
Permalink
On 24 November 2014 at 11:47, Claudio Fontana
Post by Claudio Fontana
Post by Alvise Rigo
Keep a global list with all the functions that need to modify the device
tree. Using qemu_add_machine_init_done_notifier we register a notifier
that executes all the functions on the list and loads the kernel.
Peter, could you weigh in about whether this is a good idea or not?
Sorry, I think I must have missed this series first time around.
I'm not convinced -- I don't see any reason why we should treat
the PCI host controller differently from other devices in the
virt board: just generate an appropriate dtb fragment in virt.c.

thanks
-- PMM
alvise rigo
2015-01-05 16:14:21 UTC
Permalink
Hi,
Post by Peter Maydell
On 24 November 2014 at 11:47, Claudio Fontana
Post by Claudio Fontana
Post by Alvise Rigo
Keep a global list with all the functions that need to modify the device
tree. Using qemu_add_machine_init_done_notifier we register a notifier
that executes all the functions on the list and loads the kernel.
Peter, could you weigh in about whether this is a good idea or not?
Sorry, I think I must have missed this series first time around.
I'm not convinced -- I don't see any reason why we should treat
the PCI host controller differently from other devices in the
The reason for this is that the PCI host controller needs to generate
its device node after all the PCI devices have been added to the bus
(also those specified with the -device option).
This is required by the interrupt-map node property, that specifies
for each PCI device an interrupt map entry. Since we have one device
requiring this 'postponed' node generation, this patch allows also
other devices to do the same.
Recently I was thinking to another approach, which is to have multiple
dtb modifiers called by arm_boot_info.modify_dtb(). Is this
reasonable?
Post by Peter Maydell
virt board: just generate an appropriate dtb fragment in virt.c.
Of course, the method that generates the device node fragment can be
moved to virt.c, but the requirement of postponing its call after the
machine has been created still applies.

Thank you for your feedback,
alvise
Post by Peter Maydell
thanks
-- PMM
Peter Maydell
2015-01-05 16:41:09 UTC
Permalink
Post by alvise rigo
Post by Peter Maydell
Sorry, I think I must have missed this series first time around.
I'm not convinced -- I don't see any reason why we should treat
the PCI host controller differently from other devices in the
The reason for this is that the PCI host controller needs to generate
its device node after all the PCI devices have been added to the bus
(also those specified with the -device option).
This is required by the interrupt-map node property, that specifies
for each PCI device an interrupt map entry. Since we have one device
requiring this 'postponed' node generation, this patch allows also
other devices to do the same.
What? This doesn't sound right -- you can have hot-plugged PCI
devices, for a start. Device tree is only supposed to be
needed for the bits of hardware that can't be probed, and
we can rely on PCI itself to probe the other devices.

interrupt-map as far as I can tell just specifies how the
interrupt lines are mapped for each PCI slot; it won't
change based on whether devices are present or not. The
example in the wiki:
http://devicetree.org/Device_Tree_Usage#Advanced_Interrupt_Mapping
cares about number of slots, but that's all.

thanks
-- PMM
alvise rigo
2015-01-05 17:35:56 UTC
Permalink
Post by Peter Maydell
Post by alvise rigo
Post by Peter Maydell
Sorry, I think I must have missed this series first time around.
I'm not convinced -- I don't see any reason why we should treat
the PCI host controller differently from other devices in the
The reason for this is that the PCI host controller needs to generate
its device node after all the PCI devices have been added to the bus
(also those specified with the -device option).
This is required by the interrupt-map node property, that specifies
for each PCI device an interrupt map entry. Since we have one device
requiring this 'postponed' node generation, this patch allows also
other devices to do the same.
What? This doesn't sound right -- you can have hot-plugged PCI
devices, for a start. Device tree is only supposed to be
needed for the bits of hardware that can't be probed, and
we can rely on PCI itself to probe the other devices.
OK, I see.
Post by Peter Maydell
interrupt-map as far as I can tell just specifies how the
interrupt lines are mapped for each PCI slot; it won't
change based on whether devices are present or not. The
http://devicetree.org/Device_Tree_Usage#Advanced_Interrupt_Mapping
cares about number of slots, but that's all.
So I suppose we need to define a fixed number of PCI slots according
to the number of interrupts available in mach-virt. In this regard,
should this number be a qdev property?

Thank you,
alvise
Post by Peter Maydell
thanks
-- PMM
Peter Maydell
2015-01-05 18:07:34 UTC
Permalink
Post by alvise rigo
So I suppose we need to define a fixed number of PCI slots according
to the number of interrupts available in mach-virt. In this regard,
should this number be a qdev property?
The PCI spec means that a bus has an implicit maximum of
32 slots (the devfn in a PCI address is only 5 bits).
Note that this doesn't imply having 32 interrupt lines.

What you can do is something like this (which is what the
versatilepb device-tree-binding will have):

+ interrupt-map-mask = <0x1800 0 0 7>;
+ interrupt-map = <0x1800 0 0 1 &sic 28
+ 0x1800 0 0 2 &sic 29
+ 0x1800 0 0 3 &sic 30
+ 0x1800 0 0 4 &sic 27
+
+ 0x1000 0 0 1 &sic 27
+ 0x1000 0 0 2 &sic 28
+ 0x1000 0 0 3 &sic 29
+ 0x1000 0 0 4 &sic 30
+
+ 0x0800 0 0 1 &sic 30
+ 0x0800 0 0 2 &sic 27
+ 0x0800 0 0 3 &sic 28
+ 0x0800 0 0 4 &sic 29
+
+ 0x0000 0 0 1 &sic 29
+ 0x0000 0 0 2 &sic 30
+ 0x0000 0 0 3 &sic 27
+ 0x0000 0 0 4 &sic 28>;

That says "we have four interrupts, which are swizzled in
the usual way between slots", and doesn't impose any
particular maximum number of slots. (Notice that the
mask value is 0x1800 0 0 0 7, which means we only match
on the low two bits of the devfn, so a unit-interrupt-specifier
of <0x2000 0x0 0x0 1> for slot 4 matches the entry
<0x0000 0x0 0x0 1> in the map table, as required.)

(You'll want to do something more sensible than 27..30,
which is just what the interrupt lines on the versatilepb
happen to be.)

-- PMM
alvise rigo
2015-01-06 09:56:16 UTC
Permalink
Thank you. I will keep this in mind for the next spin of the patches.

Regards,
alvise
Post by Peter Maydell
Post by alvise rigo
So I suppose we need to define a fixed number of PCI slots according
to the number of interrupts available in mach-virt. In this regard,
should this number be a qdev property?
The PCI spec means that a bus has an implicit maximum of
32 slots (the devfn in a PCI address is only 5 bits).
Note that this doesn't imply having 32 interrupt lines.
What you can do is something like this (which is what the
+ interrupt-map-mask = <0x1800 0 0 7>;
+ interrupt-map = <0x1800 0 0 1 &sic 28
+ 0x1800 0 0 2 &sic 29
+ 0x1800 0 0 3 &sic 30
+ 0x1800 0 0 4 &sic 27
+
+ 0x1000 0 0 1 &sic 27
+ 0x1000 0 0 2 &sic 28
+ 0x1000 0 0 3 &sic 29
+ 0x1000 0 0 4 &sic 30
+
+ 0x0800 0 0 1 &sic 30
+ 0x0800 0 0 2 &sic 27
+ 0x0800 0 0 3 &sic 28
+ 0x0800 0 0 4 &sic 29
+
+ 0x0000 0 0 1 &sic 29
+ 0x0000 0 0 2 &sic 30
+ 0x0000 0 0 3 &sic 27
+ 0x0000 0 0 4 &sic 28>;
That says "we have four interrupts, which are swizzled in
the usual way between slots", and doesn't impose any
particular maximum number of slots. (Notice that the
mask value is 0x1800 0 0 0 7, which means we only match
on the low two bits of the devfn, so a unit-interrupt-specifier
of <0x2000 0x0 0x0 1> for slot 4 matches the entry
<0x0000 0x0 0x0 1> in the map table, as required.)
(You'll want to do something more sensible than 27..30,
which is just what the interrupt lines on the versatilepb
happen to be.)
-- PMM
Eric Auger
2015-01-06 09:18:24 UTC
Permalink
Post by alvise rigo
Hi,
Post by Peter Maydell
On 24 November 2014 at 11:47, Claudio Fontana
Post by Claudio Fontana
Post by Alvise Rigo
Keep a global list with all the functions that need to modify the device
tree. Using qemu_add_machine_init_done_notifier we register a notifier
that executes all the functions on the list and loads the kernel.
Peter, could you weigh in about whether this is a good idea or not?
Sorry, I think I must have missed this series first time around.
I'm not convinced -- I don't see any reason why we should treat
the PCI host controller differently from other devices in the
The reason for this is that the PCI host controller needs to generate
its device node after all the PCI devices have been added to the bus
(also those specified with the -device option).
This is required by the interrupt-map node property, that specifies
for each PCI device an interrupt map entry. Since we have one device
requiring this 'postponed' node generation, this patch allows also
other devices to do the same.
Recently I was thinking to another approach, which is to have multiple
dtb modifiers called by arm_boot_info.modify_dtb(). Is this
reasonable?
Post by Peter Maydell
virt board: just generate an appropriate dtb fragment in virt.c.
Of course, the method that generates the device node fragment can be
moved to virt.c, but the requirement of postponing its call after the
machine has been created still applies.
Hi Alvise, Peter,

Besides the PCI aspects, the dt generation problem that is addressed
here is identical to the one related to VFIO platform device dt node
generation that also needs to happen after machine init.

In "machvirt dynamic sysbus device instantiation"
(https://www.mail-archive.com/qemu-***@nongnu.org/msg272506.html),
arm_load_kernel is proposed to be executed in a machine init done
notifier and VFIO node creation happens in another machine init done
notifier registered after this latter.

Best Regards

Eric
Post by alvise rigo
Thank you for your feedback,
alvise
Post by Peter Maydell
thanks
-- PMM
alvise rigo
2015-01-06 09:29:55 UTC
Permalink
Hi Eric,

You are right. In fact, I've also spent some time to see if it was
possible to use the code you mentioned.
However, it's not needed anymore: the node generation will happen at
machine init for the reasons discussed in this thread.

Regards,
alvise
Post by Eric Auger
Post by alvise rigo
Hi,
Post by Peter Maydell
On 24 November 2014 at 11:47, Claudio Fontana
Post by Claudio Fontana
Post by Alvise Rigo
Keep a global list with all the functions that need to modify the device
tree. Using qemu_add_machine_init_done_notifier we register a notifier
that executes all the functions on the list and loads the kernel.
Peter, could you weigh in about whether this is a good idea or not?
Sorry, I think I must have missed this series first time around.
I'm not convinced -- I don't see any reason why we should treat
the PCI host controller differently from other devices in the
The reason for this is that the PCI host controller needs to generate
its device node after all the PCI devices have been added to the bus
(also those specified with the -device option).
This is required by the interrupt-map node property, that specifies
for each PCI device an interrupt map entry. Since we have one device
requiring this 'postponed' node generation, this patch allows also
other devices to do the same.
Recently I was thinking to another approach, which is to have multiple
dtb modifiers called by arm_boot_info.modify_dtb(). Is this
reasonable?
Post by Peter Maydell
virt board: just generate an appropriate dtb fragment in virt.c.
Of course, the method that generates the device node fragment can be
moved to virt.c, but the requirement of postponing its call after the
machine has been created still applies.
Hi Alvise, Peter,
Besides the PCI aspects, the dt generation problem that is addressed
here is identical to the one related to VFIO platform device dt node
generation that also needs to happen after machine init.
In "machvirt dynamic sysbus device instantiation"
arm_load_kernel is proposed to be executed in a machine init done
notifier and VFIO node creation happens in another machine init done
notifier registered after this latter.
Best Regards
Eric
Post by alvise rigo
Thank you for your feedback,
alvise
Post by Peter Maydell
thanks
-- PMM
Peter Maydell
2015-01-06 09:51:37 UTC
Permalink
Post by Eric Auger
Besides the PCI aspects, the dt generation problem that is addressed
here is identical to the one related to VFIO platform device dt node
generation that also needs to happen after machine init.
Right, for VFIO we need it; but for PCI we don't, so we shouldn't
tangle the two up together unnecessarily.

-- PMM
Eric Auger
2015-01-06 10:05:53 UTC
Permalink
Post by Peter Maydell
Post by Eric Auger
Besides the PCI aspects, the dt generation problem that is addressed
here is identical to the one related to VFIO platform device dt node
generation that also needs to happen after machine init.
Right, for VFIO we need it; but for PCI we don't, so we shouldn't
tangle the two up together unnecessarily.
OK understood

Eric
Post by Peter Maydell
-- PMM
Claudio Fontana
2014-11-24 10:34:21 UTC
Permalink
Post by Alvise Rigo
Add a generic PCI host controller for virtual platforms, based on the
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html
Would it not be better to add rob herring's signoff, and then the explanation
of what you changed from his patch, followed by your signoff?
Or did you change so much that you had to redo the original work by Rob Herring
from scratch?
Post by Alvise Rigo
The controller creates a PCI bus for PCI devices; it is intended to
receive from the platform all the needed information to initiate the
memory regions expected by the PCI system. Due to this dependence, a
header file is used to define the structure that the platform has to
fill with the data needed by the controller. The device provides also a
routine for the device tree node generation, which mostly has to take
care of the interrupt-map property generation. This property is fetched
by the guest operating system to map any PCI interrupt to the interrupt
controller.
For the time being, the device expects a GIC v2 to be used
by the guest.
That's a pretty big limitation for a "generic" controller.
If it's only about the size of a parent interrupt cell or something like
that, why don't we provide it as part of the platform description that
you pass to the machinery anyway?
Post by Alvise Rigo
Only mach-virt has been used to test the controller.
---
hw/pci-host/Makefile.objs | 2 +-
hw/pci-host/generic-pci.c | 280 ++++++++++++++++++++++++++++++++++++++
include/hw/pci-host/generic-pci.h | 74 ++++++++++
3 files changed, 355 insertions(+), 1 deletion(-)
create mode 100644 hw/pci-host/generic-pci.c
create mode 100644 include/hw/pci-host/generic-pci.h
diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index bb65f9c..8ef9fac 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -1,4 +1,4 @@
-common-obj-y += pam.o
+common-obj-y += pam.o generic-pci.o
# PPC devices
common-obj-$(CONFIG_PREP_PCI) += prep.o
diff --git a/hw/pci-host/generic-pci.c b/hw/pci-host/generic-pci.c
new file mode 100644
index 0000000..9c90263
--- /dev/null
+++ b/hw/pci-host/generic-pci.c
@@ -0,0 +1,280 @@
+/*
+ * Generic PCI host controller
+ *
+ * Copyright (c) 2014 Linaro, Ltd.
+ *
+ * Copyright (c) 2006-2009 CodeSourcery.
+ * Written by Paul Brook
+ *
+ * This code is licensed under the LGPL.
+ */
+
+#include "hw/sysbus.h"
+#include "hw/pci-host/generic-pci.h"
+#include "exec/address-spaces.h"
+#include "sysemu/device_tree.h"
+
+static const VMStateDescription pci_generic_host_vmstate = {
+ .name = "generic-host-pci",
+ .version_id = 1,
+ .minimum_version_id = 1,
+};
+
+static void pci_cam_config_write(void *opaque, hwaddr addr,
+ uint64_t val, unsigned size)
+{
I would add a comment:
/* the MemoryRegionOps require uint64_t val, but we can only do 32bit */

there are already asserts in pci_host.c, so that should be sufficient.
Post by Alvise Rigo
+ PCIGenState *s = opaque;
+ pci_data_write(&s->pci_bus, addr, val, size);
+}
+
+static uint64_t pci_cam_config_read(void *opaque, hwaddr addr, unsigned size)
/* same here */
Post by Alvise Rigo
+{
+ PCIGenState *s = opaque;
+ uint32_t val;
+ val = pci_data_read(&s->pci_bus, addr, size);
+ return val;
+}
+
+static const MemoryRegionOps pci_vpb_config_ops = {
+ .read = pci_cam_config_read,
+ .write = pci_cam_config_write,
+ .endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void pci_generic_set_irq(void *opaque, int irq_num, int level)
+{
+ qemu_irq *pic = opaque;
+ qemu_set_irq(pic[irq_num], level);
+}
+
+static void pci_generic_host_init(Object *obj)
+{
+ PCIHostState *h = PCI_HOST_BRIDGE(obj);
+ PCIGenState *s = PCI_GEN(obj);
+ GenericPCIHostState *gps = &s->pci_gen;
+
+ memory_region_init(&s->pci_io_space, OBJECT(s), "pci_io", 1ULL << 32);
+ memory_region_init(&s->pci_mem_space, OBJECT(s), "pci_mem", 1ULL << 32);
+
+ pci_bus_new_inplace(&s->pci_bus, sizeof(s->pci_bus), DEVICE(obj), "pci",
+ &s->pci_mem_space, &s->pci_io_space,
+ PCI_DEVFN(0, 0), TYPE_PCIE_BUS);
TYPE_PCI_BUS, until we actually support PCIE.
Post by Alvise Rigo
+ h->bus = &s->pci_bus;
+
+ object_initialize(gps, sizeof(*gps), TYPE_GENERIC_PCI_HOST);
+ qdev_set_parent_bus(DEVICE(gps), BUS(&s->pci_bus));
+ gps->devfns = 0;
+}
+
+static int generic_pci_map_irq_fn(PCIDevice *pci_dev, int pin)
+{
+ BusState *bus = qdev_get_parent_bus(&pci_dev->qdev);
+ PCIBus *pci_bus = PCI_BUS(bus);
+ PCIDevice *pdev = pci_bus->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+
+ return gps->irqmap.devfn_idx_map[PCI_SLOT(pci_dev->devfn)]
+ [PCI_FUNC(pci_dev->devfn)];
+}
+
+static void pci_generic_host_realize(DeviceState *dev, Error **errp)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+ GenericPCIPlatformData *pdata = &s->plat_data;
+ int i;
+
+ if (!pdata->addr_map[PCI_CFG].size || !pdata->addr_map[PCI_IO].size ||
+ !pdata->addr_map[PCI_MEM].size || !pdata->irqs) {
+ hw_error("generic_pci: PCI controller not fully configured.");
+ }
+
+ for (i = 0; i < pdata->irqs; i++) {
+ sysbus_init_irq(sbd, &s->irq[i]);
+ }
+
+ pci_bus_irqs(&s->pci_bus, pci_generic_set_irq, generic_pci_map_irq_fn,
+ s->irq, pdata->irqs);
+
+ memory_region_init_io(&s->mem_config, OBJECT(s), &pci_vpb_config_ops, s,
+ "pci-config", pdata->addr_map[PCI_CFG].size);
+ sysbus_init_mmio(sbd, &s->mem_config);
+
+ /* Depending on the available memory of the board using the controller, we
+ create a window on both the I/O and mememory space. */
s/mememory/memory/
Post by Alvise Rigo
+ memory_region_init_alias(&s->pci_io_window, OBJECT(s), "pci-io-win",
+ &s->pci_io_space, 0,
+ pdata->addr_map[PCI_IO].size);
+ sysbus_init_mmio(sbd, &s->pci_io_window);
+
+ memory_region_init_alias(&s->pci_mem_window, OBJECT(s), "pci-mem-win",
+ &s->pci_mem_space,
+ pdata->addr_map[PCI_MEM].addr,
+ pdata->addr_map[PCI_MEM].size);
+ sysbus_init_mmio(sbd, &s->pci_mem_window);
+
+ /* TODO Remove once realize propagates to child devices. */
+ object_property_set_bool(OBJECT(&s->pci_gen), true, "realized", errp);
+}
+
+static void pci_generic_host_class_init(ObjectClass *klass, void *data)
+{
+ PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ k->vendor_id = PCI_VENDOR_ID_REDHAT;
+ k->device_id = 0x1234;
+ k->class_id = PCI_CLASS_PROCESSOR_CO;
+ /*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+ dc->cannot_instantiate_with_device_add_yet = true;
+}
+
+struct dt_irq_mapping {
+ int irqs;
+ uint32_t gic_phandle;
cannot this be generic enough that we don't talk about gic here?
Post by Alvise Rigo
+ int base_irq_num;
+ uint64_t *data;
+};
+
+/* */
? Can you remove this empty comment ?
Post by Alvise Rigo
+#define IRQ_MAP_ENTRY_DESC_SZ 14
+static void gen_int_mapping_fn(PCIBus *b, PCIDevice *d, void *opaque)
+{
+ struct dt_irq_mapping *map_data;
+ int pin;
+
+ PCIDevice *pdev = b->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+ map_data = (struct dt_irq_mapping *)opaque;
+
+ /* Check if the platform has enough IRQs available. */
+ if (gps->devfns > map_data->irqs) {
+ hw_error("generic_pci: too many PCI devices.");
+ }
+
+ pin = pci_get_byte(d->config + PCI_INTERRUPT_PIN);
+ if (pin) {
+ uint64_t *data_ptr = map_data->data + (IRQ_MAP_ENTRY_DESC_SZ * gps->devfns);
+ uint8_t pci_slot = PCI_SLOT(d->devfn);
+ uint8_t pci_func = PCI_FUNC(d->devfn);
+ uint8_t *devfn_idx = &gps->irqmap.devfn_idx_map[pci_slot][0];
+ uint8_t *devfn_irq = gps->irqmap.devfn_irq_map;
+
+ devfn_idx[pci_func] = gps->devfns;
+ devfn_irq[gps->devfns] = map_data->base_irq_num + gps->devfns;
+
+ uint64_t buffer[IRQ_MAP_ENTRY_DESC_SZ] =
+ {1, (pci_slot << 11)|(pci_func << 8), 2, 0x00000000, 1, pin,
+ 1, map_data->gic_phandle, 1, 0x0, 1, devfn_irq[gps->devfns],
+ 1, 0x1};
ah now I see, I think you forgot a comment above. But maybe here is a better place.
The above needs to be commented heavily, to make it obvious what each field is,
and which "spec" this is following. You have mentioned already in the previous discussion,
but it needs to be in the code.
Post by Alvise Rigo
+
+ memcpy(data_ptr, buffer, IRQ_MAP_ENTRY_DESC_SZ * sizeof(*buffer));
+ gps->devfns++;
+ }
+}
+
+/* Generate the "interrup-map" node's data and store it in map_data */
interrupt-map
Post by Alvise Rigo
+static void generate_int_mapping(struct dt_irq_mapping *map_data,
+ PCIGenState *s)
+{
+ pci_for_each_device(s->parent_obj.bus, 0, gen_int_mapping_fn, map_data);
+}
+
+static void generate_dt_node(void *fdt, DeviceState *dev)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ char *nodename;
+ uint32_t gic_phandle;
+ uint32_t plat_acells;
+ uint32_t plat_scells;
+
+ SysBusDevice *busdev;
+ busdev = SYS_BUS_DEVICE(dev);
+ MemoryRegion *cfg = sysbus_mmio_get_region(busdev, 0);
+ MemoryRegion *io = sysbus_mmio_get_region(busdev, 1);
+ MemoryRegion *mem = sysbus_mmio_get_region(busdev, 2);
+
+ qemu_fdt_add_subnode(fdt, nodename);
+ qemu_fdt_setprop_string(fdt, nodename, "compatible",
+ "pci-host-cam-generic");
+ qemu_fdt_setprop_string(fdt, nodename, "device_type", "pci");
+ qemu_fdt_setprop_cell(fdt, nodename, "#address-cells", 0x3);
+ qemu_fdt_setprop_cell(fdt, nodename, "#size-cells", 0x2);
+ qemu_fdt_setprop_cell(fdt, nodename, "#interrupt-cells", 0x1);
+
+ plat_acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
+ plat_scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", plat_acells, cfg->addr,
+ plat_scells, memory_region_size(cfg));
+
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "ranges",
+ 1, 0x01000000, 2, 0x00000000, /* I/O space */
+ 2, io->addr, 2, memory_region_size(io),
+ 1, 0x02000000, 2, mem->addr, /* 32bit memory space */
Is it commented somewhere sensible that the generic pci has this layout:

cfg = whatever base address
io = base + 0x1000000
mem = base + 0x2000000
Post by Alvise Rigo
+ 2, mem->addr, 2, memory_region_size(mem));
+
+ gic_phandle = qemu_fdt_get_phandle(fdt, s->plat_data.gic_node_name);
hmm could it be gic-independent?
Post by Alvise Rigo
+ /* A mask covering bits [15,8] of phys.high allows to honor the
+ function number when resolving a triggered PCI interrupt. */
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "interrupt-map-mask",
+ 1, 0xff00, 1, 0x0, 1, 0x0, 1, 0x7);
and what about 0, 0, 7, what do they mean? How does the mapping work,
what do all the fields mean, and which paper/spec are you implementing?..
Post by Alvise Rigo
+
+ uint64_t *int_mapping_data = g_malloc0(IRQ_MAP_ENTRY_DESC_SZ *
+ sizeof(uint64_t) * s->plat_data.irqs);
+ struct dt_irq_mapping dt_map_data = {
+ .irqs = s->plat_data.irqs,
+ .gic_phandle = gic_phandle,
could this be gic independent?
Post by Alvise Rigo
+ .base_irq_num = s->plat_data.base_irq,
+ .data = int_mapping_data
+ };
+ /* Generate the interrupt mapping according to the devices attached
+ * to the PCI bus of the controller. */
+ generate_int_mapping(&dt_map_data, s);
+ qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "interrupt-map",
+ (s->pci_gen.devfns * IRQ_MAP_ENTRY_DESC_SZ)/2, int_mapping_data);
+
+ g_free(int_mapping_data);
+ g_free(nodename);
+}
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev)
+{
+ generate_dt_node(fdt, dev);
+}
+
+static const TypeInfo pci_generic_host_info = {
+ .name = TYPE_GENERIC_PCI_HOST,
+ .parent = TYPE_PCI_DEVICE,
+ .instance_size = sizeof(GenericPCIHostState),
+ .class_init = pci_generic_host_class_init,
+};
+
+static void pci_generic_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ dc->realize = pci_generic_host_realize;
+ dc->vmsd = &pci_generic_host_vmstate;
+}
+
+static const TypeInfo pci_generic_info = {
+ .name = TYPE_GENERIC_PCI,
+ .parent = TYPE_PCI_HOST_BRIDGE,
+ .instance_size = sizeof(PCIGenState),
+ .instance_init = pci_generic_host_init,
+ .class_init = pci_generic_class_init,
+};
+
+static void generic_pci_host_register_types(void)
+{
+ type_register_static(&pci_generic_info);
+ type_register_static(&pci_generic_host_info);
+}
+
+type_init(generic_pci_host_register_types)
\ No newline at end of file
diff --git a/include/hw/pci-host/generic-pci.h b/include/hw/pci-host/generic-pci.h
new file mode 100644
index 0000000..43c7a0f
--- /dev/null
+++ b/include/hw/pci-host/generic-pci.h
@@ -0,0 +1,74 @@
+#ifndef QEMU_GENERIC_PCI_H
+#define QEMU_GENERIC_PCI_H
+
+#include "hw/pci/pci.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci_host.h"
+
+#define MAX_PCI_DEVICES (PCI_SLOT_MAX * PCI_FUNC_MAX)
+
maybe this would be a good place to talk about what the address space
layout assumptions of generic pci are, like

/* The Configuration space starts at the PCI base address passed in the virtual platform
* information,
* The I/O space starts at an offset of 0x1000000 from the PCI base address
* The Mem space starts at an offset of 0x2000000 from the PCI base address
*/

these the assumptions right?
Post by Alvise Rigo
+enum {
+ PCI_CFG = 0,
+ PCI_IO,
+ PCI_MEM,
+};
+
+typedef struct {
+ /* addr_map[PCI_CFG].addr = base address of configuration memory
+ addr_map[PCI_CFG].size = configuration memory size
+ ... */
+ struct addr_map {
+ hwaddr addr;
+ hwaddr size;
+ } addr_map[3];
+
+ const char *gic_node_name;
+ uint32_t base_irq;
+ uint32_t irqs;
+} GenericPCIPlatformData;
+
+typedef struct {
+ PCIDevice parent_obj;
+
+ struct irqmap {
+ /* devfn_idx_map[i][j] = index inside slot_irq_map for device at slot i,
+ fn j */
+ uint8_t devfn_idx_map[PCI_SLOT_MAX][PCI_FUNC_MAX];
+ /* devfn_irq_map[i] = IRQ num. of the i-th device (devfn) attached to
+ the bus */
+ uint8_t devfn_irq_map[MAX_PCI_DEVICES];
+ } irqmap;
+ /* number of devices attached to the bus */
+ uint32_t devfns;
+} GenericPCIHostState;
+
+typedef struct {
+ PCIHostState parent_obj;
+
+ qemu_irq irq[MAX_PCI_DEVICES];
+ MemoryRegion mem_config;
+ /* Containers representing the PCI address spaces */
+ MemoryRegion pci_io_space;
+ MemoryRegion pci_mem_space;
+ /* Alias regions into PCI address spaces which we expose as sysbus regions.
+ * The offsets into pci_mem_space is fixed. */
+ MemoryRegion pci_io_window;
+ MemoryRegion pci_mem_window;
+ PCIBus pci_bus;
+ GenericPCIHostState pci_gen;
+
+ /* Platform dependent data */
+ GenericPCIPlatformData plat_data;
+} PCIGenState;
+
+#define TYPE_GENERIC_PCI "generic_pci"
+#define PCI_GEN(obj) \
+ OBJECT_CHECK(PCIGenState, (obj), TYPE_GENERIC_PCI)
+
+#define TYPE_GENERIC_PCI_HOST "generic_pci_host"
+#define PCI_GEN_HOST(obj) \
+ OBJECT_CHECK(GenericPCIHostState, (obj), TYPE_GENERIC_PCI_HOST)
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev);
+
+#endif
\ No newline at end of file
alvise rigo
2014-11-24 14:57:56 UTC
Permalink
Hi Claudio,

Thank you for your review. Please see my in-line comments.

On Mon, Nov 24, 2014 at 11:34 AM, Claudio Fontana
Post by Claudio Fontana
Post by Alvise Rigo
Add a generic PCI host controller for virtual platforms, based on the
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html
Would it not be better to add rob herring's signoff, and then the explanation
of what you changed from his patch, followed by your signoff?
Or did you change so much that you had to redo the original work by Rob Herring
from scratch?
It was actually difficult to create some meaningful patches over the
initial ones.
It's sure that for a final version these details will be properly
addressed (signoff and diff over the original patches if possible).
Post by Claudio Fontana
Post by Alvise Rigo
The controller creates a PCI bus for PCI devices; it is intended to
receive from the platform all the needed information to initiate the
memory regions expected by the PCI system. Due to this dependence, a
header file is used to define the structure that the platform has to
fill with the data needed by the controller. The device provides also a
routine for the device tree node generation, which mostly has to take
care of the interrupt-map property generation. This property is fetched
by the guest operating system to map any PCI interrupt to the interrupt
controller.
For the time being, the device expects a GIC v2 to be used
by the guest.
That's a pretty big limitation for a "generic" controller.
If it's only about the size of a parent interrupt cell or something like
that, why don't we provide it as part of the platform description that
you pass to the machinery anyway?
Yes we can, however being totally interrupt controller independent
means that the platform has to fully provide the way to describe a
parent interrupt.
For the time being, we use a description compatible with GICv2 (and
v3), to cover the use case mach-virt + generic-pci (this is why I also
called the interrupt specific structures "gic*").
If we really need to be more general, I can see two solutions: the
first one is to find a way to pass all the necessary interrupt
controller information to the device machinery (phandle, #cells,
interrupt number, etc.).
I don't know how such a solution could get complicated in the future,
with some other virt-platform that requires a complicated interrupt
mapping for example.
The second would be to move all the device node generation back to the
platform, making its code even more crowded.
Are there other solutions that I'm missing?
Post by Claudio Fontana
Post by Alvise Rigo
Only mach-virt has been used to test the controller.
---
hw/pci-host/Makefile.objs | 2 +-
hw/pci-host/generic-pci.c | 280 ++++++++++++++++++++++++++++++++++++++
include/hw/pci-host/generic-pci.h | 74 ++++++++++
3 files changed, 355 insertions(+), 1 deletion(-)
create mode 100644 hw/pci-host/generic-pci.c
create mode 100644 include/hw/pci-host/generic-pci.h
diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index bb65f9c..8ef9fac 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -1,4 +1,4 @@
-common-obj-y += pam.o
+common-obj-y += pam.o generic-pci.o
# PPC devices
common-obj-$(CONFIG_PREP_PCI) += prep.o
diff --git a/hw/pci-host/generic-pci.c b/hw/pci-host/generic-pci.c
new file mode 100644
index 0000000..9c90263
--- /dev/null
+++ b/hw/pci-host/generic-pci.c
@@ -0,0 +1,280 @@
+/*
+ * Generic PCI host controller
+ *
+ * Copyright (c) 2014 Linaro, Ltd.
+ *
+ * Copyright (c) 2006-2009 CodeSourcery.
+ * Written by Paul Brook
+ *
+ * This code is licensed under the LGPL.
+ */
+
+#include "hw/sysbus.h"
+#include "hw/pci-host/generic-pci.h"
+#include "exec/address-spaces.h"
+#include "sysemu/device_tree.h"
+
+static const VMStateDescription pci_generic_host_vmstate = {
+ .name = "generic-host-pci",
+ .version_id = 1,
+ .minimum_version_id = 1,
+};
+
+static void pci_cam_config_write(void *opaque, hwaddr addr,
+ uint64_t val, unsigned size)
+{
/* the MemoryRegionOps require uint64_t val, but we can only do 32bit */
there are already asserts in pci_host.c, so that should be sufficient.
Agreed.
Post by Claudio Fontana
Post by Alvise Rigo
+ PCIGenState *s = opaque;
+ pci_data_write(&s->pci_bus, addr, val, size);
+}
+
+static uint64_t pci_cam_config_read(void *opaque, hwaddr addr, unsigned size)
/* same here */
Post by Alvise Rigo
+{
+ PCIGenState *s = opaque;
+ uint32_t val;
+ val = pci_data_read(&s->pci_bus, addr, size);
+ return val;
+}
+
+static const MemoryRegionOps pci_vpb_config_ops = {
+ .read = pci_cam_config_read,
+ .write = pci_cam_config_write,
+ .endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void pci_generic_set_irq(void *opaque, int irq_num, int level)
+{
+ qemu_irq *pic = opaque;
+ qemu_set_irq(pic[irq_num], level);
+}
+
+static void pci_generic_host_init(Object *obj)
+{
+ PCIHostState *h = PCI_HOST_BRIDGE(obj);
+ PCIGenState *s = PCI_GEN(obj);
+ GenericPCIHostState *gps = &s->pci_gen;
+
+ memory_region_init(&s->pci_io_space, OBJECT(s), "pci_io", 1ULL << 32);
+ memory_region_init(&s->pci_mem_space, OBJECT(s), "pci_mem", 1ULL << 32);
+
+ pci_bus_new_inplace(&s->pci_bus, sizeof(s->pci_bus), DEVICE(obj), "pci",
+ &s->pci_mem_space, &s->pci_io_space,
+ PCI_DEVFN(0, 0), TYPE_PCIE_BUS);
TYPE_PCI_BUS, until we actually support PCIE.
This was present in the first RFC as well since harmless, but I will
make it PCI only.
Post by Claudio Fontana
Post by Alvise Rigo
+ h->bus = &s->pci_bus;
+
+ object_initialize(gps, sizeof(*gps), TYPE_GENERIC_PCI_HOST);
+ qdev_set_parent_bus(DEVICE(gps), BUS(&s->pci_bus));
+ gps->devfns = 0;
+}
+
+static int generic_pci_map_irq_fn(PCIDevice *pci_dev, int pin)
+{
+ BusState *bus = qdev_get_parent_bus(&pci_dev->qdev);
+ PCIBus *pci_bus = PCI_BUS(bus);
+ PCIDevice *pdev = pci_bus->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+
+ return gps->irqmap.devfn_idx_map[PCI_SLOT(pci_dev->devfn)]
+ [PCI_FUNC(pci_dev->devfn)];
+}
+
+static void pci_generic_host_realize(DeviceState *dev, Error **errp)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+ GenericPCIPlatformData *pdata = &s->plat_data;
+ int i;
+
+ if (!pdata->addr_map[PCI_CFG].size || !pdata->addr_map[PCI_IO].size ||
+ !pdata->addr_map[PCI_MEM].size || !pdata->irqs) {
+ hw_error("generic_pci: PCI controller not fully configured.");
+ }
+
+ for (i = 0; i < pdata->irqs; i++) {
+ sysbus_init_irq(sbd, &s->irq[i]);
+ }
+
+ pci_bus_irqs(&s->pci_bus, pci_generic_set_irq, generic_pci_map_irq_fn,
+ s->irq, pdata->irqs);
+
+ memory_region_init_io(&s->mem_config, OBJECT(s), &pci_vpb_config_ops, s,
+ "pci-config", pdata->addr_map[PCI_CFG].size);
+ sysbus_init_mmio(sbd, &s->mem_config);
+
+ /* Depending on the available memory of the board using the controller, we
+ create a window on both the I/O and mememory space. */
s/mememory/memory/
ACK.
Post by Claudio Fontana
Post by Alvise Rigo
+ memory_region_init_alias(&s->pci_io_window, OBJECT(s), "pci-io-win",
+ &s->pci_io_space, 0,
+ pdata->addr_map[PCI_IO].size);
+ sysbus_init_mmio(sbd, &s->pci_io_window);
+
+ memory_region_init_alias(&s->pci_mem_window, OBJECT(s), "pci-mem-win",
+ &s->pci_mem_space,
+ pdata->addr_map[PCI_MEM].addr,
+ pdata->addr_map[PCI_MEM].size);
+ sysbus_init_mmio(sbd, &s->pci_mem_window);
+
+ /* TODO Remove once realize propagates to child devices. */
+ object_property_set_bool(OBJECT(&s->pci_gen), true, "realized", errp);
+}
+
+static void pci_generic_host_class_init(ObjectClass *klass, void *data)
+{
+ PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ k->vendor_id = PCI_VENDOR_ID_REDHAT;
+ k->device_id = 0x1234;
+ k->class_id = PCI_CLASS_PROCESSOR_CO;
+ /*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+ dc->cannot_instantiate_with_device_add_yet = true;
+}
+
+struct dt_irq_mapping {
+ int irqs;
+ uint32_t gic_phandle;
cannot this be generic enough that we don't talk about gic here?
Yes. I will change it.
Post by Claudio Fontana
Post by Alvise Rigo
+ int base_irq_num;
+ uint64_t *data;
+};
+
+/* */
? Can you remove this empty comment ?
Of course, I missed it.
Post by Claudio Fontana
Post by Alvise Rigo
+#define IRQ_MAP_ENTRY_DESC_SZ 14
+static void gen_int_mapping_fn(PCIBus *b, PCIDevice *d, void *opaque)
+{
+ struct dt_irq_mapping *map_data;
+ int pin;
+
+ PCIDevice *pdev = b->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+ map_data = (struct dt_irq_mapping *)opaque;
+
+ /* Check if the platform has enough IRQs available. */
+ if (gps->devfns > map_data->irqs) {
+ hw_error("generic_pci: too many PCI devices.");
+ }
+
+ pin = pci_get_byte(d->config + PCI_INTERRUPT_PIN);
+ if (pin) {
+ uint64_t *data_ptr = map_data->data + (IRQ_MAP_ENTRY_DESC_SZ * gps->devfns);
+ uint8_t pci_slot = PCI_SLOT(d->devfn);
+ uint8_t pci_func = PCI_FUNC(d->devfn);
+ uint8_t *devfn_idx = &gps->irqmap.devfn_idx_map[pci_slot][0];
+ uint8_t *devfn_irq = gps->irqmap.devfn_irq_map;
+
+ devfn_idx[pci_func] = gps->devfns;
+ devfn_irq[gps->devfns] = map_data->base_irq_num + gps->devfns;
+
+ uint64_t buffer[IRQ_MAP_ENTRY_DESC_SZ] =
+ {1, (pci_slot << 11)|(pci_func << 8), 2, 0x00000000, 1, pin,
+ 1, map_data->gic_phandle, 1, 0x0, 1, devfn_irq[gps->devfns],
+ 1, 0x1};
ah now I see, I think you forgot a comment above. But maybe here is a better place.
The above needs to be commented heavily, to make it obvious what each field is,
and which "spec" this is following. You have mentioned already in the previous discussion,
but it needs to be in the code.
I will do it.
Post by Claudio Fontana
Post by Alvise Rigo
+
+ memcpy(data_ptr, buffer, IRQ_MAP_ENTRY_DESC_SZ * sizeof(*buffer));
+ gps->devfns++;
+ }
+}
+
+/* Generate the "interrup-map" node's data and store it in map_data */
interrupt-map
ACK.
Post by Claudio Fontana
Post by Alvise Rigo
+static void generate_int_mapping(struct dt_irq_mapping *map_data,
+ PCIGenState *s)
+{
+ pci_for_each_device(s->parent_obj.bus, 0, gen_int_mapping_fn, map_data);
+}
+
+static void generate_dt_node(void *fdt, DeviceState *dev)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ char *nodename;
+ uint32_t gic_phandle;
+ uint32_t plat_acells;
+ uint32_t plat_scells;
+
+ SysBusDevice *busdev;
+ busdev = SYS_BUS_DEVICE(dev);
+ MemoryRegion *cfg = sysbus_mmio_get_region(busdev, 0);
+ MemoryRegion *io = sysbus_mmio_get_region(busdev, 1);
+ MemoryRegion *mem = sysbus_mmio_get_region(busdev, 2);
+
+ qemu_fdt_add_subnode(fdt, nodename);
+ qemu_fdt_setprop_string(fdt, nodename, "compatible",
+ "pci-host-cam-generic");
+ qemu_fdt_setprop_string(fdt, nodename, "device_type", "pci");
+ qemu_fdt_setprop_cell(fdt, nodename, "#address-cells", 0x3);
+ qemu_fdt_setprop_cell(fdt, nodename, "#size-cells", 0x2);
+ qemu_fdt_setprop_cell(fdt, nodename, "#interrupt-cells", 0x1);
+
+ plat_acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
+ plat_scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", plat_acells, cfg->addr,
+ plat_scells, memory_region_size(cfg));
+
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "ranges",
+ 1, 0x01000000, 2, 0x00000000, /* I/O space */
+ 2, io->addr, 2, memory_region_size(io),
+ 1, 0x02000000, 2, mem->addr, /* 32bit memory space */
cfg = whatever base address
io = base + 0x1000000
mem = base + 0x2000000
The value 0x1000000 actually means that the following address has to
be interpreted as I/O address, while 0x2000000 means that it's a 32bit
address.
I will add some more documentation here, for now I point to page 4 of
the following document where the whole encoding is explained:
http://www.o3one.org/hwdocs/openfirmware/pci_supplement_2_1.pdf
All the documentation should also be available at the website
http://www.openfirmware.org (as pointed also by the kernel
documentation file Documentation/devicetree/bindings/pci/pci.txt),
however the domain does not seem active.
Post by Claudio Fontana
Post by Alvise Rigo
+ 2, mem->addr, 2, memory_region_size(mem));
+
+ gic_phandle = qemu_fdt_get_phandle(fdt, s->plat_data.gic_node_name);
hmm could it be gic-independent?
Post by Alvise Rigo
+ /* A mask covering bits [15,8] of phys.high allows to honor the
+ function number when resolving a triggered PCI interrupt. */
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "interrupt-map-mask",
+ 1, 0xff00, 1, 0x0, 1, 0x0, 1, 0x7);
and what about 0, 0, 7, what do they mean? How does the mapping work,
what do all the fields mean, and which paper/spec are you implementing?..
0, 0 are needed to mask out the phys.mid and hys.low bits of the PCI
address (since are not needed to resolve the interrupt).
I will add a more detailed documentation in the next version.
Post by Claudio Fontana
Post by Alvise Rigo
+
+ uint64_t *int_mapping_data = g_malloc0(IRQ_MAP_ENTRY_DESC_SZ *
+ sizeof(uint64_t) * s->plat_data.irqs);
+ struct dt_irq_mapping dt_map_data = {
+ .irqs = s->plat_data.irqs,
+ .gic_phandle = gic_phandle,
could this be gic independent?
Yes, but still we need a field necessary to store the interrupt
controller phandle.
Post by Claudio Fontana
Post by Alvise Rigo
+ .base_irq_num = s->plat_data.base_irq,
+ .data = int_mapping_data
+ };
+ /* Generate the interrupt mapping according to the devices attached
+ * to the PCI bus of the controller. */
+ generate_int_mapping(&dt_map_data, s);
+ qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "interrupt-map",
+ (s->pci_gen.devfns * IRQ_MAP_ENTRY_DESC_SZ)/2, int_mapping_data);
+
+ g_free(int_mapping_data);
+ g_free(nodename);
+}
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev)
+{
+ generate_dt_node(fdt, dev);
+}
+
+static const TypeInfo pci_generic_host_info = {
+ .name = TYPE_GENERIC_PCI_HOST,
+ .parent = TYPE_PCI_DEVICE,
+ .instance_size = sizeof(GenericPCIHostState),
+ .class_init = pci_generic_host_class_init,
+};
+
+static void pci_generic_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ dc->realize = pci_generic_host_realize;
+ dc->vmsd = &pci_generic_host_vmstate;
+}
+
+static const TypeInfo pci_generic_info = {
+ .name = TYPE_GENERIC_PCI,
+ .parent = TYPE_PCI_HOST_BRIDGE,
+ .instance_size = sizeof(PCIGenState),
+ .instance_init = pci_generic_host_init,
+ .class_init = pci_generic_class_init,
+};
+
+static void generic_pci_host_register_types(void)
+{
+ type_register_static(&pci_generic_info);
+ type_register_static(&pci_generic_host_info);
+}
+
+type_init(generic_pci_host_register_types)
\ No newline at end of file
diff --git a/include/hw/pci-host/generic-pci.h b/include/hw/pci-host/generic-pci.h
new file mode 100644
index 0000000..43c7a0f
--- /dev/null
+++ b/include/hw/pci-host/generic-pci.h
@@ -0,0 +1,74 @@
+#ifndef QEMU_GENERIC_PCI_H
+#define QEMU_GENERIC_PCI_H
+
+#include "hw/pci/pci.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci_host.h"
+
+#define MAX_PCI_DEVICES (PCI_SLOT_MAX * PCI_FUNC_MAX)
+
maybe this would be a good place to talk about what the address space
layout assumptions of generic pci are, like
/* The Configuration space starts at the PCI base address passed in the virtual platform
* information,
* The I/O space starts at an offset of 0x1000000 from the PCI base address
* The Mem space starts at an offset of 0x2000000 from the PCI base address
*/
these the assumptions right?
Actually this is not completely true. As explained before, 0x01000000
and 0x02000000 indicate an I/O region and a 32bit addressable region.
The I/O window (pci_io_window) starts at the beginning of the I/O
address space (pci_io_space), no matter where it is mapped to by the
platform.
For example, mach-virt maps it to the CPU address 0x11000000 but from
the PCI bus point of view, it starts at address 0x0.
For the memory window (pci_mem_window) instead we need to place it at
the same address in the memory space (mem_io_space) to which it has
been mapped by the platform.
In mach-virt, this region starts at CPU (and also PCI) address 0x12000000.
I hope to have been enough clear.

Regards,
alvise
Post by Claudio Fontana
Post by Alvise Rigo
+enum {
+ PCI_CFG = 0,
+ PCI_IO,
+ PCI_MEM,
+};
+
+typedef struct {
+ /* addr_map[PCI_CFG].addr = base address of configuration memory
+ addr_map[PCI_CFG].size = configuration memory size
+ ... */
+ struct addr_map {
+ hwaddr addr;
+ hwaddr size;
+ } addr_map[3];
+
+ const char *gic_node_name;
+ uint32_t base_irq;
+ uint32_t irqs;
+} GenericPCIPlatformData;
+
+typedef struct {
+ PCIDevice parent_obj;
+
+ struct irqmap {
+ /* devfn_idx_map[i][j] = index inside slot_irq_map for device at slot i,
+ fn j */
+ uint8_t devfn_idx_map[PCI_SLOT_MAX][PCI_FUNC_MAX];
+ /* devfn_irq_map[i] = IRQ num. of the i-th device (devfn) attached to
+ the bus */
+ uint8_t devfn_irq_map[MAX_PCI_DEVICES];
+ } irqmap;
+ /* number of devices attached to the bus */
+ uint32_t devfns;
+} GenericPCIHostState;
+
+typedef struct {
+ PCIHostState parent_obj;
+
+ qemu_irq irq[MAX_PCI_DEVICES];
+ MemoryRegion mem_config;
+ /* Containers representing the PCI address spaces */
+ MemoryRegion pci_io_space;
+ MemoryRegion pci_mem_space;
+ /* Alias regions into PCI address spaces which we expose as sysbus regions.
+ * The offsets into pci_mem_space is fixed. */
+ MemoryRegion pci_io_window;
+ MemoryRegion pci_mem_window;
+ PCIBus pci_bus;
+ GenericPCIHostState pci_gen;
+
+ /* Platform dependent data */
+ GenericPCIPlatformData plat_data;
+} PCIGenState;
+
+#define TYPE_GENERIC_PCI "generic_pci"
+#define PCI_GEN(obj) \
+ OBJECT_CHECK(PCIGenState, (obj), TYPE_GENERIC_PCI)
+
+#define TYPE_GENERIC_PCI_HOST "generic_pci_host"
+#define PCI_GEN_HOST(obj) \
+ OBJECT_CHECK(GenericPCIHostState, (obj), TYPE_GENERIC_PCI_HOST)
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev);
+
+#endif
\ No newline at end of file
Claudio Fontana
2014-11-25 10:20:02 UTC
Permalink
Post by alvise rigo
Hi Claudio,
Thank you for your review. Please see my in-line comments.
On Mon, Nov 24, 2014 at 11:34 AM, Claudio Fontana
Post by Claudio Fontana
Post by Alvise Rigo
Add a generic PCI host controller for virtual platforms, based on the
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html
Would it not be better to add rob herring's signoff, and then the explanation
of what you changed from his patch, followed by your signoff?
Or did you change so much that you had to redo the original work by Rob Herring
from scratch?
It was actually difficult to create some meaningful patches over the
initial ones.
It's sure that for a final version these details will be properly
addressed (signoff and diff over the original patches if possible).
Post by Claudio Fontana
Post by Alvise Rigo
The controller creates a PCI bus for PCI devices; it is intended to
receive from the platform all the needed information to initiate the
memory regions expected by the PCI system. Due to this dependence, a
header file is used to define the structure that the platform has to
fill with the data needed by the controller. The device provides also a
routine for the device tree node generation, which mostly has to take
care of the interrupt-map property generation. This property is fetched
by the guest operating system to map any PCI interrupt to the interrupt
controller.
For the time being, the device expects a GIC v2 to be used
by the guest.
That's a pretty big limitation for a "generic" controller.
If it's only about the size of a parent interrupt cell or something like
that, why don't we provide it as part of the platform description that
you pass to the machinery anyway?
Yes we can, however being totally interrupt controller independent
means that the platform has to fully provide the way to describe a
parent interrupt.
For the time being, we use a description compatible with GICv2 (and
v3), to cover the use case mach-virt + generic-pci (this is why I also
called the interrupt specific structures "gic*").
If we really need to be more general, I can see two solutions: the
first one is to find a way to pass all the necessary interrupt
controller information to the device machinery (phandle, #cells,
interrupt number, etc.).
I don't know how such a solution could get complicated in the future,
with some other virt-platform that requires a complicated interrupt
mapping for example.
The second would be to move all the device node generation back to the
platform, making its code even more crowded.
Are there other solutions that I'm missing?
Post by Claudio Fontana
Post by Alvise Rigo
Only mach-virt has been used to test the controller.
---
hw/pci-host/Makefile.objs | 2 +-
hw/pci-host/generic-pci.c | 280 ++++++++++++++++++++++++++++++++++++++
include/hw/pci-host/generic-pci.h | 74 ++++++++++
3 files changed, 355 insertions(+), 1 deletion(-)
create mode 100644 hw/pci-host/generic-pci.c
create mode 100644 include/hw/pci-host/generic-pci.h
diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index bb65f9c..8ef9fac 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -1,4 +1,4 @@
-common-obj-y += pam.o
+common-obj-y += pam.o generic-pci.o
# PPC devices
common-obj-$(CONFIG_PREP_PCI) += prep.o
diff --git a/hw/pci-host/generic-pci.c b/hw/pci-host/generic-pci.c
new file mode 100644
index 0000000..9c90263
--- /dev/null
+++ b/hw/pci-host/generic-pci.c
@@ -0,0 +1,280 @@
+/*
+ * Generic PCI host controller
+ *
+ * Copyright (c) 2014 Linaro, Ltd.
+ *
+ * Copyright (c) 2006-2009 CodeSourcery.
+ * Written by Paul Brook
+ *
+ * This code is licensed under the LGPL.
+ */
+
+#include "hw/sysbus.h"
+#include "hw/pci-host/generic-pci.h"
+#include "exec/address-spaces.h"
+#include "sysemu/device_tree.h"
+
+static const VMStateDescription pci_generic_host_vmstate = {
+ .name = "generic-host-pci",
+ .version_id = 1,
+ .minimum_version_id = 1,
+};
+
+static void pci_cam_config_write(void *opaque, hwaddr addr,
+ uint64_t val, unsigned size)
+{
/* the MemoryRegionOps require uint64_t val, but we can only do 32bit */
there are already asserts in pci_host.c, so that should be sufficient.
Agreed.
Post by Claudio Fontana
Post by Alvise Rigo
+ PCIGenState *s = opaque;
+ pci_data_write(&s->pci_bus, addr, val, size);
+}
+
+static uint64_t pci_cam_config_read(void *opaque, hwaddr addr, unsigned size)
/* same here */
Post by Alvise Rigo
+{
+ PCIGenState *s = opaque;
+ uint32_t val;
+ val = pci_data_read(&s->pci_bus, addr, size);
+ return val;
+}
+
+static const MemoryRegionOps pci_vpb_config_ops = {
+ .read = pci_cam_config_read,
+ .write = pci_cam_config_write,
+ .endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static void pci_generic_set_irq(void *opaque, int irq_num, int level)
+{
+ qemu_irq *pic = opaque;
+ qemu_set_irq(pic[irq_num], level);
+}
+
+static void pci_generic_host_init(Object *obj)
+{
+ PCIHostState *h = PCI_HOST_BRIDGE(obj);
+ PCIGenState *s = PCI_GEN(obj);
+ GenericPCIHostState *gps = &s->pci_gen;
+
+ memory_region_init(&s->pci_io_space, OBJECT(s), "pci_io", 1ULL << 32);
+ memory_region_init(&s->pci_mem_space, OBJECT(s), "pci_mem", 1ULL << 32);
+
+ pci_bus_new_inplace(&s->pci_bus, sizeof(s->pci_bus), DEVICE(obj), "pci",
+ &s->pci_mem_space, &s->pci_io_space,
+ PCI_DEVFN(0, 0), TYPE_PCIE_BUS);
TYPE_PCI_BUS, until we actually support PCIE.
This was present in the first RFC as well since harmless, but I will
make it PCI only.
Post by Claudio Fontana
Post by Alvise Rigo
+ h->bus = &s->pci_bus;
+
+ object_initialize(gps, sizeof(*gps), TYPE_GENERIC_PCI_HOST);
+ qdev_set_parent_bus(DEVICE(gps), BUS(&s->pci_bus));
+ gps->devfns = 0;
+}
+
+static int generic_pci_map_irq_fn(PCIDevice *pci_dev, int pin)
+{
+ BusState *bus = qdev_get_parent_bus(&pci_dev->qdev);
+ PCIBus *pci_bus = PCI_BUS(bus);
+ PCIDevice *pdev = pci_bus->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+
+ return gps->irqmap.devfn_idx_map[PCI_SLOT(pci_dev->devfn)]
+ [PCI_FUNC(pci_dev->devfn)];
+}
+
+static void pci_generic_host_realize(DeviceState *dev, Error **errp)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+ GenericPCIPlatformData *pdata = &s->plat_data;
+ int i;
+
+ if (!pdata->addr_map[PCI_CFG].size || !pdata->addr_map[PCI_IO].size ||
+ !pdata->addr_map[PCI_MEM].size || !pdata->irqs) {
+ hw_error("generic_pci: PCI controller not fully configured.");
+ }
+
+ for (i = 0; i < pdata->irqs; i++) {
+ sysbus_init_irq(sbd, &s->irq[i]);
+ }
+
+ pci_bus_irqs(&s->pci_bus, pci_generic_set_irq, generic_pci_map_irq_fn,
+ s->irq, pdata->irqs);
+
+ memory_region_init_io(&s->mem_config, OBJECT(s), &pci_vpb_config_ops, s,
+ "pci-config", pdata->addr_map[PCI_CFG].size);
+ sysbus_init_mmio(sbd, &s->mem_config);
+
+ /* Depending on the available memory of the board using the controller, we
+ create a window on both the I/O and mememory space. */
s/mememory/memory/
ACK.
Post by Claudio Fontana
Post by Alvise Rigo
+ memory_region_init_alias(&s->pci_io_window, OBJECT(s), "pci-io-win",
+ &s->pci_io_space, 0,
+ pdata->addr_map[PCI_IO].size);
+ sysbus_init_mmio(sbd, &s->pci_io_window);
+
+ memory_region_init_alias(&s->pci_mem_window, OBJECT(s), "pci-mem-win",
+ &s->pci_mem_space,
+ pdata->addr_map[PCI_MEM].addr,
+ pdata->addr_map[PCI_MEM].size);
+ sysbus_init_mmio(sbd, &s->pci_mem_window);
+
+ /* TODO Remove once realize propagates to child devices. */
+ object_property_set_bool(OBJECT(&s->pci_gen), true, "realized", errp);
+}
+
+static void pci_generic_host_class_init(ObjectClass *klass, void *data)
+{
+ PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ k->vendor_id = PCI_VENDOR_ID_REDHAT;
+ k->device_id = 0x1234;
+ k->class_id = PCI_CLASS_PROCESSOR_CO;
+ /*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+ dc->cannot_instantiate_with_device_add_yet = true;
+}
+
+struct dt_irq_mapping {
+ int irqs;
+ uint32_t gic_phandle;
cannot this be generic enough that we don't talk about gic here?
Yes. I will change it.
Post by Claudio Fontana
Post by Alvise Rigo
+ int base_irq_num;
+ uint64_t *data;
+};
+
+/* */
? Can you remove this empty comment ?
Of course, I missed it.
Post by Claudio Fontana
Post by Alvise Rigo
+#define IRQ_MAP_ENTRY_DESC_SZ 14
+static void gen_int_mapping_fn(PCIBus *b, PCIDevice *d, void *opaque)
+{
+ struct dt_irq_mapping *map_data;
+ int pin;
+
+ PCIDevice *pdev = b->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+ map_data = (struct dt_irq_mapping *)opaque;
+
+ /* Check if the platform has enough IRQs available. */
+ if (gps->devfns > map_data->irqs) {
+ hw_error("generic_pci: too many PCI devices.");
+ }
+
+ pin = pci_get_byte(d->config + PCI_INTERRUPT_PIN);
+ if (pin) {
+ uint64_t *data_ptr = map_data->data + (IRQ_MAP_ENTRY_DESC_SZ * gps->devfns);
+ uint8_t pci_slot = PCI_SLOT(d->devfn);
+ uint8_t pci_func = PCI_FUNC(d->devfn);
+ uint8_t *devfn_idx = &gps->irqmap.devfn_idx_map[pci_slot][0];
+ uint8_t *devfn_irq = gps->irqmap.devfn_irq_map;
+
+ devfn_idx[pci_func] = gps->devfns;
+ devfn_irq[gps->devfns] = map_data->base_irq_num + gps->devfns;
+
+ uint64_t buffer[IRQ_MAP_ENTRY_DESC_SZ] =
+ {1, (pci_slot << 11)|(pci_func << 8), 2, 0x00000000, 1, pin,
+ 1, map_data->gic_phandle, 1, 0x0, 1, devfn_irq[gps->devfns],
+ 1, 0x1};
ah now I see, I think you forgot a comment above. But maybe here is a better place.
The above needs to be commented heavily, to make it obvious what each field is,
and which "spec" this is following. You have mentioned already in the previous discussion,
but it needs to be in the code.
I will do it.
Post by Claudio Fontana
Post by Alvise Rigo
+
+ memcpy(data_ptr, buffer, IRQ_MAP_ENTRY_DESC_SZ * sizeof(*buffer));
+ gps->devfns++;
+ }
+}
+
+/* Generate the "interrup-map" node's data and store it in map_data */
interrupt-map
ACK.
Post by Claudio Fontana
Post by Alvise Rigo
+static void generate_int_mapping(struct dt_irq_mapping *map_data,
+ PCIGenState *s)
+{
+ pci_for_each_device(s->parent_obj.bus, 0, gen_int_mapping_fn, map_data);
+}
+
+static void generate_dt_node(void *fdt, DeviceState *dev)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ char *nodename;
+ uint32_t gic_phandle;
+ uint32_t plat_acells;
+ uint32_t plat_scells;
+
+ SysBusDevice *busdev;
+ busdev = SYS_BUS_DEVICE(dev);
+ MemoryRegion *cfg = sysbus_mmio_get_region(busdev, 0);
+ MemoryRegion *io = sysbus_mmio_get_region(busdev, 1);
+ MemoryRegion *mem = sysbus_mmio_get_region(busdev, 2);
+
+ qemu_fdt_add_subnode(fdt, nodename);
+ qemu_fdt_setprop_string(fdt, nodename, "compatible",
+ "pci-host-cam-generic");
+ qemu_fdt_setprop_string(fdt, nodename, "device_type", "pci");
+ qemu_fdt_setprop_cell(fdt, nodename, "#address-cells", 0x3);
+ qemu_fdt_setprop_cell(fdt, nodename, "#size-cells", 0x2);
+ qemu_fdt_setprop_cell(fdt, nodename, "#interrupt-cells", 0x1);
+
+ plat_acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
+ plat_scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", plat_acells, cfg->addr,
+ plat_scells, memory_region_size(cfg));
+
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "ranges",
+ 1, 0x01000000, 2, 0x00000000, /* I/O space */
+ 2, io->addr, 2, memory_region_size(io),
+ 1, 0x02000000, 2, mem->addr, /* 32bit memory space */
cfg = whatever base address
io = base + 0x1000000
mem = base + 0x2000000
The value 0x1000000 actually means that the following address has to
be interpreted as I/O address, while 0x2000000 means that it's a 32bit
address.
I will add some more documentation here, for now I point to page 4 of
http://www.o3one.org/hwdocs/openfirmware/pci_supplement_2_1.pdf
All the documentation should also be available at the website
http://www.openfirmware.org (as pointed also by the kernel
documentation file Documentation/devicetree/bindings/pci/pci.txt),
however the domain does not seem active.
Post by Claudio Fontana
Post by Alvise Rigo
+ 2, mem->addr, 2, memory_region_size(mem));
+
+ gic_phandle = qemu_fdt_get_phandle(fdt, s->plat_data.gic_node_name);
hmm could it be gic-independent?
Post by Alvise Rigo
+ /* A mask covering bits [15,8] of phys.high allows to honor the
+ function number when resolving a triggered PCI interrupt. */
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "interrupt-map-mask",
+ 1, 0xff00, 1, 0x0, 1, 0x0, 1, 0x7);
and what about 0, 0, 7, what do they mean? How does the mapping work,
what do all the fields mean, and which paper/spec are you implementing?..
0, 0 are needed to mask out the phys.mid and hys.low bits of the PCI
address (since are not needed to resolve the interrupt).
I will add a more detailed documentation in the next version.
Post by Claudio Fontana
Post by Alvise Rigo
+
+ uint64_t *int_mapping_data = g_malloc0(IRQ_MAP_ENTRY_DESC_SZ *
+ sizeof(uint64_t) * s->plat_data.irqs);
+ struct dt_irq_mapping dt_map_data = {
+ .irqs = s->plat_data.irqs,
+ .gic_phandle = gic_phandle,
could this be gic independent?
Yes, but still we need a field necessary to store the interrupt
controller phandle.
Post by Claudio Fontana
Post by Alvise Rigo
+ .base_irq_num = s->plat_data.base_irq,
+ .data = int_mapping_data
+ };
+ /* Generate the interrupt mapping according to the devices attached
+ * to the PCI bus of the controller. */
+ generate_int_mapping(&dt_map_data, s);
+ qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "interrupt-map",
+ (s->pci_gen.devfns * IRQ_MAP_ENTRY_DESC_SZ)/2, int_mapping_data);
+
+ g_free(int_mapping_data);
+ g_free(nodename);
+}
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev)
+{
+ generate_dt_node(fdt, dev);
+}
+
+static const TypeInfo pci_generic_host_info = {
+ .name = TYPE_GENERIC_PCI_HOST,
+ .parent = TYPE_PCI_DEVICE,
+ .instance_size = sizeof(GenericPCIHostState),
+ .class_init = pci_generic_host_class_init,
+};
+
+static void pci_generic_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ dc->realize = pci_generic_host_realize;
+ dc->vmsd = &pci_generic_host_vmstate;
+}
+
+static const TypeInfo pci_generic_info = {
+ .name = TYPE_GENERIC_PCI,
+ .parent = TYPE_PCI_HOST_BRIDGE,
+ .instance_size = sizeof(PCIGenState),
+ .instance_init = pci_generic_host_init,
+ .class_init = pci_generic_class_init,
+};
+
+static void generic_pci_host_register_types(void)
+{
+ type_register_static(&pci_generic_info);
+ type_register_static(&pci_generic_host_info);
+}
+
+type_init(generic_pci_host_register_types)
\ No newline at end of file
diff --git a/include/hw/pci-host/generic-pci.h b/include/hw/pci-host/generic-pci.h
new file mode 100644
index 0000000..43c7a0f
--- /dev/null
+++ b/include/hw/pci-host/generic-pci.h
@@ -0,0 +1,74 @@
+#ifndef QEMU_GENERIC_PCI_H
+#define QEMU_GENERIC_PCI_H
+
+#include "hw/pci/pci.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci_host.h"
+
+#define MAX_PCI_DEVICES (PCI_SLOT_MAX * PCI_FUNC_MAX)
+
maybe this would be a good place to talk about what the address space
layout assumptions of generic pci are, like
/* The Configuration space starts at the PCI base address passed in the virtual platform
* information,
* The I/O space starts at an offset of 0x1000000 from the PCI base address
* The Mem space starts at an offset of 0x2000000 from the PCI base address
*/
these the assumptions right?
Actually this is not completely true. As explained before, 0x01000000
and 0x02000000 indicate an I/O region and a 32bit addressable region.
The I/O window (pci_io_window) starts at the beginning of the I/O
address space (pci_io_space), no matter where it is mapped to by the
platform.
For example, mach-virt maps it to the CPU address 0x11000000 but from
the PCI bus point of view, it starts at address 0x0.
For the memory window (pci_mem_window) instead we need to place it at
the same address in the memory space (mem_io_space) to which it has
been mapped by the platform.
In mach-virt, this region starts at CPU (and also PCI) address 0x12000000.
I hope to have been enough clear.
Well yes, but the problem is that using those magic numbers (0x1000000 and 0x2000000)
instead of properly named macros, along with the lack of documentation about what is being done,
causes exactly the kind of confusion for the reader which I incurred into.
Post by alvise rigo
Regards,
alvise
Post by Claudio Fontana
Post by Alvise Rigo
+enum {
+ PCI_CFG = 0,
+ PCI_IO,
+ PCI_MEM,
+};
+
+typedef struct {
+ /* addr_map[PCI_CFG].addr = base address of configuration memory
+ addr_map[PCI_CFG].size = configuration memory size
+ ... */
+ struct addr_map {
+ hwaddr addr;
+ hwaddr size;
+ } addr_map[3];
+
+ const char *gic_node_name;
+ uint32_t base_irq;
+ uint32_t irqs;
+} GenericPCIPlatformData;
+
+typedef struct {
+ PCIDevice parent_obj;
+
+ struct irqmap {
+ /* devfn_idx_map[i][j] = index inside slot_irq_map for device at slot i,
+ fn j */
+ uint8_t devfn_idx_map[PCI_SLOT_MAX][PCI_FUNC_MAX];
+ /* devfn_irq_map[i] = IRQ num. of the i-th device (devfn) attached to
+ the bus */
+ uint8_t devfn_irq_map[MAX_PCI_DEVICES];
+ } irqmap;
+ /* number of devices attached to the bus */
+ uint32_t devfns;
+} GenericPCIHostState;
+
+typedef struct {
+ PCIHostState parent_obj;
+
+ qemu_irq irq[MAX_PCI_DEVICES];
+ MemoryRegion mem_config;
+ /* Containers representing the PCI address spaces */
+ MemoryRegion pci_io_space;
+ MemoryRegion pci_mem_space;
+ /* Alias regions into PCI address spaces which we expose as sysbus regions.
+ * The offsets into pci_mem_space is fixed. */
+ MemoryRegion pci_io_window;
+ MemoryRegion pci_mem_window;
+ PCIBus pci_bus;
+ GenericPCIHostState pci_gen;
+
+ /* Platform dependent data */
+ GenericPCIPlatformData plat_data;
+} PCIGenState;
+
+#define TYPE_GENERIC_PCI "generic_pci"
+#define PCI_GEN(obj) \
+ OBJECT_CHECK(PCIGenState, (obj), TYPE_GENERIC_PCI)
+
+#define TYPE_GENERIC_PCI_HOST "generic_pci_host"
+#define PCI_GEN_HOST(obj) \
+ OBJECT_CHECK(GenericPCIHostState, (obj), TYPE_GENERIC_PCI_HOST)
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev);
+
+#endif
\ No newline at end of file
--
Claudio Fontana
Server Virtualization Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München

office: +49 89 158834 4135
mobile: +49 15253060158
Claudio Fontana
2014-11-24 10:38:49 UTC
Permalink
Instantiate a generic-pci PCI controller to add a PCI bus to the
mach-virt platform. The platform memory map has now three more memory
ranges to map the device's memory regions (Configuration region, I/O
region and Memory region). Now that a PCI bus is provided, the machine
could be launched with multiple PCI devices through the -device option
(e.g., -device virtio-blk-pci).
---
hw/arm/virt.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 4e7b869..74e6838 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -32,6 +32,7 @@
#include "hw/arm/arm.h"
#include "hw/arm/primecell.h"
#include "hw/devices.h"
+#include "hw/pci-host/generic-pci.h"
#include "net/net.h"
#include "sysemu/block-backend.h"
#include "sysemu/device_tree.h"
@@ -44,6 +45,7 @@
#include "qemu/error-report.h"
#define NUM_VIRTIO_TRANSPORTS 32
+#define NUM_PCI_IRQS 20
/* Number of external interrupt lines to configure the GIC with */
#define NUM_IRQS 128
@@ -68,6 +70,9 @@ enum {
VIRT_UART,
VIRT_MMIO,
VIRT_RTC,
+ VIRT_PCI_CFG,
+ VIRT_PCI_IO,
+ VIRT_PCI_MEM,
};
typedef void (*modify_dtb_func)(void *fdt, DeviceState *dev);
@@ -120,6 +125,9 @@ static const MemMapEntry a15memmap[] = {
[VIRT_MMIO] = { 0x0a000000, 0x00000200 },
/* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that size */
/* 0x10000000 .. 0x40000000 reserved for PCI */
+ [VIRT_PCI_CFG] = { 0x10000000, 0x01000000 },
+ [VIRT_PCI_IO] = { 0x11000000, 0x00010000 },
+ [VIRT_PCI_MEM] = { 0x12000000, 0x2e000000 },
[VIRT_MEM] = { 0x40000000, 30ULL * 1024 * 1024 * 1024 },
};
@@ -127,6 +135,7 @@ static const int a15irqmap[] = {
[VIRT_UART] = 1,
[VIRT_RTC] = 2,
[VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
+ [VIRT_PCI_CFG] = 47,
why not say
[VIRT_PCI_CFG] = 16 + NUM_VIRTIO_TRANSPORTS

instead of "47"?

By the way isn't 47 wrong anyway? From 16 to 47 it's virtio transports no?
Should it not be 48?
};
static VirtBoardInfo machines[] = {
@@ -550,6 +559,50 @@ static void create_flash(const VirtBoardInfo *vbi)
g_free(nodename);
}
+static void create_pci_host(const VirtBoardInfo *vbi, qemu_irq *pic)
+{
+ PCIBus *pci_bus;
+ DeviceState *dev;
+ SysBusDevice *busdev;
+ PCIGenState *pgdev;
+ GenericPCIPlatformData *pdata;
+ int i;
+ const MemMapEntry *mme = NULL;
+
+ dev = qdev_create(NULL, "generic_pci");
+ busdev = SYS_BUS_DEVICE(dev);
+ pgdev = PCI_GEN(dev);
+
+ mme = vbi->memmap;
+
+ /* Pass platform dependent data to the controller. */
+ pdata = &pgdev->plat_data;
+ pdata->addr_map[PCI_CFG].addr = mme[VIRT_PCI_CFG].base;
+ pdata->addr_map[PCI_CFG].size = mme[VIRT_PCI_CFG].size;
+ pdata->addr_map[PCI_IO].addr = mme[VIRT_PCI_IO].base;
+ pdata->addr_map[PCI_IO].size = mme[VIRT_PCI_IO].size;
+ pdata->addr_map[PCI_MEM].addr = mme[VIRT_PCI_MEM].base;
+ pdata->addr_map[PCI_MEM].size = mme[VIRT_PCI_MEM].size;
+ pdata->gic_node_name = "/intc";
+ pdata->base_irq = vbi->irqmap[VIRT_PCI_CFG];
+ pdata->irqs = NUM_PCI_IRQS;
+ qdev_init_nofail(dev);
+
+ sysbus_mmio_map(busdev, 0, mme[VIRT_PCI_CFG].base); /* PCI config */
+ sysbus_mmio_map(busdev, 1, mme[VIRT_PCI_IO].base); /* PCI I/O */
+ sysbus_mmio_map(busdev, 2, mme[VIRT_PCI_MEM].base); /* PCI memory window */
+
+ for ( i = 0; i < NUM_PCI_IRQS; i++) {
+ sysbus_connect_irq(busdev, i, pic[vbi->irqmap[VIRT_PCI_CFG] + i]);
+ }
+
+ pci_bus = (PCIBus *)qdev_get_child_bus(dev, "pci");
+ pci_create_simple(pci_bus, -1, "pci-ohci");
+ pci_create_simple(pci_bus, -1, "lsi53c895a");
+
+ add_dtb_modifier(pci_controller_build_dt_node, dev);
+}
+
static void *machvirt_dtb(const struct arm_boot_info *binfo, int *fdt_size)
{
const VirtBoardInfo *board = (const VirtBoardInfo *)binfo;
@@ -658,6 +711,8 @@ static void machvirt_init(MachineState *machine)
*/
create_virtio_devices(vbi, pic);
+ create_pci_host(vbi, pic);
+
Notifier *finalize_dtb_notifier = g_new(Notifier, 1);
finalize_dtb_notifier->notify = machvirt_finalize_dt;
qemu_add_machine_init_done_notifier(finalize_dtb_notifier);
alvise rigo
2014-11-24 10:47:45 UTC
Permalink
Yes, I forgot to remove this hard-coded (wrong) value. I will fix it in the
next release.

On Mon, Nov 24, 2014 at 11:38 AM, Claudio Fontana <
Instantiate a generic-pci PCI controller to add a PCI bus to the
mach-virt platform. The platform memory map has now three more memory
ranges to map the device's memory regions (Configuration region, I/O
region and Memory region). Now that a PCI bus is provided, the machine
could be launched with multiple PCI devices through the -device option
(e.g., -device virtio-blk-pci).
---
hw/arm/virt.c | 55
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 4e7b869..74e6838 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -32,6 +32,7 @@
#include "hw/arm/arm.h"
#include "hw/arm/primecell.h"
#include "hw/devices.h"
+#include "hw/pci-host/generic-pci.h"
#include "net/net.h"
#include "sysemu/block-backend.h"
#include "sysemu/device_tree.h"
@@ -44,6 +45,7 @@
#include "qemu/error-report.h"
#define NUM_VIRTIO_TRANSPORTS 32
+#define NUM_PCI_IRQS 20
/* Number of external interrupt lines to configure the GIC with */
#define NUM_IRQS 128
@@ -68,6 +70,9 @@ enum {
VIRT_UART,
VIRT_MMIO,
VIRT_RTC,
+ VIRT_PCI_CFG,
+ VIRT_PCI_IO,
+ VIRT_PCI_MEM,
};
typedef void (*modify_dtb_func)(void *fdt, DeviceState *dev);
@@ -120,6 +125,9 @@ static const MemMapEntry a15memmap[] = {
[VIRT_MMIO] = { 0x0a000000, 0x00000200 },
/* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that
size */
/* 0x10000000 .. 0x40000000 reserved for PCI */
+ [VIRT_PCI_CFG] = { 0x10000000, 0x01000000 },
+ [VIRT_PCI_IO] = { 0x11000000, 0x00010000 },
+ [VIRT_PCI_MEM] = { 0x12000000, 0x2e000000 },
[VIRT_MEM] = { 0x40000000, 30ULL * 1024 * 1024 * 1024 },
};
@@ -127,6 +135,7 @@ static const int a15irqmap[] = {
[VIRT_UART] = 1,
[VIRT_RTC] = 2,
[VIRT_MMIO] = 16, /* ...to 16 + NUM_VIRTIO_TRANSPORTS - 1 */
+ [VIRT_PCI_CFG] = 47,
why not say
[VIRT_PCI_CFG] = 16 + NUM_VIRTIO_TRANSPORTS
instead of "47"?
By the way isn't 47 wrong anyway? From 16 to 47 it's virtio transports no?
Should it not be 48?
};
static VirtBoardInfo machines[] = {
@@ -550,6 +559,50 @@ static void create_flash(const VirtBoardInfo *vbi)
g_free(nodename);
}
+static void create_pci_host(const VirtBoardInfo *vbi, qemu_irq *pic)
+{
+ PCIBus *pci_bus;
+ DeviceState *dev;
+ SysBusDevice *busdev;
+ PCIGenState *pgdev;
+ GenericPCIPlatformData *pdata;
+ int i;
+ const MemMapEntry *mme = NULL;
+
+ dev = qdev_create(NULL, "generic_pci");
+ busdev = SYS_BUS_DEVICE(dev);
+ pgdev = PCI_GEN(dev);
+
+ mme = vbi->memmap;
+
+ /* Pass platform dependent data to the controller. */
+ pdata = &pgdev->plat_data;
+ pdata->addr_map[PCI_CFG].addr = mme[VIRT_PCI_CFG].base;
+ pdata->addr_map[PCI_CFG].size = mme[VIRT_PCI_CFG].size;
+ pdata->addr_map[PCI_IO].addr = mme[VIRT_PCI_IO].base;
+ pdata->addr_map[PCI_IO].size = mme[VIRT_PCI_IO].size;
+ pdata->addr_map[PCI_MEM].addr = mme[VIRT_PCI_MEM].base;
+ pdata->addr_map[PCI_MEM].size = mme[VIRT_PCI_MEM].size;
+ pdata->gic_node_name = "/intc";
+ pdata->base_irq = vbi->irqmap[VIRT_PCI_CFG];
+ pdata->irqs = NUM_PCI_IRQS;
+ qdev_init_nofail(dev);
+
+ sysbus_mmio_map(busdev, 0, mme[VIRT_PCI_CFG].base); /* PCI config */
+ sysbus_mmio_map(busdev, 1, mme[VIRT_PCI_IO].base); /* PCI I/O */
+ sysbus_mmio_map(busdev, 2, mme[VIRT_PCI_MEM].base); /* PCI memory
window */
+
+ for ( i = 0; i < NUM_PCI_IRQS; i++) {
+ sysbus_connect_irq(busdev, i, pic[vbi->irqmap[VIRT_PCI_CFG] +
i]);
+ }
+
+ pci_bus = (PCIBus *)qdev_get_child_bus(dev, "pci");
+ pci_create_simple(pci_bus, -1, "pci-ohci");
+ pci_create_simple(pci_bus, -1, "lsi53c895a");
+
+ add_dtb_modifier(pci_controller_build_dt_node, dev);
+}
+
static void *machvirt_dtb(const struct arm_boot_info *binfo, int
*fdt_size)
{
const VirtBoardInfo *board = (const VirtBoardInfo *)binfo;
@@ -658,6 +711,8 @@ static void machvirt_init(MachineState *machine)
*/
create_virtio_devices(vbi, pic);
+ create_pci_host(vbi, pic);
+
Notifier *finalize_dtb_notifier = g_new(Notifier, 1);
finalize_dtb_notifier->notify = machvirt_finalize_dt;
qemu_add_machine_init_done_notifier(finalize_dtb_notifier);
Claudio Fontana
2014-11-24 15:50:40 UTC
Permalink
Another general question about this series use:

why do all these other devices that are unrelated to the virt platform show up?
Here I am running on the guest with just virtio-net, virtio-blk and virtio-rng:

(qemu) info pci
Bus 0, device 0, function 0:
Class 2880: PCI device 1b36:1234
id ""
Bus 0, device 1, function 0:
USB controller: PCI device 106b:003f
IRQ 0.
BAR0: 32 bit memory at 0xffffffffffffffff [0x000000fe].
id ""
Bus 0, device 2, function 0:
SCSI controller: PCI device 1000:0012
IRQ 0.
BAR0: I/O at 0xffffffffffffffff [0x00fe].
BAR1: 32 bit memory at 0xffffffffffffffff [0x000003fe].
BAR2: 32 bit memory at 0xffffffffffffffff [0x00001ffe].
id ""
Bus 0, device 3, function 0:
SCSI controller: PCI device 1af4:1001
IRQ 0.
BAR0: I/O at 0x0100 [0x013f].
id "blk0"
Bus 0, device 4, function 0:
Class 0255: PCI device 1af4:1005
IRQ 0.
BAR0: I/O at 0x0140 [0x015f].
id ""
Bus 0, device 5, function 0:
Ethernet controller: PCI device 1af4:1000
IRQ 0.
BAR0: I/O at 0x0160 [0x017f].
BAR6: 32 bit memory at 0xffffffffffffffff [0x0003fffe].
id ""

Also what is the BAR6 for the virtio-net device? I am struggling to understand where it is coming from...

Thanks,

Claudio
This patch series is based on the previous work [1] and [2] by Rob
Herring and on [3] by myself. For sake of readability and since this is
still a RFC, these patches come as a stand alone work, so there's no
need to apply first [1][2][3]. it tries to enhance this work on these
- The code should be general enough to allow the use of the controller
with other platforms, not only with mach-virt. The only assumption
made is that a GIC v2 is used at guest side (the interrupt-map
property describes the parent interrupts using the three cells
format).
- The interrupt-map node generation has been enhanced in the following
- support of multi-function PCI device has been added
- a PCI device can now use an interrupt pin different from #INTA
Since some other works like [4] require to modify the device tree only
when all the devices have been instantiated, the PATCH[1/4] proposes a
solution for mach-virt to allow multiple agents (e.g., generic-pci,
VFIO) to modify the device tree. The approach in simple: a global list
is kept to store all the routines that perform the modification of the
device tree. Eventually, when the machine is completed, all these
routines are sequentially executed and the kernel is loaded to the guest
by mean of a machine_init_done_notifier.
Rather than postponing the arm_load_kernel call as this patch does,
should we use instead the modify_dtb call provided by arm_boot_info to
modify the device tree?
If so, shouldn't modify_dtb be used to modify only *user* provided
device trees?
This work has been tested attaching several PCI devices to the mach-virt
platform using an ARMv7 CPU. The tested devices are: virtio-blk-pci,
virtio-net-pci, lsi53c895a and pci-ohci (all attached at the same time).
- Add MSI, MSI-X support
- PCI-E support. Due to a lack of devices, this part is a bit hard to
accomplish at the moment.
Thank you, alvise
[1]
"[Qemu-devel] [RFC PATCH 1/2] hw/pci-host: add a generic PCI host"
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html
[2]
"[Qemu-devel] [RFC PATCH 2/2] hw/arm/virt: Add generic PCI host device"
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03483.html
[3]
"[Qemu-devel] [RFC PATCH 0/8] Add Generic PCI host device update"
https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg01957.html
[4]
http://lists.nongnu.org/archive/html/qemu-devel/2014-10/msg03816.html
hw/arm/virt: Allow multiple agents to modify dt
hw/arm/virt: find_machine_info: handle NULL value
hw/pci-host: Add a generic PCI host controller for virtual platforms
hw/arm/virt: Add generic-pci device support
hw/arm/virt.c | 114 +++++++++++++++-
hw/pci-host/Makefile.objs | 2 +-
hw/pci-host/generic-pci.c | 280 ++++++++++++++++++++++++++++++++++++++
include/hw/pci-host/generic-pci.h | 74 ++++++++++
4 files changed, 468 insertions(+), 2 deletions(-)
create mode 100644 hw/pci-host/generic-pci.c
create mode 100644 include/hw/pci-host/generic-pci.h
--
Claudio Fontana
Server Virtualization Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München

office: +49 89 158834 4135
mobile: +49 15253060158
alvise rigo
2014-11-25 10:28:10 UTC
Permalink
On Mon, Nov 24, 2014 at 4:50 PM, Claudio Fontana
Post by Claudio Fontana
why do all these other devices that are unrelated to the virt platform show up?
There are two devices added to the platform by default in mach-virt.
Look at the end of create_pci_host() in virt.c, you will find that a
pci-ohci and a lsi53c895a device are created.
Post by Claudio Fontana
(qemu) info pci
Class 2880: PCI device 1b36:1234
id ""
USB controller: PCI device 106b:003f
IRQ 0.
BAR0: 32 bit memory at 0xffffffffffffffff [0x000000fe].
id ""
SCSI controller: PCI device 1000:0012
IRQ 0.
BAR0: I/O at 0xffffffffffffffff [0x00fe].
BAR1: 32 bit memory at 0xffffffffffffffff [0x000003fe].
BAR2: 32 bit memory at 0xffffffffffffffff [0x00001ffe].
id ""
SCSI controller: PCI device 1af4:1001
IRQ 0.
BAR0: I/O at 0x0100 [0x013f].
id "blk0"
Class 0255: PCI device 1af4:1005
IRQ 0.
BAR0: I/O at 0x0140 [0x015f].
id ""
Ethernet controller: PCI device 1af4:1000
IRQ 0.
BAR0: I/O at 0x0160 [0x017f].
BAR6: 32 bit memory at 0xffffffffffffffff [0x0003fffe].
id ""
Also what is the BAR6 for the virtio-net device? I am struggling to understand where it is coming from...
I think it has to do with the efi-virtio.rom which is supplied to the
virtio-net-pci device.
At the end of pci_add_option_rom() in hw/pci/pci.c you will see the
sixth bar registered.

alvise
Post by Claudio Fontana
Thanks,
Claudio
This patch series is based on the previous work [1] and [2] by Rob
Herring and on [3] by myself. For sake of readability and since this is
still a RFC, these patches come as a stand alone work, so there's no
need to apply first [1][2][3]. it tries to enhance this work on these
- The code should be general enough to allow the use of the controller
with other platforms, not only with mach-virt. The only assumption
made is that a GIC v2 is used at guest side (the interrupt-map
property describes the parent interrupts using the three cells
format).
- The interrupt-map node generation has been enhanced in the following
- support of multi-function PCI device has been added
- a PCI device can now use an interrupt pin different from #INTA
Since some other works like [4] require to modify the device tree only
when all the devices have been instantiated, the PATCH[1/4] proposes a
solution for mach-virt to allow multiple agents (e.g., generic-pci,
VFIO) to modify the device tree. The approach in simple: a global list
is kept to store all the routines that perform the modification of the
device tree. Eventually, when the machine is completed, all these
routines are sequentially executed and the kernel is loaded to the guest
by mean of a machine_init_done_notifier.
Rather than postponing the arm_load_kernel call as this patch does,
should we use instead the modify_dtb call provided by arm_boot_info to
modify the device tree?
If so, shouldn't modify_dtb be used to modify only *user* provided
device trees?
This work has been tested attaching several PCI devices to the mach-virt
platform using an ARMv7 CPU. The tested devices are: virtio-blk-pci,
virtio-net-pci, lsi53c895a and pci-ohci (all attached at the same time).
- Add MSI, MSI-X support
- PCI-E support. Due to a lack of devices, this part is a bit hard to
accomplish at the moment.
Thank you, alvise
[1]
"[Qemu-devel] [RFC PATCH 1/2] hw/pci-host: add a generic PCI host"
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html
[2]
"[Qemu-devel] [RFC PATCH 2/2] hw/arm/virt: Add generic PCI host device"
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03483.html
[3]
"[Qemu-devel] [RFC PATCH 0/8] Add Generic PCI host device update"
https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg01957.html
[4]
http://lists.nongnu.org/archive/html/qemu-devel/2014-10/msg03816.html
hw/arm/virt: Allow multiple agents to modify dt
hw/arm/virt: find_machine_info: handle NULL value
hw/pci-host: Add a generic PCI host controller for virtual platforms
hw/arm/virt: Add generic-pci device support
hw/arm/virt.c | 114 +++++++++++++++-
hw/pci-host/Makefile.objs | 2 +-
hw/pci-host/generic-pci.c | 280 ++++++++++++++++++++++++++++++++++++++
include/hw/pci-host/generic-pci.h | 74 ++++++++++
4 files changed, 468 insertions(+), 2 deletions(-)
create mode 100644 hw/pci-host/generic-pci.c
create mode 100644 include/hw/pci-host/generic-pci.h
--
Claudio Fontana
Server Virtualization Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München
office: +49 89 158834 4135
mobile: +49 15253060158
Alexander Graf
2015-01-05 17:13:43 UTC
Permalink
Post by Alvise Rigo
Add a generic PCI host controller for virtual platforms, based on the
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html
The controller creates a PCI bus for PCI devices; it is intended to
receive from the platform all the needed information to initiate the
memory regions expected by the PCI system. Due to this dependence, a
header file is used to define the structure that the platform has to
fill with the data needed by the controller. The device provides also a
routine for the device tree node generation, which mostly has to take
care of the interrupt-map property generation. This property is fetched
by the guest operating system to map any PCI interrupt to the interrupt
controller. For the time being, the device expects a GIC v2 to be used
by the guest.
Only mach-virt has been used to test the controller.
---
hw/pci-host/Makefile.objs | 2 +-
hw/pci-host/generic-pci.c | 280 ++++++++++++++++++++++++++++++++++++++
include/hw/pci-host/generic-pci.h | 74 ++++++++++
3 files changed, 355 insertions(+), 1 deletion(-)
create mode 100644 hw/pci-host/generic-pci.c
create mode 100644 include/hw/pci-host/generic-pci.h
diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index bb65f9c..8ef9fac 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -1,4 +1,4 @@
-common-obj-y += pam.o
+common-obj-y += pam.o generic-pci.o
# PPC devices
common-obj-$(CONFIG_PREP_PCI) += prep.o
diff --git a/hw/pci-host/generic-pci.c b/hw/pci-host/generic-pci.c
new file mode 100644
index 0000000..9c90263
--- /dev/null
+++ b/hw/pci-host/generic-pci.c
@@ -0,0 +1,280 @@
+/*
+ * Generic PCI host controller
+ *
+ * Copyright (c) 2014 Linaro, Ltd.
+ *
+ * Copyright (c) 2006-2009 CodeSourcery.
+ * Written by Paul Brook
+ *
+ * This code is licensed under the LGPL.
+ */
+
+#include "hw/sysbus.h"
+#include "hw/pci-host/generic-pci.h"
+#include "exec/address-spaces.h"
+#include "sysemu/device_tree.h"
+
+static const VMStateDescription pci_generic_host_vmstate = {
+ .name = "generic-host-pci",
+ .version_id = 1,
+ .minimum_version_id = 1,
+};
+
+static void pci_cam_config_write(void *opaque, hwaddr addr,
+ uint64_t val, unsigned size)
+{
+ PCIGenState *s = opaque;
+ pci_data_write(&s->pci_bus, addr, val, size);
+}
+
+static uint64_t pci_cam_config_read(void *opaque, hwaddr addr, unsigned size)
+{
+ PCIGenState *s = opaque;
+ uint32_t val;
+ val = pci_data_read(&s->pci_bus, addr, size);
+ return val;
+}
+
+static const MemoryRegionOps pci_vpb_config_ops = {
+ .read = pci_cam_config_read,
+ .write = pci_cam_config_write,
+ .endianness = DEVICE_NATIVE_ENDIAN,
Never use NATIVE_ENDIAN unless you're 100% sure it's correct. In this
case, please make it LITTLE_ENDIAN.
Post by Alvise Rigo
+};
+
+static void pci_generic_set_irq(void *opaque, int irq_num, int level)
+{
+ qemu_irq *pic = opaque;
+ qemu_set_irq(pic[irq_num], level);
+}
+
+static void pci_generic_host_init(Object *obj)
+{
+ PCIHostState *h = PCI_HOST_BRIDGE(obj);
+ PCIGenState *s = PCI_GEN(obj);
+ GenericPCIHostState *gps = &s->pci_gen;
+
+ memory_region_init(&s->pci_io_space, OBJECT(s), "pci_io", 1ULL << 32);
Why is IO space that big? It's only 64k usually, no?
Post by Alvise Rigo
+ memory_region_init(&s->pci_mem_space, OBJECT(s), "pci_mem", 1ULL << 32);
+
+ pci_bus_new_inplace(&s->pci_bus, sizeof(s->pci_bus), DEVICE(obj), "pci",
+ &s->pci_mem_space, &s->pci_io_space,
+ PCI_DEVFN(0, 0), TYPE_PCIE_BUS);
+ h->bus = &s->pci_bus;
+
+ object_initialize(gps, sizeof(*gps), TYPE_GENERIC_PCI_HOST);
+ qdev_set_parent_bus(DEVICE(gps), BUS(&s->pci_bus));
+ gps->devfns = 0;
+}
+
+static int generic_pci_map_irq_fn(PCIDevice *pci_dev, int pin)
+{
+ BusState *bus = qdev_get_parent_bus(&pci_dev->qdev);
+ PCIBus *pci_bus = PCI_BUS(bus);
+ PCIDevice *pdev = pci_bus->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+
+ return gps->irqmap.devfn_idx_map[PCI_SLOT(pci_dev->devfn)]
+ [PCI_FUNC(pci_dev->devfn)];
+}
+
+static void pci_generic_host_realize(DeviceState *dev, Error **errp)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+ GenericPCIPlatformData *pdata = &s->plat_data;
Where does this come from? Why isn't it exposed as qdev parameters?
Post by Alvise Rigo
+ int i;
+
+ if (!pdata->addr_map[PCI_CFG].size || !pdata->addr_map[PCI_IO].size ||
+ !pdata->addr_map[PCI_MEM].size || !pdata->irqs) {
+ hw_error("generic_pci: PCI controller not fully configured.");
+ }
+
+ for (i = 0; i < pdata->irqs; i++) {
+ sysbus_init_irq(sbd, &s->irq[i]);
+ }
+
+ pci_bus_irqs(&s->pci_bus, pci_generic_set_irq, generic_pci_map_irq_fn,
+ s->irq, pdata->irqs);
+
+ memory_region_init_io(&s->mem_config, OBJECT(s), &pci_vpb_config_ops, s,
+ "pci-config", pdata->addr_map[PCI_CFG].size);
+ sysbus_init_mmio(sbd, &s->mem_config);
+
+ /* Depending on the available memory of the board using the controller, we
+ create a window on both the I/O and mememory space. */
meme?
Post by Alvise Rigo
+ memory_region_init_alias(&s->pci_io_window, OBJECT(s), "pci-io-win",
+ &s->pci_io_space, 0,
+ pdata->addr_map[PCI_IO].size);
+ sysbus_init_mmio(sbd, &s->pci_io_window);
+
+ memory_region_init_alias(&s->pci_mem_window, OBJECT(s), "pci-mem-win",
+ &s->pci_mem_space,
+ pdata->addr_map[PCI_MEM].addr,
+ pdata->addr_map[PCI_MEM].size);
+ sysbus_init_mmio(sbd, &s->pci_mem_window);
What if we simply postpone the creation of those regions and the
creation of the PCIBus to the realize function? At that point we know
the size of the regions.
Post by Alvise Rigo
+
+ /* TODO Remove once realize propagates to child devices. */
+ object_property_set_bool(OBJECT(&s->pci_gen), true, "realized", errp);
Is this still necessary? In fact, wouldn't the realize of our PHB and
the creation of child devices happen consecutively usually?
Post by Alvise Rigo
+}
+
+static void pci_generic_host_class_init(ObjectClass *klass, void *data)
+{
+ PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ k->vendor_id = PCI_VENDOR_ID_REDHAT;
+ k->device_id = 0x1234;
Is this a reserved ID?
Post by Alvise Rigo
+ k->class_id = PCI_CLASS_PROCESSOR_CO;
+ /*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+ dc->cannot_instantiate_with_device_add_yet = true;
+}
+
+struct dt_irq_mapping {
+ int irqs;
+ uint32_t gic_phandle;
+ int base_irq_num;
+ uint64_t *data;
+};
+
+/* */
+#define IRQ_MAP_ENTRY_DESC_SZ 14
+static void gen_int_mapping_fn(PCIBus *b, PCIDevice *d, void *opaque)
+{
+ struct dt_irq_mapping *map_data;
+ int pin;
+
+ PCIDevice *pdev = b->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+ map_data = (struct dt_irq_mapping *)opaque;
+
+ /* Check if the platform has enough IRQs available. */
+ if (gps->devfns > map_data->irqs) {
+ hw_error("generic_pci: too many PCI devices.");
+ }
+
+ pin = pci_get_byte(d->config + PCI_INTERRUPT_PIN);
+ if (pin) {
+ uint64_t *data_ptr = map_data->data + (IRQ_MAP_ENTRY_DESC_SZ * gps->devfns);
+ uint8_t pci_slot = PCI_SLOT(d->devfn);
+ uint8_t pci_func = PCI_FUNC(d->devfn);
+ uint8_t *devfn_idx = &gps->irqmap.devfn_idx_map[pci_slot][0];
+ uint8_t *devfn_irq = gps->irqmap.devfn_irq_map;
+
+ devfn_idx[pci_func] = gps->devfns;
+ devfn_irq[gps->devfns] = map_data->base_irq_num + gps->devfns;
+
+ uint64_t buffer[IRQ_MAP_ENTRY_DESC_SZ] =
+ {1, (pci_slot << 11)|(pci_func << 8), 2, 0x00000000, 1, pin,
+ 1, map_data->gic_phandle, 1, 0x0, 1, devfn_irq[gps->devfns],
+ 1, 0x1};
+
+ memcpy(data_ptr, buffer, IRQ_MAP_ENTRY_DESC_SZ * sizeof(*buffer));
+ gps->devfns++;
+ }
+}
+
+/* Generate the "interrup-map" node's data and store it in map_data */
+static void generate_int_mapping(struct dt_irq_mapping *map_data,
+ PCIGenState *s)
+{
+ pci_for_each_device(s->parent_obj.bus, 0, gen_int_mapping_fn, map_data);
The interrupt-map should describe interrupt mappings for every slot, not
every device. If you only describe the current state of affairs, hotplug
won't work.
Post by Alvise Rigo
+}
+
+static void generate_dt_node(void *fdt, DeviceState *dev)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ char *nodename;
+ uint32_t gic_phandle;
+ uint32_t plat_acells;
+ uint32_t plat_scells;
+
+ SysBusDevice *busdev;
+ busdev = SYS_BUS_DEVICE(dev);
+ MemoryRegion *cfg = sysbus_mmio_get_region(busdev, 0);
+ MemoryRegion *io = sysbus_mmio_get_region(busdev, 1);
+ MemoryRegion *mem = sysbus_mmio_get_region(busdev, 2);
+
+ qemu_fdt_add_subnode(fdt, nodename);
+ qemu_fdt_setprop_string(fdt, nodename, "compatible",
+ "pci-host-cam-generic");
+ qemu_fdt_setprop_string(fdt, nodename, "device_type", "pci");
+ qemu_fdt_setprop_cell(fdt, nodename, "#address-cells", 0x3);
+ qemu_fdt_setprop_cell(fdt, nodename, "#size-cells", 0x2);
+ qemu_fdt_setprop_cell(fdt, nodename, "#interrupt-cells", 0x1);
+
+ plat_acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
+ plat_scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", plat_acells, cfg->addr,
+ plat_scells, memory_region_size(cfg));
+
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "ranges",
+ 1, 0x01000000, 2, 0x00000000, /* I/O space */
+ 2, io->addr, 2, memory_region_size(io),
+ 1, 0x02000000, 2, mem->addr, /* 32bit memory space */
+ 2, mem->addr, 2, memory_region_size(mem));
+
+ gic_phandle = qemu_fdt_get_phandle(fdt, s->plat_data.gic_node_name);
+ /* A mask covering bits [15,8] of phys.high allows to honor the
+ function number when resolving a triggered PCI interrupt. */
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "interrupt-map-mask",
+ 1, 0xff00, 1, 0x0, 1, 0x0, 1, 0x7);
+
+ uint64_t *int_mapping_data = g_malloc0(IRQ_MAP_ENTRY_DESC_SZ *
+ sizeof(uint64_t) * s->plat_data.irqs);
+ struct dt_irq_mapping dt_map_data = {
+ .irqs = s->plat_data.irqs,
+ .gic_phandle = gic_phandle,
+ .base_irq_num = s->plat_data.base_irq,
+ .data = int_mapping_data
+ };
+ /* Generate the interrupt mapping according to the devices attached
+ * to the PCI bus of the controller. */
+ generate_int_mapping(&dt_map_data, s);
+ qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "interrupt-map",
+ (s->pci_gen.devfns * IRQ_MAP_ENTRY_DESC_SZ)/2, int_mapping_data);
+
+ g_free(int_mapping_data);
+ g_free(nodename);
+}
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev)
+{
+ generate_dt_node(fdt, dev);
Device tree generation should *never* be part of device code. It's
machine / board specific. If one day someone can convince me that it's
safe to share device tree bits of one device with multiple boards I'm
happy to work with them to achieve that goal then, but for now, please
leave device tree bits in the machine logic.


Alex
Post by Alvise Rigo
+}
+
+static const TypeInfo pci_generic_host_info = {
+ .name = TYPE_GENERIC_PCI_HOST,
+ .parent = TYPE_PCI_DEVICE,
+ .instance_size = sizeof(GenericPCIHostState),
+ .class_init = pci_generic_host_class_init,
+};
+
+static void pci_generic_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ dc->realize = pci_generic_host_realize;
+ dc->vmsd = &pci_generic_host_vmstate;
+}
+
+static const TypeInfo pci_generic_info = {
+ .name = TYPE_GENERIC_PCI,
+ .parent = TYPE_PCI_HOST_BRIDGE,
+ .instance_size = sizeof(PCIGenState),
+ .instance_init = pci_generic_host_init,
+ .class_init = pci_generic_class_init,
+};
+
+static void generic_pci_host_register_types(void)
+{
+ type_register_static(&pci_generic_info);
+ type_register_static(&pci_generic_host_info);
+}
+
+type_init(generic_pci_host_register_types)
\ No newline at end of file
diff --git a/include/hw/pci-host/generic-pci.h b/include/hw/pci-host/generic-pci.h
new file mode 100644
index 0000000..43c7a0f
--- /dev/null
+++ b/include/hw/pci-host/generic-pci.h
@@ -0,0 +1,74 @@
+#ifndef QEMU_GENERIC_PCI_H
+#define QEMU_GENERIC_PCI_H
+
+#include "hw/pci/pci.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci_host.h"
+
+#define MAX_PCI_DEVICES (PCI_SLOT_MAX * PCI_FUNC_MAX)
+
+enum {
+ PCI_CFG = 0,
+ PCI_IO,
+ PCI_MEM,
+};
+
+typedef struct {
+ /* addr_map[PCI_CFG].addr = base address of configuration memory
+ addr_map[PCI_CFG].size = configuration memory size
+ ... */
+ struct addr_map {
+ hwaddr addr;
+ hwaddr size;
+ } addr_map[3];
+
+ const char *gic_node_name;
+ uint32_t base_irq;
+ uint32_t irqs;
+} GenericPCIPlatformData;
+
+typedef struct {
+ PCIDevice parent_obj;
+
+ struct irqmap {
+ /* devfn_idx_map[i][j] = index inside slot_irq_map for device at slot i,
+ fn j */
+ uint8_t devfn_idx_map[PCI_SLOT_MAX][PCI_FUNC_MAX];
+ /* devfn_irq_map[i] = IRQ num. of the i-th device (devfn) attached to
+ the bus */
+ uint8_t devfn_irq_map[MAX_PCI_DEVICES];
+ } irqmap;
+ /* number of devices attached to the bus */
+ uint32_t devfns;
+} GenericPCIHostState;
+
+typedef struct {
+ PCIHostState parent_obj;
+
+ qemu_irq irq[MAX_PCI_DEVICES];
+ MemoryRegion mem_config;
+ /* Containers representing the PCI address spaces */
+ MemoryRegion pci_io_space;
+ MemoryRegion pci_mem_space;
+ /* Alias regions into PCI address spaces which we expose as sysbus regions.
+ * The offsets into pci_mem_space is fixed. */
+ MemoryRegion pci_io_window;
+ MemoryRegion pci_mem_window;
+ PCIBus pci_bus;
+ GenericPCIHostState pci_gen;
+
+ /* Platform dependent data */
+ GenericPCIPlatformData plat_data;
+} PCIGenState;
+
+#define TYPE_GENERIC_PCI "generic_pci"
+#define PCI_GEN(obj) \
+ OBJECT_CHECK(PCIGenState, (obj), TYPE_GENERIC_PCI)
+
+#define TYPE_GENERIC_PCI_HOST "generic_pci_host"
+#define PCI_GEN_HOST(obj) \
+ OBJECT_CHECK(GenericPCIHostState, (obj), TYPE_GENERIC_PCI_HOST)
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev);
+
+#endif
\ No newline at end of file
alvise rigo
2015-01-06 08:47:41 UTC
Permalink
Hi,
Post by Alexander Graf
Post by Alvise Rigo
Add a generic PCI host controller for virtual platforms, based on the
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html
The controller creates a PCI bus for PCI devices; it is intended to
receive from the platform all the needed information to initiate the
memory regions expected by the PCI system. Due to this dependence, a
header file is used to define the structure that the platform has to
fill with the data needed by the controller. The device provides also a
routine for the device tree node generation, which mostly has to take
care of the interrupt-map property generation. This property is fetched
by the guest operating system to map any PCI interrupt to the interrupt
controller. For the time being, the device expects a GIC v2 to be used
by the guest.
Only mach-virt has been used to test the controller.
---
hw/pci-host/Makefile.objs | 2 +-
hw/pci-host/generic-pci.c | 280 ++++++++++++++++++++++++++++++++++++++
include/hw/pci-host/generic-pci.h | 74 ++++++++++
3 files changed, 355 insertions(+), 1 deletion(-)
create mode 100644 hw/pci-host/generic-pci.c
create mode 100644 include/hw/pci-host/generic-pci.h
diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index bb65f9c..8ef9fac 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -1,4 +1,4 @@
-common-obj-y += pam.o
+common-obj-y += pam.o generic-pci.o
# PPC devices
common-obj-$(CONFIG_PREP_PCI) += prep.o
diff --git a/hw/pci-host/generic-pci.c b/hw/pci-host/generic-pci.c
new file mode 100644
index 0000000..9c90263
--- /dev/null
+++ b/hw/pci-host/generic-pci.c
@@ -0,0 +1,280 @@
+/*
+ * Generic PCI host controller
+ *
+ * Copyright (c) 2014 Linaro, Ltd.
+ *
+ * Copyright (c) 2006-2009 CodeSourcery.
+ * Written by Paul Brook
+ *
+ * This code is licensed under the LGPL.
+ */
+
+#include "hw/sysbus.h"
+#include "hw/pci-host/generic-pci.h"
+#include "exec/address-spaces.h"
+#include "sysemu/device_tree.h"
+
+static const VMStateDescription pci_generic_host_vmstate = {
+ .name = "generic-host-pci",
+ .version_id = 1,
+ .minimum_version_id = 1,
+};
+
+static void pci_cam_config_write(void *opaque, hwaddr addr,
+ uint64_t val, unsigned size)
+{
+ PCIGenState *s = opaque;
+ pci_data_write(&s->pci_bus, addr, val, size);
+}
+
+static uint64_t pci_cam_config_read(void *opaque, hwaddr addr, unsigned size)
+{
+ PCIGenState *s = opaque;
+ uint32_t val;
+ val = pci_data_read(&s->pci_bus, addr, size);
+ return val;
+}
+
+static const MemoryRegionOps pci_vpb_config_ops = {
+ .read = pci_cam_config_read,
+ .write = pci_cam_config_write,
+ .endianness = DEVICE_NATIVE_ENDIAN,
Never use NATIVE_ENDIAN unless you're 100% sure it's correct. In this
case, please make it LITTLE_ENDIAN.
Agreed.
Post by Alexander Graf
Post by Alvise Rigo
+};
+
+static void pci_generic_set_irq(void *opaque, int irq_num, int level)
+{
+ qemu_irq *pic = opaque;
+ qemu_set_irq(pic[irq_num], level);
+}
+
+static void pci_generic_host_init(Object *obj)
+{
+ PCIHostState *h = PCI_HOST_BRIDGE(obj);
+ PCIGenState *s = PCI_GEN(obj);
+ GenericPCIHostState *gps = &s->pci_gen;
+
+ memory_region_init(&s->pci_io_space, OBJECT(s), "pci_io", 1ULL << 32);
Why is IO space that big? It's only 64k usually, no?
This was just to prevent a definition of the io space smaller than the
actual size of the memory region.
This could be avoided as you suggested later, by postponing the
creation of the PCIBus.
Post by Alexander Graf
Post by Alvise Rigo
+ memory_region_init(&s->pci_mem_space, OBJECT(s), "pci_mem", 1ULL << 32);
+
+ pci_bus_new_inplace(&s->pci_bus, sizeof(s->pci_bus), DEVICE(obj), "pci",
+ &s->pci_mem_space, &s->pci_io_space,
+ PCI_DEVFN(0, 0), TYPE_PCIE_BUS);
+ h->bus = &s->pci_bus;
+
+ object_initialize(gps, sizeof(*gps), TYPE_GENERIC_PCI_HOST);
+ qdev_set_parent_bus(DEVICE(gps), BUS(&s->pci_bus));
+ gps->devfns = 0;
+}
+
+static int generic_pci_map_irq_fn(PCIDevice *pci_dev, int pin)
+{
+ BusState *bus = qdev_get_parent_bus(&pci_dev->qdev);
+ PCIBus *pci_bus = PCI_BUS(bus);
+ PCIDevice *pdev = pci_bus->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+
+ return gps->irqmap.devfn_idx_map[PCI_SLOT(pci_dev->devfn)]
+ [PCI_FUNC(pci_dev->devfn)];
+}
+
+static void pci_generic_host_realize(DeviceState *dev, Error **errp)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+ GenericPCIPlatformData *pdata = &s->plat_data;
Where does this come from? Why isn't it exposed as qdev parameters?
It comes from the platform (e.g. mach-virt).
Indeed, I will expose them as qdev properties.
Post by Alexander Graf
Post by Alvise Rigo
+ int i;
+
+ if (!pdata->addr_map[PCI_CFG].size || !pdata->addr_map[PCI_IO].size ||
+ !pdata->addr_map[PCI_MEM].size || !pdata->irqs) {
+ hw_error("generic_pci: PCI controller not fully configured.");
+ }
+
+ for (i = 0; i < pdata->irqs; i++) {
+ sysbus_init_irq(sbd, &s->irq[i]);
+ }
+
+ pci_bus_irqs(&s->pci_bus, pci_generic_set_irq, generic_pci_map_irq_fn,
+ s->irq, pdata->irqs);
+
+ memory_region_init_io(&s->mem_config, OBJECT(s), &pci_vpb_config_ops, s,
+ "pci-config", pdata->addr_map[PCI_CFG].size);
+ sysbus_init_mmio(sbd, &s->mem_config);
+
+ /* Depending on the available memory of the board using the controller, we
+ create a window on both the I/O and mememory space. */
meme?
Ack.
Post by Alexander Graf
Post by Alvise Rigo
+ memory_region_init_alias(&s->pci_io_window, OBJECT(s), "pci-io-win",
+ &s->pci_io_space, 0,
+ pdata->addr_map[PCI_IO].size);
+ sysbus_init_mmio(sbd, &s->pci_io_window);
+
+ memory_region_init_alias(&s->pci_mem_window, OBJECT(s), "pci-mem-win",
+ &s->pci_mem_space,
+ pdata->addr_map[PCI_MEM].addr,
+ pdata->addr_map[PCI_MEM].size);
+ sysbus_init_mmio(sbd, &s->pci_mem_window);
What if we simply postpone the creation of those regions and the
creation of the PCIBus to the realize function? At that point we know
the size of the regions.
I'm not sure but I think I've already tried this path and for some
reason I've abandoned it.
I will explore it again and I'll let you know.
Post by Alexander Graf
Post by Alvise Rigo
+
+ /* TODO Remove once realize propagates to child devices. */
+ object_property_set_bool(OBJECT(&s->pci_gen), true, "realized", errp);
Is this still necessary? In fact, wouldn't the realize of our PHB and
the creation of child devices happen consecutively usually?
Sure, I will fix it.
Post by Alexander Graf
Post by Alvise Rigo
+}
+
+static void pci_generic_host_class_init(ObjectClass *klass, void *data)
+{
+ PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ k->vendor_id = PCI_VENDOR_ID_REDHAT;
+ k->device_id = 0x1234;
Is this a reserved ID?
Not at all. I see now from the documentation (docs/specs/pci-ids.txt)
that a value within the range 0000 -> 00ff should be used.
I suppose that eventually we will pick up one of the available?
Post by Alexander Graf
Post by Alvise Rigo
+ k->class_id = PCI_CLASS_PROCESSOR_CO;
+ /*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+ dc->cannot_instantiate_with_device_add_yet = true;
+}
+
+struct dt_irq_mapping {
+ int irqs;
+ uint32_t gic_phandle;
+ int base_irq_num;
+ uint64_t *data;
+};
+
+/* */
+#define IRQ_MAP_ENTRY_DESC_SZ 14
+static void gen_int_mapping_fn(PCIBus *b, PCIDevice *d, void *opaque)
+{
+ struct dt_irq_mapping *map_data;
+ int pin;
+
+ PCIDevice *pdev = b->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+ map_data = (struct dt_irq_mapping *)opaque;
+
+ /* Check if the platform has enough IRQs available. */
+ if (gps->devfns > map_data->irqs) {
+ hw_error("generic_pci: too many PCI devices.");
+ }
+
+ pin = pci_get_byte(d->config + PCI_INTERRUPT_PIN);
+ if (pin) {
+ uint64_t *data_ptr = map_data->data + (IRQ_MAP_ENTRY_DESC_SZ * gps->devfns);
+ uint8_t pci_slot = PCI_SLOT(d->devfn);
+ uint8_t pci_func = PCI_FUNC(d->devfn);
+ uint8_t *devfn_idx = &gps->irqmap.devfn_idx_map[pci_slot][0];
+ uint8_t *devfn_irq = gps->irqmap.devfn_irq_map;
+
+ devfn_idx[pci_func] = gps->devfns;
+ devfn_irq[gps->devfns] = map_data->base_irq_num + gps->devfns;
+
+ uint64_t buffer[IRQ_MAP_ENTRY_DESC_SZ] =
+ {1, (pci_slot << 11)|(pci_func << 8), 2, 0x00000000, 1, pin,
+ 1, map_data->gic_phandle, 1, 0x0, 1, devfn_irq[gps->devfns],
+ 1, 0x1};
+
+ memcpy(data_ptr, buffer, IRQ_MAP_ENTRY_DESC_SZ * sizeof(*buffer));
+ gps->devfns++;
+ }
+}
+
+/* Generate the "interrup-map" node's data and store it in map_data */
+static void generate_int_mapping(struct dt_irq_mapping *map_data,
+ PCIGenState *s)
+{
+ pci_for_each_device(s->parent_obj.bus, 0, gen_int_mapping_fn, map_data);
The interrupt-map should describe interrupt mappings for every slot, not
every device. If you only describe the current state of affairs, hotplug
won't work.
Ack, as agreed also with Peter in the other thread.
Post by Alexander Graf
Post by Alvise Rigo
+}
+
+static void generate_dt_node(void *fdt, DeviceState *dev)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ char *nodename;
+ uint32_t gic_phandle;
+ uint32_t plat_acells;
+ uint32_t plat_scells;
+
+ SysBusDevice *busdev;
+ busdev = SYS_BUS_DEVICE(dev);
+ MemoryRegion *cfg = sysbus_mmio_get_region(busdev, 0);
+ MemoryRegion *io = sysbus_mmio_get_region(busdev, 1);
+ MemoryRegion *mem = sysbus_mmio_get_region(busdev, 2);
+
+ qemu_fdt_add_subnode(fdt, nodename);
+ qemu_fdt_setprop_string(fdt, nodename, "compatible",
+ "pci-host-cam-generic");
+ qemu_fdt_setprop_string(fdt, nodename, "device_type", "pci");
+ qemu_fdt_setprop_cell(fdt, nodename, "#address-cells", 0x3);
+ qemu_fdt_setprop_cell(fdt, nodename, "#size-cells", 0x2);
+ qemu_fdt_setprop_cell(fdt, nodename, "#interrupt-cells", 0x1);
+
+ plat_acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
+ plat_scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", plat_acells, cfg->addr,
+ plat_scells, memory_region_size(cfg));
+
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "ranges",
+ 1, 0x01000000, 2, 0x00000000, /* I/O space */
+ 2, io->addr, 2, memory_region_size(io),
+ 1, 0x02000000, 2, mem->addr, /* 32bit memory space */
+ 2, mem->addr, 2, memory_region_size(mem));
+
+ gic_phandle = qemu_fdt_get_phandle(fdt, s->plat_data.gic_node_name);
+ /* A mask covering bits [15,8] of phys.high allows to honor the
+ function number when resolving a triggered PCI interrupt. */
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "interrupt-map-mask",
+ 1, 0xff00, 1, 0x0, 1, 0x0, 1, 0x7);
+
+ uint64_t *int_mapping_data = g_malloc0(IRQ_MAP_ENTRY_DESC_SZ *
+ sizeof(uint64_t) * s->plat_data.irqs);
+ struct dt_irq_mapping dt_map_data = {
+ .irqs = s->plat_data.irqs,
+ .gic_phandle = gic_phandle,
+ .base_irq_num = s->plat_data.base_irq,
+ .data = int_mapping_data
+ };
+ /* Generate the interrupt mapping according to the devices attached
+ * to the PCI bus of the controller. */
+ generate_int_mapping(&dt_map_data, s);
+ qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "interrupt-map",
+ (s->pci_gen.devfns * IRQ_MAP_ENTRY_DESC_SZ)/2, int_mapping_data);
+
+ g_free(int_mapping_data);
+ g_free(nodename);
+}
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev)
+{
+ generate_dt_node(fdt, dev);
Device tree generation should *never* be part of device code. It's
machine / board specific. If one day someone can convince me that it's
Actually you already convinced me that the device tree generation
should stay in the board code :)
I will rework the code accordingly.

Thank you for your feedback,
alvise
Post by Alexander Graf
safe to share device tree bits of one device with multiple boards I'm
happy to work with them to achieve that goal then, but for now, please
leave device tree bits in the machine logic.
Alex
Post by Alvise Rigo
+}
+
+static const TypeInfo pci_generic_host_info = {
+ .name = TYPE_GENERIC_PCI_HOST,
+ .parent = TYPE_PCI_DEVICE,
+ .instance_size = sizeof(GenericPCIHostState),
+ .class_init = pci_generic_host_class_init,
+};
+
+static void pci_generic_class_init(ObjectClass *klass, void *data)
+{
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ dc->realize = pci_generic_host_realize;
+ dc->vmsd = &pci_generic_host_vmstate;
+}
+
+static const TypeInfo pci_generic_info = {
+ .name = TYPE_GENERIC_PCI,
+ .parent = TYPE_PCI_HOST_BRIDGE,
+ .instance_size = sizeof(PCIGenState),
+ .instance_init = pci_generic_host_init,
+ .class_init = pci_generic_class_init,
+};
+
+static void generic_pci_host_register_types(void)
+{
+ type_register_static(&pci_generic_info);
+ type_register_static(&pci_generic_host_info);
+}
+
+type_init(generic_pci_host_register_types)
\ No newline at end of file
diff --git a/include/hw/pci-host/generic-pci.h b/include/hw/pci-host/generic-pci.h
new file mode 100644
index 0000000..43c7a0f
--- /dev/null
+++ b/include/hw/pci-host/generic-pci.h
@@ -0,0 +1,74 @@
+#ifndef QEMU_GENERIC_PCI_H
+#define QEMU_GENERIC_PCI_H
+
+#include "hw/pci/pci.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci_host.h"
+
+#define MAX_PCI_DEVICES (PCI_SLOT_MAX * PCI_FUNC_MAX)
+
+enum {
+ PCI_CFG = 0,
+ PCI_IO,
+ PCI_MEM,
+};
+
+typedef struct {
+ /* addr_map[PCI_CFG].addr = base address of configuration memory
+ addr_map[PCI_CFG].size = configuration memory size
+ ... */
+ struct addr_map {
+ hwaddr addr;
+ hwaddr size;
+ } addr_map[3];
+
+ const char *gic_node_name;
+ uint32_t base_irq;
+ uint32_t irqs;
+} GenericPCIPlatformData;
+
+typedef struct {
+ PCIDevice parent_obj;
+
+ struct irqmap {
+ /* devfn_idx_map[i][j] = index inside slot_irq_map for device at slot i,
+ fn j */
+ uint8_t devfn_idx_map[PCI_SLOT_MAX][PCI_FUNC_MAX];
+ /* devfn_irq_map[i] = IRQ num. of the i-th device (devfn) attached to
+ the bus */
+ uint8_t devfn_irq_map[MAX_PCI_DEVICES];
+ } irqmap;
+ /* number of devices attached to the bus */
+ uint32_t devfns;
+} GenericPCIHostState;
+
+typedef struct {
+ PCIHostState parent_obj;
+
+ qemu_irq irq[MAX_PCI_DEVICES];
+ MemoryRegion mem_config;
+ /* Containers representing the PCI address spaces */
+ MemoryRegion pci_io_space;
+ MemoryRegion pci_mem_space;
+ /* Alias regions into PCI address spaces which we expose as sysbus regions.
+ * The offsets into pci_mem_space is fixed. */
+ MemoryRegion pci_io_window;
+ MemoryRegion pci_mem_window;
+ PCIBus pci_bus;
+ GenericPCIHostState pci_gen;
+
+ /* Platform dependent data */
+ GenericPCIPlatformData plat_data;
+} PCIGenState;
+
+#define TYPE_GENERIC_PCI "generic_pci"
+#define PCI_GEN(obj) \
+ OBJECT_CHECK(PCIGenState, (obj), TYPE_GENERIC_PCI)
+
+#define TYPE_GENERIC_PCI_HOST "generic_pci_host"
+#define PCI_GEN_HOST(obj) \
+ OBJECT_CHECK(GenericPCIHostState, (obj), TYPE_GENERIC_PCI_HOST)
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev);
+
+#endif
\ No newline at end of file
Alexander Graf
2015-01-06 11:18:11 UTC
Permalink
Post by alvise rigo
Hi,
Post by Alexander Graf
Post by Alvise Rigo
Add a generic PCI host controller for virtual platforms, based on the
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html
The controller creates a PCI bus for PCI devices; it is intended to
receive from the platform all the needed information to initiate the
memory regions expected by the PCI system. Due to this dependence, a
header file is used to define the structure that the platform has to
fill with the data needed by the controller. The device provides also a
routine for the device tree node generation, which mostly has to take
care of the interrupt-map property generation. This property is fetched
by the guest operating system to map any PCI interrupt to the interrupt
controller. For the time being, the device expects a GIC v2 to be used
by the guest.
Only mach-virt has been used to test the controller.
---
hw/pci-host/Makefile.objs | 2 +-
hw/pci-host/generic-pci.c | 280 ++++++++++++++++++++++++++++++++++++++
include/hw/pci-host/generic-pci.h | 74 ++++++++++
3 files changed, 355 insertions(+), 1 deletion(-)
create mode 100644 hw/pci-host/generic-pci.c
create mode 100644 include/hw/pci-host/generic-pci.h
diff --git a/hw/pci-host/Makefile.objs b/hw/pci-host/Makefile.objs
index bb65f9c..8ef9fac 100644
--- a/hw/pci-host/Makefile.objs
+++ b/hw/pci-host/Makefile.objs
@@ -1,4 +1,4 @@
-common-obj-y += pam.o
+common-obj-y += pam.o generic-pci.o
# PPC devices
common-obj-$(CONFIG_PREP_PCI) += prep.o
diff --git a/hw/pci-host/generic-pci.c b/hw/pci-host/generic-pci.c
new file mode 100644
index 0000000..9c90263
--- /dev/null
+++ b/hw/pci-host/generic-pci.c
@@ -0,0 +1,280 @@
+/*
+ * Generic PCI host controller
+ *
+ * Copyright (c) 2014 Linaro, Ltd.
+ *
+ * Copyright (c) 2006-2009 CodeSourcery.
+ * Written by Paul Brook
+ *
+ * This code is licensed under the LGPL.
+ */
+
+#include "hw/sysbus.h"
+#include "hw/pci-host/generic-pci.h"
+#include "exec/address-spaces.h"
+#include "sysemu/device_tree.h"
+
+static const VMStateDescription pci_generic_host_vmstate = {
+ .name = "generic-host-pci",
+ .version_id = 1,
+ .minimum_version_id = 1,
+};
+
+static void pci_cam_config_write(void *opaque, hwaddr addr,
+ uint64_t val, unsigned size)
+{
+ PCIGenState *s = opaque;
+ pci_data_write(&s->pci_bus, addr, val, size);
+}
+
+static uint64_t pci_cam_config_read(void *opaque, hwaddr addr, unsigned size)
+{
+ PCIGenState *s = opaque;
+ uint32_t val;
+ val = pci_data_read(&s->pci_bus, addr, size);
+ return val;
+}
+
+static const MemoryRegionOps pci_vpb_config_ops = {
+ .read = pci_cam_config_read,
+ .write = pci_cam_config_write,
+ .endianness = DEVICE_NATIVE_ENDIAN,
Never use NATIVE_ENDIAN unless you're 100% sure it's correct. In this
case, please make it LITTLE_ENDIAN.
Agreed.
Post by Alexander Graf
Post by Alvise Rigo
+};
+
+static void pci_generic_set_irq(void *opaque, int irq_num, int level)
+{
+ qemu_irq *pic = opaque;
+ qemu_set_irq(pic[irq_num], level);
+}
+
+static void pci_generic_host_init(Object *obj)
+{
+ PCIHostState *h = PCI_HOST_BRIDGE(obj);
+ PCIGenState *s = PCI_GEN(obj);
+ GenericPCIHostState *gps = &s->pci_gen;
+
+ memory_region_init(&s->pci_io_space, OBJECT(s), "pci_io", 1ULL << 32);
Why is IO space that big? It's only 64k usually, no?
This was just to prevent a definition of the io space smaller than the
actual size of the memory region.
This could be avoided as you suggested later, by postponing the
creation of the PCIBus.
Post by Alexander Graf
Post by Alvise Rigo
+ memory_region_init(&s->pci_mem_space, OBJECT(s), "pci_mem", 1ULL << 32);
+
+ pci_bus_new_inplace(&s->pci_bus, sizeof(s->pci_bus), DEVICE(obj), "pci",
+ &s->pci_mem_space, &s->pci_io_space,
+ PCI_DEVFN(0, 0), TYPE_PCIE_BUS);
+ h->bus = &s->pci_bus;
+
+ object_initialize(gps, sizeof(*gps), TYPE_GENERIC_PCI_HOST);
+ qdev_set_parent_bus(DEVICE(gps), BUS(&s->pci_bus));
+ gps->devfns = 0;
+}
+
+static int generic_pci_map_irq_fn(PCIDevice *pci_dev, int pin)
+{
+ BusState *bus = qdev_get_parent_bus(&pci_dev->qdev);
+ PCIBus *pci_bus = PCI_BUS(bus);
+ PCIDevice *pdev = pci_bus->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+
+ return gps->irqmap.devfn_idx_map[PCI_SLOT(pci_dev->devfn)]
+ [PCI_FUNC(pci_dev->devfn)];
+}
+
+static void pci_generic_host_realize(DeviceState *dev, Error **errp)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ SysBusDevice *sbd = SYS_BUS_DEVICE(dev);
+ GenericPCIPlatformData *pdata = &s->plat_data;
Where does this come from? Why isn't it exposed as qdev parameters?
It comes from the platform (e.g. mach-virt).
Indeed, I will expose them as qdev properties.
Post by Alexander Graf
Post by Alvise Rigo
+ int i;
+
+ if (!pdata->addr_map[PCI_CFG].size || !pdata->addr_map[PCI_IO].size ||
+ !pdata->addr_map[PCI_MEM].size || !pdata->irqs) {
+ hw_error("generic_pci: PCI controller not fully configured.");
+ }
+
+ for (i = 0; i < pdata->irqs; i++) {
+ sysbus_init_irq(sbd, &s->irq[i]);
+ }
+
+ pci_bus_irqs(&s->pci_bus, pci_generic_set_irq, generic_pci_map_irq_fn,
+ s->irq, pdata->irqs);
+
+ memory_region_init_io(&s->mem_config, OBJECT(s), &pci_vpb_config_ops, s,
+ "pci-config", pdata->addr_map[PCI_CFG].size);
+ sysbus_init_mmio(sbd, &s->mem_config);
+
+ /* Depending on the available memory of the board using the controller, we
+ create a window on both the I/O and mememory space. */
meme?
Ack.
Post by Alexander Graf
Post by Alvise Rigo
+ memory_region_init_alias(&s->pci_io_window, OBJECT(s), "pci-io-win",
+ &s->pci_io_space, 0,
+ pdata->addr_map[PCI_IO].size);
+ sysbus_init_mmio(sbd, &s->pci_io_window);
+
+ memory_region_init_alias(&s->pci_mem_window, OBJECT(s), "pci-mem-win",
+ &s->pci_mem_space,
+ pdata->addr_map[PCI_MEM].addr,
+ pdata->addr_map[PCI_MEM].size);
+ sysbus_init_mmio(sbd, &s->pci_mem_window);
What if we simply postpone the creation of those regions and the
creation of the PCIBus to the realize function? At that point we know
the size of the regions.
I'm not sure but I think I've already tried this path and for some
reason I've abandoned it.
I will explore it again and I'll let you know.
Post by Alexander Graf
Post by Alvise Rigo
+
+ /* TODO Remove once realize propagates to child devices. */
+ object_property_set_bool(OBJECT(&s->pci_gen), true, "realized", errp);
Is this still necessary? In fact, wouldn't the realize of our PHB and
the creation of child devices happen consecutively usually?
Sure, I will fix it.
Post by Alexander Graf
Post by Alvise Rigo
+}
+
+static void pci_generic_host_class_init(ObjectClass *klass, void *data)
+{
+ PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+ DeviceClass *dc = DEVICE_CLASS(klass);
+
+ k->vendor_id = PCI_VENDOR_ID_REDHAT;
+ k->device_id = 0x1234;
Is this a reserved ID?
Not at all. I see now from the documentation (docs/specs/pci-ids.txt)
that a value within the range 0000 -> 00ff should be used.
I suppose that eventually we will pick up one of the available?
Post by Alexander Graf
Post by Alvise Rigo
+ k->class_id = PCI_CLASS_PROCESSOR_CO;
+ /*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+ dc->cannot_instantiate_with_device_add_yet = true;
+}
+
+struct dt_irq_mapping {
+ int irqs;
+ uint32_t gic_phandle;
+ int base_irq_num;
+ uint64_t *data;
+};
+
+/* */
+#define IRQ_MAP_ENTRY_DESC_SZ 14
+static void gen_int_mapping_fn(PCIBus *b, PCIDevice *d, void *opaque)
+{
+ struct dt_irq_mapping *map_data;
+ int pin;
+
+ PCIDevice *pdev = b->devices[PCI_DEVFN(0, 0)];
+ GenericPCIHostState *gps = PCI_GEN_HOST(pdev);
+ map_data = (struct dt_irq_mapping *)opaque;
+
+ /* Check if the platform has enough IRQs available. */
+ if (gps->devfns > map_data->irqs) {
+ hw_error("generic_pci: too many PCI devices.");
+ }
+
+ pin = pci_get_byte(d->config + PCI_INTERRUPT_PIN);
+ if (pin) {
+ uint64_t *data_ptr = map_data->data + (IRQ_MAP_ENTRY_DESC_SZ * gps->devfns);
+ uint8_t pci_slot = PCI_SLOT(d->devfn);
+ uint8_t pci_func = PCI_FUNC(d->devfn);
+ uint8_t *devfn_idx = &gps->irqmap.devfn_idx_map[pci_slot][0];
+ uint8_t *devfn_irq = gps->irqmap.devfn_irq_map;
+
+ devfn_idx[pci_func] = gps->devfns;
+ devfn_irq[gps->devfns] = map_data->base_irq_num + gps->devfns;
+
+ uint64_t buffer[IRQ_MAP_ENTRY_DESC_SZ] =
+ {1, (pci_slot << 11)|(pci_func << 8), 2, 0x00000000, 1, pin,
+ 1, map_data->gic_phandle, 1, 0x0, 1, devfn_irq[gps->devfns],
+ 1, 0x1};
+
+ memcpy(data_ptr, buffer, IRQ_MAP_ENTRY_DESC_SZ * sizeof(*buffer));
+ gps->devfns++;
+ }
+}
+
+/* Generate the "interrup-map" node's data and store it in map_data */
+static void generate_int_mapping(struct dt_irq_mapping *map_data,
+ PCIGenState *s)
+{
+ pci_for_each_device(s->parent_obj.bus, 0, gen_int_mapping_fn, map_data);
The interrupt-map should describe interrupt mappings for every slot, not
every device. If you only describe the current state of affairs, hotplug
won't work.
Ack, as agreed also with Peter in the other thread.
Post by Alexander Graf
Post by Alvise Rigo
+}
+
+static void generate_dt_node(void *fdt, DeviceState *dev)
+{
+ PCIGenState *s = PCI_GEN(dev);
+ char *nodename;
+ uint32_t gic_phandle;
+ uint32_t plat_acells;
+ uint32_t plat_scells;
+
+ SysBusDevice *busdev;
+ busdev = SYS_BUS_DEVICE(dev);
+ MemoryRegion *cfg = sysbus_mmio_get_region(busdev, 0);
+ MemoryRegion *io = sysbus_mmio_get_region(busdev, 1);
+ MemoryRegion *mem = sysbus_mmio_get_region(busdev, 2);
+
+ qemu_fdt_add_subnode(fdt, nodename);
+ qemu_fdt_setprop_string(fdt, nodename, "compatible",
+ "pci-host-cam-generic");
+ qemu_fdt_setprop_string(fdt, nodename, "device_type", "pci");
+ qemu_fdt_setprop_cell(fdt, nodename, "#address-cells", 0x3);
+ qemu_fdt_setprop_cell(fdt, nodename, "#size-cells", 0x2);
+ qemu_fdt_setprop_cell(fdt, nodename, "#interrupt-cells", 0x1);
+
+ plat_acells = qemu_fdt_getprop_cell(fdt, "/", "#address-cells");
+ plat_scells = qemu_fdt_getprop_cell(fdt, "/", "#size-cells");
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", plat_acells, cfg->addr,
+ plat_scells, memory_region_size(cfg));
+
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "ranges",
+ 1, 0x01000000, 2, 0x00000000, /* I/O space */
+ 2, io->addr, 2, memory_region_size(io),
+ 1, 0x02000000, 2, mem->addr, /* 32bit memory space */
+ 2, mem->addr, 2, memory_region_size(mem));
+
+ gic_phandle = qemu_fdt_get_phandle(fdt, s->plat_data.gic_node_name);
+ /* A mask covering bits [15,8] of phys.high allows to honor the
+ function number when resolving a triggered PCI interrupt. */
+ qemu_fdt_setprop_sized_cells(fdt, nodename, "interrupt-map-mask",
+ 1, 0xff00, 1, 0x0, 1, 0x0, 1, 0x7);
+
+ uint64_t *int_mapping_data = g_malloc0(IRQ_MAP_ENTRY_DESC_SZ *
+ sizeof(uint64_t) * s->plat_data.irqs);
+ struct dt_irq_mapping dt_map_data = {
+ .irqs = s->plat_data.irqs,
+ .gic_phandle = gic_phandle,
+ .base_irq_num = s->plat_data.base_irq,
+ .data = int_mapping_data
+ };
+ /* Generate the interrupt mapping according to the devices attached
+ * to the PCI bus of the controller. */
+ generate_int_mapping(&dt_map_data, s);
+ qemu_fdt_setprop_sized_cells_from_array(fdt, nodename, "interrupt-map",
+ (s->pci_gen.devfns * IRQ_MAP_ENTRY_DESC_SZ)/2, int_mapping_data);
+
+ g_free(int_mapping_data);
+ g_free(nodename);
+}
+
+void pci_controller_build_dt_node(void *fdt, DeviceState *dev)
+{
+ generate_dt_node(fdt, dev);
Device tree generation should *never* be part of device code. It's
machine / board specific. If one day someone can convince me that it's
Actually you already convinced me that the device tree generation
should stay in the board code :)
I will rework the code accordingly.
Meanwhile I've taken a stab at implementing a generic PCIe PHB instead.
I'm not sure we should really bother with legacy PCI at this point.

Please give it a try and see whether all of the devices that worked for
you with this patch also work with the PCIe version:


https://github.com/agraf/qemu/commit/ff56c2b2f8f50fe165b3febee0ab2d773b26040b


Alex
Claudio Fontana
2015-01-12 16:26:14 UTC
Permalink
This patch series is based on the previous work [1] and [2] by Rob
Herring and on [3] by myself. For sake of readability and since this is
still a RFC, these patches come as a stand alone work, so there's no
need to apply first [1][2][3]. it tries to enhance this work on these
- The code should be general enough to allow the use of the controller
with other platforms, not only with mach-virt. The only assumption
made is that a GIC v2 is used at guest side (the interrupt-map
property describes the parent interrupts using the three cells
format).
- The interrupt-map node generation has been enhanced in the following
- support of multi-function PCI device has been added
- a PCI device can now use an interrupt pin different from #INTA
Since some other works like [4] require to modify the device tree only
when all the devices have been instantiated, the PATCH[1/4] proposes a
solution for mach-virt to allow multiple agents (e.g., generic-pci,
VFIO) to modify the device tree. The approach in simple: a global list
is kept to store all the routines that perform the modification of the
device tree. Eventually, when the machine is completed, all these
routines are sequentially executed and the kernel is loaded to the guest
by mean of a machine_init_done_notifier.
Rather than postponing the arm_load_kernel call as this patch does,
should we use instead the modify_dtb call provided by arm_boot_info to
modify the device tree?
If so, shouldn't modify_dtb be used to modify only *user* provided
device trees?
This work has been tested attaching several PCI devices to the mach-virt
platform using an ARMv7 CPU. The tested devices are: virtio-blk-pci,
virtio-net-pci, lsi53c895a and pci-ohci (all attached at the same time).
- Add MSI, MSI-X support
- PCI-E support. Due to a lack of devices, this part is a bit hard to
accomplish at the moment.
Thank you, alvise
[1]
"[Qemu-devel] [RFC PATCH 1/2] hw/pci-host: add a generic PCI host"
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03482.html
[2]
"[Qemu-devel] [RFC PATCH 2/2] hw/arm/virt: Add generic PCI host device"
http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg03483.html
[3]
"[Qemu-devel] [RFC PATCH 0/8] Add Generic PCI host device update"
https://lists.gnu.org/archive/html/qemu-devel/2014-07/msg01957.html
[4]
http://lists.nongnu.org/archive/html/qemu-devel/2014-10/msg03816.html
hw/arm/virt: Allow multiple agents to modify dt
hw/arm/virt: find_machine_info: handle NULL value
hw/pci-host: Add a generic PCI host controller for virtual platforms
hw/arm/virt: Add generic-pci device support
hw/arm/virt.c | 114 +++++++++++++++-
hw/pci-host/Makefile.objs | 2 +-
hw/pci-host/generic-pci.c | 280 ++++++++++++++++++++++++++++++++++++++
include/hw/pci-host/generic-pci.h | 74 ++++++++++
4 files changed, 468 insertions(+), 2 deletions(-)
create mode 100644 hw/pci-host/generic-pci.c
create mode 100644 include/hw/pci-host/generic-pci.h
Tested with modified OSv guest on AArch64.

Tested-by: Claudio Fontana <***@huawei.com>

Loading...