skiboot icon indicating copy to clipboard operation
skiboot copied to clipboard

PCI probing takes 4 time units in qemu

Open shenki opened this issue 6 years ago • 5 comments

Running @legoater 's qemu tree (3.1 based) with the powernv model:

[    0.270844220,5] PCI: Resetting PHBs and training links...
[    0.272165254,7] PHB#0000: FRESET: Assert skipped
[    0.272202593,7] PHB#0000: FRESET: Deassert
[    1.272533870,7] PHB#0000: LINK: Start polling
[    1.324060833,7] PHB#0000: LINK: Electrical link detected
[    1.376168984,7] PHB#0000: LINK: Link is up
[    1.377293537,7] PHB#0001: FRESET: Assert skipped
[    1.377331212,7] PHB#0001: FRESET: Deassert
[    2.377747704,7] PHB#0001: LINK: Start polling
[    2.429170789,7] PHB#0001: LINK: Electrical link detected
[    2.480562886,7] PHB#0001: LINK: Link is up
[    2.480931339,7] PHB#0002: FRESET: Assert skipped
[    2.480958483,7] PHB#0002: FRESET: Deassert
[    3.481109850,7] PHB#0002: LINK: Start polling
[    4.020507587,7] PHB#0002: LINK: Electrical link detected
[    4.071916279,7] PHB#0002: LINK: Link is up
[    4.072493625,5] PCI: Probing slots...
[    4.073907375,7] PHB#0000:00:00.0 Link up at x1 width
[    4.073995548,7] PHB#0000:00:00.0 Scanning (upstream+downsteam)...
[    4.076130479,7] PHB#0000:00:00.0 Found VID:1014 DEV:03dc TYP:4 MF- BR+ EX+
[    4.083441488,7] PHB#0000:00:00.0 Bus 01..ff  scanning...
[    4.101693656,7] PHB#0001:00:00.0 Link up at x1 width
[    4.101785886,7] PHB#0001:00:00.0 Scanning (upstream+downsteam)...
[    4.102132269,7] PHB#0001:00:00.0 Found VID:1014 DEV:03dc TYP:4 MF- BR+ EX+
[    4.105512107,7] PHB#0001:00:00.0 Bus 01..ff  scanning...
[    4.112340762,7] PHB#0002:00:00.0 Link up at x1 width
[    4.112368924,7] PHB#0002:00:00.0 Scanning (upstream+downsteam)...
[    4.112476835,7] PHB#0002:00:00.0 Found VID:1014 DEV:03dc TYP:4 MF- BR+ EX+
[    4.113921390,7] PHB#0002:00:00.0 Bus 01..ff  scanning...
[    4.118325244,5] PCI Summary:
[    4.118992280,5] PHB#0000:00:00.0 [ROOT] 1014 03dc R:00 C:060400 B:01..01 
[    4.119569928,5] PHB#0001:00:00.0 [ROOT] 1014 03dc R:00 C:060400 B:01..01 
[    4.120173134,5] PHB#0002:00:00.0 [ROOT] 1014 03dc R:00 C:060400 B:01..01 

We should see if the model, or skiboot, can be modified to reduce this time.

shenki avatar Oct 23 '18 23:10 shenki

That log says that PCI probing is pretty much instant:

[    4.072493625,5] PCI: Probing slots...
...
[    4.118325244,5] PCI Summary:

The time consuming part is resetting each PHB and training the links. IIRC there's a bunch of mandatory waits in there so if you're booting with a single CPU it'll be slow.

oohal avatar Oct 24 '18 02:10 oohal

Yeah, it'll be slow because of those compulsory waits. Arguably we should do them all at once though and take one unit of waits rather than several.

ghost avatar Nov 20 '18 01:11 ghost

So I was looking at this today and we might be able to make it a bit faster. Currently the PHB3 model always reports that there is in-band presence since the value of the training control register is hard-coded. You can hack that with this:

diff --git a/hw/pci-host/pnv_phb3.c b/hw/pci-host/pnv_phb3.c
index 92828fa2d1b6..79196e28bbf0 100644
--- a/hw/pci-host/pnv_phb3.c
+++ b/hw/pci-host/pnv_phb3.c
@@ -585,6 +585,7 @@ void pnv_phb3_reg_write(void *opaque, hwaddr off, uint64_t val, unsigned size)
 uint64_t pnv_phb3_reg_read(void *opaque, hwaddr off, unsigned size)
 {
     PnvPHB3 *phb = opaque;
+    PCIHostState *pci = PCI_HOST_BRIDGE(phb);
     uint64_t val;
 
     if ((off & 0xfffc) == PHB_CONFIG_DATA) {
@@ -614,6 +615,8 @@ uint64_t pnv_phb3_reg_read(void *opaque, hwaddr off, unsigned size)
 
     /* Link training always appears trained */
     case PHB_PCIE_DLP_TRAIN_CTL:
+        if (!pci_find_device(pci->bus, 1, 0))
+            return 0;
         return PHB_PCIE_DLP_INBAND_PRESENCE | PHB_PCIE_DLP_TC_DL_LINKACT;
 
     /* FFI Lock */

Unfortunately, this doesn't make things any faster right now since there's a 1s timeout when we're waiting for the in-band presence bit to go high. That said, the PCIe spec says that:

a) Devices must enter the link Detect state within 20ms of PERST being lifted, and b) The transition from Detect to Polling (where in-band presence is 1) should be 24ms.

So we should be able to make it faster. There might be a good reason for the longer timeout though, @mikey or @ozbenh might know why.

oohal avatar Mar 22 '19 01:03 oohal

Do you want me to merge the patch above in the PHB3 model ?

legoater avatar Mar 22 '19 07:03 legoater

@legoater Sure, it won't fix the problem here, but it makes the model a little better and shouldn't break anything. The PHB4 model has the same issue too.

oohal avatar Mar 25 '19 22:03 oohal