PSParseHTML icon indicating copy to clipboard operation
PSParseHTML copied to clipboard

ConvertFrom-HtmlTable Doesn't enable arbitrary headers

Open ld0614 opened this issue 1 year ago • 2 comments

I am reading an HTML page which has a number of tables however the most important table doesn't include a specific header. Currently running ConvertFrom-HtmlTable -Content $ErrorPageData makes the first row a header, as this data is dynamic it makes programmatically accessing the table contents much harder.

Ideally I'd love to see the following (parameter names are just suggestions/to get the point across)

Ability to filter to just a specific table (I think this would be needed as the tables have different numbers of columns) ConvertFrom-HtmlTable -Content $ErrorPageData -TableNumber 2

Which would return (but none of the other tables in the HTML)

Data 1-1 Data 1-2
Data 2-1 Data 2-2

Ability to specify the heading names ConvertFrom-HtmlTable -Content $ErrorPageData -TableNumber 2 -Headings @("Heading 1", "Heading 2")

Which would return

Heading 1 Heading 2
Data 1-1 Data 1-2
Data 2-1 Data 2-2

Alternatively, just using column numbers would also work which would support multiple tables in the page and would probably be simpler to write ``ConvertFrom-HtmlTable -Content $ErrorPageData -NoHeadings`

Which would return

1 2
Data 1-1 Data 1-2
Data 2-1 Data 2-2

ld0614 avatar Dec 06 '24 14:12 ld0614

Unless you can provide some website/html page that replicates issue you're describing it's hard to test/address that.

PrzemyslawKlys avatar Dec 06 '24 14:12 PrzemyslawKlys

LogFile.txt (just rename to HTML)

I've sanitised the file for public consumption and running the following code

$ErrorPageData = Get-Content -Path 'C:\temp\LogFile.txt' | Out-String

$HTMLtables = ConvertFrom-HtmlTable -Content $ErrorPageData

$ErrorTable = $HTMLtables[2]

$ErrorTable

Returns the output:

image

Which leads to the default array access experience being:

$ErrorTable.'Failed to copy from Source : The copy is not identical to the original. The original file may have been modified after it was copied'

And I can't use the Replace Headers option as this would wipe the first row in the list and relies on me knowing what the first row in the table is as scripting time

Thanks for looking at this 😄

ld0614 avatar Dec 06 '24 15:12 ld0614

So in V2 version the file you uploaded gives 4 tables.

Image

Not sure if that was what you were after ..

$Path = Join-Path $PSScriptRoot '\Input\headless_table.html'
$Content = Get-Content -LiteralPath $Path -Raw

$Tables = ConvertFrom-HtmlTable -Content $Content
$Tables[0] | Format-Table -AutoSize

$Tables[1] | Format-Table -AutoSize

$Tables[2] | Format-Table -AutoSize

$Tables[3] | Format-Table -AutoSize

PrzemyslawKlys avatar Jun 07 '25 19:06 PrzemyslawKlys

That's amazing thanks so much for fixing ❤️

ld0614 avatar Jun 07 '25 19:06 ld0614