PSParseHTML
PSParseHTML copied to clipboard
ConvertFrom-HtmlTable Doesn't enable arbitrary headers
I am reading an HTML page which has a number of tables however the most important table doesn't include a specific header. Currently running ConvertFrom-HtmlTable -Content $ErrorPageData makes the first row a header, as this data is dynamic it makes programmatically accessing the table contents much harder.
Ideally I'd love to see the following (parameter names are just suggestions/to get the point across)
Ability to filter to just a specific table (I think this would be needed as the tables have different numbers of columns)
ConvertFrom-HtmlTable -Content $ErrorPageData -TableNumber 2
Which would return (but none of the other tables in the HTML)
| Data 1-1 | Data 1-2 |
|---|---|
| Data 2-1 | Data 2-2 |
Ability to specify the heading names
ConvertFrom-HtmlTable -Content $ErrorPageData -TableNumber 2 -Headings @("Heading 1", "Heading 2")
Which would return
| Heading 1 | Heading 2 |
|---|---|
| Data 1-1 | Data 1-2 |
| Data 2-1 | Data 2-2 |
Alternatively, just using column numbers would also work which would support multiple tables in the page and would probably be simpler to write ``ConvertFrom-HtmlTable -Content $ErrorPageData -NoHeadings`
Which would return
| 1 | 2 |
|---|---|
| Data 1-1 | Data 1-2 |
| Data 2-1 | Data 2-2 |
Unless you can provide some website/html page that replicates issue you're describing it's hard to test/address that.
LogFile.txt (just rename to HTML)
I've sanitised the file for public consumption and running the following code
$ErrorPageData = Get-Content -Path 'C:\temp\LogFile.txt' | Out-String
$HTMLtables = ConvertFrom-HtmlTable -Content $ErrorPageData
$ErrorTable = $HTMLtables[2]
$ErrorTable
Returns the output:
Which leads to the default array access experience being:
$ErrorTable.'Failed to copy from Source : The copy is not identical to the original. The original file may have been modified after it was copied'
And I can't use the Replace Headers option as this would wipe the first row in the list and relies on me knowing what the first row in the table is as scripting time
Thanks for looking at this 😄
So in V2 version the file you uploaded gives 4 tables.
Not sure if that was what you were after ..
$Path = Join-Path $PSScriptRoot '\Input\headless_table.html'
$Content = Get-Content -LiteralPath $Path -Raw
$Tables = ConvertFrom-HtmlTable -Content $Content
$Tables[0] | Format-Table -AutoSize
$Tables[1] | Format-Table -AutoSize
$Tables[2] | Format-Table -AutoSize
$Tables[3] | Format-Table -AutoSize
That's amazing thanks so much for fixing ❤️