HTML-Renderer icon indicating copy to clipboard operation
HTML-Renderer copied to clipboard

HTML being cut off in PDF

Open gitn00b1337 opened this issue 8 years ago • 31 comments

I am rendering a very simple invoice and it is cutting off HTML in the middle of the page when I set page top and bottom margins. When the margin is removed it seems to sort itself out. Can send html it's trying to parse if that would help

gitn00b1337 avatar Mar 01 '16 22:03 gitn00b1337

On latest nuget package of tmlRenderer.PdfSharp

broken htmlrenderer.txt

Attached test method, most noticably the table headers wont appear. Any help is hugely appreciated,

Thanks.

gitn00b1337 avatar Mar 01 '16 22:03 gitn00b1337

Fixed. See example. Notice that it requires prerelease version of PdfSharp and this changes are not included in nuget-package yet

gyfke avatar Apr 28 '16 06:04 gyfke

thank you! I will upgrade to pre-release

gitn00b1337 avatar May 15 '16 19:05 gitn00b1337

Are there still plans to update the nuget package to include these changes?

Thanks, Pam

pdreid avatar Aug 15 '16 14:08 pdreid

Where can I get that prelelese version that you talking about?

Thanks

ElVisPL avatar Oct 10 '16 13:10 ElVisPL

Hey guys. I just uploaded a pre-release version of the packages to nuget (*-beta1). That is the most up to date code and includes the fix gyfke made for page breaks.

Cheers!

PawelMaj avatar Nov 14 '16 23:11 PawelMaj

PawelMaj thank you so much!! I'm not seeing it when I search within NuGet though, even though I have "Include Prerelease" selected.

pdreid avatar Nov 15 '16 15:11 pdreid

https://www.nuget.org/packages/HtmlRenderer.Core/1.5.1-beta1

PawelMaj avatar Nov 15 '16 15:11 PawelMaj

Thanks very much, I wasn't familiar with the package manager console.

One other question - is this compatible with the HtmlRenderer.PdfSharp library? When I try to generate a PDF in my application after updating the core to the beta release, it's blank except for the footer and one horizontal rule. Interestingly, it has that correct spacing/number of pages for the text that should be there.

Thanks, Pam

pdreid avatar Nov 21 '16 17:11 pdreid

@pdreid I am currently using it in a large project specifically with PdfSharp. Make sure you have the right version of PdfSharp installed. You need the beta version.

PawelMaj avatar Dec 01 '16 23:12 PawelMaj

Hello, I'm getting the same with the prerelease version, simple <p>'s with LOL text, margins top-bottom 60, left-right 10, pageSize Letter and Orientation Landscape:

image

Result:

image

Any ideas?

darkguy2008 avatar Jan 06 '17 16:01 darkguy2008

@darkguy2008 Hey! I have recently decided to join this project and started looking into the page-breaking code in specific. That being said, I think the words will break a page correctly if the immediate parent element has that attribute on it and if the parent element is smaller than the size of the page.

Should work

Hello

Should fail

Hello

If you only really care about making sure words are not split in half you can change the following code:

CssRect.cs ln307

From:

if (!box.IsFixed && box.PageBreakInside == CssConstants.Avoid) { word.BreakPage(); }

To:

if (!box.IsFixed) { word.BreakPage(); }

If I am correct (I may be wrong, I have only started to get into this code) this change will make sure that no words will ever be split on a page break. I am a bit busy right now, but if I find time I will try to push these changes in and get a new version into nuget.

PawelMaj avatar Jan 06 '17 17:01 PawelMaj

Hello @PawelMaj ! Thanks for your answer!. While I would love to help you in this endeavor, the library's code is huge for my timeframe and I'm also surprised on how were you able to find that out. I made the change and I can say it's working nicely, at least for one-line paragraphs. Now I'm working with a table using rowspans and the split is happening again :( see!

image

I tried to look at the code in that function but it's hard to understand for me, what do you think it could be? thanks again for your time! it would be great if you can make a nuget package out of that fix too :)

darkguy2008 avatar Jan 09 '17 15:01 darkguy2008

+1

Jedarc avatar Mar 06 '17 18:03 Jedarc

@darkguy2008 The table code uses different logic unfortunately....

Edit:

The fix I am suggesting will work for any single word/image.

PawelMaj avatar Mar 09 '17 03:03 PawelMaj

+1

PhilizSwiss avatar Mar 13 '17 10:03 PhilizSwiss

This is still having problems. I'm using this for a production application and this is the last remaining bug. Please fix this.

KthProg avatar Mar 13 '17 16:03 KthProg

@KthProg I solved it by using WxHTMLToPDF (another project) by just making an HTML and rendering it through Webkit... it might be unrelated but it solved my problem, this renderer has a lot of issues regarding line breaks and so on, and the code is too convoluted for me to understand :(

darkguy2008 avatar Mar 13 '17 16:03 darkguy2008

@darkguy2008 How did you do "rendering through WebKit"?

KthProg avatar Mar 13 '17 16:03 KthProg

@KthProg that project does it, there's a NuGet package called TuesPechkin ( https://www.nuget.org/packages/TuesPechkin/ ) which is a wrapper for that project, you just make an HTML and supply it to the wrapper and it works kinda like PhantomJS, but with PDF output built-in.

darkguy2008 avatar Mar 13 '17 16:03 darkguy2008

@darkguy2008 I'll take a look thank you!

KthProg avatar Mar 13 '17 17:03 KthProg

So I can implement the quick fix I mentioned earlier. I will not push it to a Nuget package just yet because I am afraid that the change may have some side affects. This quick fix would be changing the code:

From: if (!box.IsFixed && box.PageBreakInside == CssConstants.Avoid) { word.BreakPage(); }

To: if (!box.IsFixed) { word.BreakPage(); }

From what I have seen, this will make sure that each INDIVIDUAL word/picture will not be cut in half on page breaks. This will introduce a possible smaller issue. If the words in a single line have different line heights, here is a possibility of some words being placed on the next page while others stay on the previous page.

Now I have thought of doing a more complex fix to this. Unfortunately the entire process seems to be using recursion to determine where to place each small element(aka word/image). I would have to redo a large/important part of the code base. I am not comfortable doing this, since I did not write any of the code, nor do I have time to do that. Thus, I will just make a commit with my simple fix for now and hope it does not cause other problems. I will try to do this tonight...

PawelMaj avatar Mar 13 '17 17:03 PawelMaj

@PawelMaj That fix would do the trick for me. If you add this fix how would I pull it down/build from it? (I know I should know this, but I use TFS at work...)

KthProg avatar Mar 13 '17 18:03 KthProg

@PawelMaj I made the change you suggested and did my own build. When I used the updated DLLs my problem was fixed. Thanks 👍

KthProg avatar Mar 13 '17 23:03 KthProg

Yes, I can confirm that the above change fixes this problem for my simple test case as well.

DanielSundberg avatar Mar 15 '17 17:03 DanielSundberg

@PawelMaj can this fix be merged to master and pushed to nuget?

vip32 avatar May 30 '17 09:05 vip32

@PawelMaj As mentioned, could you possibly merge your fix to master and update the nuget package? This is the last of the issues with my PDF generation project, would be good to see it fixed.

mattytommo avatar Jun 24 '17 16:06 mattytommo

Hey guys, just to let all solution more clear following the step-by-step below: 1 - Add to your project https://www.nuget.org/packages/HtmlRenderer.Core/1.5.1-beta1 2 - Add to your project https://www.nuget.org/packages/HtmlRenderer.PdfSharp/1.5.1-beta1 3 - After that, if you wanna to break table, td, h3 .. just put td {page-break-inside: avoid; } will work. 4 - if you wanna to break imgs just put the tag img inside a span and in span add page-break-inside: avoid;

Cheers Guys.

JUchoa avatar Sep 14 '17 19:09 JUchoa

Hi guys, I'm new to HTML-Renderer but its use as been until now simple et straightforward. I'm still having one issue, the page break one... I've updated to beta1 both core and pdfsharp libs, added the page-break-inside css rule too, but I'm still having lines appearing cut between 2 pages. Has anyone any idea why ? It appears inside a

for example or an h3.

I really need this to work and can't use a solution like TuesPechkin since my app is running as an Azure wep app.

Here's how I create the pdf from some simple html. Again, page-breaking appears even in

tags... sample margin sc error

PierreLasvigne avatar May 20 '18 09:05 PierreLasvigne

The sample code i downloaded is working perfectly but when i move the same thing to my actual code it is breaking the line like below image. Please suggest what is missed. Appreciate your help in advance. image

sureshbabu-n avatar Sep 01 '21 23:09 sureshbabu-n