pdfminer.six
                                
                                
                                
                                    pdfminer.six copied to clipboard
                            
                            
                            
                        Text Extraction: first character of LTTextLine totally disappears
Hi,
I am trying to extract several text blocks (using pdfquery https://github.com/jcushman/pdfquery but it's mostly dependant of pdfminer backend). Most of the extractions work well but sometimes the first character (a capital letter often) just disappear and I have been exploring the tree structures the character really does not exist in it.
I tried to solve this by myself by resizing the box of extraction or tweaking the LAParams but no success.
Here's a result example by LTTextLine:

[ "éplacer des produits vers", "la zone de stockage", "Accueillir une clientèle", "écharger des", "marchandises, des produits", "ncaisser le montant d'une", "vente", "rocédures d'encaissement", "roposer un service, produit", "adapté à la demande client", "éaliser la mise en rayon", "epérer et signaler les", "produits détériorés ou", "manquants", "rier et répartir les colis,", "marchandises selon les", "indications (codification,", "format, poids, nombre, ...)", "" ]
As you can see after the first block, each first character has disappear. Is it a problem you already met ?
Thank you in advance for you help !
Can you share the PDF for us to investigate?