speechbr icon indicating copy to clipboard operation
speechbr copied to clipboard

Problem with `mutate()` column `discurso`

Open pedrocapetti opened this issue 3 years ago • 3 comments

speech<- speechbr::speech_data( keyword = "polícia", reference_date = "2018-12-31", qtd_days = 6939, orador = 'Jair Bolsonaro')

Error: Problem with mutate() column discurso. i discurso = txt. i discurso must be size 180 or 1, not 179.

pedrocapetti avatar Feb 09 '22 23:02 pedrocapetti

Hi Pedro, the problem was solved with commit 53f85d6ea99b406ed6ff9647027453538ed70c2c. To explain, some rows of the table, in this case only one, do not have the text of the speech, and this will result in an error, because the table has 180 rows but the number of text speeches actually found is only 179.

Thank you, and don't forget to install the package again to update it, with devtools::install_github("dcardosos/speechbr").

dcardosos avatar Feb 10 '22 03:02 dcardosos

Hi Douglas, thank you! I tryed again after the package update, but i still have this error. The same error repeat if i uptade qtd_days to a date before 2001. For example, qtd_days > 10000 . Furthermore, the new commit does a filter when sessao is empty, but it's interesting to know the references of speech that are not transcribed, only in PDF. Thank you and a big congratulations!

pedrocapetti avatar Feb 10 '22 22:02 pedrocapetti

Hi Pedro, thank you for the feedback! Extract the link of PDF is a hard challenge, because the link of this PDF doesn't have a pattern, but thanks, I will try in the next versions of the package!

I thing that I fixed this problem, or part of him, with condition that match some empty values on transform_url function as "empty" and later, in the main function, discard this values, to avoid the incompatibility of number of rows.

The commit is 9153f011b306c89fab5a312d3b71248de523c600 and 6b79162113f57fb3ed7cd958e9daa624276728bf.

Thank you again! Very happy with this firsts Issues on GitHub.

dcardosos avatar Feb 11 '22 01:02 dcardosos