lam
lam copied to clipboard
Add dataset: [bnl_newspapers1841-1879]
A URL for this dataset
https://data.bnl.lu/data/historical-newspapers/
Dataset description
630.709 articles from historical newspapers (1841-1879) along with metadata and the full text.
21 newspaper titles 24.415 newspaper issues 99.957 scanned pages Transcribed using a variety of OCR engines and corrected using https://github.com/natliblux/nautilusocr (95% threshold)
The newspapers used are:
- Der Arbeiter (1878)
- L'Arlequin (1848-1848)
- L'Avenir (1868-1871)
- Courrier du Grand-Duché de Luxembourg (1844-1868)
- Cäcilia (1863-1871)
- Diekircher Wochenblatt (1841-1848)
- Le Gratis luxembourgeois (1857-1858)
- L'Indépendance luxembourgeoise (1871-1879)
- Kirchlicher Anzeiger für die Diözese Luxemburg (1871-1879)
- La Gazette du Grand-Duché de Luxembourg (1878)
- Luxemburger Anzeiger (1856)
- Luxemburger Bauernzeitung (1857)
- Luxemburger Volks-Freund (1869-1876)
- Luxemburger Wort (1848-1879)
- Luxemburger Zeitung (1844-1845)
- Luxemburger Zeitung = Journal de Luxembourg (1858-1859)
- L'Union (1860-1871)
- Das Vaterland (1869-1870)
- Der Volksfreund (1848-1849)
- Der Wächter an der Sauer (1849-1869)
- D'Wäschfra (1868-1879)
Dataset modality
Text
Dataset licence
Creative Commons Public Domain Dedication and Certification
Other licence
No response
How can you access this data
As a download from a repository/website
size of dataset
500MB-2GB
Confirm the dataset has an open licence
- [X] To the best of my knowledge, this dataset is accessible via an open licence
Contact details for data custodian
I'll add it to: https://huggingface.co/datasets/biglam/bnl_newspapers1841-1879