faker
faker copied to clipboard
docs(lorem): define allowed words
Description
Document the expected words for a locale in the lorem module.
Related to
This is the first part of #2884.
- #2884
Deploy Preview for fakerjs ready!
| Name | Link |
|---|---|
| Latest commit | 79e8529ed2dadceaab9b87db80fd5e95b6806676 |
| Latest deploy log | https://app.netlify.com/sites/fakerjs/deploys/665995697eeb600007b0a2fd |
| Deploy Preview | https://deploy-preview-2885.fakerjs.dev |
| Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 99.96%. Comparing base (
a082ed2) to head (79e8529).
Additional details and impacted files
@@ Coverage Diff @@
## next #2885 +/- ##
========================================
Coverage 99.95% 99.96%
========================================
Files 2986 2986
Lines 215926 215929 +3
Branches 598 950 +352
========================================
+ Hits 215839 215855 +16
+ Misses 87 74 -13
| Files | Coverage Δ | |
|---|---|---|
| src/modules/lorem/index.ts | 100.00% <100.00%> (ø) |
I thought about using normal words if they dont use Latin. 🤔
Oh wow, then I completly misunderstood what we discussed in the team meeting when we made the decision. Discuss again in tomorrows meeting?
We should survey what current non-Latin locales actually do and document that (eg are they nonsense words or real words or transliterations of Latin lorem?)
25 locales currently have a lorem/words.ts file. For each i generated 5 sample words
Latin
| code | script | words(5) | notes |
|---|---|---|---|
| cs_CZ | Latn | quasi neque quasi delectus minima | standard Latin lorem |
| de | Latn | excepturi inventore nihil eveniet velit | standard Latin lorem |
| en | Latn | crur capillus denique veritas audacia | standard Latin lorem |
| fr | Latn | aliquid vitae accusamus suscipit est | standard Latin lorem |
| fr_CH | Latn | iure ratione dicta voluptas illo | standard Latin lorem |
| nl | Latn | veritatis quibusdam maxime magnam possimus | standard Latin lorem |
| pl | Latn | accusamus eaque deleniti quam distinctio | standard Latin lorem |
| pt_BR | Latn | vitae aliquid temporibus laudantium nam | standard Latin lorem |
| sk | Latn | quidem possimus corrupti odio voluptate | standard Latin lorem |
| tr | Latn | optio natus quis aspernatur molestias | standard Latin lorem |
| uz_UZ_latin | Latn | tutamen ullam magni auctor delectatio | standard Latin lorem |
| en_BORK | Latn | thees lebureeuoos gesh ooccoor injuy | real English words with spelling modifications |
| lv | Latn | māxīmē vulnēro xīphīās soļ āēgrotātīo | standard Latin lorem with extra diacritics |
| es_MX | Latn | Fichero Incorpóreo Basurear Engarbarse Gendarme | random real and nonsense Spanish words |
| vi | Latn | yêu bè vàng ngọt độc | random real Vietnamese words |
Non-latin
(im not really enough of an expert in non-Latin languages to tell if these are nonsense or real words)
| code | script | words(5) | notes |
|---|---|---|---|
| ar | Arab | الذات اصرخ ليونة أتذكر فشيأ | |
| dv | Thaa | އިންގިލާބެއް އެންމެ ތަރައްގީ މުޅިން ގެއްލުންނުވާ | |
| el | Grek | nihil similique laudantium aliquid qui | standard Latin lorem (perhaps should be changed) |
| fa | Arab | تمام به پایان بلکه ستون داشت | |
| he | Hebr | דולור תוק לפתיעם רוגצה קלאצי | Transliteration mimicking lorem ipsum in Hebrew |
| hy | Armn | աշխարհում բոլորն մեկ հասած իրենց | |
| ja | Jpan | 色々 独裁 錠 めいがら たて | |
| ko | Kore | 형에 확정될 자유를 정한다. 범하고 | |
| ru | Cyrl | направлений модели внедрения профессионального играет | Random real words |
| ur | Arab | چاسدسد چسد ساسدبھ اسدفگبطاسدفد اسداسدھدسبابگ ابنسد |
Team Proposal
- The lorem module will always return the Latin lorem words/sentences.
- The lorem word definitions will be inlined into the module and removed from the locale data
- We will add replacement methods to the word module, that generate sentences/paragraphs with words of that locale
Would it make more sense to move the lorem definitions to the base locale?
Team Proposal
- The lorem module will always return the Latin lorem words/sentences.
- The lorem word definitions will be inlined into the module and removed from the locale data
- We will add replacement methods to the word module, that generate sentences/paragraphs with words of that locale
I was not available at last team meeting, and I'm missing any reasons 👀 Why will lorem always return latin?
Why will lorem always return latin?
Because all locales that use Latin characters, use Latin anyway. And those that dont, use normal words and are sometimes even incompatible with the Latin sentence structure.
Team Task
Everybody should make their suggestions for the lorem module defimnitions and expectations. We will discuss the proposals/expectations in the next team meeting.
Team Task
Everybody should make their suggestions for the lorem module defimnitions and expectations. We will discuss the proposals/expectations in the next team meeting.
My expectation:
If we do not at least change the behavior of lorem, there should be a e.g. word/lorem.blindtext that generates randomized but localized placeholder texts (wiki:de:Blindtext, wiki:en:Filler_text)
name and module is up for discussion
@Shinigami92 Could you please elaborate on what you would consider the defining difference between the word module and the lorem module?
@Shinigami92 Could you please elaborate on what you would consider the defining difference between the word module and the lorem module?
Personally I would say word is for generating words like nouns, verbs and so on, while lorem is more like for placeholder texts for example to test responsive table cells in a frontend.
However I did not read our docs yet and both modules were from times long before I came into the project and so I won’t like to be made responsible for any historical decisions. Instead I would like more to find a good way in the future and not look into the past. So I don’t care if it is called lorem, word or anything else but the functionality is provided.
I won’t like to be made responsible for any historical decisions.
That was not my intention. I'm sorry.
I spend quite some time thinking about this. The following represents my personal opinion. I'll split the answer to multiple comments in order to make it easier to react to them using emojis. I hope the reactions (and other answers) help us (all) determine where we are one the same page and where we are not.
For me, lorem (module) is a specific type of blind text that you are not supposed to read/be able to understand.
If we define lorem to be latin, then those locales that currently return locale specific words would be impacted. If we define lorem to be locale specific, then the locales that currently use latin would loose their original intent of creating un-understandable text. If we not define lorem, then the locales that use latin are unable to generate (pseudo-)understandable blind texts unless we add new methods for that.
If we add the explicit concept of (pseudo-)understandable blind text, then it is likely that it will require the same or at least similar methods to the current lorem module, so that you are able to generate a string matching your length requirements.
These would cause conflicts with the existing methods and would need some form of disambiguation.
Either by prefixing them with blind e.g. blindText, blindSentence, or moving them to a different new module.
I think the real question is does "lorem" mean general blind text or specifically the Latin lotem ipsum text?
Does the idea of "Chinese lorem" or "Thai lorem" or "Hebrew lorem" make sense?
Does the idea of "Chinese lorem" or "Thai lorem" or "Hebrew lorem" make sense?
Important question.
For me, Chinese lorem is just a blind text.
If it doesn't use latin (like) characters including translations it is a blind text but not lorem.
If it is a "phonetic translation", then it is just gibberish using the locales characters. not better than faker.string.fromCharacters.
The closest non-latin character based equivalent I can think of is the japanese DoReMi spelling training thing(?), because I assume their brain turns off as soon as it recognizes the "intro". Which kinds of leads me to "we should probably start our lorem text with lorem ipsum", but that kind of is a different feature request altogether.
@matthewmayer Could you please share your opinion on any of these questions? Ultimately we need answers/shared opinions to form any kind of consensus.
I don't really know. I think we need to try and involve some native speakers of non-Latin languages.
Team Proposal
- We want to make the lorem module to consist only of lorem ipsum (Latin)
- We want to make a new text (or similar) module that generates blind texts in the current locale
- We are not sure yet whether these will be hard-coded pseudo realistic sentences or wild combinations of words that may or may not follow the normal syntax of the language (e.g.
firstName drives through cityvsnoun verb adjectivevsword word word) - The module should roughly mimic the methods in the lorem module
- The lorem module and the blind text module should have links (at each method) between each other to help with discovery
- We are not sure yet whether these will be hard-coded pseudo realistic sentences or wild combinations of words that may or may not follow the normal syntax of the language (e.g.
- The actual redefinition on the lorem module is v10/not now (after we have the blind-text module)