SpelledOut.jl
SpelledOut.jl copied to clipboard
Draft: Implement spelled out for Māori
Made some good progress tonight: at a7e63b8 the Māori test results were:
Test Summary: | Pass Error Total Time
Māori | 28 33 61 3.0s
Now (at e233741), they are:
Test Summary: | Pass Fail Error Total Time
Māori | 43 1 17 61 7.6s
Since 7b8e491, doing better still:
Test Summary: | Pass Fail Error Total Time
Māori | 52 6 3 61 4.1s
Still to do:
- Implement evaluation strings for empty tests
- Implement commas:
Māori: Test Failed at /Users/jakeireland/projects/SpelledOut.jl/test/runtests.jl:114
Expression: spelled_out(1982, lang = :mi) == "kotahi mano, iwa rau, waru tekau mā rua"
Evaluated: "kotahi mano waru tekau mā rua" == "kotahi mano, iwa rau, waru tekau mā rua"
- Account for numbers ≥ 10,000
The primary questions I have had are:
- How do you say 100? (Kotahi or kotahi rau?) What about 9,100? (Iwa mano, kotahi, or iwa mano, kotahi rau?)
- Are there modifiers for very large numbers? For example, we have mano for thousand, miriona for one thousand thousand (one million), and piriona for one thousand million (one billion). Are there words for one trillion (one thousand billion), or one quadrillion (one thousand trillion)? What about higher?
- When do we use kotahi vs. tahi to mean "one"? (I have seen both.)
- What do you say for negative numbers? In English, you can write "negative one-hundred," or "minus one-hundred."
- Where do I put commas in large numbers, or example, 1984? I have seen this written without commas (kotahi mano iwa rau waru tekau mā whā), or with commas between each order of magnitude (kotahi mano, iwa rau, waru tekau mā whā). or with just one comma (kotahi mano, iwa rau waru tekau mā whā).
I have asked some people these questions, but I'm still trying to find kaumātua to provide me with assurance that I am doing this correctly.
Some important questions to ask:
- What is the rationale as to why you would use numbers in te reo Māori?
- How do numbers in any shape or form benefit our whānau?
- Do the numbers tell a narrative or more so have a whakapapa?
The goal of the project as a whole (not te reo specific) is to provide a programmatic interface to writing out numbers in long form. Specifically for te reo Māori, I am proud of our culture and language. I think this functionality is useful in many contexts, from simply providing readable metrics in areas such as report writing or data science, to natural language processing and machine learning. Providing this functionality for te reo Māori enables the use of this library throughout Aotearoa—I particularly see this as being useful in government, where language and accessibility is most important. I think this benefits our whānau/Aotearoa, simply in providing visibility and accessibility to te reo Māori. The numbers output from this programme do not tell a narrative, as it entirely depends on the use case. This programme is a library, to be used in others' programmes.
That said, I'm curious to know how this is done in computer science. My intentions with this library are good, and I genuinely think that the accessibility this would enable is beneficial, but you can't necessarily be sure that everyone who uses this library has good intentions, no? Same can be said for any library. I'd be interested to know how this is tackled in computer science.
Other questions:
- How did Māori write before Europeans?
- Consider regional differences and Elsdon Best's article on vigesimal numeration.
Note also, regarding big numbers, that if you think of navigation by the many stars, I'm sure there would have been numbers or co-ordinates that map their geographical and spatial dimensions of time to land. So I suspect there are numbers that are large, or in some way relevant in this area. Still unsure about the modifiers question, though.