v2.ocaml.org
v2.ocaml.org copied to clipboard
Add new script for checking how up to date translations are
Currently we have no insight into how up to date (or most likely not) the translations of the site are.
One partial solution for at least providing insight into the problem is have a script which compares the last known modification time (using git) of pages which are supposed to be a translation of each other. This script could then dump this list to stdout and at least then we would have a basis for integrating this into the build process (perhaps in the future it could mark the translated pages as likely being out of date w.r.t to probably the English version).
As a prototype we would probably need:
- A new
scriptwhich uses https://github.com/mirage/ocaml-git to compare this information. This would have to iterate over files in thesitedirectory and do the comparisons. - A new Makefile target (e.g.
make check) which builds and executes the script.
This was first suggested https://github.com/ocaml/ocaml.org/issues/824 -- of course there are problems such as if a translation is partially updated (but still out of date) it will get a new modification time, but this is meant to provide insight rather than be total solution and is likely to be an improvement over having nothing at all.
@patricoferris can I work on this?
@Srinithyee sure go ahead! And ask any questions if you're stuck :)
@Srinithyee sure go ahead! And ask any questions if you're stuck :)
@gs0510 okay! I'll start working on this. Yes, I'll make sure I reach out to you guys when I am stuck. Thanks a lot :)
This script could then dump this list to stdout and at least then we would have a basis for integrating this into the build process (perhaps in the future it could mark the translated pages as likely being out of date w.r.t to probably the English version).
@patricoferris I'd like to clarify the issue, so that I understand better. What would you like the the list to hold? the time of the change of a translated page or something like a key value pair?
And by scripts? what did you mean exactly?
@patricoferris Here's how I plan to work on this issue:
- Create a map of the files in English to it's corresponding French files using
module Map : Stdlib.Map.S with type Map.key = t - use
val compare_by_date : t -> t -> intthrough each iteration of the map - Store the result of 2 in a list to be dumped to stdout.
Am I on the right track?
Hi @Srinithyee,
That sounds okay. By scripts I mean an OCaml file in the script directory.
How you go about implementing it is entirely up to you, but be aware that there are some pages with more than just french translations. The list printed to stdout should be easily interpretable i.e. we need to know which two files are being compared and what the difference in time is, not just that there's a difference. Sorting them by least to most out of date might be a nice touch.
Once you are ready go ahead an open a PR and we can work on it there (especially for any help with the code) :)) Thanks.
@patricoferris being a beginner to OCaml, I have a couple of questions ( I'm sorry if they are too basic)
- Do you have a general format of how the script working with git should be?
- I went through this , but I'll need help with understanding the syntax and how to use
compare_by_dateandmap - Do you have any script that uses mirage? I am not too sure about how to use it. I did try running the example given in the readme here
But, I ended with this


Can you please help? :)
Awesome work so far, to get to this point (and hopefully have an understanding of what's going on) is great! Mirage is a tool for building Unikernels, it has a lot of requirements so often libraries (like this one) are written for what we need (Unix) and for Mirage. We don't need to worry about that.
By the way, this is non-trivial OCaml code. I recommend having a quick read of https://mirage.io/wiki/tutorial-lwt to understand the Lwt bits.
The error you got is that you are trying to make a Commit module with the Make functor whose argument should be a Hash module (https://mirage.github.io/ocaml-git/git/Git/Commit/index.html) but you provided it with a Digestif.S module. So first you will need a Hash module to pass to a Commit.Make.
utop # module Hash = Git.Hash.Make(Digestif.SHA1);;
utop # module Commit = Git.Hash.Make(Hash);;
The compare_by_date function takes two commits and compares them. Reading the API docs and understanding the signatures is a good way to get know what's possible and how you can join the functions together. The test directory is also another way to get familiar with the library. Hopefully that unblocks you a little :))
@patricoferris I see that there is already a script file that identifies the language of the file lang_of_file. I would like to use the output of this script within my script. Could you please help me with how to include a script within another script?
Hi @Srinithyee,
Of course :)) So there's a couple of things you can do:
- First of all, you will probably want to read up on compiling OCaml programs. Note how
lang_of_filename.mlis able to use theUtilsmodule, that is because of how it is compiled https://github.com/ocaml/ocaml.org/blob/master/Makefile.common#L50-L53 (hint 😉). - Next you will probably went to extract any common logic out into
Utilsso you can use it in your script. - You could then rewrite
lang_of_filenamewhich is an executable to use theUtilsand then you can use the same functions in your script file.
Does this make sense? Let me know if you need some more guidance :))
@patricoferris I am currently trying to run various statements on utop to get familiar with the syntax and way it works. I am yet to start working on the script. But, for now I do know the following:
- I'll have to include
open Utils,open mirage,open Printf,open Lwt.infixto use the utilities of mirage. - I'm not too sure about why I am running into these errors. I did cd to my site directory, but it is not able to locate about.md

- I'm having trouble understanding
there

- When I start working on the script, it should be saved in /scripts . How will I be able to access the files of /site. Should I change the path within the script?
Could you please help me here? I'm sorry to bombard you with too many questions. I am super new to all this and am slowly understanding it :)
- Next you will probably went to extract any common logic out into
Utilsso you can use it in your script.
I'm going to need help understanding this. What do you mean by "Extract common logic"? Do you mean extract common logic from lang_of_filename?
No worries :))
- You shouldn't need to touch
Mirageat all, you will needGitbut notMirage. We're not usingMirage, it is just a backend likeUnix. - I would suggest having a read about Git internals (this seems pretty accessible and not too complicated https://www.freecodecamp.org/news/git-internals-objects-branches-create-repo/). This is directly related to the
GitAPI that you are using. For example you wroteHash.Map.exists "readme.md"but the type is telling you the first argument is aSearch.hashnot astring(for reviewing types: https://ocaml.org/learn/tutorials/a_first_hour_with_ocaml.html). When you said "I did cd into my site directory" did you mean the repository or the./sitedirectory in the repository. If you are following this example it is important to follow it when it says "(* get store located in current root's .git folder *)". tis a type. It is a common idiom in OCaml to name the "main" type of a modulet. You will likely have seen the primitive typeint, theIntmodule has a typet(i.e.Int.t) which is equal toint. Maybe https://ocaml.org/learn/tutorials/modules.html#Abstract-types would be useful reading.- We can cross that bridge when you get to it, just use hard-coded paths for say comparing
index.mdandindex.fr.mdfirst, once we've got things being compared then we can work on scaling it to the whole site :))
This is actually quite a difficult problem so please don't feel bad or feel the need to apologise, hopefully you are learning a lot of OCaml 🐫.
I'm going to need help understanding this. What do you mean by "Extract common logic"? Do you mean extract common logic from lang_of_filename?
Yes exactly, again for the moment feel free to copy and paste and we can do that later.
@patricoferris Thanks for being so kind and helping me. The resources you've shared are extremely useful. I hope to make some progress :)