corpora icon indicating copy to clipboard operation
corpora copied to clipboard

About Json file format and golang lib support

Open artikell opened this issue 2 years ago • 0 comments

golang lib support

When I wanted to use the corpora project, I found that there was no golang version of the library, so I spent some time supporting a golang version. The repository is artikell/gocorpora.

json file format

But when I tried to build the model, I found that some JSON file formats were not friendly to strong-type languages such as golang and Java,for example:corpora/hexagrams.json at master · dariusk/corpora

{
 "description": "I Ching hexagrams and descriptions, by Ashley Blewer.",
 "source": "https://bits.ashleyblewer.com/i-ching/",
 "hexagrams": {
"111111": {"definition": "01. Force (乾 qián); The Creative; Possessing Creative Power & Skill",
            "hexagram": " ䷀ ",
            "number": "1",
            "description": "Heaven above and Heaven below: Heaven in constant motion. With the strength of the dragon, the Superior Person steels herself for ceaseless activity. Productive activity. Potent Influence. Sublime success if you keep to your course."},
  "000000": {"definition": "02. Field (坤 kūn); The Receptive; Needing Knowledge & Skill; Do not force matters and go with the flow",
            "hexagram": " ䷁ ",
            "number": "2",
            "description": "Earth above and Earth below: The Earth contains and sustains. In th"}
}
}

This model will be transformed into the following structure

{
		Description string `json:"description"`
		Hexagrams   struct {
			_000000 struct {
				Definition  string `json:"definition"`
				Description string `json:"description"`
				Hexagram    string `json:"hexagram"`
				Number      int64  `json:"number,string"`
			} `json:"000000"`
			_000001 struct {
				Definition  string `json:"definition"`
				Description string `json:"description"`
				Hexagram    string `json:"hexagram"`
				Number      int64  `json:"number,string"`
			} `json:"000001"`
}
}

000000 and 000001 have become an attribute of hexagrams. In the subsequent data supplement, the model structure will continue to increase. Of course, the advantage is to ensure the uniqueness of data. Higher efficiency in query. However, I think the data should be consistent in structure, and the unlimited increase of attributes is not allowed, which is more friendly to some strong-type languages.

artikell avatar Apr 26 '22 07:04 artikell