Interface to fetch entries in primitive types from `DataPack`
This PR is the first step towards fixing #881
Description of changes
Current, when fetching entries from a DataPack or MultiPack using the get method, Forte converts data store entries into object form. We wanted a way for users to directly interact with DataStore entries. In this PR, we provide a modification to the get method of DataPack to be able to return an entry in its primitive form directly from DataStore without needing to be converted to an object.
Additionally, since DataStore entries are not very interpretable (since they are in a list format), this PR introduces a way to retain data store entries in their primitive form and also represent them in a more interpretable way by converting it to a dictionary. This happens by the transform_data_store_entry method in data_store.py. An example of this is as follows:
# Entry of type 'ft.onto.base_ontology.Sentence'
data_store_entry = [
171792711812874531962213686690228233530,
'ft.onto.base_ontology.Sentence',
0,
164,
0,
'-',
0,
{},
{},
{}
]
transformed_entry = pack.transform_data_store_entry(
data_store_entry
)
# transformed_entry = {
# 'begin': 0,
# 'end': 164,
# 'payload_idx': 0,
# 'speaker': '-',
# 'part_id': 0,
# 'sentiment': {},
# 'classification': {},
# 'classifications': {},
# 'tid': 171792711812874531962213686690228233530,
# 'type': 'ft.onto.base_ontology.Sentence'}
# }
Possible influences of this PR.
By allowing DataPack or MultiRack to fetch entries in their primitive form, users can interact with DataStore more easily.
Test Conducted
The working of the get method with the get_raw attribute set to True was tested in data_pack_test.py and multi_pack_test.py
Codecov Report
Merging #900 (77f4483) into master (72e8bce) will increase coverage by
0.05%. The diff coverage is92.59%.
@@ Coverage Diff @@
## master #900 +/- ##
==========================================
+ Coverage 80.87% 80.93% +0.05%
==========================================
Files 253 253
Lines 19619 19677 +58
==========================================
+ Hits 15867 15925 +58
Misses 3752 3752
| Impacted Files | Coverage Δ | |
|---|---|---|
| tests/forte/data/data_store_serialization_test.py | 98.43% <ø> (ø) |
|
| tests/forte/data/data_store_test.py | 95.58% <ø> (ø) |
|
| forte/data/multi_pack.py | 83.01% <80.00%> (+0.82%) |
:arrow_up: |
| forte/data/data_pack.py | 84.90% <86.36%> (-0.37%) |
:arrow_down: |
| forte/data/data_store.py | 93.31% <95.23%> (+0.39%) |
:arrow_up: |
| forte/data/base_pack.py | 76.75% <100.00%> (+0.07%) |
:arrow_up: |
| forte/data/ontology/top.py | 78.16% <100.00%> (+0.05%) |
:arrow_up: |
| tests/forte/data/data_pack_test.py | 98.98% <100.00%> (+0.13%) |
:arrow_up: |
| tests/forte/data/multi_pack_test.py | 97.05% <100.00%> (+0.15%) |
:arrow_up: |
| ... and 2 more |
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
quick comment on the title, not "fetch entries directly from Data Store", but fetch primitive types from data pack. Data store is still invisible to users.