Get the oldest available data?
Is there a way to get the date of the oldest available data in google fit?
With some Apple health kit packages, you can set the start date to 1970 and end date to today, and limit the results to just 1 - Meaning it will return after the first result is found. Is there some similar workaround?
interesting question, Based on this one https://developers.google.com/android/reference/com/google/android/gms/fitness/request/DataReadRequest.Builder#setLimit(int)
it look like a definitely yes with a lot of flexibilities, but need to test out if it really works.
Other than that, depending on what kinds of data you are trying to get, the approach could be very "challenging".
For example, the very special one, step data is break into many small buckets. So the first data could be a year, or one second depending on your bucket unit.
Without relying on any features, my approach with current tools is to
const opt = {
startDate: moment([1970]).startOf('day'),
endDate: moment().endOf('day'),
bucketInterval: 365,
bucketUnit: 'DAY',
};
365 is to get the raw entry point of the year of that oldest data, and it pretty fast because there are no more than 60 (1970-2021) data maximum.
Probably we can just use the day difference between 1970 and today to get exact 1 data with all steps, but I'm not sure if fitness api will still give us the correct startDate instead of 1970. You can test it out, if that's the case, that's fastest way!
After getting those data, check the data inside rawSteps, the startDate of first data should give you the oldest timestamp you are looking for, but with wrong value since it's accumulate of 365 days.
Then take that startDate we can do the second search;
const opt = {
startDate: moment(startDate).startOf('day'),
endDate: moment(startDate).endOf('day'),
bucketInterval: 1,
bucketUnit: 'HOUR',
};
Again, take the first step data
Depending on what your need, this will give you the oldest available data within that 1 hour of span.
You can think of the entire process like a binary search, this is probably the fastest/accurate approach I can think of.
You can think of the entire process like a
binary search, this is probably the fastest/accurate approach I can think of. @aboveyunhai
Would you say the "binary search" approach would be the best approach for something like heart rate where bucketing doesn't work (unless bucketing not working for HR is a bug?).
@shmkane I did read through the HR code, it doesn't use the bucket internally so I cannot give you the answer without trying.
If it works, then yes. Two separated fast data query with only fewer dozen of data in worst cases V.S. One data query blindly searches your entire history that could take forever and crush your app. I would personally choose the former.
Of course, This is all based on Google Fit API itself does not provide us any built-in function.
For no bucket data, we indeed must find another approach.
@aboveyunhai This is borderline a different issue, but after some frustration on getting this sort of functionality to work, I found this SO thread that claims that you can't read more than a certain timeframe back. I wasn't sure if this project had support for the mentioned enableServerQueries()
My current naive solution for no-bucket data is start from the year 2000 (could be older... but lets be real who has older data in fit?) and iteratively, month by month, query and check to see if any data exists. Once I find a month that contains data, the first datapoint should be the first data object available...
However, because I can't look back more than around a couple weeks, I'm stuck.
Am I approaching this the wrong way?
Furthermore, I'd love to contribute to this project, however I'm not sure how to get a working local copy of this repo. Aside from the contribution doc, is there anywhere else I can find info?
@shmkane 1970 is the bottom-line timestamp for Google Fit Api. If you called less than that, you will get an app crush without error handler.
So I think the enableServerQueries() is to deal with unsync history across multiple devices instead of you can't read more than a certain timeframe back as long as it exists in server/local devices related to that account. But this indeed points out we need to do some work around from it.
However, 2500-3000 based on my past tests is the rough threshold for a single DataReadRequest, which mean if your app risks to do a relative wide range request (5 years with 1 day bucket) or (1 hour with 1s bucket), you might stuck even crush your app
because the amount of data excesses the threshold, assume if you get perfect 3600 data entries from that particular hour.
So for your naïve solution, I felt like it might just be slow, it could be faster if we can use bucket.
Unfortunately setLimit only takes the latest, and hopefully I missed some doc that can do its reverse operation, otherwise it's quite annoying to make some workaround.
@aboveyunhai not worried about anything before 1970 😅
I do want to explicitly ask: heart rate doesn't have an option to bucket right?
A bit about my setup- I'm using an Apple Watch connected to an iPhone with Google Fit installed. I have a separate Samsung S10 that also has Google Fit installed. The device I'm testing on is the S10.
You mentioned unsync across multiple devices - I wonder if I'm experiencing this issue because of the devices?
For example, I know for a fact that I have HR data from my watch going back to 2019, but if I try to query for data from the beginning of September, I get an empty array returned.
@aboveyunhai not worried about anything before 1970 😅
I do want to explicitly ask: heart rate doesn't have an option to bucket right?
A bit about my setup- I'm using an Apple Watch connected to an iPhone with Google Fit installed. I have a separate Samsung S10 that also has Google Fit installed. The device I'm testing on is the S10.
You mentioned unsync across multiple devices - I wonder if I'm experiencing this issue because of the devices?
For example, I know for a fact that I have HR data from my watch going back to 2019, but if I try to query for data from the beginning of September, I get an empty array returned.
HR does not have that option from current implementation (it's implemented by other devs way before I help to maintain), but it may have. For your second question, yes. that might be the issue, you can download official google fit app on your S10 to see if the data exists. And I believe that all the current implementation are only reading the local data.
@aboveyunhai not worried about anything before 1970 😅
I do want to explicitly ask: heart rate doesn't have an option to bucket right?
A bit about my setup- I'm using an Apple Watch connected to an iPhone with Google Fit installed. I have a separate Samsung S10 that also has Google Fit installed. The device I'm testing on is the S10.
You mentioned unsync across multiple devices - I wonder if I'm experiencing this issue because of the devices?
For example, I know for a fact that I have HR data from my watch going back to 2019, but if I try to query for data from the beginning of September, I get an empty array returned.
HR does not have that option from current implementation (it's implemented by other devs way before I help to maintain), but it may have.
For your second question, yes. that might be the issue, you can download official google fit app on your S10 to see if the data exists. And I believe that all the current implementation are only reading the local data.
The official google for app is installed on my S10, and from the app I can see the data, but I'm unable to read it from this package.
To be clear, I can get the past week or so of data. But nothing from the beginning of last month or even before that.
@shmkane There are fewer things can cause the issue,
First, you can read the data but it exists on server so you may use enableServerQueries() to test it out.
Second, the data is created by Google Fit App. By default, you would not read data created by other apps. This one required some modifications to rework the current implementation.
Caveat, you can read but you are not allowed to update any data that is created by other apps, vice versa,
Either one requires to understand and modify the source code.
@aboveyunhai This is borderline a different issue, but after some frustration on getting this sort of functionality to work, I found this SO thread that claims that you can't read more than a certain timeframe back. I wasn't sure if this project had support for the mentioned enableServerQueries()
My current naive solution for no-bucket data is start from the year 2000 (could be older... but lets be real who has older data in fit?) and iteratively, month by month, query and check to see if any data exists. Once I find a month that contains data, the first datapoint should be the first data object available...
However, because I can't look back more than around a couple weeks, I'm stuck.
Am I approaching this the wrong way?
Did you found a way to enable ServerQueries in this project ?