label-studio icon indicating copy to clipboard operation
label-studio copied to clipboard

Missing data display in time series

Open kennyxue opened this issue 3 years ago • 14 comments

Describe the bug Data plot error with the Y axis scale and zoom for time series.

To Reproduce Steps to reproduce the behavior:

  1. setup time series label with data and config file attached
  2. zoom in and out in data display panel
  3. See error. the data line don't display.

Expected behavior fix the data plot error for time series.

Screenshots failure

Environment (please complete the following information):

  • OS: ISO WINDOWS10
  • Label Studio Version 0.8.0

Additional context attached data and config file config.xml.csv

2020-12-1comb_history.csv

kennyxue avatar Nov 18 '21 00:11 kennyxue

Same issue we have when having files with more then 20k data points

https://user-images.githubusercontent.com/55290094/142640603-ace1bda5-7719-4dfa-bba7-c8d8b48629ec.mp4

.

Fourkane avatar Nov 19 '21 14:11 Fourkane

Been having the same problem.

jardinetsouffleton avatar Nov 19 '21 14:11 jardinetsouffleton

I'm facing the exact same problem on Ubuntu (docker) and Windows 10 (pip installation). Unfortunately, for me the time series editor is the key feature I am really looking forward to try and integrate in my workflow.

swssl avatar Apr 01 '22 07:04 swssl

Same thing here, Windows 10, Chrome browser.

jeroenboeye avatar May 18 '22 10:05 jeroenboeye

Hi, Same thing here: my time series are not displayed when the number of data points to be displayed is apparently to high. Attached a view where the time series are visible (slider zoom at the bottom set over 3 weeks time) and another where the time series visualization get buggy (when I try to visualize more than 1 month of data). Note that the default zoom is set too large.

view1_normal view2_bug

michaelhoarau avatar May 24 '22 11:05 michaelhoarau

@michaelhoarau Thank you for your report. Is it possible to share the data and the labeling config to reproduce this issue?

makseq avatar May 27 '22 00:05 makseq

@makseq

I have a similar issue. I can drag out the x axis and see a complete view of my data, but the label cursor will not appear and the y axis will bug out with time series data containing H:M:S format. It works with just dates though.

Works: 2022/2/7 Doesn't work: 2022/2/7 12:23:42

Buggy Y axis: image image image image image

dataset: total.csv

config: " <View>

<TimeSeries name="ts" valueType="url" value="$csv" sep="," timeColumn="time_y" timeFormat="%Y-%m-%d %H:%M:%S" overviewChannels="toplevel">
  <Channel column="toplevel" units="ft" displayFormat=",.1f" strokeColor="#1f77b4" legend="toplevel"/>
</TimeSeries>
<!-- Control tag for region labels -->
  <TimeSeriesLabels name="label" toName="ts">
    <Label value="Leak" background="red"/>
  </TimeSeriesLabels>
</View>

berrnine avatar May 31 '22 14:05 berrnine

Thank you for provided data & config, it helps us a lot!

makseq avatar Jun 01 '22 22:06 makseq

I believe this might be a sort order problem with the data imported in the time column. If I take either of the datasets, and order it prior to uploading the data by the time column in either asc or desc, there are no errors and it works entirely as expected. I am going to take a look further into other precondition processes to see if we can handle this automatically without impacting other workflows. In the meantime, can you ensure the imported data is sorted in order of the time column?

bmartel avatar Jun 08 '22 20:06 bmartel

As for datasets which are too large causing an issue, I don't know that this is the same issue necessarily. I'll try to reproduce that, but I would as a first check just ensure the datasets are ordered correctly by time as I am seeing this being an issue in code at this time for handling datasets which are ordered by another column than the timeColumn.

Ouput based on original csv total.csv

Screen Shot 2022-06-08 at 3 52 50 PM

Output based on sorted csv time_y ascending Untitled spreadsheet - total.csv

Screen Shot 2022-06-08 at 3 52 22 PM

bmartel avatar Jun 08 '22 20:06 bmartel

I believe this might be a sort order problem with the data imported in the time column. If I take either of the datasets, and order it prior to uploading the data by the time column in either asc or desc, there are no errors and it works entirely as expected. I am going to take a look further into other precondition processes to see if we can handle this automatically without impacting other workflows. In the meantime, can you ensure the imported data is sorted in order of the time column?

Thank you this worked!

berrnine avatar Jun 09 '22 14:06 berrnine

I'm having the same problem even with my data sorted by the time column. Any thoughts?

Data: data.csv

Config:

<View>
    <!-- No region selected section -->
    <View visibleWhen="no-region-selected" style="height:120px">

        <!-- Control tag for region labels -->
        <TimeSeriesLabels name="label" toName="ts">
            <Label value="Region" background="#5b5"/>
        </TimeSeriesLabels>
    </View>

    <!-- Region selected section with choices and rating -->
    <View visibleWhen="region-selected" style="height:120px">
        <!-- Per region Choices  -->
        <Choices name="choices" toName="ts" showInline="true" required="true" perRegion="true">
            <Choice value="Good"/>
            
            
        </Choices>
    </View>

    <!-- Object tag for time series data source -->
    <TimeSeries name="ts" valueType="url" value="$csv" sep="," timeColumn="time" timeFormat="%Y-%m-%d %H:%M:%S.%f">
        <Channel column="thing1" strokeColor="#17b" legend="thing1"/>
        <Channel column="thing2" strokeColor="#17b" legend="thing2"/>
      	<Channel column="thing3" strokeColor="#17b" legend="thing3"/>
      	<Channel column="thing4" strokeColor="#17b" legend="thing4"/>
      	<Channel column="thing5" strokeColor="#17b" legend="thing5"/>
      	<Channel column="thing6" strokeColor="#17b" legend="thing6"/>
      	<Channel column="thing7" strokeColor="#17b" legend="thing7"/>
      	<Channel column="thing8" strokeColor="#17b" legend="thing8"/>
        <Channel column="thing9" strokeColor="#17b" legend="thing9"/>
    </TimeSeries>
</View>

sopeeweje avatar Jul 08 '22 02:07 sopeeweje

Hi, I tried different things concerning this issue an want to share my findings. Used software:

  • LS v1.5.0 (Local pip installation)
  • Python 3.9
  • Windows 10, Edge Browser

Used Dataset: https://archive.ics.uci.edu/ml/datasets/Appliances+energy+prediction

Used labeling interface:

<View>
    <TimeSeries name="ts" valueType="url" value="$csv"
                sep=";"
                timeColumn="date"
                timeFormat="%Y-%m-%d %H:%M:%S"
                timeDisplayFormat="%Y-%m-%d %H:%M:%S">
      <Channel column="T1"
                 units="unit"
                 displayFormat=",.2f"
                 strokeColor="#ff1122"
                 legend="T1"/>
      <Channel column="Visibility"
                 units="unit"
                 displayFormat=",.2f"
                 strokeColor="#1f77b4"
                 legend="Visibility"/>
      <Channel column="rv1"
                 units="unit"
                 displayFormat=",.2f"
                 strokeColor="#1f77b4"
                 legend="random"/>  
    </TimeSeries>
    <TimeSeriesLabels name="label" toName="ts">
        <Label value="Region" background="red" />
    </TimeSeriesLabels>
    <Choices name="region_type" toName="ts"
          perRegion="true" required="true">
        <Choice value="Outlier"/>
        <Choice value="Anomaly"/>
    </Choices>
</View>

What I found:

  1. When importing the time series Data as-is (deleted " and replace , with ; ), all three channels (T1, Visibility, rv1) are visible for a time period of about 1970 lines of data points (10 minutes period each, see Dataset description). When zooming out, the T1 column behaves like described above - it seems like the scale of the graph shifts, so that just the peaks are visible: Screenshot_20220713_134205 When this happens and I shift the time window to touch the right end of the data, suddenly everything is hidden. This behaviour doesn't depend on: Order of the Channels (T1 is always affected), Colour settings, Number of Channels (I tried 2 and 3)

  2. When copying the data from the working "Visibility" column to the "T1" column, everything works fine, independent of order, colour settings, etc. => I didn't touch (/sort) the "date" column, so is it possible that i read over something in the documentation? Or is there another rule that the data has to fulfill?

- Thanks

swssl avatar Jul 13 '22 12:07 swssl

Hello, I'm facing this problem: when I zoom or select label, the time series often goes out of scale. Should I create another issue for that or is it related?

See the third time series on the first image and the second one on the second plot (starting from the top): the top part is missing.

image

image

pidefrem avatar Aug 12 '22 09:08 pidefrem

Also seeing this with large segments: everything shows fine for ~38,000 data points, but when I zoom out further than that, the time series are clipped along the top:

image

image

(You can also see in the latter image that the highlight on the summary bar below the plots is wrong: it should cover most of the time series.)

Video:

Screencast from 20-10-22 08:59:36.webm

Datasets are sorted in ascending order by time (left-most column of CSV). Label config below:

<View>
  <Header value="Record version" />
  <Text name="record-version" valueType="text" value="$record_version" />

  <Header value="Original filename" />
  <Text name="original-filename" valueType="text" value="$original_filename" />

  <Header value="Tidal breathing H2S" />

  <Header name="flow-rate-label" value="(Negative flow rate = inhale)" size="6" />

  <TimeSeriesLabels name="tidal-breathing-labels" toName="tidal-breathing">
    <Label value="h2s-baseline"/>
    <Label value="tidal-breathing"/>
  </TimeSeriesLabels>

  <TimeSeries
    name="tidal-breathing"
    valueType="url"
    value="$csv"
    timeColumn="millis"
    overviewChannels="flow,h2s"
  >
    <Channel column="flow" units="L/m" displayFormat=",.1f" legend="Flow rate"></Channel>
    <Channel column="h2s" units="ppm" displayFormat=",.1f" legend="H2S concentration"></Channel>
  </TimeSeries>

  <Header value="Data errors" size="5" />
  <Choices name="errors" toName="tidal-breathing" choice="multiple">
    <Choice value="Flow rate"/>
    <Choice value="H2S"/>
  </Choices>

  <TextArea name="notes" toName="tidal-breathing" placeholder="Notes" />
</View>

harrymander avatar Oct 19 '22 20:10 harrymander

There are currently a couple different issues here, the main issue in which I have a fix under review. We will hopefully have this resolved soon, and I will update this ticket as soon as it's merged.

bmartel avatar Oct 20 '22 12:10 bmartel

Sorry to be that guy, but is there an update on this? 1.7.0 seemed to improve it somewhat but I am still seeing issues especially when there is a large range in the y-axis. Issue seems to be related to weird behaviour with the overview box - box gets bigger when zooming in, disappears etc.

https://user-images.githubusercontent.com/41089556/215891506-01107263-b3da-4521-8d1e-6ee16d02c8a5.mp4

harrymander avatar Jan 31 '23 21:01 harrymander

@harrymander Can you provide the exact config and a sample dataset which was used in the screen recording? This should be fixed on 1.7.0, but seeing this recording looks like it didn't quite fix all of the issues.

bmartel avatar Jan 31 '23 22:01 bmartel

@harrymander no need to be sorry, we appreciate your patience and persistence. We're working on tightening up the feedback loop from community issues, and are grateful that you're helping make this experience better for everyone.

hogepodge avatar Jan 31 '23 22:01 hogepodge

@bmartel I have this config and the date looks like 12/8/2022 4:45:00 AM. I can't read the data and I think its because of the date format. Any ideas please ? ` <View>

<TimeSeries name="ts" valueType="url" value="$csv" sep="," timeColumn="time" timeFormat="%m/%d/%Y %H:%M:%S %p">

    <Channel column="A-1"/>
  <Channel column="A-2"/>
  <Channel column="A-3"/>
  <Channel column="A-4"/>
</TimeSeries>
<TimeSeriesLabels name="label" toName="ts">
    <Label value="0" background="red"/>
<Label value="1" background="#FFA39E"/></TimeSeriesLabels>

</View>`

test.csv

Screenshot 2023-07-20 161748

khadijakhaldi avatar Jul 20 '23 21:07 khadijakhaldi