neptune-client icon indicating copy to clipboard operation
neptune-client copied to clipboard

Feature Request: fetch runs table with selected fields only

Open wjaskowski opened this issue 4 years ago • 8 comments

In my project I have 800 runs each of which has 18000 hyperparameters. In result,

project.fetch_runs_table()

takes 51 seconds.

I rarely need all those parameters. Usually, I need just run_ids or a few selected columns. Maybe you also do not need additional traffic ;). I wish I had sth like:

project.fetch_runs_table(columns=['sys/run_id', 'node/param1'])

wjaskowski avatar Oct 12 '21 07:10 wjaskowski

Hey @wjaskowski

Prince here,

Thank you very much for bringing this up!

I will pass it to the engineering team.

Blaizzy avatar Oct 12 '21 11:10 Blaizzy

I did some work for you, guys. Here is the profiler results on fetching the leaderboard. I leave the interpretation to you but it seems to me that the performance issues are on the client side and depend on the number of objects sent.

image

@Herudaio

wjaskowski avatar Oct 26 '21 12:10 wjaskowski

It already takes 1 minute and 15 seconds to call fetch_runs_table for my project...

wjaskowski avatar Oct 29 '21 07:10 wjaskowski

Hi Wojciech

Thank you very much for such detailed profiling💯 ,

I have contacted the engineering team.

So either they will reach out soon or I will let you know of their comments and path forward

Blaizzy avatar Oct 29 '21 08:10 Blaizzy

Today it takes 1 min 46 seconds for fetch_runs_table. It is getting completely unusable...

wjaskowski avatar Nov 01 '21 20:11 wjaskowski

My dataframe obtained by fetch_runs_table() takes 1.4GB. Are you planning to do anything about this issue?

image

wjaskowski avatar Aug 11 '22 13:08 wjaskowski

At the same time the object from which the pandas dataframe is converted from takes >7GB: image

Simply unusable.

wjaskowski avatar Aug 11 '22 14:08 wjaskowski

Hey @wjaskowski, filtering the results by column names should be available soon. Optimistically within 2 weeks, but may be up to 4 - it's already in the short dev queue.

Herudaio avatar Aug 12 '22 14:08 Herudaio

Hello @wjaskowski, Sorry for the delay in communication here.

We introduced this feature in neptune-client release 0.16.7.

You can use the columns parameter of the fetch_runs_table() method to filter the columns you need. More information is available in our API reference here: https://docs.neptune.ai/api/project/#fetch_runs_table.

I am closing this feature request for now, however, please feel free to reach out in case you have any other questions. We appreciate your feedback and feature requests :)

SiddhantSadangi avatar Jan 25 '23 09:01 SiddhantSadangi