pyfolio icon indicating copy to clipboard operation
pyfolio copied to clipboard

[extract_round_trips] ValueError: cannot insert dt, already exists.

Open prediction-labs opened this issue 7 years ago • 9 comments

HI. I'm using pyfolio to analyze output from a zipline run.

import pandas as pd
perf = pd.read_pickle('1594.pickle') # read in perf 
returns, positions, transactions = pf.utils.extract_rets_pos_txn_from_zipline(perf)
rts = pf.round_trips.extract_round_trips(transactions,portfolio_value=positions.sum(axis='columns') / (returns + 1))

The get this error.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-43-74bd4b98ea0f> in <module>()
----> 1 rts = pf.round_trips.extract_round_trips(transactions,portfolio_value=positions.sum(axis='columns') / (returns + 1))

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyfolio/round_trips.py in extract_round_trips(transactions, portfolio_value)
    197         rt_returns are the returns in regards to the invested capital
    198         into that partiulcar round-trip.
--> 199     """
    200 
    201     transactions = _groupby_consecutive(transactions)

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyfolio/round_trips.py in _groupby_consecutive(txn, max_delta)
    122     out = []
    123     for sym, t in txn.groupby('symbol'):
--> 124         t = t.sort_index()
    125         t.index.name = 'dt'
    126         t = t.reset_index()

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py in reset_index(self, level, drop, inplace, col_level, col_fill)
   2959         mapper, index, columns : dict-like or function, optional
   2960             dict-like or functions transformations to apply to
-> 2961             that axis' values. Use either ``mapper`` and ``axis`` to
   2962             specify the axis to target with ``mapper``, or ``index`` and
   2963             ``columns``.

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py in insert(self, loc, column, value, allow_duplicates)
   2447             exclude = (exclude,) if exclude is not None else ()
   2448 
-> 2449         selection = tuple(map(frozenset, (include, exclude)))
   2450 
   2451         if not any(selection):

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/internals.py in insert(self, loc, item, value, allow_duplicates)
   3508                         if m.any():
   3509                             b = b.coerce_to_target_dtype(d)
-> 3510                             new_rb.extend(b.putmask(m, d, inplace=True))
   3511                         else:
   3512                             new_rb.append(b)

ValueError: cannot insert dt, already exists
pf.create_round_trip_tear_sheet(returns, positions, transactions, round_trips=True)
In [25]:

pf.round_trips.extract_round_trips(transactions,portfolio_value=positions.sum(axis='columns') / (returns + 1))
pf.round_trips.extract_round_trips(transactions,portfolio_value=positions.sum(axis='columns') / (returns + 1))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-25-f2a68aa7b6d7> in <module>()
----> 1 pf.round_trips.extract_round_trips(transactions,portfolio_value=positions.sum(axis='columns') / (returns + 1))

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyfolio/round_trips.py in extract_round_trips(transactions, portfolio_value)
    197     """
    198 
--> 199     transactions = _groupby_consecutive(transactions)
    200     roundtrips = []
    201 

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pyfolio/round_trips.py in _groupby_consecutive(txn, max_delta)
    122         t = t.sort_index()
    123         t.index.name = 'dt'
--> 124         t = t.reset_index()
    125 
    126         t['order_sign'] = t.amount > 0

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py in reset_index(self, level, drop, inplace, col_level, col_fill)
   2959                     name = tuple(name_lst)
   2960             values = _maybe_casted_values(self.index)
-> 2961             new_obj.insert(0, name, values)
   2962 
   2963         new_obj.index = new_index

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py in insert(self, loc, column, value, allow_duplicates)
   2447         value = self._sanitize_column(column, value)
   2448         self._data.insert(loc, column, value,
-> 2449                           allow_duplicates=allow_duplicates)
   2450 
   2451     def assign(self, **kwargs):

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/internals.py in insert(self, loc, item, value, allow_duplicates)
   3508         if not allow_duplicates and item in self.items:
   3509             # Should this be a different kind of error??
-> 3510             raise ValueError('cannot insert %s, already exists' % item)
   3511 
   3512         if not isinstance(loc, int):

ValueError: cannot insert dt, already exists

Any idea much appreciated.

Thanks!

D

prediction-labs avatar Jan 21 '18 03:01 prediction-labs

Are there identical transactions with the same dt in there by any chance?

twiecki avatar Jan 30 '18 11:01 twiecki

I'm seeing this as well and with only 1 transaction. Imagine this is a version difference in pandas. Going to dig in more.

>>> group_txn(transactions[:0])
[]
>>> group_txn(transactions[:1])
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/projects/zip/analyze.py", line 31, in group_txn
    t = t.reset_index()
  File "/usr/local/lib/python3.5/site-packages/pandas/core/internals.py", line 3445, in insert
    raise ValueError('cannot insert %s, already exists' % item)
ValueError: cannot insert dt, already exists

h55nick avatar Feb 25 '18 15:02 h55nick

@h55nick If you could fix this bug it'd be much appreciated!

twiecki avatar Feb 26 '18 11:02 twiecki

@twiecki I've got this working locally by using this version of round_trips.py: https://gist.github.com/h55nick/2847723d58e9ff3cf9ede7bcdf40c523

I'll freely admit that I did not do any in-depth dive here and my goal was to simply to get it working. There are no additional specs etc but might be a helpful base.

h55nick avatar Feb 26 '18 14:02 h55nick

@h55nick Great, want to do a pull request?

twiecki avatar Feb 26 '18 14:02 twiecki

Hi guys. After making the fix above locally, I'm getting another issue:

gen_round_trip_stats raises a key error: KeyError: 'Column not found: returns'

Here's my attempt at getting the transaction PNL:

txn = pf.txn.make_transaction_frame(perf_df.transactions)
if txn.index.tzinfo is None:
        txn.index = txn.index.tz_localize('utc')
round_trips = pf.round_trips.extract_round_trips(txn)
round_trip_stats = pf.round_trips.gen_round_trip_stats(round_trips)
print(round_trip_stats)

I can see that round_trips has a rt_returns column but not a returns column.

My algo makes one round trip trade and I can see it when I print out round_trips.

Any ideas about this?

Thanks!

brettelliot avatar Nov 09 '18 19:11 brettelliot

To get returns you need to pass in portfolio_value, although that should be optional so the fact that it's raising an exception is a bug. PRs welcome!

twiecki avatar Nov 12 '18 10:11 twiecki

Ah, that was it! Thanks!

brettelliot avatar Nov 14 '18 18:11 brettelliot

you could just drop 'dt' column in transactions before call extract_round_trips like following without modify the pyfolio code

import pandas as pd perf = pd.read_pickle('1594.pickle') # read in perf returns, positions, transactions = pf.utils.extract_rets_pos_txn_from_zipline(perf)

transactions.drop(['dt'], axis=1, inplace=True)

rts = pf.round_trips.extract_round_trips(transactions,portfolio_value=positions.sum(axis='columns') / (returns + 1)

jihuang-taipei avatar Nov 17 '19 12:11 jihuang-taipei