PrefixSpan-py icon indicating copy to clipboard operation
PrefixSpan-py copied to clipboard

How are intrepreting this as sequential data bases

Open xander412 opened this issue 2 years ago • 4 comments

Actually sequential databases are like [ [[1, 2], [1], [1, 3]], [[1, 2, 4], [3]], [[4, 5], [1], [4,5,6]] ] How can we give this as input to this algorithm?

xander412 avatar Jul 21 '22 15:07 xander412

Hey, did you figure it out? I got the same problem.

LeCarteloo avatar Jan 13 '23 11:01 LeCarteloo

No man, seems both are different kinds of data and we should interpret both separately. This is not probable thing I had come up with.

On Fri, Jan 13, 2023, 5:20 PM Filip Papiernik @.***> wrote:

Hey, did you figure it out? I got the same problem.

— Reply to this email directly, view it on GitHub https://github.com/chuanconggao/PrefixSpan-py/issues/38#issuecomment-1381748282, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKLGD6DMPRGCZIMSWOMOUY3WSE6PHANCNFSM54IAD6PA . You are receiving this because you authored the thread.Message ID: @.***>

xander412 avatar Jan 13 '23 12:01 xander412

I gonna do the same thing and I'm about to transform each item into a string representation ex : [ ["1,2","1","1,3"], ["1,2,4", "3"], ["4,5", "1", "4,5,6"] ]

if the order can change just sort it each time before convert to string.

KASDmusic avatar May 01 '23 17:05 KASDmusic

The readme says Outputs traditional single-item sequential patterns, in other words I don't think this implementation currently supports itemsets. As @KASDmusic suggests, you can use strings (or any Python hashable type, such as frozensets) instead of integers as your sequence items, but then you will not find subsequences with subsets of those itemsets e.g. in the sequence ["4,5", "1", "4,5,6"] you will not find the subsequence ["4", "1", "5, 6"] which should be a valid subsequence according to the PrefixSpan paper.

kittentronic avatar Apr 09 '24 12:04 kittentronic