fiftyone icon indicating copy to clipboard operation
fiftyone copied to clipboard

[BUG] App cannot handle large integers

Open flystarhe opened this issue 3 years ago • 2 comments

Instructions

image

System information

  • Linux Ubuntu 20.04
  • FiftyOne installed from pip
  • FiftyOne v0.16.5, Voxel51, Inc.
  • Python 3.9.7

Describe the problem

dataset size is 4, but App show 0 samples.

Code to reproduce issue

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset(
    "cifar100",
    splits=["test"],
    dataset_name="cifar100-test",
)

dataset.name = "cifar100-test"
dataset.persistent = True

session = fo.launch_app(dataset)

import fiftyone.core.utils as fou

for sample in dataset:
    sample["file_hash"] = fou.compute_filehash(sample.filepath)
    sample.save()

print(dataset)

session.dataset = dataset

from collections import Counter
from fiftyone import ViewField as F

filehash_counts = Counter(sample.file_hash for sample in dataset)
dup_filehashes = [k for k, v in filehash_counts.items() if v > 1]

print("Number of duplicate file hashes: %d" % len(dup_filehashes))

dup_view = (dataset
    .match(F("file_hash").is_in(dup_filehashes))
    .sort_by("file_hash")
)

print("Number of images that have a duplicate: %d" % len(dup_view))

session.view = dup_view

flystarhe avatar Sep 09 '22 09:09 flystarhe

Pardon the delay. I can reproduce. This is very odd, still debugging.

benjaminpkane avatar Sep 13 '22 22:09 benjaminpkane

These file hashes are integers that exceed the size of browser/JSON integers, so the hashes get cast to another in integer. I recommend using strings for hashes, and we will have to update any documentation accordingly.

This also starts a larger discussion about supporting larger numbers in App views, so I will leave the issue open. Thanks for the report!

benjaminpkane avatar Sep 13 '22 23:09 benjaminpkane