laravel-datatables-export icon indicating copy to clipboard operation
laravel-datatables-export copied to clipboard

Export has duplicates

Open uchajk opened this issue 2 years ago • 11 comments

Export yields unexpected results. We see multiple duplicates in the exported excel file. The problem stems from this line: https://github.com/yajra/laravel-datatables-export/blob/4e7cbcb2029fd52f9ff6a7ae5313a42f6ff0b379/src/Jobs/DataTableExportJob.php#L122

Notice that if the filtered query is not ordered by id (or any unique column), the order of the rows might differ between the calls. I would suggest using lazyById, but then we would have to pass the name of the column we want to use for lazyById.

uchajk avatar Feb 05 '23 20:02 uchajk

Maybe we can use the value set in the ->setRowId() call?

However, I think doing so will be a change in behavior. Some of our users expect what they see on the screen would be in the same order when it was exported.

I am in favor of using lazyById but it should not be the default behavior. We can put it config I guess?

yajra avatar Feb 06 '23 03:02 yajra

I was actually thinking the opposite... When using chunking or lazy, you have to guarantee the order of rows by an unique column. If not, you take a risk of getting different order of results between the execution of the query, and therefore missing some rows and duplicating some. In fact, getFilteredQuery() could be ordered by any column chosen by the user. I was going to suggest the following solution:

$query = $dataTable->getFilteredQuery()->reorder(); $query = $query->orderBy($someUniqueColumn)->lazy($chunkSize);

The $someUniqueColumn can come from the $dataTable object, or passed down from the job. I understand that users expect order to be preserved when exporting, but I think accurate result takes precedence. Unless someone can come up with a better solution, I don't see how you can offer both.

uchajk avatar Feb 06 '23 15:02 uchajk

Can you provide snippets to reproduce the duplicates? I asked around my users and they haven't encountered any duplicates yet. Largest export so far is ~200k.

yajra avatar Feb 07 '23 00:02 yajra

I'm not allowed to share any code snippets, but I've created a demo: https://github.com/uchajk/duplicates

Be sure to run UserSeeder. We are using pgsql.

if you try to export users, you will actually see the duplicates in the excel spreadsheet. Obviously, if you lower chunk size, you will see more duplicates. Then if you try to sort by id and export again, in this case you don't see duplicates. This happens because when ordering by id, order is guaranteed across all executions of the filtered query when using lazy method. When sorting by "Updated At" which has the same values for all entries, order is different across the executions of the query.

uchajk avatar Feb 07 '23 20:02 uchajk

I tested your demo app using SQLite. I was able to filter 28 records and export yields to the same result. Filtered 1 and all 1k records, and got the expected results.

Might be pgsql specific issue? Will try to set up a pgsql later.

yajra avatar Feb 08 '23 01:02 yajra

@yajra $chunkSize has to be smaller than total number of records in filteredQuery. It's the different order between the calls of the query that results in duplicates. If chunk size is 1k, exporting 1k entries will only execute the query 1 time and there is no possibility of duplicates at all. If you were trying to export 6,752 entries with $chunkSize 1k, the query will get executed 7 times, each of the execution possibly having different order. Therefore, the entries that you processed in the first execution (0 to 999) might appear in the second execution (1000 to 1999) and you would be exporting these entries second time. Again, this could only happen if the filteredQuery is not ordered by an unique column (e.g. Updated At).

I suggest running php artisan db:seed --class=UserSeeder

uchajk avatar Feb 08 '23 02:02 uchajk

The seeder does not work on SQLite and always hits a unique constraint issue.

I tried to export 7275 records and got the expected output.

    public function query(): Builder
    {
        return DB::table('users')
                 ->where('name', 'like', '%a%');
    }

UI

image

Excel

image

yajra avatar Feb 08 '23 03:02 yajra

https://www.youtube.com/watch?v=li5RVhoRiMQ

uchajk avatar Feb 08 '23 15:02 uchajk

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Mar 11 '23 02:03 github-actions[bot]

@uchajk maybe orderFixed option would help in this case? Not tested but maybe you can add a post order defaulted to ID.

yajra avatar Mar 14 '23 03:03 yajra

@uchajk maybe orderFixed option would help in this case? Not tested but maybe you can add a post order defaulted to ID.

Yeah sounds like it might. I'll need to find some time to test this.

uchajk avatar Mar 15 '23 01:03 uchajk