AvS_FastSimpleImport icon indicating copy to clipboard operation
AvS_FastSimpleImport copied to clipboard

Unable to get speed on product importation

Open bijalbhavsar opened this issue 9 years ago • 14 comments

We are trying to import 1 Lac products on one request and it takes around 10 hours to import products. Kindly suggest me how to achieve performance.

Do we need to improve anything in server to get performance? What steps we have to follow to achieve speed of 100000 records in 10 minutes?

Thanks in advance.

bijalbhavsar avatar Jan 29 '16 10:01 bijalbhavsar

We need more information for that:

  • What is "Lac"?
  • How many products are you importing
  • Which type of products are you importing?
  • How do you start the import?
  • Are you using Partial Indexing?

avstudnitz avatar Jan 29 '16 12:01 avstudnitz

Hi Andrea,

Thank you for your response.

We have around 1200000 products to import and those are simple products. We have created 12 CSV files with 100000 bunch of products in each , also created script in magento root folder and schedule cron to run it.

At a time we are trying to import 100000 products and it is taking approx. 10- 13 hours to complete 100000 products which is too much.

I am not pretty sure about patrial indexing.

Regards, Bijal B.

On Friday, January 29, 2016, Andreas von Studnitz [email protected] wrote:

We need more information for that:

  • What is "Lac"?
  • How many products are you importing
  • Which type of products are you importing?
  • How do you start the import?
  • Are you using Partial Indexing?

— Reply to this email directly or view it on GitHub https://github.com/avstudnitz/AvS_FastSimpleImport/issues/279#issuecomment-176726336 .

Greeting, Bijal Bhavsar Web Developer

bijalbhavsar avatar Jan 30 '16 18:01 bijalbhavsar

Hi Avstudnitz,

Are you able to get my concern about speedup importation.

Awaiting for your reply.

Thank You, Bijal

bijalbhavsar avatar Feb 02 '16 07:02 bijalbhavsar

Please try to answer all my questions, your answers are important for our understanding. Regarding Partial Indexing: there is a configuration setting at System -> configuration -> Fast Simple Import named "Partial Indexing". Try to turn that off, it will speed up things. You will have to do a reindex afterwards. Generally, 1.2 million is a huge amount of products. I don't think Magento does run well with it out of the box. I hope you have a t least some indexing optimiziations in place, or are using Enterprise Edition.

avstudnitz avatar Feb 02 '16 07:02 avstudnitz

Hi Avstudnitz,

We already having "Partial indexing" set to "Off" - Is ther any other way to speed up the process?

bijalbhavsar avatar Feb 05 '16 10:02 bijalbhavsar

Again: please answer all my questions, you haven't yet.

avstudnitz avatar Feb 07 '16 08:02 avstudnitz

@bijalbhavsar Please provide concrete code examples. Possible bottlenecks are:

  • Calling AvS_FastSimpleImport in a loop, always pass the complete array in at once.
  • If a lot of attribute options need to be created, it can take while (don't know if that is fixed).
  • Please use the latest develop branch in combination with the latest Magento patches. This commit solves a memory issue and huge slowdown. https://github.com/avstudnitz/AvS_FastSimpleImport/commit/8280983632d0f034d70d2d23cad78517d4de6393

Other than that, please answer @avstudnitz's questions. The importer can run pretty fast, main concern in your indexing speed. If your running CE, the importer's performance isn't the bottleneck. If you're running EE, your performance should be great.

paales avatar Feb 07 '16 17:02 paales

@avstudnitz @paales Kindly find my answer to each questions.

Q: What is "Lac"? A: Lac = 100000 record. Q: How many products are you importing A: We require to import approx. 1200000 products Q: Which type of products are you importing? A: Simple products Q: How do you start the import? A: We are using cron to import, we have created file and it reads records from CSV file and create array of 1000 records in chunks and that array is used as in loop to import products using Avs_FastSimpleImport script. Q: Are you using Partial Indexing? A: Yes

Let me know if you have any further questions.

bijalbhavsar avatar Feb 08 '16 05:02 bijalbhavsar

@paales - kindly find below code snippet which executes as per the scheduled cron for every 15 minutes.

    $range=100000;   
    $logfilepath=Mage::getBaseDir('var').DS.'log'.DS.'data'.DS.'logs';
    $logfilename='feed'.DS.'logs'.DS.'feed_'.date("Y_m_d").'.log';
    if(!file_exists($logfilepath))    mkdir($logfilepath,0777);

    function getVisibilityId($visibility)
    {
        $visibilityArray = array(
            'Not Visible Individually' => 1,
            'Catalog' => 2,
            'Search' => 3,
            'Catalog, Search' => 4
            );
        return $visibilityArray[$visibility];
    }

    function getTaxClassID($taxclass)
    {
        $taxclassArray = array(
            'None' => 0,
            'Taxable Goods' => 2,
            'Shipping' => 4
            );
        return $taxclassArray[$taxclass];
    }
    $time = microtime(true);
    $path = "/feeds"; 

    $latest_ctime = 0;
    $latest_filename = '';    
    $mainRowtitle=array();
    $group_pricedata=array();
    $d = dir($path);
    while (false !== ($entry = $d->read())) {
        $filepath = "{$path}/{$entry}";
        // could do also other checks than just checking whether the entry is a file
        if(strstr($entry, 'data_full_') != false){
            if (is_file($filepath) && filectime($filepath) > $latest_ctime) {
                $latest_ctime = filectime($filepath);
                $latest_filename = $entry;
            }
        }
    } 

    $file = fopen("$path/$latest_filename", "r");
    $start_at=$datafeed['start_at'];
    $break_value = $start_at+$range;

    if(count($datafeed)>0){

     try {
         $data = array();
         $i = 0;
         $ct=0;
         $ij = 0;
         $bk=0; 
         $skus = array();
         $currentDate=date("Y/m/d");
         while(! feof($file))
         {

            $bk++;
            if($bk > $break_value){
              break;
            }
            $row = fgetcsv($file,0,';'); 
            if(($ct % 1000) == 0 ) {
             if($i==0){
              foreach ($row as $key=>$titleRow) {
               $mainRowtitle[$titleRow]=$key;
              }
             }
             $ij++; 
            }
            if($i <= $start_at) {$i++; continue;}

            if(!in_array( $row[$mainRowtitle['SKU']],$skus)){
                $skus []=$row[$mainRowtitle['SKU']]; 
                $productName=trim($row[$mainRowtitle['Product flat name EN']]);
                if(trim($productName)==''){
                  $productName = trim($row[$mainRowtitle['Vendor Name']].' - '.$row[$mainRowtitle['Description']]);
                }
                $categoryid=$row[$mainRowtitle['Category ID']];
                $weight=0.00;
                $weight=trim($row[$mainRowtitle['Weight']]);
                if($weight==''){
                  $weight=0.00;
                }

              $data[$ij][] = array(
                'store' => 'admin',
                '_product_websites' =>'base',
                '_attribute_set' => 'Default',
                '_type' => 'simple',
                '_category' => $categoryid,
                'sku' => trim($row[$mainRowtitle['SKU']]),
                'has_options' => '',
                'name' => $productName,
                'vpn' => trim($row[$mainRowtitle['VPN']]),
                'ean' => trim($row[$mainRowtitle['EAN0']]),
                'price' => trim($row[$mainRowtitle['Dealer Price']]),
                'weight' =>  $weight,
                'status' => (('Enabled' == 'Enabled')?'1':'2'),
                'visibility' => getVisibilityId('Catalog, Search'),
                'tax_class_id' =>getTaxClassID('None') ,
                'manufacturer' => trim($row[$mainRowtitle['Vendor Name']]),
                'description' => trim($row[$mainRowtitle['Description']]),
                'short_description' =>trim($row[$mainRowtitle['Description']]),
                'qty' => 1,
                'is_in_stock' => 1,
                'product_name' =>  trim($row[$mainRowtitle['Product flat name EN']]),
                'store_id' => 0,
                'product_type_id' => 'simple',
                'product_flat_name' =>  trim($row[$mainRowtitle['Product flat name EN']]), 
                'product_status' =>trim($row[$mainRowtitle['Product Status']]),

                );

                $group_pricedata[$row[$mainRowtitle['SKU']]]=array(
                    1=>trim($row[$mainRowtitle['Price A1 incl. fees & vat']]),
                    2=>trim($row[$mainRowtitle['Price D1 incl. fees & vat']]),
                    3=>trim($row[$mainRowtitle['Price D2 incl. fees & vat']])
                    );
                $ct++;
            } 
        }

        fclose($file);
    }  catch (Exception $e) {
        $message = $e->getMessage();
        $mtime=Mage::getModel('core/date')->date('Y-m-d H:i:s');
        $message=$message."\n Error Time :".$mtime; 
        Mage::log($import->getErrorMessages(), null, $logfilename);
    }

    try {                 
        $i = 0;
        $totalct=0;
        foreach($data as $chunkData) {

          // @var $import AvS_FastSimpleImport_Model_Import
          $import = Mage::getModel('fastsimpleimport/import');
          $result = $import->processProductImport($chunkData);

            foreach($chunkData as $productdata){ 
                $import->import_group_price($productdata['sku'],$group_pricedata[$productdata['sku']]);
                $totalct++;

            }
        }
        $finalmessage='Total Import product : '.$totalct.'  Elapsed time: ' . round(microtime(true) - $time, 2) . 's' . "\n"; 
        Mage::log($import->getErrorMessages(), null, $finalmessage);
    } catch (Exception $e) {

        $message = $e->getMessage();
        $mtime=Mage::getModel('core/date')->date('Y-m-d H:i:s');
        $message=$message."\n Error Time :".$mtime; 
        Mage::log($import->getErrorMessages(), null, $logfilename);
    } 
} 
}

Let me know your views on this. We have removed few new attributes which are added in $data array to reduce the size of above code.

bijalbhavsar avatar Feb 08 '16 06:02 bijalbhavsar

Importing only 1,000 products at a time might be the bottleneck. Try to put that number as high as possible and if the memory runs out, increase the memory limit. I'd try to aim for 100,000 products in one chunk.

avstudnitz avatar Feb 08 '16 07:02 avstudnitz

Thanks @avstudnitz - let me try that and get back to you.

bijalbhavsar avatar Feb 08 '16 09:02 bijalbhavsar

@avstudnitz we have updated from 1000 chunk to 50000 and parse 50000 product CSV file and executed script - it took 16368.98 seconds to import 49445 products.

Please suggest workaround to improve speed of importation.

bijalbhavsar avatar Feb 09 '16 11:02 bijalbhavsar

@bijalbhavsar Can you please run a profiler? The slowness you are experiencing isn't normal. I suggest running the import with XHProf with a small product size to debug this issue.

paales avatar Feb 09 '16 11:02 paales

It is taking around 4 hours to import 200K records without inventory, As we were using Magestore Inventory management extension for multiple warehouses and plants. (with indexing enable)

Here we found that If we enable code for Inventory importation then it is taking around 10-11 Hours to import 200K records with inventory.

bijalbhavsar avatar Mar 04 '16 10:03 bijalbhavsar