processwire-issues icon indicating copy to clipboard operation
processwire-issues copied to clipboard

PagesExportImport issue with some fieldtypes

Open adrianbj opened this issue 11 months ago • 14 comments

Short description of the issue

Hi @ryancramerdesign - just wanting to make sure this isn't lost forever (I don't think there is any other issue referencing it).

Remember that I am also hacking that chunk of code for the VerifiedURL field.

https://github.com/processwire/processwire/blob/3cc76cc886a49313b4bfb9a1a904bd88d11b7cb7/wire/core/PagesExportImport.php#L973

Expected behavior

All fields should be imported.

Actual behavior

FieldtypeTextLanguage, FieldtypePageTitleLanguage, & FieldtypeVerifiedURL don't import with hacking the core code to SetAndSave()

adrianbj avatar Dec 17 '24 15:12 adrianbj

@adrianbj Are you still experiencing the issue? I just tried to duplicate here, using FieldtypePageTitleLanguage and FieldtypeTextLanguage, and it worked correctly both when updating an existing page and when creating a new page. So I'm wondering if this one might have already been fixed?

ryancramerdesign avatar Dec 20 '24 14:12 ryancramerdesign

Yep, still an issue. I just double checked without my hack in place. Please remember to read through this thread: https://processwire.com/talk/topic/28382-not-working-with-pagesexportimport/#comment-235933 which explains that the issue only shows when called via the API. It works from the admin interface.

adrianbj avatar Dec 20 '24 14:12 adrianbj

@adrianbj Just tried from the API with the same two multi-language fields, and it also worked. Maybe we're back to it just affecting VerifiedURL, so I'll have to try that. But here's the API syntax I used. First I exported a page to JSON, then edited the multi-language field values in the JSON. After running the script below, the fields were properly updated in all languages:

<?php namespace ProcessWire;
include('./index.php'); // boot PW
$json = file_get_contents(__DIR__ . '/page.json');
$pei = new PagesExportImport(); 
$pei->importJSON($json);

ryancramerdesign avatar Dec 20 '24 17:12 ryancramerdesign

Thanks for diving in.

For my needs I need to directly call arrayToPage()

$import = new PagesExportImport();

$options = [
    'update' => true,
    'create' => true,
    'saveOptions' => [
        'adjustName' => true,
        'quiet' => false
    ],
    'fieldNames' => $templateAllowedFields,
    'commit' => true,
    'changeTemplate' => false,
    'changeParent' => false,
    'originalHost' => isset($incomingPage['host']) ? $incomingPage['host'] : $this->wire('config')->httpHost,
    'originalRootUrl' => isset($incomingPage['url']) ? $incomingPage['url'] : $this->wire('config')->urls->root,
];

$import->arrayToPage($incomingPage, $options);

I use this for synching content between two sites and it works brilliantly for all sorts of complex fields (Table, RepeaterMatrix, etc) - VerifiedURL is the only one that fails for me without that hack.

adrianbj avatar Dec 20 '24 17:12 adrianbj

@adrianbj I just setup another test using a VerifiedURL field (name: test_verified_url) and the same $options (except the 2 URL/host ones at the bottom) and function call (arrayToPage) as you. I don't know what $incomingPage is but assume it's from something exported by PagesExportImport.

include('./index.php'); 

$options = [
    'update' => true,
    'create' => true,
    'saveOptions' => [
        'adjustName' => true,
        'quiet' => false
    ],
    'fieldNames' => [ 'title', 'test_verified_url' ],
    'commit' => true,
    'changeTemplate' => false,
    'changeParent' => false,
];

$json = file_get_contents(__DIR__ . '/page.json');
$data = json_decode($json, true);
$data = $data['pages'][0]; // narrow to just data for the 1 page
$importer = new PagesExportImport();
$page = $importer->arrayToPage($data, $options);
echo '<pre>' . print_r($page, true);

I exported a page with test_verified_url having value https://processwire.com. Then I edited the page.json file to change the URL to https://www.tripsite.com, and then ran the import. The URL was successfully updated to www.tripsite.com on the page.

I must be missing some part. What's the context where your script is running? Is it a PHP script that's booting PW, or maybe a template file, or something else?

Here's my page.json file:

{
    "type": "ProcessWire:PageArray",
    "created": "2024-12-20 13:51:33",
    "version": "3.0.242",
    "user": "ryan",
    "host": "localhost:8888",
    "pages": [
        {
            "type": "ProcessWire:Page",
            "path": "/about/our-vision/",
            "class": "ProcessWire\\Page",
            "template": "default",
            "settings": {
                "id": 3257,
                "name": "our-vision",
                "status": 1,
                "sort": 6,
                "sortfield": "sort",
                "created": "2023-08-08 11:39:00",
                "modified": "2024-12-20 13:51:02",
                "name_br": "nossa-visao",
                "status_br": 1,
                "name_es": null,
                "status_es": 1
            },
            "data": {
                "title": {
                    "default": "What Makes Us Different EN updated",
                    "br": "O que nos torna diferentes BR updated",
                    "es": "Lo que nos hace diferentes ES updated"
                },
                "test_verified_url": {
                    "status": 200,
                    "checked": 1734720662,
                    "content": null,
                    "data": "https://www.tripsite.com"
                }
            }
        }
    ],
    "fields": {
        "title": {
            "type": "FieldtypePageTitleLanguage",
            "label": "Title",
            "version": "1.0.0",
            "id": 1,
            "blankValue": "",
            "importable": true,
            "test": false,
            "returnsPageValue": true,
            "requiresExportValue": false,
            "restoreOnException": false
        },
        "test_verified_url": {
            "type": "FieldtypeVerifiedURL",
            "label": "Test verified URL",
            "version": "0.0.6",
            "id": 330,
            "blankValue": "class:VerifiedURL",
            "importable": true,
            "test": false,
            "returnsPageValue": true,
            "requiresExportValue": false,
            "restoreOnException": false
        }
    },
    "urls": {
        "root": "/pwsite/",
        "assets": "/pwsite/site/assets/"
    },
    "timer": "0.1872"
}

ryancramerdesign avatar Dec 20 '24 19:12 ryancramerdesign

Thanks @ryancramerdesign - sorry this is taking up so much of your time.

I am building up $incomingPage from the $json that is sent to the destination server/site like this:

$decodedJson = json_decode($json, true);

$pageData = $decodedJson['json'];
$pageArray = json_decode($pageData, true);
$incomingPage = $pageArray['pages']['0'];
$incomingPageData = $incomingPage['data'];

if($incomingPageData['guid'] != '') {
     $p = $this->wire()->pages->get('guid=' . $incomingPageData['guid']);
      if($p->id) {
          $incomingPage['_importToID'] = $p->id;
          $incomingPage['path'] = $p->path;
          $incomingPage['host'] = $decodedJson['host'];
     }
}

That guid field is set on the source server so that pages can be matched to the destination server because pages do also get created on the destination server directly, so the core id field often won't match.

I have CURL request that is called inside a Pages::saved hook on the source server that calls a URL on the destination server that is created by $this->wire()->addHook('/sync/{operation}/{template}' - everything is secured with a hash of the json, timestamp, etc, and a shared secret.

Here is a simplified (removed lots of fields) version of the json I am processing:

   {
      "type":"ProcessWire:PageArray",
      "created":"2024-12-20 11:58:36",
      "version":"3.0.242",
      "user":"me",
      "host":"mysite.com",
      "pages":[
         {
            "type":"ProcessWire:Page",
            "path":"/l/test/",
            "class":"ProcessWirePage",
            "template":"message",
            "settings":{
               "id":32970,
               "name":"test",
               "status":1,
               "sort":1279,
               "sortfield":"sort",
               "created":"2024-12-09 14:34:14",
               "modified":"2024-12-13 13:08:06",
               "created_users_id":6026,
               "modified_users_id":41
            },
            "data":{
               "title":"Test",
               "link":{
                  "status":200,
                  "checked":1734724715,
                  "content":{
                     "title":"Google"
                  },
                  "data":"https://google.com"
               },
               "message_priority":[
                  4
               ],
               "guid":"us1-32970"
            }
         }
      ],
      "fields":{
         "title":{
            "type":"FieldtypePageTitle",
            "label":"Title",
            "version":"1.0.0",
            "id":1,
            "blankValue":"",
            "importable":true,
            "test":false,
            "returnsPageValue":true,
            "requiresExportValue":false,
            "restoreOnException":false
         },
         "link":{
            "type":"FieldtypeVerifiedURL",
            "label":"Website URL",
            "version":"0.0.6",
            "id":103,
            "blankValue":"class:VerifiedURL",
            "importable":true,
            "test":false,
            "returnsPageValue":true,
            "requiresExportValue":false,
            "restoreOnException":false
         },
         "message_priority":{
            "type":"FieldtypeOptions",
            "label":"Message Priority",
            "version":"0.0.2",
            "id":135,
            "blankValue":"class:SelectableOptionArray",
            "importable":true,
            "test":false,
            "returnsPageValue":true,
            "requiresExportValue":false,
            "restoreOnException":false
         },
         "gc_id":{
            "type":"FieldtypeText",
            "label":"GUID",
            "version":"1.0.2",
            "id":438,
            "blankValue":"",
            "importable":true,
            "test":false,
            "returnsPageValue":true,
            "requiresExportValue":false,
            "restoreOnException":false
         }
      },
      "urls":{
         "root":"/",
         "assets":"/site/assets/"
      },
      "timer":"0.1660"
   }

adrianbj avatar Dec 20 '24 21:12 adrianbj

@ryancramerdesign - this is really baffling. I have a RepeaterMatrix (RM) field on the template I am updating but if I exclude that from the import using the fieldNames option then the VerifiedURL is updated as expected.

In an attempt to debug, I inserted a bd($page); here: https://github.com/processwire/processwire/blob/fa47338eede1c7e822ef0e938ef8214aca4f5f3f/wire/core/PagesEditor.php#L467

and it gets called twice during an import. The second time it only includes the title and categories (a PR field) fields, but the first time it includes the values of all fields and these values are the updated values being imported, except for the VerifiedURL field, which is still the old value, unless I either exclude the RM field, or I do the setAndSave() as described - then it contains the new/updated value.

The RM field is quite complex and contains a lot of fields, but of note is that it doesn't contain the VerifiedURL field in question. Not sure that this matters though.

So, the question is, why is a RM field import breaking the update of the VerifiedURL field? Do you think there is something that might have changed for the RM or base Repeater field back in late 2022 or early 2023 that could be causing this?

adrianbj avatar Dec 27 '24 23:12 adrianbj

This probably makes no sense, but if there is already a block within the RM field and you remove it while also changing the VerifiedURL field value, then the latter is actually updated. Note that it makes no difference if the RM is populated or not (neither of those states work), only if it was populated and you are deleting all blocks.

adrianbj avatar Dec 28 '24 01:12 adrianbj

Thanks for the help debugging this. This is good new information. Just for an additional data point I wanted to double check that everything still works interactively, and that it's just the API import where the repeater seems to introduce the issue?

On Fri, Dec 27, 2024, 8:56 PM Adrian Jones @.***> wrote:

This probably makes no sense, but if there is already a block within the RM field and you remove it while also changing the VerifiedURL field value, then the latter is actually updated. Note that it makes no difference if the RM is populated or not (neither of those states work), only if it was populated and you are deleting all blocks.

— Reply to this email directly, view it on GitHub https://github.com/processwire/processwire-issues/issues/2009#issuecomment-2564135485, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACQEUC6Y4I53LKNHMG7WW32HYALZAVCNFSM6AAAAABTYW7I2SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNRUGEZTKNBYGU . You are receiving this because you were mentioned.Message ID: @.***>

ryancramerdesign avatar Dec 28 '24 13:12 ryancramerdesign

@ryancramerdesign - yes, it's only via the API and even then it doesn't seem to be easily reproducible in other environments, but it happens every time with one of my destination servers. I almost feel like it might be related to the Maria/MySQL version but I can't imagine why.

I have decided to give up for now but have moved my hacky fix from PagesExportImport.php into my synching script.

My new site hack is this - note that the field name is link

$import->arrayToPage($incomingPage, $options);

if(isset($incomingPageData['link']['data'])) {
       $destinationPage->setAndSave('link', $incomingPageData['link']['data']);
}

Of some potential interest is that I couldn't do $destinationPage->setAndSave('link', $incomingPageData['link']); - I could only save the URL itself (from the data key), rather than the full array of info containing the status, checked, and content keys. But without trying to import a RM field, then the full array works fine 😖

If I ever have the energy to look into it again and find something I'll let you know.

adrianbj avatar Dec 30 '24 20:12 adrianbj

@ryancramerdesign - I wonder, could the importing of RM fields do something to outputformatting?

In my import code I have a $pages->of(false); - maybe the RM import process is turning it back on and leaving it on which is why the import of the full link object/array doesn't work?

adrianbj avatar Dec 30 '24 21:12 adrianbj

@adrianbj I suppose it's possible. In addition to your $page->of(false); try placing a $pages->of(false); call before all of your code. That disables output formatting for any pages loaded from that point forward. It would be interesting to see if that changes anything.

ryancramerdesign avatar Dec 31 '24 19:12 ryancramerdesign

@ryancramerdesign - As I noted above, I am already using $pages->of(false);

It is in my import code, before I run arrayToPage() on the json.

It's not in my export code, but I am not sure it would help there.

I suppose an alternative to my new hack would be to do a replace on the raw json to change the VerifiedURL data to just the URL.

Or perhaps a better solution might be an ___exportValue() method for VerifiedURL that forces it to only export the URL as a string. Maybe I'll try that soon, but then if there is still an issue with those Language fields, then this wouldn't work because it would need the multiple values.

adrianbj avatar Jan 01 '25 03:01 adrianbj

Ok, so I tried adding this to FieldtypeVerifiedURL.module

	public function ___exportValue(Page $page, Field $field, $value, array $options = array()) {
		if(!$value) return '';
		return $value->href;
	}

but even with just the URL being present in the JSON, it makes no difference - no change is imported. So for now I am back to the hack in this post above: https://github.com/processwire/processwire-issues/issues/2009#issuecomment-2565903593

adrianbj avatar Jan 01 '25 16:01 adrianbj