rrweb
rrweb copied to clipboard
[Bug]: inlineImages not working
Preflight Checklist
- [X] I have searched the issue tracker for a bug report that matches the one I want to file, without success.
What package is this bug report for?
rrweb
Version
2.0.0-alpha.6
Expected Behavior
I expect the record function to emit events with the images as base64 rr_dataURL attributes.
Actual Behavior
Events are emitted with the normal img src attribute unchanged and no rr_dataURL attribute. No warnings or errors in the console.
Steps to Reproduce
Using Chrome Version 112.0.5615.137 (Official Build) (arm64) Begin recording with the setting inlineImages: true, and checkoutEveryNms of some value (say 10000). Look at the events being emitted, you will notice that they do not have the rr_dataURL attribute
Testcase Gist URL
No response
Additional Information
You will notice in the Chrome network tab that every 10 seconds, the browser refetches the images on the page. See the following screenshot that traces the image fetch to rrweb-snapshot.
That line is the following:
Setting the crossOrigin attribute on the img element is causing it to re-load the image. Which then leads to a bug. The condition on line 676 if (image_1.complete && image_1.naturalWidth !== 0)
is always false, because setting crossOrigin causes the image to be reloaded.
Furthermore, I don't understand the point of setting the onload event handler. The function serializeElementNode is not asynchronous, by the time the handler is called, the function has already returned the attributes object.
Building with a locally modified version of rrweb I have tried deleting the following lines:
-
image_1.crossOrigin = 'anonymous';
-
oldValue_1 ? (attributes.crossOrigin = oldValue_1) : image_1.removeAttribute('crossorigin');
-
else image_1.onload = recordInlineImage;
And then inline images works as expected, generating outputs with the rr_dataURL
attribute as expected
This also stops the periodic refetching of every image on the page.
@econte-sprig Thanks for your PR and detailed explanation, but your change is not working with the following case, and it explains why we need to set the onload
event handler.
<html>
<body>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/rrweb.min.js"></script>
<script>
rrwebRecord({
inlineImages: true,
emit(event) {
//
},
});
setTimeout(() => {
const img = document.createElement('img');
img.setAttribute('crossorigin', 'anonymous');
img.setAttribute(
'src',
'xxx',
);
document.body.append(img);
}, 1000);
</script>
</body>
</html>
Additionally crossorigin
is necessary for recording inline images, as the image base64 string is transformed by canvas.toDataURL, which will throw an error when the domain is not clean.
If the image is without crossorigin
attribute initially, after adding the crossorigin
anonymous attribute to the image, the browser will force the use of CORS to obtain resources, and you will see a new network request.
If you want to read rr_dataURL attributes
from record emit:
- If your image is loaded before the record start: you can read it from the full snapshot(
with type: 2
) - If your image is loaded after the record start: you can read it from the increment snapshot(
with type : 3 and source: 0
)
@econte-sprig Thanks for your PR and detailed explanation, but your change is not working with the following case, and it explains why we need to set the
onload
event handler.<html> <body> <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/rrweb.min.js"></script> <script> rrwebRecord({ inlineImages: true, emit(event) { // }, }); setTimeout(() => { const img = document.createElement('img'); img.setAttribute('crossorigin', 'anonymous'); img.setAttribute( 'src', 'xxx', ); document.body.append(img); }, 1000); </script> </body> </html>
Additionally
crossorigin
is necessary for recording inline images, as the image base64 string is transformed by canvas.toDataURL, which will throw an error when the domain is not clean.If the image is without
crossorigin
attribute initially, after adding thecrossorigin
anonymous attribute to the image, the browser will force the use of CORS to obtain resources, and you will see a new network request.If you want to read
rr_dataURL attributes
from record emit:
- If your image is loaded before the record start: you can read it from the full snapshot(
with type: 2
)- If your image is loaded after the record start: you can read it from the increment snapshot(
with type : 3 and source: 0
)
But the code just fundamentally isn't working. Setting 'crossorigin' on Chrome causes it to always reload and never set rr_dataURL. Are you not seeing this behavior?
@econte-sprig Thanks for your PR and detailed explanation, but your change is not working with the following case, and it explains why we need to set the
onload
event handler.<html> <body> <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/rrweb.min.js"></script> <script> rrwebRecord({ inlineImages: true, emit(event) { // }, }); setTimeout(() => { const img = document.createElement('img'); img.setAttribute('crossorigin', 'anonymous'); img.setAttribute( 'src', 'xxx', ); document.body.append(img); }, 1000); </script> </body> </html>
Additionally
crossorigin
is necessary for recording inline images, as the image base64 string is transformed by canvas.toDataURL, which will throw an error when the domain is not clean. If the image is withoutcrossorigin
attribute initially, after adding thecrossorigin
anonymous attribute to the image, the browser will force the use of CORS to obtain resources, and you will see a new network request. If you want to readrr_dataURL attributes
from record emit:
- If your image is loaded before the record start: you can read it from the full snapshot(
with type: 2
)- If your image is loaded after the record start: you can read it from the increment snapshot(
with type : 3 and source: 0
)But the code just fundamentally isn't working. Setting 'crossorigin' on Chrome causes it to always reload and never set rr_dataURL. Are you not seeing this behavior?
can you provide me with a demo to show your case?
Sure: https://github.com/econte-sprig/rrweb-demo1
- Clone the repo
- Run
npm i
- Run
npm run start
- Press the 'Record' button to start recording with rrweb
- You will see in the network tab that every 10 seconds the images on the page are reloaded
- When you press 'Stop' you will see the JSON stringified event array logged to the console, as well as a line 'Includes Data URL attribute true/false'. That line is determined by just checking if the JSON string includes the substring 'rr_dataURL'.
At least on my Chrome 112.0.5615.137 (Official Build) (arm64) you will see that it logs false, the dataURL attribute is not present in the JSON despite setting it to inline images.
@econte-sprig Hi, thanks for your demo. It should work with the following change. And rr_dataURL
attribute is in the full snapshot with a lazy way attached to the attribute object.
const record = () => {
stopRecord.current = rrweb.record({
checkoutEveryNms: 10000,
inlineImages: true,
emit: (ev) => {
- events.current.push(JSON.stringify(ev));
+ events.current.push(ev);
},
})
};
The reload of the image is caused by the cross-origin attribute add and remove, and this attribute is necessary when transforming the image with Canvas. toDataURL
will throw an error when the canvas's bitmap is not origin clean
@econte-sprig Hi, thanks for your demo. It should work with the following change. And
rr_dataURL
attribute is in the full snapshot with a lazy way attached to the attribute object.const record = () => { stopRecord.current = rrweb.record({ checkoutEveryNms: 10000, inlineImages: true, emit: (ev) => { - events.current.push(JSON.stringify(ev)); + events.current.push(ev); }, }) };
This isn't a viable fix. RRweb is changing the attributes on the event after it is emitted. Rrweb's own pack function stringifies the events on emit and so this breaks with rrweb's provided pack function. And what if we have already sent the event to the backend by the time the event is changed? Clearly rrweb should not be changing events after emitting them.
The reload of the image is caused by the cross-origin attribute add and remove, and this attribute is necessary when transforming the image with Canvas.
toDataURL
will throw an error when the canvas's bitmap is not origin clean
The real fix is to emit a separate attribute mutation event when the reload finishes, then store that so it doesn't reload every X seconds.The event should not be changing after emitting, so when the reload finishes after setting crossOrigin='anonymous' a new event needs to be emitted rather than change an event that was previously emitted.
Otherwise, how do I know when an event is finished changing and it is safe to do something with the event (such as store it or send it to the backend)? The event gives no indication that changes are coming, so how long do I have to wait before sending the event to a backend? Am I supposed to just hold every event in memory for X seconds before sending it and just cross my fingers and hope it is done changing despite having no way to know?
This is all not to mention the complete incompatibility with the pack function.
@econte-sprig Is the following change working for you?
- <img src={logo} className="App-logo" alt="logo" />
+ <img src={logo} className="App-logo" alt="logo" crossOrigin='anonymous' />
Or change the source code of the snapshot with the following change to force a new attribute mutation?
But I don't think it is a good choice to add too many strings into the <img>
tag on the user side.
try {
canvasService!.width = image.naturalWidth;
canvasService!.height = image.naturalHeight;
canvasCtx!.drawImage(image, 0, 0);
- attributes.rr_dataURL = canvasService!.toDataURL(
- dataURLOptions.type,
- dataURLOptions.quality,
- );
+ image.setAttribute('rr_data_url', canvasService!.toDataURL(
+ dataURLOptions.type,
+ dataURLOptions.quality,
+ ))
} catch (err) {
console.warn(
`Cannot inline img src=${image.currentSrc}! Error: ${err as string}`,
);
}
This isn't a viable fix. RRweb is changing the attributes on the event after it is emitted. Rrweb's own pack function stringifies the events on emit and so this breaks with rrweb's provided pack function. And what if we have already sent the event to the backend by the time the event is changed? Clearly rrweb should not be changing events after emitting them.
You are absolutely right, this PR should help with this: https://github.com/rrweb-io/rrweb/pull/1239
Just putting together a fix PR for this issue with inline images. You can check it out here.
Let me know if you need anything, such as a more detailed repro or something, but the PR is not too involved. I can explain any of the changes if you have any questions