TZP canvas: revamp

clean up canvas

[x] remove mozGetAsFile
[x] remove getContext: 2d supported/not-supported
[x] remove winding, fillText, strokeText supported/not-supported
[x] catch 0, null, undefined, false, [], and "" (and of course errors)
- record + display blocked, record specific block type in methods
[x] spoofing: record untrustworthy + lie, display real result, record spoof type in methods
[x] don't forget bypass possible with toDataURL vs toBlob
[x] add Object.keys(CanvasRenderingContext2D.prototype)
[x] what to do with winding - A: we don't need it, entropy manifests in other ways
[x] remove keys: doesn't add to entropy: we already have isVer
[ ] harden RFP random notation: see RFP characteristics (length, strings)
[ ] add toDataURL spoof fingerprint, also to be used in RFP random notation hardening
[ ] add convertToBlob ?
[ ] add offScreen canvas?
~~make canvas more sophisticated~~ NOT DOING, see make canvas smaller/faster
- don't increase size of canvas but increase size of objects as this defines pixels more precisely
- use color gradients
- mathematical curves and shadows
- multiple text snippets, each rotated, twisted, curved etc
- example:
- see https://plaperdr.github.io/morellian-canvas/Preliminary%20analysis/webpage/canvas.html
[x] make canvas tests smaller and faster
- as long as their is some entropy, that's fine
- the real test is checking the metric is protected and the fingerprinting characteristics

original post

when I clean up canvas section to color up lies, bypasses, methods, and add a toDataURL FP (in methods) etc

I noticed with cydec the other day, on the noise test the toDataURL image is broken

broken

So I added the full data for toBlob and toDataURL on the spoof test

with cydec (this is the known test used in TZP), it is always 2be88c...

cydec

on the TZP test (in both FF chrome, at least on windows) it is always 984e23748f49e7aff3f1dab5....

cydec is returning null

Anyway: cydec is not "noise" being added, it's returning null

no error caught
2 passes: they match, so not a lie yet
known test does not match result, therefore it's a lie, so we assume noise added

This is what confuses me

	// temp debug
	let testdata = getFilledContext().canvas.toDataURL()
	if (testdata == "") {console.debug(runNo, "empty string")}
	if (testdata === null) {console.debug(runNo, "null")}
	if (testdata === undefined) {console.debug(runNo, "undefined")}

	// temp debug
	let testdata = getKnown().canvas.toDataURL()
	if (testdata == "") {console.debug("k, empty string")}
	if (testdata === null) {console.debug("k, null")}
	if (testdata === undefined) {console.debug("k, undefined")}

huh

Why is the hash different

2be88ca4242c76e8253ac62474851065032d6833 for known
984e23748f49e7aff3f1dab58ad7c26a649433cc.... for toDataURL
they're both null
???

Jun 06 '21 06:06 Thorin-Oakenpants

https://github.com/arkenfox/TZP/commit/84b1448ccd62887cd5c5c0ab19cc6333c6ff543b

will expand to cover them all when I cleanup the rest
still intrigued as to why the hashes differed
do we need to handle undefined and empty strings ? <- @abrahamjuliot

Jun 06 '21 06:06 Thorin-Oakenpants

do we need to handle undefined and empty strings

That could be useful to highlight in the output, but the hash will be unique in any case. For example, 0, null, undefined, false, [], and "" should not return the same hash.

Jun 06 '21 20:06 abrahamjuliot

I still don't see why the hashes differed between toDataURL known and toDataURL test. Anyhow, the hashes collected for known only tell me that the canvas is being tampered with, not how: and returning a consistent error even across browser sessions is not "noise added", but rather more entropy to be added under methods

0, null, undefined, false, [], and ""

oooh, a list 👍

Jun 06 '21 21:06 Thorin-Oakenpants

currently

     getContext | 2d: supported
      toDataURL | hash or blocked (nulls, errors, timeouts etc)
         toBlob | hash/blocked
   mozGetAsFile | not supported or hash/blocked
   getImageData | hash/blocked
        winding | supported
  isPointInPath | hash/blocked
isPointInStroke | hash/blocked
       fillText | supported
     strokeText | supported

I'm going to strip out the mozGetAsFile (makes things simpler for bypasses etc), and I don't need the getContext 2d (which used to combine with webgl support): if it's not supported that will show in the hash results - right?

So my question would be do I need the winding, fillText, strokeText in my results? CB has options for the two Text and an extension could block those inputs - but that still results in the canvas results showing that up

CB set to block only inputs fillText and strokeText, but you also have to tick getContext for anything to apply CB-block-input

If you do the same but untick getContext but tick say toDataURL, then everything is "supported" but toDataURL returns undefined

So my gut reaction is that winding, fillText, strokeText are not required for any entropy? @abrahamjuliot would you agree? Then I can just focus on five results, not ten .. does this sound like the right plan?

Jun 06 '21 23:06 Thorin-Oakenpants

pretty sure I don't lose any possible entropy by reducing to five results

five

Jun 07 '21 00:06 Thorin-Oakenpants

getContext 2d: if it's not supported that will show in the hash results - right?

Right, the hash result can detect support.

winding, fillText, strokeText

It appears these have long-standing support. Unless there is a preference that alters these, checking support might be unnecessary. If there is a preference that alters supported features, we could alternatively fingerprint functions and properties on CanvasRenderingContext2D.

Object.keys(CanvasRenderingContext2D.prototype)

Jun 07 '21 00:06 abrahamjuliot

^^ That sounds better .. have some 🍰

Jun 07 '21 00:06 Thorin-Oakenpants

if canvas keys are blocked, it would still affect the FP of the image (if image didn't error)

Jun 07 '21 00:06 Thorin-Oakenpants

Technically, Object.keys can be altered so that it blocks the argument CanvasRenderingContext2D.prototype, and this would not affect the image or cause errors. Instead, we can use a for...in loop to get the keys:

keys = []; for (const key in CanvasRenderingContext2D.prototype) { keys.push(key) }

If CanvasRenderingContext2D.prototype is blocked, that would affect the image and cause errors.

Jun 07 '21 00:06 abrahamjuliot

So are we trying to bypass the keys tampering, or get it's entropy .. or both? Will come back to it, after I tidy up the current five items with logic for blocks (including nulls, errors etc), bypass (toDataURL vs toBlob), lies, methods (for spoofing and "blocks") .. and coloring stuff up

Jun 07 '21 00:06 Thorin-Oakenpants

Some observations (super unlikely settings, but then who knows)

CB settings: one way to trigger the effect (basically blocking some inputs)

General: block everything (or block input)
APIS: uncheck getContext, toDataURL, toBlob, getImageURL

settings

My real canvas values are

      toDataURL 1684d35fa0fe81e6f4091bcf624742e99fe01e2e6841d7c7ec98534a47d1402a
         toBlob 1684d35fa0fe81e6f4091bcf624742e99fe01e2e6841d7c7ec98534a47d1402a
   getImageData 8ef46cfc0049d834564c9d21b8d6c60e3c9733458404de50524710253e75962d
  isPointInPath be524a87ffe08cdaa4512fef9e3595783f568823225593b406c9b3c62f81807c
isPointInStroke 942591bcf2f6bd7c4c49a061a3f1ad5c5d7de9c6ee2a9fc86f79a1b665a0acf7

here's what happens holy-moly

The known tests pass, and we do not pick up the lies for toBlob, toDataURL or getImageData. Those values are consistent on my PC. So lets look at CB tests cbtest

So the fillText is being blocked from being input (note my canvas test does not use strokeText). The known tests do not use text, and in fact do not do much at all, whereas the actual canvas test does a bunch of colors and shapes (with math: not sure if that math is FF entropy-worthy), and text, and winding. I assume the winding is not being applied either.

So is this a LIE or not? Initially I was like, WTF, why am I not picking this up. But now I realize it's not a lie. It's not being spoofed (it's consistent across reloads, sessions etc) and is actually revealing the correct values: it's just that some things are blocked (text, winding).

Now we might be able to pick up on the winding (not supported), but not sure on the fillTest, strokeText (they are supported), but to be fair, the image already holds entropy. Will be interested to see what the canvas keys adds when I'm ready.

@abrahamjuliot .. would you agree with my assessment?

Jun 07 '21 14:06 Thorin-Oakenpants

^ of course what I can't detect is if someone blocks fileText but then randomizes with persistent noise (not sure I can get CB to behave like that): the problem is the known test .. I guess that's just super unlikely edge cases?

Jun 07 '21 15:06 Thorin-Oakenpants

keys: added (unsorted for now) looking at FF nightly vs chrome stable

FF has (and chrome doesn't)

createConicGradient, mozCurrentTransform, mozCurrentTransformInverse, mozImageSmoothingEnabled, mozTextStyle
- FYI: createConicGradient is not in stable

chrome has (and Ff doesn't)

direction, getContextAttributes, imageSmoothingQuality

Other than that, they are the same, but the order is different

thoughts

diffs between FF releases doesn't really matter (we have get version), but some might be behind a pref
voila ... see canvas.createConicGradient.enabled .. but no-one would really flip these
so I think this really only enforces browser version per engine
since it has almost zero perf, might as well keep it

I'm going to check older versions of FF to see how much volatility it had. And it'll be interesting to see of it changes with various extension spoofing configs (or RFP, Brave shields)

Also: winding was in the old test, and it does become unavailable in some configs: but I don't know how important that is to collect, or what blocks winding. A spoofed canvas won't reflect lack of winding, but a non-spoofed one would.

Jun 09 '21 03:06 Thorin-Oakenpants

test away bro 2dkeys

edit: seems stable AF: RFP, brave shields, extensions (tried lots of configs) etc aren't affecting it

Jun 09 '21 03:06 Thorin-Oakenpants

FYI: tested vanilla FF's (windows)

FF60-69
e61df00e8aa70a5cbb5533dee374b5aff116cfc5 [62 keys]

FF70-89
b7d5621a3a1a1b9e6fe366187e562426bea769f4 [63 keys]
added: getTransform

FF90+
d516bde278df1947c1d099d77b3d8ca8bbb4a4a3 [64 keys]
added: createConicGradient

Jun 09 '21 07:06 Thorin-Oakenpants

@abrahamjuliot (and @kkapsner : you still there buddy? is everything OK?)

In order to be more consistent with layout and alignment of columns, I changed the few items that were using SHA-256, to SHA-1: namely audio and canvas

like here in audio https://github.com/arkenfox/TZP/blob/09fdc4cb467cee746dc14af34cd112e4eeea9f2b/js/audio.js#L259-L265

I'm concerned about using SHA-1 with crypto.subtle.digest as it may get deprecated, or indeed not even work in some browsers (IDK about safari: but Edge doesn't seem to: that would be the edgeHTML engine I assume, not the one based on blink)

So I have two options AFAICT

just use my own sha1() and rehash the sha-256 value
work out how to convert the arrayBuffer into an array before hashing
- I'm sure it's something simple

@abrahamjuliot don't you do something like my first option, by minifying the hash?

Jun 09 '21 07:06 Thorin-Oakenpants

you still there buddy? is everything OK?

Yes - I'm still here. But the last months were tough. Was not ill but the whole situation was very taxing.

Jun 09 '21 10:06 kkapsner

Yes - I'm still here

Hang in there buddy. I'm rooting for ya ❤️ . Meanwhile, have some 🍰 and 🍻

Jun 10 '21 02:06 Thorin-Oakenpants

convert the arrayBuffer into an array

Array.from(buffer)
// or
[...buffer]

minifying the hash

For section hashes and heavy arrays, I use sha-256 and slice the first 8 characters for the HTML output. I've seen some sites shorten sha-256 strings with an ellipsis separator. Some show the full hash in a pop-up title on mouse hover.

const sha256hash = '89455ebb9765644fb98068ec68fbad7fcaaf2768b2cb6e1bd062eee5790c00e8'
const getDisplayHash = (sha256hash, limiter = 8, separator = '...') => {
  return sha256hash.slice(0, limiter)+separator+sha256hash.slice(-limiter)
}
getDisplayHash(sha256hash) // "89455ebb...790c00e8"

For sub item display hashes and a few light size metrics, I use a mini hash function.

// https://stackoverflow.com/a/22429679
const hashMini = str => {
	const json = `${JSON.stringify(str)}`
	let i, len, hash = 0x811c9dc5
	for (i = 0, len = json.length; i < len; i++) {
		hash = Math.imul(31, hash) + json.charCodeAt(i) | 0
	}
	return ('0000000' + (hash >>> 0).toString(16)).substr(-8)
}

Jun 10 '21 04:06 abrahamjuliot

hmmm, getDisplayHash .. I used to do that on canvas

e.g. [1] 829a659daf5413... [2] 29fa351ec8e334a......
e.g. noise detected [both] 4829fa829a659daf541351ec8...
^^ both totalling 64 chars to match a sha-256 length

After doing that I wasn't a fan of hiding the full hash, although the hash is meaningless in these cases

I guess my two issues are SHA-1 doesn't work on all browsers and can get dropped at any time, and SHA-1 can have collisions

not that I care about edgeHTML, but breakage is not professional
so therefore, I must use SHA-256 and display a short form
or use my internal sha1() function

Canvas and webgl should really use sha-256 given they have high entropy

Hmmm ... decisions .. decisions ... ¯\_(ツ)_/¯ I need to drink on it for a while :beers:

Jun 10 '21 08:06 Thorin-Oakenpants

Hmmm .. RFP random characteristics

RFP

Jun 18 '21 18:06 Thorin-Oakenpants

hah - https://bugzilla.mozilla.org/buglist.cgi?bug_id=1737038,1724331

Nov 08 '21 13:11 Thorin-Oakenpants

@abrahamjuliot thanks for that mini function 👍

I've been on a little bit of a perf mission. Things like using map. I should use sets more too. using else if where I can. Adding break in for loops. using inline instead of calling a global function. one thing I'm not sure on with perf is how async can speed things up

Got any tips?

FYI

I think I need to be careful on what chars can be in the string passed to mini - because it uses charCodeAt?

I was calling sha1() 122 times (or 123 if not nightly for component shims) taking 95ms

the fastest code is when no code at all runs: so I eliminated calculating a hash
- e.g. a bypass one vs the fake one unless needed
- not hashing empty empty arrays or arrays with 'undefined'
  - e.g. when a textmetric wasn't supported
I switched canvas to SHA-1 instead of rehashing it
- the entropy is not that important, it's to detect protection, so if I get a collision so be it
- if SHA-1 is ever dropped from crypto, I can switch it back via the var isSHA
- I also switched canvas known tests to mini (except nonFF toDataURL, toBlob still uses sha1 because I can't test for all mini values)
i added the mini() which I use for simple compares (if needed)
I moved some functions to post FP: i.e items not in the fingerprint
- all the iframe UA tests (a hash for each pus the summary)
- worker and iframe tests are going to be treated as a different FP to top level doc

currently down to 92 (8 mini, 84 sha1) taking ~80ms, but I can do more

the sixteen domrects can be switched to minis, and only a single sha1 generated if no lies (or 4 in non-FF)
the three computed styles suck a lot of ms - we can do the same as domrect: if no lies we only need a single sha1
math will be replaced with the monsta test PoC: all set to go: two minis for 2-pass compares, and a single sha1 for display

out of interest, I ran the entire thing with mini and it topped out at 14ms :)

click me for details

HASH STATS: [92 times | 78 ms]
 -   1 : mini : _global isError
 -   1 : sha1 : _global isEngine
 -   0 : mini : _prereq navA
 -   0 : mini : _prereq navB
 -   1 : sha1 : feature errors
 -   1 : sha1 : feature widgets
 -   0 : sha1 : feature math m1hash
 -   0 : sha1 : feature math m6hash
 -   0 : sha1 : feature math mchash
 -   0 : sha1 : feature math m1hash
 -   0 : sha1 : feature math m6hash
 -   0 : sha1 : feature math mchash
 -   0 : sha1 : feature section result
 -   0 : sha1 : ua
 -   0 : sha1 : ua section result
 -   0 : sha1 : screen section result
 -   1 : sha1 : devices speech engines
 -   1 : sha1 : headers section result
 -   0 : sha1 : storage section result
 -   1 : sha1 : domrect
 -   2 : sha1 : domrect
 -   1 : sha1 : domrect
 -   0 : sha1 : domrect
 -   1 : sha1 : domrect
 -   1 : sha1 : domrect
 -   0 : sha1 : domrect
 -   0 : sha1 : domrect
 -   1 : sha1 : domrect
 -   0 : sha1 : domrect
 -   1 : sha1 : domrect
 -   0 : sha1 : domrect
 -   1 : sha1 : domrect
 -   0 : sha1 : domrect
 -   0 : sha1 : domrect
 -   0 : sha1 : domrect
 -   0 : sha1 : domrect section result
 -   0 : sha1 : media canplay
 -   0 : sha1 : media istype
 -   1 : sha1 : media canplay
 -   0 : sha1 : media istype
 -   0 : sha1 : media section result
 -   0 : sha1 : languages collation
 -   0 : sha1 : languages timezone offsets
 -   0 : sha1 : languages language & locale
 -   0 : sha1 : languages timezone
 -   2 : sha1 : languages date/time & format
 -   0 : sha1 : languages geo
 -   0 : sha1 : language section result
 -   1 : sha1 : css colors
 -   1 : sha1 : css colors
 -   0 : sha1 : css colors
 -   1 : sha1 : css colors
 -   0 : sha1 : css system fonts
 -  16 : sha1 : css computed style 0
 -   7 : sha1 : css computed style 1
 -   6 : sha1 : css computed style 2
 -   1 : sha1 : css section result
 -   0 : sha1 : misc component shims
 -   9 : sha1 : misc iframe props
 -   0 : sha1 : misc nav keys
 -   0 : sha1 : misc section result
 -   1 : sha1 : elements keys
 -   0 : sha1 : elements mathml
 -   0 : sha1 : elements lineheight
 -   0 : sha1 : elements section result
 -   0 : sha1 : devices media devices
 -   0 : sha1 : devices section result
 -   2 : sha1 : fonts textmetrics width
 -   1 : sha1 : fonts textmetrics actualBoundingBoxAscent
 -   0 : sha1 : fonts textmetrics actualBoundingBoxDescent
 -   1 : sha1 : fonts textmetrics actualBoundingBoxLeft
 -   1 : sha1 : fonts textmetrics actualBoundingBoxRight
 -   2 : sha1 : fonts gylphs offset
 -   2 : sha1 : fonts gylphs bounding
 -   2 : sha1 : fonts gylphs client
 -   1 : sha1 : fonts fontsScroll
 -   1 : sha1 : fonts fontsOffset
 -   0 : sha1 : fonts fontsClient
 -   0 : sha1 : fonts fontsPixel
 -   1 : sha1 : fonts fontsPixelSize
 -   1 : sha1 : fonts fontsPerspective
 -   0 : sha1 : fonts fontsTransform
 -   0 : sha1 : fonts section result
 -   0 : mini : canvas [k] todataurl
 -   1 : mini : canvas [k] getimagedata
 -   0 : mini : canvas [k] ispointinpath
 -   0 : mini : canvas [k] ispointinstroke
 -   0 : mini : canvas [k] toblob
 -   0 : sha1 : canvas section result
 -   0 : sha1 : audio get
 -   1 : sha1 : audio copy
 -   0 : sha1 : audio section result

Nov 09 '21 06:11 Thorin-Oakenpants

async

There's performance gains if the functions contain asynchronous operations like setTimeout, promise based APIs, fetch requests, etc. The async syntax is mostly a cleaner way of writing promises.

tips

I think you're ahead of this tip by moving non fingerprinting functions to post FP.

I run the SubtleCrypto.digest() hashing functions in one Promise.all, separate from the fingerprint functions, and then I perform all HTML template modifications in a final patch operation to reduce blocking code during other operations.

In short, there are 5 operations I run with their own performance time.

Get iframe and prototype lies, then pass iframe contentWindow and lie results to fingerprinting functions
Fingerprint (includes async operations like worker, webrtc, and voices)
Continue final fingerprinting (compare navigator with worker results)
Then, perform hashing
Finally, patch the HTML template with the results. I mostly use the mini hash function here to compress sub results.

Nov 13 '21 19:11 abrahamjuliot

Thanks

I'm down to 72 calls and ~50ms (can still remove 5 math and 11 domrect which might shave off 5+ ms - new math coming soon) - as long as you're not lying or blocking some multiple methods (7 x fonts, 3 x css styles - in which case I have to compute each lie and maybe a bypass)

change

I'm toying with the idea of using minihashes for some sections, but it just looks weird - but it can make sense - I need to drink on it

by moving non fingerprinting functions to post FP

Yup, that made some perf more consistent as well - it's really just the perf.now test (11 x 13ms) and the iframe + workers

I replaced the woff function and it's now in the FP - check it out, speed AF - bonus, it picks up on a pref change when you rerun

If I was designing this from scratch again, it would be a bit different - removing the iframes and workers to run post doc makes sense when I expand them - i.e treat top level doc separate

timing

The only one that bugs me is device speech engines gets held up somewhere

RFP off - run outputDevices last, the perf is like 8m / run outputDevices first and it ends up down the stack at 100ms (rerun is 8ms)
RFP on - run outputDevices last (or pretty much not early) and the entire TZP perf goes out the window, double/triple

Nov 13 '21 23:11 Thorin-Oakenpants

Useless app

Mar 18 '22 12:03 Asakapaaaa

updated: https://arkenfox.github.io/TZP/tests/canvasrfp.html - much faster and you can now set a size via console (see details at the top where it says click me)

You can also bypass the run buttons by using run_checks(number). If there's no known matching patterns for a size (I've only bothered with 16x8 and 16x8), the summary will basically provide you with the guts of the RFP rules for that size (if you use a large enough number)

for example, before I added the rules for 16x8 - I ran run_check(1000000) - 1 million, and got back all 12 possible combos from the 4 toDataURL lengths sizes and last 10 chars (the middle slice was stable) - ... BUT ... some of those were as little as 1 result from a million. Also, don't do that unless you have grunty machine. On this 11 yrs old, it took about a minute to run, but then basically made FF unusable for about 4 minutes.

So basically, all combos are possible: lengths x middlechars x last 10chars

anyway, to cap a few things off. Canvas (TZP refactor) is now no longer checking for entropy - we already know entropy exists and we already protect against it. So this is pointless to try and exploit in TB or RFP - so TZp only checks for protection - and records the unprotected (known) hash or untrustworthy. If untrustworthy it also records if RFP and it also records if persistent vs not persistent.

I might add some degree of randomization: such as channels and % of pixels changed: low, high, medium - but this is nasty when trying to draw lines if you want consistency

FWIW: I think subpixel collection is far more dangerous: font, glyph, mathml, lineheight and other objects - transformed etc, is about the worst we can get (maybe outside webgl)

Dec 26 '22 15:12 Thorin-Oakenpants

TZP TZP copied to clipboard

canvas: revamp

FYI

TZP
TZP copied to clipboard