apoc icon indicating copy to clipboard operation
apoc copied to clipboard

String normalisation

Open neo-technology-build-agent opened this issue 3 years ago • 4 comments

Issue by legraphista Sunday Oct 08, 2017 at 20:15 GMT Originally opened as https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/628


Any plans to support string normalisation?

NFC — Normalization Form Canonical Composition.
NFD — Normalization Form Canonical Decomposition.
NFKC — Normalization Form Compatibility Composition.
NFKD — Normalization Form Compatibility Decomposition.

Thank you for the great work that has been poured into this project!

Comment by legraphista Sunday Oct 08, 2017 at 20:23 GMT


Hmm 🤔 I found the code here but when calling apoc.text.clean("test") it returns an unregistered procedure error.

When calling dbms.procedures() the only text related apoc procedures that I get are apoc.text.phonetic and apoc.text.phoneticDelta

I'm using the following: Neo4j: 3.1.7 Apoc: 3.1.3.8-all

Comment by legraphista Sunday Oct 08, 2017 at 20:40 GMT


Closed in favour of #629

Comment by jexp Monday Oct 09, 2017 at 19:27 GMT


@legraphista is there a Java library that does these?

That one is a user definined function now. You can use RETURN apoc.text.clean("test")

Comment by legraphista Tuesday Oct 10, 2017 at 09:14 GMT


Yep @jexp, I had to re-read the documentation to notice it changed from procedure to function (my bad 😞 ) .

It does NFD as expected, but after cleaning it up it also strips the text out of any non-alphanumeric characters at this line.

I think there should be an option to toggle the stripping of non-alphanumeric characters.

For example if you want to clean a sentence, you need to split it, clean it word by word then join it back together.

String normalization is supported in regular Cypher for awhile now:

  • https://neo4j.com/docs/cypher-manual/current/functions/string/#functions-normalize
  • https://medium.com/neo4j/cypher-gems-in-neo4j-5-fa270643f9b0#:~:text=Cypher%20Unicode%20Normalization

hvub avatar Dec 18 '24 12:12 hvub