characters icon indicating copy to clipboard operation
characters copied to clipboard

[Feature Request] Support `isEmoji`

Open btrautmann opened this issue 1 year ago β€’ 6 comments

Context

https://github.com/flutter/flutter/issues/142037 is the issue that spawned this one.

Request

When working with user-generated text, it can be helpful to introspect on that text, filter it, etc. Since text can include emojis, and this package provides ways of splitting text into its grapheme clusters such that each cluster (as I understand it, I'm new to this) could or could not represent an emoji, it would be awesome to have an isEmoji function similar to the one provided in Swift. Is this possible? I understand that its implementation may need to be updated as new emojis are added, so it may be undesirable. But given that the linked implementation above exists, I imagine Apple has decided it's worth the extra maintenance to have the nice DX.

Example usage:

extension on String {
  String withoutEmojis() {
    final buffer = StringBuffer();
    for (final cluster in characters) {
      if (!cluster.isEmoji) {
        buffer.write(cluster);
      }
    }
    return buffer.toString();
  }
}

btrautmann avatar Jan 30 '24 15:01 btrautmann

I think you can write an isEmoji extension like this:

extension on String {
  static final _isEmojiRegex =
      RegExp(r'\p{Extended_Pictographic}', unicode: true);

  bool get isEmoji {
    return _isEmojiRegex.hasMatch(this);
  }
}

xvrh avatar Jan 30 '24 15:01 xvrh

I think you can write an isEmoji extension like this:

Unfortunately I don't think that will be exhaustive, consider the following using that implementation:

void main() {
  const family = 'πŸ‘¨β€πŸ‘©β€πŸ‘¦β€πŸ‘¦';
  print(family.isEmoji);  // prints `true`
  
  const usa = 'πŸ‡ΊπŸ‡Έ';
  print(usa.isEmoji); // prints `false`
}

btrautmann avatar Jan 30 '24 16:01 btrautmann

Maybe with the regex like that:

  static final _isEmojiRegex = RegExp(
      r'[\p{Extended_Pictographic}\p{Emoji_Presentation}]',
      unicode: true);

xvrh avatar Jan 30 '24 16:01 xvrh

Maybe with the regex like that:

  static final _isEmojiRegex = RegExp(
      r'[\p{Extended_Pictographic}\p{Emoji_Presentation}]',
      unicode: true);

This seemingly works at least for my use case. I do think it'd be sweet to have this as an extension managed by the framework rather than something every user that needs the functionality maintains, so I won't immediately close this unless a maintainer disagrees :)

PS thank you @xvrh for your help πŸ˜„

btrautmann avatar Feb 02 '24 18:02 btrautmann

Do you guys have a link to where this regex pattern is documented? That is using Extended_Pictographic and Emoji_Presentation. I couldn't find it mentioned in RegExp and i'm surprised you can query those. I do however think a more low level construct would be better performance wise than a regex. The Characters package already have the emoji-data table available, but not sure if this is out of scope, as it's mainly for iterating grapheme clusters.

erf avatar Feb 12 '24 04:02 erf

Not sure if this helps, but they're documented (IMO) pretty well here:

  • https://unicode.org/reports/tr51/#Emoji_Characters
  • https://unicode.org/reports/tr51/#Emoji_Properties_and_Data_Files

btrautmann avatar Feb 12 '24 17:02 btrautmann