jellylanguage icon indicating copy to clipboard operation
jellylanguage copied to clipboard

New Jelly corpus

Open lynn opened this issue 5 years ago • 0 comments

I queried SE for new Jelly code since some suggestions in #69 got implemented (May 2018).

I did analysis on the CSV and got a new corpus with new most common {2,3,4}-graphs:

2-graphs    3-graphs    4-graphs
 48 ị“       14 e€Ø       6 Ø.,U
 47 $€       11 ⁼¥Ƈ       6 ƝẠ$Ƈ
 47 iⱮ       10 UÐe       5 ŒPŒ!
 46 Œ!        9 Ạ$Ƈ       5 PŒ!€
 38 ŒP        8 QƑƇ       5 Œ!€Ẏ
 37 €Ẏ        8 €Œp       5 S⁼¥Ƈ
 35 ṣ”        7 Ø.,       5 ;⁾v`
 32 Œp        7 ;N$       4 fØDV
 32 œị        7 ƝẠ$       4 ŒDṙL
 30 $Ƈ        7 fØD       4 e€Øẹ
 29 s2        7 ẠƲƇ       4 QƑƇṪ
 29 ƑƇ        7 ƑƇṪ       4 “Ṿ;⁾
 29 ØD        7 œị⁸       4 Ṿ;⁾v
 28 Ø.        7 +2/       3 ¶TT¶
 28 ;€        7 ’Œ?       3 TT¶“
 28 ịØ        6 ʋ1#       3 S⁼ɗƇ
 26 þ`        6 ⁼ɗƇ       3 .,U$
 26 Ðḟ        6 BḊ€       3 ,U$;
 25 Æm        6 .,U       3 U$;N
 25 e€        6 ŒPŒ       3 $;N$
 25 /€        6 Ƒ$Ƈ       3 ;N$¤
 25 œc        6 œc3       3 >“</
 24 ØA        6 ị“¡       3 ØDV€
 24 ¥Ƈ        6 ”v`       3 ;@W}
 23 1#        6 ṢƑƇ       3 QƑ$Ƈ
 23 “¡        6 ‘ịØ       3 ⁵ịØJ
 23 U$        6 Z$⁺       3 ṣ”:V
 23 2¦        6 “Ṿ;       3 ỊẠƲƇ
 23 €€        6 ”iⱮ       3 _²§½
 23 ŒH        6 µÐL       3 1œị$
 23 Ɗ€        6 K€Y       3 Ṫ⁼¥Ƈ
 22 €Ø        6 ;⁾v       3 ‘ịØB
 22 +2        5 Ɗ$€       3 ŒṪŒ!
 21 €F        5 ,U$       3 Œ!ŒṖ
 21 Øa        5 PŒ!       3 !ŒṖ€
 20 €0        5 Œ!€       3 ØJiⱮ
 20 QƑ        5 !€Ẏ       3 ị⁾# 
 20 ị@        5 Ø0j       3 ṪḢƭ€
 20 Ðe        5 €0Z       3 ØDṙ1
 20 Œc        5 ØDV       3 !@#$

Remarks:

  • Œ! and ŒP are the most common two-character atoms. I think it's OK to make code-page alternatives like § for and here.
  • e€Ø is the most common trigraph. This is followed by A or D to get masks of alphabets/digits in a string.
  • ⁼¥Ƈ is the second most common trigraph. I guess Ƒ was introduced a little later?
  • Apparently UÐe (reverse every other element) is a surprisingly common operation in golf.
  • condition + Ạ$Ƈ is common. Maybe this could be ÐẠ "filter-all".
  • QƑƇ is common: keep only elements that have no duplicates. It's like a meta version of Q. How about Œq?
  • I investigated Ø.,U and apparently this is common because people are writing Ø.,U$ to get [[0,1],[1,0]] and Ø.,U$;N$ to get [[0,1],[1,0],[0,-1],[-1,0]]. So these should both just be nilads. ØX and Øx maybe.
  • Perhaps ƝẠ$ could be ɲ. e.g. checks if a list is strictly ascending.
  • ŒDṙL: I can see why you would want the diagonals in this order.
  • fØDV€ turns “a2b~3,f16!” into [2,3,1,6].
  • _²§½ is about point distances. I think there should be a dyad δ so that [3,3]δ[[3,4],[4,4],[5,3]] is 1,√(2),2.
  • Might as well make œc3 into a monad Œƈ.

lynn avatar Sep 03 '20 14:09 lynn