App icon indicating copy to clipboard operation
App copied to clipboard

[$500] IOS - Compose Box - App crashes when pasting a text with an image

Open IuliiaHerets opened this issue 1 year ago • 50 comments

If you haven’t already, check out our contributing guidelines for onboarding and email [email protected] to request to join our Slack channel!


Version Number: 9.0.61-0 Reproducible in staging?: Y Reproducible in production?: Y If this was caught on HybridApp, is this reproducible on New Expensify Standalone?: Y Issue was found when executing this PR: https://github.com/Expensify/App/pull/51855 Email or phone of affected tester (no customers): [email protected] Issue reported by: Applause Internal Team

Action Performed:

  1. Open IOS app
  2. Open any chat and paste this text inside of the compose box test_test
  3. Try to send the message or go back

Expected Result:

The text with the image gets sent

Actual Result:

App freezes at first and crashes after a while

Workaround:

Unknown

Platforms:

  • [ ] Android: Standalone
  • [ ] Android: HybridApp
  • [ ] Android: mWeb Chrome
  • [x] iOS: Standalone
  • [x] iOS: HybridApp
  • [ ] iOS: mWeb Safari
  • [ ] MacOS: Chrome / Safari
  • [ ] MacOS: Desktop

Screenshots/Videos

https://github.com/user-attachments/assets/27701433-691f-49fe-b122-aec2738e57e5

View all open jobs on GitHub

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~021856775016601889974
  • Upwork Job ID: 1856775016601889974
  • Last Price Increase: 2024-11-26

IuliiaHerets avatar Nov 13 '24 13:11 IuliiaHerets

Triggered auto assignment to @garrettmknight (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details. Please add this bug to a GH project, as outlined in the SO.

melvin-bot[bot] avatar Nov 13 '24 13:11 melvin-bot[bot]

Commented on the offending issue.

garrettmknight avatar Nov 13 '24 14:11 garrettmknight

Job added to Upwork: https://www.upwork.com/jobs/~021856775016601889974

melvin-bot[bot] avatar Nov 13 '24 19:11 melvin-bot[bot]

Triggered auto assignment to Contributor-plus team member for initial proposal review - @ntdiary (External)

melvin-bot[bot] avatar Nov 13 '24 19:11 melvin-bot[bot]

@ntdiary can you take a look at this PR and see if it's causing the problem?

garrettmknight avatar Nov 13 '24 19:11 garrettmknight

@IuliiaHerets Can you please comment the exact text that was pasted inside the compose box?

badeggg avatar Nov 14 '24 09:11 badeggg

@ntdiary can you take a look at this PR and see if it's causing the problem?

@garrettmknight, I think this issue wasn't introduced by that PR. The problem is likely due to performance issues with our input component when parsing url in the text, which causes the app to become unresponsive. This is likely another regular expression catastrophic backtracking problem, similar to #34324 😂

test video:

https://github.com/user-attachments/assets/42caa88d-ebac-4eec-b066-4f86a0f27942


the origin text in the OP: ![test](https://camo.githubusercontent.com/4848d0f965f332077b77a1a0488c3e66b4769032104f4de6890bae218b4add8d/68747470733a2f2f70696373756d2e70686f746f732f69642f313036372f3230302f333030)_test cc @badeggg

ntdiary avatar Nov 14 '24 09:11 ntdiary

Thanks @ntdiary - we'll solve here then!

garrettmknight avatar Nov 14 '24 15:11 garrettmknight

@badeggg Posted the exact text that was pasted inside the compose box. It is the same as @ntdiary mentioned ![test](https://camo.githubusercontent.com/4848d0f965f332077b77a1a0488c3e66b4769032104f4de6890bae218b4add8d/68747470733a2f2f70696373756d2e70686f746f732f69642f313036372f3230302f333030 )_test

IuliiaHerets avatar Nov 14 '24 20:11 IuliiaHerets

Edited by proposal-police: This proposal was edited at 2024-11-21 07:37:15 UTC.

Proposal

Please re-state the problem that we are trying to solve in this issue.

App stalls when pasting specific text inside the compose box.

What is the root cause of that problem?

Regexp execution get stuck when too many possibilities before failing.

Let me show an example. Regexp (\w*)*m will get stuck when testing with 4848d0f965f332077b77a1a0488c3e66b4769032104f4de6890bae218b4add8d:

  1. The testing will fail, since there is no char m in the string
  2. Before the failing decision, the regexp executor need go through too many possibilities
js sample code

Running this piece of code will lead stuck with both node and hermes. I am testing with `node v20.18.0` and hermes built from source with hash `662354b78a4ee8f4cf449c47c46fcaa6ca473847`, which is the current main branch.

function log() {
    if (typeof print !== 'undefined') { // in hermes
        print(...arguments)
    } else if (typeof console !== 'undefined') { // in node
        console.log(...arguments);
    }
}

log('hello there')


const markdown = `4848d0f965f332077b77a1a0488c3e66b4769032104f4de6890bae218b4add8d`;

const regexp = /(\w+)*m/

const replacement = '===='

const start = new Date();
log('start', start)
const result = markdown.replace(regexp, replacement);
const end = new Date();
log('  end', end)

log('result', result)
python sample code

This also happens with python, specifically I tested with following code on `Python 3.10.5`:

import re
from datetime import datetime

# Sample text
text = '4848d0f965f332077b77a1a0488c3e66b4769032104f4de6890bae218b4add8d'


# Define the pattern to search for an email address
# pattern = r'(\w+)+m'
pattern = r'(\w*)*m'

# Search for the pattern in the text
print('start: ', datetime.now())
result = re.match(pattern, text)
print('  end: ', datetime.now())

print(result)

In our project, the regexp for video test against ![test](https://camo.githubusercontent.com/4848d0f965f332077b77a1a0488c3e66b4769032104f4de6890bae218b4add8d/68747470733a2f2f70696373756d2e70686f746f732f69642f313036372f3230302f333030)_test is slowing regexp executor in hermes. If we remove the VIDEO_EXTENSIONS part from the regexp, which is the 'match failing' part, slowing is off.

To be more convincible, here are facts:

  1. Testing the regexp for video with ![test](https://camo.githubusercontent.com/4848d0f965f332077b77a1a0488c3e66b4769032104f4de6890bae218b4add8d/68747470733a2f2f70696373756d2e70686f746f732f69642f313036372f3230302f333030)_test in hermes cost about 30 seconds.
  2. After reducing the possibilities by setting url path char quantifier to {60,80} from *, testing cost about 9 seconds.
test code
function log() {
    if (typeof print !== 'undefined') { // in hermes
        print(...arguments)
    } else if (typeof console !== 'undefined') { // in node
        console.log(...arguments);
    }
}

log('hello there')


const markdown = `![test](https://camo.githubusercontent.com/4848d0f965f332077b77a1a0488c3e66b4769032104f4de6890bae218b4add8d/68747470733a2f2f70696373756d2e70686f746f732f69642f313036372f3230302f333030)_test`;

// full original reg for video
// const regexp = /\!(?:\[([^\][]*(?:\[[^\][]*][^\][]*)*)])?\((((((((ht|f)tps?:\/\/)([a-z0-9](?:[-a-z0-9]*[a-z0-9])?\.)*(?:[a-z0-9](?:[-a-z0-9]*[a-z0-9])?)(?:\:([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])|\b|(?=_)))(?:(?:[.,=(+$!*]|&(?:amp|#x27);)?\/(?:[-\w$@.+!*:(),=%~]|&(?:amp|#x27);)*(?:[-\w~@:%)]|&(?:amp|#x27);)|\/)*(?:(?:\?(?:[-\w$@.+!*()\/,=%{}:;\[\]\|_|~]|&(?:amp|#x27);)*)?|(?:#(?:[-\w$@.+!*()[\],=%;\/:~]|&(?:amp|#x27);)*)?)*)|((((ht|f)tps?:\/\/)?((?:www\.)?[a-z0-9](?=(?<label>[-a-z0-9]*[a-z0-9])?)\k<label>\.)+(?:XN--VERMGENSBERATUNG-PWB|XN--VERMGENSBERATER-CTB|XN--CLCHC0EA0B2G2A9GCD|XN--W4R85EL8FHU5DNRA|TRAVELERSINSURANCE|NORTHWESTERNMUTUAL|XN--XKC2DL3A5EE0H|XN--MGBERP4A5D4AR|XN--MGBAI9AZGQP6J|XN--MGBAH1A3HJKRD|XN--BCK1B9A5DRE4C|XN--5SU34J936BGSG|XN--3OQ18VL8PN36A|XN--XKC2AL3HYE2A|XN--MGBCPQ6GPA1A|XN--MGBA7C0BBN0A|XN--FZYS8D69UVGM|XN--NQV7FS00EMA|XN--MGBC0A9AZCG|XN--MGBAAKC7DVF|XN--MGBA3A4F16A|XN--LGBBAT1AD8J|XN--KCRX77D1X4A|XN--I1B6B1A6A2E|SANDVIKCOROMANT|KERRYPROPERTIES|AMERICANEXPRESS|XN--RVC1E0AM3E|XN--MGBX4CD0AB|XN--MGBI4ECEXP|XN--MGBCA7DZDO|XN--MGBBH1A71E|XN--MGBAYH7GPA|XN--MGBAAM7A8H|XN--MGBA3A3EJT|XN--JLQ61U9W7B|XN--JLQ480N2RG|XN--H2BREG3EVE|XN--FIQ228C5HS|XN--B4W605FERD|XN--80AQECDR1A|XN--6QQ986B3XL|XN--54B7FTA0CC|WEATHERCHANNEL|KERRYLOGISTICS|COOKINGCHANNEL|CANCERRESEARCH|BANANAREPUBLIC|AMERICANFAMILY|AFAMILYCOMPANY|XN--YGBI2AMMX|XN--YFRO4I67O|XN--TIQ49XQYJ|XN--H2BRJ9C8C|XN--FZC2C9E2C|XN--FPCRJ9C3D|XN--ECKVDTC9D|XN--CCKWCXETD|WOLTERSKLUWER|TRAVELCHANNEL|SPREADBETTING|LIFEINSURANCE|INTERNATIONAL|XN--QCKA1PMC|XN--OGBPF8FL|XN--NGBE9E0A|XN--NGBC5AZD|XN--MK1BU44C|XN--MGBT3DHD|XN--MGBPL2FH|XN--MGBGU82A|XN--MGBAB2BD|XN--MGB9AWBF|XN--GCKR3F0F|XN--8Y0A063A|XN--80ASEHDB|XN--80ADXHKS|XN--4DBRK0CE|XN--45BR5CYL|XN--3E0B707E|VERSICHERUNG|SCHOLARSHIPS|LPLFINANCIAL|CONSTRUCTION|XN--ZFR164B|XN--XHQ521B|XN--W4RS40L|XN--VUQ861B|XN--T60B56A|XN--SES554G|XN--S9BRJ9C|XN--ROVU88B|XN--RHQV96G|XN--Q9JYB4C|XN--PGBS0DH|XN--OTU796D|XN--NYQY26A|XN--MIX891F|XN--MGBTX2B|XN--MGBBH1A|XN--KPRY57D|XN--KPRW13D|XN--JVR189M|XN--J6W193G|XN--IMR513N|XN--HXT814E|XN--H2BRJ9C|XN--GK3AT1E|XN--GECRJ9C|XN--G2XX48C|XN--FLW351E|XN--FJQ720A|XN--FCT429K|XN--EFVY88H|XN--D1ACJ3B|XN--CZR694B|XN--CCK2B3B|XN--9KRT00A|XN--80AO21A|XN--6FRZ82G|XN--55QW42G|XN--45BRJ9C|XN--42C2D9A|XN--3HCRJ9C|XN--3DS443G|XN--3BST00M|XN--2SCRJ9C|XN--1QQW23A|XN--1CK2E1B|XN--11B4C3D|WILLIAMHILL|REDUMBRELLA|PROGRESSIVE|PRODUCTIONS|PLAYSTATION|PHOTOGRAPHY|OLAYANGROUP|MOTORCYCLES|LAMBORGHINI|KERRYHOTELS|INVESTMENTS|FOODNETWORK|ENTERPRISES|ENGINEERING|CREDITUNION|CONTRACTORS|CALVINKLEIN|BRIDGESTONE|BLOCKBUSTER|BLACKFRIDAY|BARCLAYCARD|ACCOUNTANTS|XN--Y9A3AQ|XN--WGBL6A|XN--WGBH1C|XN--UNUP4Y|XN--Q7CE6A|XN--PSSY2U|XN--O3CW4H|XN--MXTQ1M|XN--KPUT3I|XN--IO0A7I|XN--FIQZ9S|XN--FIQS8S|XN--FIQ64B|XN--CZRU2D|XN--CZRS0T|XN--CG4BKI|XN--C2BR7G|XN--9ET52U|XN--9DBQ2A|XN--90A3AC|XN--80ASWG|XN--5TZM5G|XN--55QX5D|XN--4GBRIM|XN--45Q11C|XN--3PXU8K|XN--30RR7Y|VOLKSWAGEN|VLAANDEREN|UNIVERSITY|TECHNOLOGY|TATAMOTORS|SWIFTCOVER|SCHAEFFLER|RESTAURANT|REPUBLICAN|REALESTATE|PRUDENTIAL|PROTECTION|PROPERTIES|ONYOURSIDE|NEXTDIRECT|NATIONWIDE|MITSUBISHI|MANAGEMENT|INDUSTRIES|IMMOBILIEN|HEALTHCARE|FOUNDATION|EXTRASPACE|EUROVISION|CUISINELLA|CREDITCARD|CONSULTING|CAPITALONE|BOEHRINGER|BNPPARIBAS|BASKETBALL|ASSOCIATES|APARTMENTS|ACCOUNTANT|YODOBASHI|XN--VHQUV|XN--TCKWE|XN--QXA6A|XN--P1ACF|XN--NQV7F|XN--NGBRX|XN--L1ACC|XN--J1AMH|XN--J1AEF|XN--FHBEI|XN--E1A4C|XN--D1ALF|XN--C1AVG|XN--90AIS|VACATIONS|TRAVELERS|STOCKHOLM|STATEFARM|STATEBANK|SOLUTIONS|SHANGRILA|SCJOHNSON|RICHARDLI|PRAMERICA|PASSAGENS|PANASONIC|MICROSOFT|MELBOURNE|MARSHALLS|MARKETING|LIFESTYLE|LANDROVER|LANCASTER|KUOKGROUP|INSURANCE|INSTITUTE|HOMESENSE|HOMEGOODS|HOMEDEPOT|HISAMITSU|GOLDPOINT|FURNITURE|FUJIXEROX|FRONTDOOR|FRESENIUS|FIRESTONE|FINANCIAL|FAIRWINDS|EQUIPMENT|EDUCATION|DIRECTORY|COMMUNITY|CHRISTMAS|BLOOMBERG|BARCELONA|AQUARELLE|ANALYTICS|AMSTERDAM|ALLFINANZ|ALFAROMEO|ACCENTURE|YOKOHAMA|XN--QXAM|XN--P1AI|XN--NODE|XN--90AE|WOODSIDE|VERISIGN|VENTURES|VANGUARD|TRAINING|SUPPLIES|STCGROUP|SOFTWARE|SOFTBANK|SHOWTIME|SHOPPING|SERVICES|SECURITY|SAMSCLUB|SAARLAND|RELIANCE|REDSTONE|PROPERTY|PLUMBING|PICTURES|PHARMACY|PARTNERS|OBSERVER|MORTGAGE|MERCKMSD|MEMORIAL|MCKINSEY|MASERATI|MARRIOTT|LUNDBECK|LIGHTING|JPMORGAN|ISTANBUL|IPIRANGA|INFINITI|HOSPITAL|HOLDINGS|HELSINKI|HDFCBANK|GUARDIAN|GRAPHICS|GRAINGER|GOODYEAR|FRONTIER|FOOTBALL|FIRMDALE|FIDELITY|FEEDBACK|EXCHANGE|ETISALAT|ERICSSON|ENGINEER|DOWNLOAD|DISCOVER|DISCOUNT|DIAMONDS|DEMOCRAT|DELOITTE|DELIVERY|COMPUTER|COMMBANK|CLOTHING|CLINIQUE|CLEANING|CITYEATS|CIPRIANI|CATHOLIC|CATERING|CAPETOWN|BUSINESS|BUILDERS|BUDAPEST|BRUSSELS|BROADWAY|BRADESCO|BOUTIQUE|BASEBALL|BARGAINS|BAREFOOT|BARCLAYS|ATTORNEY|ALLSTATE|AIRFORCE|ABUDHABI|ZUERICH|YOUTUBE|YAMAXUN|XFINITY|WINNERS|WINDOWS|WHOSWHO|WEDDING|WEBSITE|WEATHER|WATCHES|WANGGOU|WALMART|TRADING|TOSHIBA|TIFFANY|TICKETS|THEATRE|THEATER|TEMASEK|SYSTEMS|SURGERY|SUPPORT|STORAGE|STAPLES|SINGLES|SHIKSHA|SCIENCE|SCHWARZ|SCHMIDT|SANDVIK|SAMSUNG|REXROTH|REVIEWS|RENTALS|RECIPES|REALTOR|POLITIE|PIONEER|PHILIPS|ORIGINS|ORGANIC|OLDNAVY|OKINAWA|NEUSTAR|NETWORK|NETFLIX|NETBANK|MONSTER|MARKETS|LINCOLN|LIMITED|LECLERC|LATROBE|LASALLE|LANXESS|LACAIXA|KOMATSU|KITCHEN|JUNIPER|JEWELRY|ISMAILI|HYUNDAI|HOTMAIL|HOTELES|HOSTING|HOLIDAY|HITACHI|HANGOUT|HAMBURG|GUITARS|GROCERY|GODADDY|GENTING|GALLERY|FUJITSU|FROGANS|FORSALE|FLOWERS|FLORIST|FLIGHTS|FITNESS|FISHING|FINANCE|FERRERO|FERRARI|FASHION|FARMERS|EXPRESS|EXPOSED|DOMAINS|DIGITAL|DENTIST|CRUISES|CRICKET|COURSES|COUPONS|COUNTRY|CORSICA|COOKING|CONTACT|COMPARE|COMPANY|COMCAST|COLOGNE|COLLEGE|CLUBMED|CITADEL|CHINTAI|CHARITY|CHANNEL|CAREERS|CARAVAN|CAPITAL|BUGATTI|BROTHER|BOOKING|BESTBUY|BENTLEY|BAUHAUS|BANAMEX|AVIANCA|AUSPOST|AUDIBLE|AUCTION|ATHLETA|ANDROID|ALIBABA|AGAKHAN|ACADEMY|ABOGADO|ZAPPOS|YANDEX|YACHTS|XIHUAN|WEBCAM|WALTER|VUELOS|VOYAGE|VOTING|VISION|VIRGIN|VILLAS|VIKING|VIAJES|UNICOM|TRAVEL|TOYOTA|TKMAXX|TJMAXX|TIENDA|TENNIS|TATTOO|TARGET|TAOBAO|TAIPEI|SYDNEY|SWATCH|SUZUKI|SUPPLY|STUDIO|STREAM|SOCIAL|SOCCER|SHOUJI|SELECT|SECURE|SEARCH|SCHULE|SCHOOL|SANOFI|SAKURA|SAFETY|RYUKYU|ROGERS|ROCHER|REVIEW|REPORT|REPAIR|REISEN|REALTY|RACING|QUEBEC|PICTET|PHYSIO|PHOTOS|PFIZER|OTSUKA|ORANGE|ORACLE|ONLINE|OLAYAN|OFFICE|NOWRUZ|NORTON|NISSAY|NISSAN|NATURA|NAGOYA|MUTUAL|MUSEUM|MOSCOW|MORMON|MONASH|MOBILE|MATTEL|MARKET|MAKEUP|MAISON|MADRID|LUXURY|LONDON|LOCKER|LIVING|LEFRAK|LAWYER|LATINO|LANCIA|KOSHER|KINDLE|KINDER|KAUFEN|JUEGOS|JOBURG|JAGUAR|INTUIT|INSURE|IMAMAT|HUGHES|HOTELS|HOCKEY|HIPHOP|HERMES|HEALTH|GRATIS|GOOGLE|GLOBAL|GIVING|GEORGE|GARDEN|GALLUP|FUTBOL|FLICKR|FAMILY|EXPERT|EVENTS|ESTATE|ENERGY|EMERCK|DURBAN|DUPONT|DUNLOP|DOCTOR|DIRECT|DESIGN|DENTAL|DEGREE|DEALER|DATSUN|DATING|CRUISE|CREDIT|COUPON|CONDOS|COMSEC|COFFEE|CLINIC|CLAIMS|CIRCLE|CHURCH|CHROME|CHANEL|CENTER|CASINO|CAREER|CAMERA|BROKER|BOSTON|BOSTIK|BHARTI|BERLIN|BEAUTY|BAYERN|AUTHOR|ARAMCO|ANQUAN|AMAZON|ALSTOM|ALSACE|ALIPAY|AIRTEL|AIRBUS|AGENCY|AFRICA|ABBVIE|ABBOTT|ABARTH|YAHOO|XEROX|WORLD|WORKS|WEIBO|WEBER|WATCH|WALES|VOLVO|VODKA|VIDEO|VEGAS|UBANK|TUSHU|TUNES|TRUST|TRADE|TOURS|TOTAL|TORAY|TOOLS|TOKYO|TODAY|TMALL|TIROL|TIRES|TATAR|SWISS|SUCKS|STYLE|STUDY|STORE|STADA|SPORT|SPACE|SOLAR|SMILE|SMART|SLING|SKYPE|SHOES|SHELL|SHARP|SEVEN|SENER|SALON|RUGBY|RODEO|ROCKS|RICOH|REISE|REHAB|RADIO|QUEST|PROMO|PRIME|PRESS|PRAXI|POKER|PLACE|PIZZA|PHOTO|PHONE|PARTY|PARTS|PARIS|OSAKA|OMEGA|NOWTV|NOKIA|NINJA|NIKON|NEXUS|MOVIE|MONEY|MIAMI|MEDIA|MANGO|MACYS|LOTTO|LOTTE|LOCUS|LOANS|LIXIL|LIPSY|LINDE|LILLY|LEXUS|LEGAL|LEASE|LAMER|KYOTO|KOELN|JETZT|IVECO|IRISH|IKANO|HYATT|HOUSE|HORSE|HONDA|HOMES|GUIDE|GUCCI|GROUP|GRIPE|GREEN|GMAIL|GLOBO|GLASS|GLADE|GIVES|GIFTS|GAMES|GALLO|FORUM|FOREX|FINAL|FEDEX|FAITH|EPSON|EMAIL|EDEKA|EARTH|DUBAI|DRIVE|DELTA|DEALS|DANCE|DABUR|CYMRU|CROWN|CODES|COACH|CLOUD|CLICK|CITIC|CISCO|CHEAP|CHASE|CARDS|CANON|BUILD|BOSCH|BOATS|BLACK|BINGO|BIBLE|BEATS|BAIDU|AZURE|AUTOS|AUDIO|ARCHI|APPLE|AMICA|AMFAM|AETNA|ADULT|ACTOR|ZONE|ZERO|ZARA|YOGA|XBOX|WORK|WINE|WIKI|WIEN|WEIR|WANG|VOTO|VOTE|VIVO|VIVA|VISA|VANA|TUBE|TOYS|TOWN|TIPS|TIAA|TEVA|TECH|TEAM|TAXI|TALK|SURF|STAR|SPOT|SONY|SONG|SOHU|SNCF|SKIN|SITE|SINA|SILK|SHOW|SHOP|SHIA|SHAW|SEXY|SEEK|SEAT|SCOT|SAXO|SAVE|SARL|SALE|SAFE|RUHR|RSVP|ROOM|RMIT|RICH|REST|RENT|REIT|READ|RAID|QPON|PROF|PROD|POST|PORN|POHL|PLUS|PLAY|PINK|PING|PICS|PCCW|PARS|PAGE|OPEN|OLLO|NIKE|NICO|NEXT|NEWS|NAVY|NAME|MOTO|MODA|MOBI|MINT|MINI|MENU|MEME|MEET|MAIF|LUXE|LTDA|LOVE|LOFT|LOAN|LIVE|LINK|LIMO|LIKE|LIFE|LIDL|LGBT|LEGO|LAND|KRED|KPMG|KIWI|KDDI|JPRS|JOBS|JEEP|JAVA|ITAU|INFO|IMMO|IMDB|IEEE|ICBC|HSBC|HOST|HGTV|HERE|HELP|HDFC|HAUS|HAIR|GURU|GUGE|GOOG|GOLF|GOLD|GMBH|GIFT|GGEE|GENT|GBIZ|GAME|FUND|FREE|FORD|FOOD|FLIR|FISH|FIRE|FILM|FIDO|FIAT|FAST|FARM|FANS|FAIL|FAGE|ERNI|DVAG|DUCK|DOCS|DISH|DIET|DESI|DELL|DEAL|DCLK|DATE|DATA|CYOU|COOP|COOL|CLUB|CITY|CITI|CHAT|CERN|CBRE|CASH|CASE|CASA|CARS|CARE|CAMP|CALL|CAFE|BUZZ|BOOK|BOND|BOFA|BLUE|BLOG|BING|BIKE|BEST|BEER|BBVA|BANK|BAND|BABY|AUTO|AUDI|ASIA|ASDA|ARTE|ARPA|ARMY|ARAB|AMEX|ALLY|AKDN|AERO|ADAC|ABLE|AARP|ZIP|YUN|YOU|XYZ|XXX|XIN|WTF|WTC|WOW|WME|WIN|WED|VIP|VIN|VIG|VET|UPS|UOL|UNO|UBS|TVS|TUI|TRV|TOP|TJX|THD|TEL|TDK|TCI|TAX|TAB|STC|SRL|SPA|SOY|SKY|SKI|SFR|SEX|SEW|SES|SCB|SCA|SBS|SBI|SAS|SAP|RWE|RUN|RIP|RIO|RIL|REN|RED|QVC|PWC|PUB|PRU|PRO|PNC|PIN|PID|PHD|PET|PAY|OVH|OTT|ORG|OOO|ONL|ONG|ONE|OFF|OBI|NYC|NTT|NRW|NRA|NOW|NHK|NGO|NFL|NEW|NET|NEC|NBA|NAB|MTR|MTN|MSD|MOV|MOM|MOI|MOE|MMA|MLS|MLB|MIT|MIL|MEN|MED|MBA|MAP|MAN|LTD|LPL|LOL|LLP|LLC|LDS|LAW|LAT|KRD|KPN|KIM|KIA|KFH|JOY|JOT|JNJ|JMP|JLL|JIO|JCB|ITV|IST|INT|INK|ING|INC|IFM|ICU|ICE|IBM|HOW|HOT|HKT|HIV|HBO|GOV|GOT|GOP|GOO|GMX|GMO|GLE|GEA|GDN|GAY|GAP|GAL|FYI|FUN|FTR|FRL|FOX|FOO|FLY|FIT|FAN|EUS|ESQ|EDU|ECO|EAT|DVR|DTV|DOT|DOG|DNP|DIY|DHL|DEV|DDS|DAY|DAD|CSC|CRS|CPA|COM|CFD|CFA|CEO|CBS|CBN|CBA|CAT|CAR|CAM|CAL|CAB|BZH|BUY|BOX|BOT|BOO|BOM|BMW|BMS|BIZ|BIO|BID|BET|BCN|BCG|BBT|BBC|BAR|AXA|AWS|ART|APP|AOL|ANZ|AIG|AFL|AEG|ADS|ACO|ABC|ABB|AAA|ZW|ZM|ZA|YT|YE|WS|WF|VU|VN|VI|VG|VE|VC|VA|UZ|UY|US|UK|UG|UA|TZ|TW|TV|TT|TR|TO|TN|TM|TL|TK|TJ|TH|TG|TF|TD|TC|SZ|SY|SX|SV|SU|ST|SS|SR|SO|SN|SM|SL|SK|SJ|SI|SH|SG|SE|SD|SC|SB|SA|RW|RU|RS|RO|RE|QA|PY|PW|PT|PS|PR|PN|PM|PL|PK|PH|PG|PF|PE|PA|OM|NZ|NU|NR|NP|NO|NL|NI|NG|NF|NE|NC|NA|MZ|MY|MX|MW|MV|MU|MT|MS|MR|MQ|MP|MO|MN|MM|ML|MK|MH|MG|ME|MD|MC|MA|LY|LV|LU|LT|LS|LR|LK|LI|LC|LB|LA|KZ|KY|KW|KR|KP|KN|KM|KI|KH|KG|KE|JP|JO|JM|JE|IT|IS|IR|IQ|IO|IN|IM|IL|IE|ID|HU|HT|HR|HN|HM|HK|GY|GW|GU|GT|GS|GR|GQ|GP|GN|GM|GL|GI|GH|GG|GF|GE|GD|GB|GA|FR|FO|FM|FK|FJ|FI|EU|ET|ES|ER|EG|EE|EC|DZ|DO|DM|DK|DJ|DE|CZ|CY|CX|CW|CV|CU|CR|CO|CN|CM|CL|CK|CI|CH|CG|CF|CD|CC|CA|BZ|BY|BW|BV|BT|BS|BR|BO|BN|BM|BJ|BI|BH|BG|BF|BE|BD|BB|BA|AZ|AX|AW|AU|AT|AS|AR|AQ|AO|AM|AL|AI|AG|AF|AE|AD|AC|SJC|RNO|LAX)(?:\:([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])|\b|(?=_))(?!@(?:[a-z\d-]+\.)+[a-z]{2,}))(?:(?:[.,=(+$!*]|&(?:amp|#x27);)?\/(?:[-\w$@.+!*:(),=%~]|&(?:amp|#x27);)*(?:[-\w~@:%)]|&(?:amp|#x27);)|\/)*(?:(?:\?(?:[-\w$@.+!*()\/,=%{}:;\[\]\|_|~]|&(?:amp|#x27);)*)?|(?:#(?:[-\w$@.+!*()[\],=%;\/:~]|&(?:amp|#x27);)*)?)*)))\.(?:mp4|mov|avi|wmv|flv|mkv|webm|3gp|m4v|mpg|mpeg|ogv))\)(?![^<]*(<\/pre>|<\/code>))/gi;

// full reg for video with path char quantifier {60,80}
const regexp = /\!(?:\[([^\][]*(?:\[[^\][]*][^\][]*)*)])?\((((((((ht|f)tps?:\/\/)([a-z0-9](?:[-a-z0-9]*[a-z0-9])?\.)*(?:[a-z0-9](?:[-a-z0-9]*[a-z0-9])?)(?:\:([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])|\b|(?=_)))(?:(?:[.,=(+$!*]|&(?:amp|#x27);)?\/(?:[-\w$@.+!*:(),=%~]|&(?:amp|#x27);){60,80}(?:[-\w~@:%)]|&(?:amp|#x27);)|\/)*(?:(?:\?(?:[-\w$@.+!*()\/,=%{}:;\[\]\|_|~]|&(?:amp|#x27);)*)?|(?:#(?:[-\w$@.+!*()[\],=%;\/:~]|&(?:amp|#x27);)*)?)*)|((((ht|f)tps?:\/\/)?((?:www\.)?[a-z0-9](?=(?<label>[-a-z0-9]*[a-z0-9])?)\k<label>\.)+(?:XN--VERMGENSBERATUNG-PWB|XN--VERMGENSBERATER-CTB|XN--CLCHC0EA0B2G2A9GCD|XN--W4R85EL8FHU5DNRA|TRAVELERSINSURANCE|NORTHWESTERNMUTUAL|XN--XKC2DL3A5EE0H|XN--MGBERP4A5D4AR|XN--MGBAI9AZGQP6J|XN--MGBAH1A3HJKRD|XN--BCK1B9A5DRE4C|XN--5SU34J936BGSG|XN--3OQ18VL8PN36A|XN--XKC2AL3HYE2A|XN--MGBCPQ6GPA1A|XN--MGBA7C0BBN0A|XN--FZYS8D69UVGM|XN--NQV7FS00EMA|XN--MGBC0A9AZCG|XN--MGBAAKC7DVF|XN--MGBA3A4F16A|XN--LGBBAT1AD8J|XN--KCRX77D1X4A|XN--I1B6B1A6A2E|SANDVIKCOROMANT|KERRYPROPERTIES|AMERICANEXPRESS|XN--RVC1E0AM3E|XN--MGBX4CD0AB|XN--MGBI4ECEXP|XN--MGBCA7DZDO|XN--MGBBH1A71E|XN--MGBAYH7GPA|XN--MGBAAM7A8H|XN--MGBA3A3EJT|XN--JLQ61U9W7B|XN--JLQ480N2RG|XN--H2BREG3EVE|XN--FIQ228C5HS|XN--B4W605FERD|XN--80AQECDR1A|XN--6QQ986B3XL|XN--54B7FTA0CC|WEATHERCHANNEL|KERRYLOGISTICS|COOKINGCHANNEL|CANCERRESEARCH|BANANAREPUBLIC|AMERICANFAMILY|AFAMILYCOMPANY|XN--YGBI2AMMX|XN--YFRO4I67O|XN--TIQ49XQYJ|XN--H2BRJ9C8C|XN--FZC2C9E2C|XN--FPCRJ9C3D|XN--ECKVDTC9D|XN--CCKWCXETD|WOLTERSKLUWER|TRAVELCHANNEL|SPREADBETTING|LIFEINSURANCE|INTERNATIONAL|XN--QCKA1PMC|XN--OGBPF8FL|XN--NGBE9E0A|XN--NGBC5AZD|XN--MK1BU44C|XN--MGBT3DHD|XN--MGBPL2FH|XN--MGBGU82A|XN--MGBAB2BD|XN--MGB9AWBF|XN--GCKR3F0F|XN--8Y0A063A|XN--80ASEHDB|XN--80ADXHKS|XN--4DBRK0CE|XN--45BR5CYL|XN--3E0B707E|VERSICHERUNG|SCHOLARSHIPS|LPLFINANCIAL|CONSTRUCTION|XN--ZFR164B|XN--XHQ521B|XN--W4RS40L|XN--VUQ861B|XN--T60B56A|XN--SES554G|XN--S9BRJ9C|XN--ROVU88B|XN--RHQV96G|XN--Q9JYB4C|XN--PGBS0DH|XN--OTU796D|XN--NYQY26A|XN--MIX891F|XN--MGBTX2B|XN--MGBBH1A|XN--KPRY57D|XN--KPRW13D|XN--JVR189M|XN--J6W193G|XN--IMR513N|XN--HXT814E|XN--H2BRJ9C|XN--GK3AT1E|XN--GECRJ9C|XN--G2XX48C|XN--FLW351E|XN--FJQ720A|XN--FCT429K|XN--EFVY88H|XN--D1ACJ3B|XN--CZR694B|XN--CCK2B3B|XN--9KRT00A|XN--80AO21A|XN--6FRZ82G|XN--55QW42G|XN--45BRJ9C|XN--42C2D9A|XN--3HCRJ9C|XN--3DS443G|XN--3BST00M|XN--2SCRJ9C|XN--1QQW23A|XN--1CK2E1B|XN--11B4C3D|WILLIAMHILL|REDUMBRELLA|PROGRESSIVE|PRODUCTIONS|PLAYSTATION|PHOTOGRAPHY|OLAYANGROUP|MOTORCYCLES|LAMBORGHINI|KERRYHOTELS|INVESTMENTS|FOODNETWORK|ENTERPRISES|ENGINEERING|CREDITUNION|CONTRACTORS|CALVINKLEIN|BRIDGESTONE|BLOCKBUSTER|BLACKFRIDAY|BARCLAYCARD|ACCOUNTANTS|XN--Y9A3AQ|XN--WGBL6A|XN--WGBH1C|XN--UNUP4Y|XN--Q7CE6A|XN--PSSY2U|XN--O3CW4H|XN--MXTQ1M|XN--KPUT3I|XN--IO0A7I|XN--FIQZ9S|XN--FIQS8S|XN--FIQ64B|XN--CZRU2D|XN--CZRS0T|XN--CG4BKI|XN--C2BR7G|XN--9ET52U|XN--9DBQ2A|XN--90A3AC|XN--80ASWG|XN--5TZM5G|XN--55QX5D|XN--4GBRIM|XN--45Q11C|XN--3PXU8K|XN--30RR7Y|VOLKSWAGEN|VLAANDEREN|UNIVERSITY|TECHNOLOGY|TATAMOTORS|SWIFTCOVER|SCHAEFFLER|RESTAURANT|REPUBLICAN|REALESTATE|PRUDENTIAL|PROTECTION|PROPERTIES|ONYOURSIDE|NEXTDIRECT|NATIONWIDE|MITSUBISHI|MANAGEMENT|INDUSTRIES|IMMOBILIEN|HEALTHCARE|FOUNDATION|EXTRASPACE|EUROVISION|CUISINELLA|CREDITCARD|CONSULTING|CAPITALONE|BOEHRINGER|BNPPARIBAS|BASKETBALL|ASSOCIATES|APARTMENTS|ACCOUNTANT|YODOBASHI|XN--VHQUV|XN--TCKWE|XN--QXA6A|XN--P1ACF|XN--NQV7F|XN--NGBRX|XN--L1ACC|XN--J1AMH|XN--J1AEF|XN--FHBEI|XN--E1A4C|XN--D1ALF|XN--C1AVG|XN--90AIS|VACATIONS|TRAVELERS|STOCKHOLM|STATEFARM|STATEBANK|SOLUTIONS|SHANGRILA|SCJOHNSON|RICHARDLI|PRAMERICA|PASSAGENS|PANASONIC|MICROSOFT|MELBOURNE|MARSHALLS|MARKETING|LIFESTYLE|LANDROVER|LANCASTER|KUOKGROUP|INSURANCE|INSTITUTE|HOMESENSE|HOMEGOODS|HOMEDEPOT|HISAMITSU|GOLDPOINT|FURNITURE|FUJIXEROX|FRONTDOOR|FRESENIUS|FIRESTONE|FINANCIAL|FAIRWINDS|EQUIPMENT|EDUCATION|DIRECTORY|COMMUNITY|CHRISTMAS|BLOOMBERG|BARCELONA|AQUARELLE|ANALYTICS|AMSTERDAM|ALLFINANZ|ALFAROMEO|ACCENTURE|YOKOHAMA|XN--QXAM|XN--P1AI|XN--NODE|XN--90AE|WOODSIDE|VERISIGN|VENTURES|VANGUARD|TRAINING|SUPPLIES|STCGROUP|SOFTWARE|SOFTBANK|SHOWTIME|SHOPPING|SERVICES|SECURITY|SAMSCLUB|SAARLAND|RELIANCE|REDSTONE|PROPERTY|PLUMBING|PICTURES|PHARMACY|PARTNERS|OBSERVER|MORTGAGE|MERCKMSD|MEMORIAL|MCKINSEY|MASERATI|MARRIOTT|LUNDBECK|LIGHTING|JPMORGAN|ISTANBUL|IPIRANGA|INFINITI|HOSPITAL|HOLDINGS|HELSINKI|HDFCBANK|GUARDIAN|GRAPHICS|GRAINGER|GOODYEAR|FRONTIER|FOOTBALL|FIRMDALE|FIDELITY|FEEDBACK|EXCHANGE|ETISALAT|ERICSSON|ENGINEER|DOWNLOAD|DISCOVER|DISCOUNT|DIAMONDS|DEMOCRAT|DELOITTE|DELIVERY|COMPUTER|COMMBANK|CLOTHING|CLINIQUE|CLEANING|CITYEATS|CIPRIANI|CATHOLIC|CATERING|CAPETOWN|BUSINESS|BUILDERS|BUDAPEST|BRUSSELS|BROADWAY|BRADESCO|BOUTIQUE|BASEBALL|BARGAINS|BAREFOOT|BARCLAYS|ATTORNEY|ALLSTATE|AIRFORCE|ABUDHABI|ZUERICH|YOUTUBE|YAMAXUN|XFINITY|WINNERS|WINDOWS|WHOSWHO|WEDDING|WEBSITE|WEATHER|WATCHES|WANGGOU|WALMART|TRADING|TOSHIBA|TIFFANY|TICKETS|THEATRE|THEATER|TEMASEK|SYSTEMS|SURGERY|SUPPORT|STORAGE|STAPLES|SINGLES|SHIKSHA|SCIENCE|SCHWARZ|SCHMIDT|SANDVIK|SAMSUNG|REXROTH|REVIEWS|RENTALS|RECIPES|REALTOR|POLITIE|PIONEER|PHILIPS|ORIGINS|ORGANIC|OLDNAVY|OKINAWA|NEUSTAR|NETWORK|NETFLIX|NETBANK|MONSTER|MARKETS|LINCOLN|LIMITED|LECLERC|LATROBE|LASALLE|LANXESS|LACAIXA|KOMATSU|KITCHEN|JUNIPER|JEWELRY|ISMAILI|HYUNDAI|HOTMAIL|HOTELES|HOSTING|HOLIDAY|HITACHI|HANGOUT|HAMBURG|GUITARS|GROCERY|GODADDY|GENTING|GALLERY|FUJITSU|FROGANS|FORSALE|FLOWERS|FLORIST|FLIGHTS|FITNESS|FISHING|FINANCE|FERRERO|FERRARI|FASHION|FARMERS|EXPRESS|EXPOSED|DOMAINS|DIGITAL|DENTIST|CRUISES|CRICKET|COURSES|COUPONS|COUNTRY|CORSICA|COOKING|CONTACT|COMPARE|COMPANY|COMCAST|COLOGNE|COLLEGE|CLUBMED|CITADEL|CHINTAI|CHARITY|CHANNEL|CAREERS|CARAVAN|CAPITAL|BUGATTI|BROTHER|BOOKING|BESTBUY|BENTLEY|BAUHAUS|BANAMEX|AVIANCA|AUSPOST|AUDIBLE|AUCTION|ATHLETA|ANDROID|ALIBABA|AGAKHAN|ACADEMY|ABOGADO|ZAPPOS|YANDEX|YACHTS|XIHUAN|WEBCAM|WALTER|VUELOS|VOYAGE|VOTING|VISION|VIRGIN|VILLAS|VIKING|VIAJES|UNICOM|TRAVEL|TOYOTA|TKMAXX|TJMAXX|TIENDA|TENNIS|TATTOO|TARGET|TAOBAO|TAIPEI|SYDNEY|SWATCH|SUZUKI|SUPPLY|STUDIO|STREAM|SOCIAL|SOCCER|SHOUJI|SELECT|SECURE|SEARCH|SCHULE|SCHOOL|SANOFI|SAKURA|SAFETY|RYUKYU|ROGERS|ROCHER|REVIEW|REPORT|REPAIR|REISEN|REALTY|RACING|QUEBEC|PICTET|PHYSIO|PHOTOS|PFIZER|OTSUKA|ORANGE|ORACLE|ONLINE|OLAYAN|OFFICE|NOWRUZ|NORTON|NISSAY|NISSAN|NATURA|NAGOYA|MUTUAL|MUSEUM|MOSCOW|MORMON|MONASH|MOBILE|MATTEL|MARKET|MAKEUP|MAISON|MADRID|LUXURY|LONDON|LOCKER|LIVING|LEFRAK|LAWYER|LATINO|LANCIA|KOSHER|KINDLE|KINDER|KAUFEN|JUEGOS|JOBURG|JAGUAR|INTUIT|INSURE|IMAMAT|HUGHES|HOTELS|HOCKEY|HIPHOP|HERMES|HEALTH|GRATIS|GOOGLE|GLOBAL|GIVING|GEORGE|GARDEN|GALLUP|FUTBOL|FLICKR|FAMILY|EXPERT|EVENTS|ESTATE|ENERGY|EMERCK|DURBAN|DUPONT|DUNLOP|DOCTOR|DIRECT|DESIGN|DENTAL|DEGREE|DEALER|DATSUN|DATING|CRUISE|CREDIT|COUPON|CONDOS|COMSEC|COFFEE|CLINIC|CLAIMS|CIRCLE|CHURCH|CHROME|CHANEL|CENTER|CASINO|CAREER|CAMERA|BROKER|BOSTON|BOSTIK|BHARTI|BERLIN|BEAUTY|BAYERN|AUTHOR|ARAMCO|ANQUAN|AMAZON|ALSTOM|ALSACE|ALIPAY|AIRTEL|AIRBUS|AGENCY|AFRICA|ABBVIE|ABBOTT|ABARTH|YAHOO|XEROX|WORLD|WORKS|WEIBO|WEBER|WATCH|WALES|VOLVO|VODKA|VIDEO|VEGAS|UBANK|TUSHU|TUNES|TRUST|TRADE|TOURS|TOTAL|TORAY|TOOLS|TOKYO|TODAY|TMALL|TIROL|TIRES|TATAR|SWISS|SUCKS|STYLE|STUDY|STORE|STADA|SPORT|SPACE|SOLAR|SMILE|SMART|SLING|SKYPE|SHOES|SHELL|SHARP|SEVEN|SENER|SALON|RUGBY|RODEO|ROCKS|RICOH|REISE|REHAB|RADIO|QUEST|PROMO|PRIME|PRESS|PRAXI|POKER|PLACE|PIZZA|PHOTO|PHONE|PARTY|PARTS|PARIS|OSAKA|OMEGA|NOWTV|NOKIA|NINJA|NIKON|NEXUS|MOVIE|MONEY|MIAMI|MEDIA|MANGO|MACYS|LOTTO|LOTTE|LOCUS|LOANS|LIXIL|LIPSY|LINDE|LILLY|LEXUS|LEGAL|LEASE|LAMER|KYOTO|KOELN|JETZT|IVECO|IRISH|IKANO|HYATT|HOUSE|HORSE|HONDA|HOMES|GUIDE|GUCCI|GROUP|GRIPE|GREEN|GMAIL|GLOBO|GLASS|GLADE|GIVES|GIFTS|GAMES|GALLO|FORUM|FOREX|FINAL|FEDEX|FAITH|EPSON|EMAIL|EDEKA|EARTH|DUBAI|DRIVE|DELTA|DEALS|DANCE|DABUR|CYMRU|CROWN|CODES|COACH|CLOUD|CLICK|CITIC|CISCO|CHEAP|CHASE|CARDS|CANON|BUILD|BOSCH|BOATS|BLACK|BINGO|BIBLE|BEATS|BAIDU|AZURE|AUTOS|AUDIO|ARCHI|APPLE|AMICA|AMFAM|AETNA|ADULT|ACTOR|ZONE|ZERO|ZARA|YOGA|XBOX|WORK|WINE|WIKI|WIEN|WEIR|WANG|VOTO|VOTE|VIVO|VIVA|VISA|VANA|TUBE|TOYS|TOWN|TIPS|TIAA|TEVA|TECH|TEAM|TAXI|TALK|SURF|STAR|SPOT|SONY|SONG|SOHU|SNCF|SKIN|SITE|SINA|SILK|SHOW|SHOP|SHIA|SHAW|SEXY|SEEK|SEAT|SCOT|SAXO|SAVE|SARL|SALE|SAFE|RUHR|RSVP|ROOM|RMIT|RICH|REST|RENT|REIT|READ|RAID|QPON|PROF|PROD|POST|PORN|POHL|PLUS|PLAY|PINK|PING|PICS|PCCW|PARS|PAGE|OPEN|OLLO|NIKE|NICO|NEXT|NEWS|NAVY|NAME|MOTO|MODA|MOBI|MINT|MINI|MENU|MEME|MEET|MAIF|LUXE|LTDA|LOVE|LOFT|LOAN|LIVE|LINK|LIMO|LIKE|LIFE|LIDL|LGBT|LEGO|LAND|KRED|KPMG|KIWI|KDDI|JPRS|JOBS|JEEP|JAVA|ITAU|INFO|IMMO|IMDB|IEEE|ICBC|HSBC|HOST|HGTV|HERE|HELP|HDFC|HAUS|HAIR|GURU|GUGE|GOOG|GOLF|GOLD|GMBH|GIFT|GGEE|GENT|GBIZ|GAME|FUND|FREE|FORD|FOOD|FLIR|FISH|FIRE|FILM|FIDO|FIAT|FAST|FARM|FANS|FAIL|FAGE|ERNI|DVAG|DUCK|DOCS|DISH|DIET|DESI|DELL|DEAL|DCLK|DATE|DATA|CYOU|COOP|COOL|CLUB|CITY|CITI|CHAT|CERN|CBRE|CASH|CASE|CASA|CARS|CARE|CAMP|CALL|CAFE|BUZZ|BOOK|BOND|BOFA|BLUE|BLOG|BING|BIKE|BEST|BEER|BBVA|BANK|BAND|BABY|AUTO|AUDI|ASIA|ASDA|ARTE|ARPA|ARMY|ARAB|AMEX|ALLY|AKDN|AERO|ADAC|ABLE|AARP|ZIP|YUN|YOU|XYZ|XXX|XIN|WTF|WTC|WOW|WME|WIN|WED|VIP|VIN|VIG|VET|UPS|UOL|UNO|UBS|TVS|TUI|TRV|TOP|TJX|THD|TEL|TDK|TCI|TAX|TAB|STC|SRL|SPA|SOY|SKY|SKI|SFR|SEX|SEW|SES|SCB|SCA|SBS|SBI|SAS|SAP|RWE|RUN|RIP|RIO|RIL|REN|RED|QVC|PWC|PUB|PRU|PRO|PNC|PIN|PID|PHD|PET|PAY|OVH|OTT|ORG|OOO|ONL|ONG|ONE|OFF|OBI|NYC|NTT|NRW|NRA|NOW|NHK|NGO|NFL|NEW|NET|NEC|NBA|NAB|MTR|MTN|MSD|MOV|MOM|MOI|MOE|MMA|MLS|MLB|MIT|MIL|MEN|MED|MBA|MAP|MAN|LTD|LPL|LOL|LLP|LLC|LDS|LAW|LAT|KRD|KPN|KIM|KIA|KFH|JOY|JOT|JNJ|JMP|JLL|JIO|JCB|ITV|IST|INT|INK|ING|INC|IFM|ICU|ICE|IBM|HOW|HOT|HKT|HIV|HBO|GOV|GOT|GOP|GOO|GMX|GMO|GLE|GEA|GDN|GAY|GAP|GAL|FYI|FUN|FTR|FRL|FOX|FOO|FLY|FIT|FAN|EUS|ESQ|EDU|ECO|EAT|DVR|DTV|DOT|DOG|DNP|DIY|DHL|DEV|DDS|DAY|DAD|CSC|CRS|CPA|COM|CFD|CFA|CEO|CBS|CBN|CBA|CAT|CAR|CAM|CAL|CAB|BZH|BUY|BOX|BOT|BOO|BOM|BMW|BMS|BIZ|BIO|BID|BET|BCN|BCG|BBT|BBC|BAR|AXA|AWS|ART|APP|AOL|ANZ|AIG|AFL|AEG|ADS|ACO|ABC|ABB|AAA|ZW|ZM|ZA|YT|YE|WS|WF|VU|VN|VI|VG|VE|VC|VA|UZ|UY|US|UK|UG|UA|TZ|TW|TV|TT|TR|TO|TN|TM|TL|TK|TJ|TH|TG|TF|TD|TC|SZ|SY|SX|SV|SU|ST|SS|SR|SO|SN|SM|SL|SK|SJ|SI|SH|SG|SE|SD|SC|SB|SA|RW|RU|RS|RO|RE|QA|PY|PW|PT|PS|PR|PN|PM|PL|PK|PH|PG|PF|PE|PA|OM|NZ|NU|NR|NP|NO|NL|NI|NG|NF|NE|NC|NA|MZ|MY|MX|MW|MV|MU|MT|MS|MR|MQ|MP|MO|MN|MM|ML|MK|MH|MG|ME|MD|MC|MA|LY|LV|LU|LT|LS|LR|LK|LI|LC|LB|LA|KZ|KY|KW|KR|KP|KN|KM|KI|KH|KG|KE|JP|JO|JM|JE|IT|IS|IR|IQ|IO|IN|IM|IL|IE|ID|HU|HT|HR|HN|HM|HK|GY|GW|GU|GT|GS|GR|GQ|GP|GN|GM|GL|GI|GH|GG|GF|GE|GD|GB|GA|FR|FO|FM|FK|FJ|FI|EU|ET|ES|ER|EG|EE|EC|DZ|DO|DM|DK|DJ|DE|CZ|CY|CX|CW|CV|CU|CR|CO|CN|CM|CL|CK|CI|CH|CG|CF|CD|CC|CA|BZ|BY|BW|BV|BT|BS|BR|BO|BN|BM|BJ|BI|BH|BG|BF|BE|BD|BB|BA|AZ|AX|AW|AU|AT|AS|AR|AQ|AO|AM|AL|AI|AG|AF|AE|AD|AC|SJC|RNO|LAX)(?:\:([1-9][0-9]{0,3}|[1-5][0-9]{4}|6[0-4][0-9]{3}|65[0-4][0-9]{2}|655[0-2][0-9]|6553[0-5])|\b|(?=_))(?!@(?:[a-z\d-]+\.)+[a-z]{2,}))(?:(?:[.,=(+$!*]|&(?:amp|#x27);)?\/(?:[-\w$@.+!*:(),=%~]|&(?:amp|#x27);){60,80}(?:[-\w~@:%)]|&(?:amp|#x27);)|\/)*(?:(?:\?(?:[-\w$@.+!*()\/,=%{}:;\[\]\|_|~]|&(?:amp|#x27);)*)?|(?:#(?:[-\w$@.+!*()[\],=%;\/:~]|&(?:amp|#x27);)*)?)*)))\.(?:mp4|mov|avi|wmv|flv|mkv|webm|3gp|m4v|mpg|mpeg|ogv))\)(?![^<]*(<\/pre>|<\/code>))/gi;

const replacement = '===='

const start = new Date();
log('    start:', start)
const result = markdown.replace(regexp, replacement);
const end = new Date();
log('      end:', end)
const diff = end - start;
log('time diff:', `${diff} ms`, `${diff / 1000} s`)

log('result', result)


test video

https://github.com/user-attachments/assets/99956a3e-5582-453b-9bf6-a0900130920d

What changes do you think we should make in order to solve the problem?

Move the VIDEO_EXTENSIONS logic from the regexp to a separate match.

  1. Remove the VIDEO_EXTENSIONS part, that's changing regexp for video to:
const MARKDOWN_VIDEO_REGEX = new RegExp(
   `\\!(?:\\[([^\\][]*(?:\\[[^\\][]*][^\\][]*)*)])?\\((${UrlPatterns.MARKDOWN_URL_REGEX})\\)(?![^<]*(<\\/pre>|<\\/code>))`,
   'gi',
);

which is actually identical with regexp for image 2. add process field to video rule, that is adding:

                process: (textToProcess, replacement, shouldKeepRawInput) => {
                    const videoPart = new RegExp(`\\b\\.(?:${Constants.CONST.VIDEO_EXTENSIONS.join('|')})\\b`);
                    if (!textToProcess.match(videoPart)) {
                        return textToProcess;
                    }
                    return this.replaceTextWithExtras(textToProcess, MARKDOWN_VIDEO_REGEX, EXTRAS_DEFAULT, replacement);
                },

below video rule here

  1. To verify:
    • build expensify-common with command npm run build
    • copy file dist/ExpensiMark.js to react-native-live-markdown/parser/node_modules/expensify-common/dist/ExpensiMark.js --- inside project react-native-live-markdown
    • start ios example and verify

Notice: This solution will work fine if the user is pasting a valid video markdown, a valid image markdown or a valid link, but does not work if a user paste an invalid video markdown, an invalid image markdown or an invalid link and let the regexp executor have to go through too many possibilities before failing.

What alternative solutions did you explore? (Optional)

We can change the url regular expressions to a simple pattern to avoid 'too many possibilities before failing'.

  1. Add following code below Url.js here
const SIMPLE_URL_REGEX = '(https?:\/\/[0-9a-z\.-\/?=&#%]+)';
  1. Export it in Url.js file. That is adding SIMPLE_URL_REGEX, below Url.js here

  2. Add following code below Url.d.ts here

declare const SIMPLE_URL_REGEX: '(https?:\/\/[0-9a-z\.-\/?=&#%]+)';
  1. Add following code below Url.d.ts here
SIMPLE_URL_REGEX,
  1. Change all url related regexp usage in file ExpensiMark.ts to UrlPatterns.SIMPLE_URL_REGEX:
    • here, change allUrlPatterns.MARKDOWN_URL_REGEX to UrlPatterns.SIMPLE_URL_REGEX
    • here, changeUrlPatterns.MARKDOWN_URL_REGEX to UrlPatterns.SIMPLE_URL_REGEX
    • here, change ^${UrlPatterns.LOOSE_URL_REGEX}$|^${UrlPatterns.URL_REGEX}$ to ^${UrlPatterns.SIMPLE_URL_REGEX}$
    • here, change UrlPatterns.MARKDOWN_URL_REGEX to UrlPatterns.SIMPLE_URL_REGEX
  2. Sync the regexp change to back end to prevent similar issues like this one

Other

  • I have not opened an issue to nodejs/hermes/cpython, guess I will, although I don't think it's fixable.
  • regexp implementation:
    • nodejs rely on v8, and v8 implement regexp internally
    • hermes uses the ICU's regular expression engine
    • cpython implement regexp internally

badeggg avatar Nov 15 '24 10:11 badeggg

That is saying to replace UrlPatterns.MARKDOWN_URL_REGEX with UrlPatterns.LOOSE_URL_REGEX, which has a reasonable regexp part.

App still gets stuck. 🤔

https://github.com/user-attachments/assets/e6ad59d1-e7b4-4ec2-8219-d62102da8bda

ntdiary avatar Nov 16 '24 08:11 ntdiary

App still gets stuck.

Yeah, it's frustrating. I noticed the size of the regexp is not the cause. I guess it's because some part of the regexp 'somehow' trigger an edge case of regexp executor in hermes. I can find which part(or parts) is causing the stall, but it's definitely not the root cause.

badeggg avatar Nov 16 '24 09:11 badeggg

I have updated my proposal

badeggg avatar Nov 16 '24 13:11 badeggg

I think it would be better to provide an accurate and clear root cause, it will help us determine how to prevent similar issues in the future. :)

ntdiary avatar Nov 18 '24 14:11 ntdiary

I have updated my proposal, with a more accurate root cause.

badeggg avatar Nov 19 '24 10:11 badeggg

I have updated my proposal, with a more accurate root cause.

@badeggg, thank you! will review again about 12 hours. :)

ntdiary avatar Nov 19 '24 14:11 ntdiary

A regexp looks like (\w*)*m can lead to stuck situation

Hi, @badeggg, is this a pattern you found in expensify-common, or is it just an example?

ntdiary avatar Nov 20 '24 09:11 ntdiary

It's a pattern in expensify-common, part of regexp for video match this pattern. I will point out it latter, I'm kind busy now

badeggg avatar Nov 20 '24 09:11 badeggg

📣 It's been a week! Do we have any satisfactory proposals yet? Do we need to adjust the bounty for this issue? 💸

melvin-bot[bot] avatar Nov 20 '24 16:11 melvin-bot[bot]

I have updated my proposal. Thank you for your patience.

I also updated my opinion. The root cause is regexp execution get stuck when too many possibilities before failing, pattern (\w*)*m is just one of the cases, a simple case though. Too many possibilities is usually created in a regexp when set a quantifier inside a group and set another quantifier just after the group definition.

We can find few cases in our project here: Screenshot 2024-11-21 at 15 53 47

badeggg avatar Nov 21 '24 08:11 badeggg

Notice: This solution will work fine if the user is pasting a valid video markdown, a valid image markdown or a valid link, but does not work if a user paste an invalid video markdown, an invalid image markdown or an invalid link and let the regexp executor have to go through too many possibilities before failing.

I was just about to upload a video to present this problem. When I add a space, the app still freezes. I think it would be better if this situation could also be addressed. 🤔

https://github.com/user-attachments/assets/17b72ea5-0edb-4ca8-8a67-176664c1e842

ntdiary avatar Nov 21 '24 08:11 ntdiary

The alternative solution in my proposal can solve this problem, change the super complicated url pattern to a simple url pattern like: https?:\/\/[0-9a-z\.-\/?=&#]+. But it's a big change, guess this solution needs an internal check.

I think there is not a perfect solution.

badeggg avatar Nov 21 '24 08:11 badeggg

Hi, I have updated my proposal, just added tech details for the alternative solution.

badeggg avatar Nov 21 '24 13:11 badeggg

@badeggg, using your primary solution, a few unit tests fail. image

As for your alternative solution, I don’t think we’d want to simplify url pattern to that extent, since domain name has RFC standards. The restriction on domain name formats is intentional.

BTW, I mentioned the similar problem, and also included an article link to help understand catastrophic backtracking. Finally, Considering that we’re still relying on regex, I believe a more general and suitable approach would be to prevent meaningless backtracking (for a human).

For example, here is a regex I test:

const LOOSE_URL_WEBSITE_REGEX = `${URL_PROTOCOL_REGEX}([a-z0-9](?=(?<a>[-a-z0-9]*[a-z0-9])?)\\k<a>\\.)*(?:[a-z0-9](?=(?<b>[-a-z0-9]*[a-z0-9])?)\\k<b>)(?:\\:${ALLOWED_PORTS}|\\b|(?=_))`;

You can test it as well, maybe you can identify shortcomings in this regex or find a better solution. :)

ntdiary avatar Nov 21 '24 14:11 ntdiary

  • Guess my description "regexp execution get stuck when too many possibilities before failing" has a formal name: catastrophic backtracking.
  • I looked into the regex, the reason for catastrophic backtracking is two pairs of nesting quantifiers, we can flatten them to avoid 'too many possibilities', but few simple url patterns can not be covered.
  • From my view
    • It's almost impossible to write a single regexp to cover all possibilities of the standard defined url, while avoid catastrophic backtracking in js
    • It's not necessary for us to accurately cover all possibilities of the standard defined url
    • So I stand with the alternative solution of my proposal: use a simple regexp

badeggg avatar Nov 22 '24 09:11 badeggg

  • Guess my description "regexp execution get stuck when too many possibilities before failing" has a formal name: catastrophic backtracking.

  • I looked into the regex, the reason for catastrophic backtracking is two pairs of nesting quantifiers, we can flatten them to avoid 'too many possibilities', but few simple url patterns can not be covered.

  • From my view

    • It's almost impossible to write a single regexp to cover all possibilities of the standard defined url, while avoid catastrophic backtracking in js
    • It's not necessary for us to accurately cover all possibilities of the standard defined url
    • So I stand with the alternative solution of my proposal: use a simple regexp

Sorry, I really appreciate your continuous efforts to improve the proposal during discussions. However, at this point, I personally can't approve your proposal, as it would cause some of our expected unit tests to fail. For example, it would parse an image markdow link as video html tag, or treat url like http://-77.com and http://test?test as valid. cc @garrettmknight to see if they have any different thoughts. :)

ntdiary avatar Nov 22 '24 15:11 ntdiary

it would parse an image markdown link as video html tag

I think there is not a clear rule that define what kind of urls are video type and others are image type. The type of the content behind a url can only be determined after requested from server. We can only guess. Currently, we guess a link is video type by checking whether there is a video file extension string(e.g. .mp4) in the url string.

or treat url like http://-77.com and http://test?test as valid

Yeah, It's a concession. I think it's user's responsibility to make sure a url is valid, not us. Just when I typing, I notice that github is recognizing http://test?test as a url link string.

Or maybe wait for more proposals.

badeggg avatar Nov 23 '24 10:11 badeggg

@garrettmknight, @ntdiary Whoops! This issue is 2 days overdue. Let's get this updated quick!

melvin-bot[bot] avatar Nov 26 '24 09:11 melvin-bot[bot]

Let's give this a little more time for proposals - I'll also increase the bounty. If we don't get more proposals in a week or so, we can consider @badeggg's current solution as a viable option.

@badeggg thanks for working on it thus far - these regex issues tend to get pretty tricky.

garrettmknight avatar Nov 26 '24 13:11 garrettmknight

Upwork job price has been updated to $500

melvin-bot[bot] avatar Nov 26 '24 13:11 melvin-bot[bot]