mapshaper icon indicating copy to clipboard operation
mapshaper copied to clipboard

-erase command warns on guessing DBF encoding needlessly

Open nvkelso opened this issue 2 years ago • 1 comments

Lower priority... In 0.5.116 I now see warnings about detecting encodings of DBF files used in erase commands. Not a big deal, but creates noise / confusion in the command line during a run.

Erase should be a geometry only operations, so warning about the DBF component seems irrelevant?

Oddly intermediate/ne_10m_lakes_big.shp should have a CPG sidecar because it's generated by Mapshaper, but even constructing it as follows there isn't a CPG sidecar?

intermediate/ne_10m_lakes_big.shp: 10m_physical/ne_10m_lakes.shp
	mkdir -p intermediate
	mapshaper -i 10m_physical/ne_10m_lakes.shp encoding=utf8 \
		-filter 'scalerank <= 0' + \
		-o intermediate/ne_10m_lakes_big.shp \

Whole thing:

mapshaper -i 10m_cultural/ne_10m_admin_0_scale_rank.shp \
		-dissolve 'sr_adm0_a3' calc='featurecla="Admin-0 country", scalerank = min(scalerank)' \
		-filter 'scalerank !== null' + \
		-join housekeeping/ne_admin_0_details_level_2_countries.dbf encoding=utf8 keys=sr_adm0_a3,ADM0_A3 fields=* \
		-each 'delete sr_adm0_a3' \
		-o 10m_cultural/ne_10m_admin_0_countries.shp \
		-erase intermediate/ne_10m_lakes_big.shp \
		-o 10m_cultural/ne_10m_admin_0_countries_lakes.shp \

[dissolve] Dissolved 4,352 features into 258 features
[filter] Retained 258 of 258 features
[join] Joined data from 261 source records to 258 target records
[join] 13/274 source records could not be joined
[join] 2/258 target records were matched by multiple source records (many-to-one relationship)
[join] Inconsistent values were found in fields [LABELRANK,TYPE,NAME,NAME_LONG,BRK_A3,BRK_NAME,BRK_GROUP,ABBREV,POSTAL,FORMAL_EN,NAME_CIAWF,NOTE_ADM0,NOTE_BRK,NAME_SORT,POP_EST,POP_RANK,POP_YEAR,GDP_MD,GDP_YEAR,ISO_A2,ISO_A2_EH,ISO_A3,ISO_A3_EH,ISO_N3,ISO_N3_EH,UN_A3,WB_A2,WB_A3,WOE_ID,WOE_ID_EH,WOE_NOTE,ADM0_ISO,ADM0_DIFF,ADM0_TLC,ADM0_A3_FR,ADM0_A3_RU,ADM0_A3_ES,ADM0_A3_CN,ADM0_A3_TW,ADM0_A3_IN,ADM0_A3_NP,ADM0_A3_PK,ADM0_A3_DE,ADM0_A3_GB,ADM0_A3_BR,ADM0_A3_PS,ADM0_A3_SA,ADM0_A3_EG,ADM0_A3_MA,ADM0_A3_PT,ADM0_A3_AR,ADM0_A3_JP,ADM0_A3_KO,ADM0_A3_VN,ADM0_A3_TR,ADM0_A3_ID,ADM0_A3_PL,ADM0_A3_GR,ADM0_A3_IT,ADM0_A3_NL,ADM0_A3_SE,ADM0_A3_BD,ADM0_A3_UA,NAME_LEN,LONG_LEN,ABBREV_LEN,MIN_ZOOM,MIN_LABEL,MAX_LABEL,LABEL_X,LABEL_Y,NE_ID,WIKIDATAID,NAME_AR,NAME_BN,NAME_DE,NAME_EN,NAME_ES,NAME_FA,NAME_FR,NAME_EL,NAME_HE,NAME_HI,NAME_HU,NAME_ID,NAME_IT,NAME_JA,NAME_KO,NAME_NL,NAME_PL,NAME_PT,NAME_RU,NAME_SV,NAME_TR,NAME_UK,NAME_UR,NAME_VI,NAME_ZH,NAME_ZHT,FCLASS_ISO,TLC_DIFF,FCLASS_TLC,FCLASS_US,FCLASS_IL,FCLASS_PS,FCLASS_MA,FCLASS_AR,FCLASS_JP,FCLASS_KO,FCLASS_VN,FCLASS_TR,FCLASS_ID,BRK_DIFF,HOMEPART,FCLASS_RU,FCLASS_CN,FCLASS_UA] during many-to-one join. Values in the first joining record were used.
[o] Wrote 10m_cultural/ne_10m_admin_0_countries.shp
[o] Wrote 10m_cultural/ne_10m_admin_0_countries.shx
[o] Wrote 10m_cultural/ne_10m_admin_0_countries.dbf
[o] Wrote 10m_cultural/ne_10m_admin_0_countries.prj
[erase] Detected DBF text encoding: utf8
Sample text containing non-ascii characters:
  بحيرة الدب العظيم             গ্রেট বেয়ার লেক
  Großer Bärensee               Μεγάλη Λίμνη των Άρκτων
  ग्रेट बियर झील                Nagy-Medve-tó
  グレートベア湖                       그레이트베어호
  Wielkie Jezioro Niedźwiedzie  Большое Медвежье озеро
  Stora Björnsjön               Büyük Ayı
  Gấu lớn                       大熊湖
  دریاچه گریت بر                ימת הדובים הגדולה
  Велике Ведмеже озеро          گریٹ بیئر لیک
  بحيرة جريت سليف               গ্রেট স্লেভ লেক
  Großer Sklavensee             Μεγάλη Λίμνη των Σκλάβων
  ग्रेट स्लाव झील               Nagy-Rabszolga-tó
  グレートスレーブ湖                     그레이트슬레이브호
  Большое Невольничье озеро     Stora Slavsjön
  Büyük Esir                    Slave Lớn
  大奴湖                           دریاچه گریت اسلیو
  ימת העבדים הגדולה             Велике Невільниче озеро
  گریٹ سلیو جھیل                خليج ماكليود
  ম্যাকলিওড বে                  Bahía McLeod
  Κόλπος ΜακΛέοντ               मैकलियोड खाड़ी
  McLeod-öböl                   マクレオド・ベイ
  맥라우드 베이                       залив Мак-Леод
  Hồ McLeod Bay                 麦克利奥德
  خلیج مک لئود                  מיד
  Маклеод-Бей                   مکلیوڈ بے
[o] Wrote 10m_cultural/ne_10m_admin_0_countries_lakes.shp
[o] Wrote 10m_cultural/ne_10m_admin_0_countries_lakes.shx
[o] Wrote 10m_cultural/ne_10m_admin_0_countries_lakes.dbf
[o] Wrote 10m_cultural/ne_10m_admin_0_countries_lakes.prj

EG from https://github.com/nvkelso/natural-earth-vector/blob/master/Makefile#L1009-L1020.

nvkelso avatar May 09 '22 04:05 nvkelso

v0.5.118 prevents the DBF text encoding message that sometimes appeared while running -clip and -erase.

About .cpg files: Mapshaper will use a .cpg file if it is present, but Mapshaper doesn't currently generate .cpg files.

I suppose it's a good idea to start outputting .cpg files... I'm just unsure of whether the encoding names that Mapshaper accepts are all recognized by other popular software. For example, Mapshaper accepts iso88591 ... could it be that some other programs accept ISO8859-1 but not iso88591? I think there's a fair amount of testing to be done before this feature gets added.

mbloch avatar May 16 '22 04:05 mbloch