arcgis-python-api icon indicating copy to clipboard operation
arcgis-python-api copied to clipboard

Publishing CSV in ArcGIS Online with `publish_parameters` fails

Open cooperjaXC opened this issue 1 month ago • 2 comments

Describe the bug Publishing a CSV with non-null publish_parameters fails. Full details in my post in the ESRI community that's getting little traction.

To Reproduce Steps to reproduce the behavior:

import os, sys, arcgis, pandas as pd, time

df = pd.DataFrame({"state": ["Wyoming"]})
downloads = os.path.join(os.path.expanduser("~"), "Downloads")
tstcsvname = "test_csv"
tst_path = os.path.join(downloads, f"{tstcsvname}.csv")
df.to_csv(tst_path, index=False)

# TODO Enter your own username and password here:
uname =
pwrd =
gis = arcgis.GIS(url="https://www.arcgis.com", username=uname, password=pwrd)

# Get the folder
user = gis.users.me
target_folder = [f for f in user.folders if f.properties["id"] == "Root Folder"][0]

counter = 0
for pp in [None, {"locationType": "none"}]:
    # Check to make sure the CSV/Feature layer isn't there already.
    remote = None
    for itype in ["CSV", "Feature Service"]:
        search = gis.content.search(
            query=f'title: "{tstcsvname}", type: "{itype}"', max_items=1
        )
        if search:
            # If it is, remove it fully for a clean, fresh execution.
            remote = search[0]
            print("Deleting")
            print(remote)
            remote.delete(permanent=True)
    del remote


    # Add test data file
    props = {"title": tstcsvname, "type": "CSV"}
    job = target_folder.add(item_properties=props, file=tst_path)
    remotecsv = job.result()
    # remotecsv = gis.content.add(item_properties=props,file=tst_path)  # This loads a corrupted CSV; not sure what this is about...
    print(remotecsv)

    time.sleep(1.5)

    if not pp:
        # Try 1: Publishing works here
        # WARNING! This will use credits to geocode 1 record
        remotecsv.publish()
        print("Successful publishing without publish_parameters")
        # Woops, that geocoded when I didn't mean for it to. Let's use publish_parameters to avoid that.
    else:
        # Try 2: But if you toggle to this option, it does not.
        print("Try to publish with publish parameters")
        # pp = {"locationType": "none"}
        remotecsv.publish(publish_parameters=pp)

error:

# arcgis==2.4.1.1
Traceback (most recent call last):
  File "~\Downloads\publish_parameters_error.py", line 21, in <module>
    remotecsv.publish(publish_parameters=pp)
  File "~\Python311\site-packages\arcgis\gis\__init__.py", line 16825, in publish
    return self._publish(**params)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "~\Python311\site-packages\arcgis\gis\__init__.py", line 17111, in _publish
    raise Exception("Service name already exists in your org.")
Exception: Service name already exists in your org.

# arcgis==2.4.2
Traceback (most recent call last):
  File "~\Downloads\publish_parameters_error.py", line 27, in <module>
    remotecsv.publish(publish_parameters=pp)
  File "C~\Lib\site-packages\arcgis\gis\__init__.py", line 17877, in publish
    return self._publish(**params)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "~\Lib\site-packages\arcgis\gis\__init__.py", line 18164, in _publish
    raise Exception("Service name already exists in your org.")
Exception: Service name already exists in your org.

Screenshots N/A

Expected behavior The CSV should publish in AGOL as a hosted table layer without any geocoding or geometry/geography.

Platform (please complete the following information):

  • OS: Windows 11
  • Browser: Google Chrome
  • Python API Version: both 2.4.2 & 2.4.1.1

Additional context Re. the contributing guidelines, I tried seeking advice on the community forum, but I haven't gotten any feedback as of raising this issue. It may generate discussion, but I will triage between this issue and that discussion as necessary. The source code of my original post is pasted below for folks' convenience. Preemptively, Thanks for the help!

See the text of that post here

I recently switched my workflow from using `gis.content.add()` to `Folder.add()`, as the future warnings have been advising. I have one workflow that publishes a non-spatial CSV to a hosted table to ArcGIS Online that I believe was working as recently as last week with the new `arcgis==2.4.2` version. This week, however, trying to publish with any non-null `publish_parameters` argument at all is resulting in a hallucination of the item already existing in the organization. I have tested with various datasets and item names (including item names never before used for the organization), and all have come back with this exception. Any publish parameter will trigger this for me (time zone, type, location type). Publishing without the argument or with a blank dictionary does not trigger this issue, but it will geocode any location fields and cost me credits, which is the point of setting `{"locationType": "none"}` for tables.

Here's a stripped down version that trigger the error for me. Just enter your own credentials, ensure you have arcgis==2.4.2 installed, and execute. My error messages are at the bottom of the code block, environment specs after that. Let me know if I've missed anything. Thanks, all!


(At this point, I provided the same code snippet and error messages as are pasted above in this GH Issue)


System: Windows 11 (2 machines tested)

Python versions tested: 3.11.8, 3.12.4, & 3.12.10

`arcgis` package versions tested: 2.4.1.1 & 2.4.2

`pip freeze`:

annotated-types==0.7.0
arcgis==2.4.2
cachetools==6.2.2
certifi==2025.11.12
cffi==2.0.0
charset-normalizer==3.4.4
click==8.3.1
cloudpickle==3.1.2
colorama==0.4.6
contourpy==1.3.3
cryptography==46.0.3
cycler==0.12.1
dask==2025.2.0
fonttools==4.60.1
fsspec==2025.10.0
geomet==1.1.0
idna==3.11
jaraco.classes==3.4.0
jaraco.context==6.0.1
jaraco.functools==4.3.0
keyring==25.7.0
kiwisolver==1.4.9
locket==1.0.0
lxml==6.0.2
matplotlib==3.10.7
matplotlib-inline==0.2.1
more-itertools==10.8.0
networkx==3.5
numpy==2.3.5
oauthlib==3.3.1
packaging==25.0
pandas==2.3.3
partd==1.4.2
pillow==12.0.0
puremagic==1.30
pyarrow==20.0.0
pycparser==2.23
pydantic==2.12.4
pydantic_core==2.41.5
pylerc==4.0
pyparsing==3.2.5
pyspnego==0.12.0
python-dateutil==2.9.0.post0
pytz==2025.2
pywin32==311
pywin32-ctypes==0.2.3
PyYAML==6.0.3
requests==2.32.5
requests-oauthlib==2.0.0
requests-toolbelt==1.0.0
six==1.17.0
sspilib==0.4.0
toolz==1.1.0
traitlets==5.14.3
truststore==0.10.4
typing-inspection==0.4.2
typing_extensions==4.15.0
tzdata==2025.2
ujson==5.11.0
urllib3==2.5.0
websocket-client==1.9.0

cooperjaXC avatar Nov 20 '25 18:11 cooperjaXC

@cooperjaXC feed your CSV item into the analyze function to get your publish parameters. Then take the dataset and update the layerInfo's name property to something unique. This normally fixes most issues.

achapkowski avatar Nov 20 '25 22:11 achapkowski

@achapkowski, yes, this fixes the issue. Thank you much for the suggestion.

My follow-up question is why is this fix necessary? The 'name' parameter of the publish_parameters changes the name of the sub-table of the feature layer/table being published. When one publishes a CSV as a hosted table from AGOL's GUI, the sub-table automatically inherits the name of the hosted table. Same thing happens when publishing as a hosted feature layer without any specified publish_parameters in python; the sub-layer has the same name as the hosted feature layer itself. But when I gis.content.analyze(item=remotecsv) in my example, I get the following publishParameters for my test_csv.csv:

{'publishParameters': {'type': 'csv', 'name': 'data',....

Even if this is expected behavior when using publish parameters in python (as the documentation pasted below for gis.Item.publish() suggests), why does it not follow the default logic of the other publishing operations? I imagine others could get tripped up by this as well. Is the default name 'data' expected? I tried to look through the gis.Item class (pasted relevant code from v2.4.2 below), but it looks like everything should be set up to follow that default logic of passing the item-to-be-published's name into the publishParameters. What am I missing here?

Thanks again @achapkowski!

Relevant snippets from the package

In debugging this, I found where the documentation exists for this in the code (linked below)
# ~/Lib/site-packages/arcgis/gis/__init__.py, line 17821 (v2.4.2)
            # Publishing a Hosted Table Example

            >>> csv_item = gis.content.get('<csv item id>')
            >>> analyzed = gis.content.analyze(item=csv_item, file_type='csv')

            >>> publish_parameters = analyzed['publishParameters']
            >>> publish_parameters['name'] = 'AVeryUniqueName' # this needs to be updated
            >>> publish_parameters['locationType'] = "none" # this makes it a hosted table

            >>> published_item = csv_item.publish(publish_parameters)

# Line 17908
        if str(output_type).lower() in ["ogc", "ogcfeatureservice"]:
            output_type = "OGCFeatureService"
            file_type = "featureService"
            scrubbed = re.sub("[^a-zA-Z0-9_]+", "", self.title)
            if publish_parameters is None:
                publish_parameters = {}
            publish_parameters.update(
                {"name": publish_parameters.get("name", scrubbed)}

# Line 17980
            elif fileType in ["csv", "excel"] and not overwrite:
                res = self._gis.content.analyze(
                    item=self,
                    file_type=fileType,
                    geocoding_service=geocode_service,
                )
                publish_parameters = res["publishParameters"]
                service_name = re.sub(r"[\W_]+", "_", self["title"])
                publish_parameters.update({"name": service_name})

# Line 18120
        elif (
            fileType
            in [
                "csv",
                "excel",
            ]
            and overwrite is False
        ):  # merge users passed-in publish parameters with analyze results
            publish_parameters_orig = publish_parameters

            res = self._gis.content.analyze(
                item=self,
                file_type=fileType,
                geocoding_service=geocode_service,
            )
            publish_parameters = res["publishParameters"]
            # case for hosted tables
            if (
                "layerInfo" in publish_parameters
                and "layerInfo" in publish_parameters_orig
            ):
                # do general update
                publish_parameters.update(publish_parameters_orig)
            # case for hosted fl
            else:
                # check if layers key exist. If not, add empty array to avoid error in update
                if "layers" not in publish_parameters:
                    publish_parameters["layers"] = []
                    # csv analyze returns layerInfo rather than a layer
                    if "layerInfo" in publish_parameters:
                        publish_parameters["layers"].append(
                            publish_parameters["layerInfo"]
                        )

                # do general update and assign service name
                publish_parameters.update(publish_parameters_orig)
                if "name" in publish_parameters:
                    service_name = re.sub(r"[\W_]+", "_", publish_parameters["name"])
                else:
                    service_name = re.sub(r"[\W_]+", "_", self["title"])
                publish_parameters.update({"name": service_name})
                if not self._gis.content.is_service_name_available(
                    publish_parameters["name"], "featureService"
                ):
                    raise Exception("Service name already exists in your org.")

Reproducible code

You'll need to replace "Try 2" in the code block above with this updated code below. This snippet seeks to

Code snippet that fixed my issue
# Replace "Try 2" in the code block above with this updated code
# Try 2: But if you toggle to this option, it does not.
print("\nTry to publish with publish parameters")
pp = {"locationType": "none"}
jlabel = 'publishParameters'
ana_result = gis.content.analyze(item=remotecsv)
print(ana_result)
for locationType, noLocationType in pp.items():
    ana_result[jlabel][locationType] = noLocationType
ana_result[jlabel]['name'] = remotecsv.title  # Grab that same name as the CSV file name
print(ana_result)
remotecsv.publish(publish_parameters=ana_result[jlabel])

cooperjaXC avatar Nov 21 '25 23:11 cooperjaXC