clust
clust copied to clipboard
Type conversion errors when reading replicates file
Hello,
I have been unable to include alphanumeric text in fields 2 or 3 of the replicates file without encountering an error:
| Analysis started at: Thursday 12 May 2022 (15:44:38) |
| 1. Reading dataset(s) |
Traceback (most recent call last):
File "pandas/_libs/parsers.pyx", line 1113, in pandas._libs.parsers.TextReader._convert_tokens
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/anaconda3/envs/clust/bin/clust", line 10, in <module>
sys.exit(main())
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/clust/__main__.py", line 102, in main
clustpipeline.clustpipeline(args.datapath, args.m, args.r, args.n, args.o, args.K, args.t,
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/clust/clustpipeline.py", line 86, in clustpipeline
(X, replicates, Genes, datafiles) = io.readDatasetsFromDirectory(datapath, delimiter='\t| |, |; |,|;', skiprows=1, skipcolumns=1,
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/clust/scripts/io.py", line 46, in readDatasetsFromDirectory
datafilesread = readDataFromFiles(datafileswithpath, delimiter, float, skiprows, skipcolumns, returnSkipped)
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/clust/scripts/io.py", line 204, in readDataFromFiles
X[l] = pdreadcsv_regexdelim(datafiles[l], delimiter=delimiter, dtype=dtype, skiprows=skiprows,
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/clust/scripts/io.py", line 239, in pdreadcsv_regexdelim
result = pd.read_csv(StringIO('\n'.join(re.sub(delimiter, '\t', str(x)) for x in f)),
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 680, in read_csv
return _read(filepath_or_buffer, kwds)
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 581, in _read
return parser.read(nrows)
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1254, in read
index, columns, col_dict = self._engine.read(nrows)
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 225, in read
chunks = self._reader.read_low_memory(nrows)
File "pandas/_libs/parsers.pyx", line 805, in pandas._libs.parsers.TextReader.read_low_memory
File "pandas/_libs/parsers.pyx", line 883, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 1026, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas/_libs/parsers.pyx", line 1119, in pandas._libs.parsers.TextReader._convert_tokens
ValueError: could not convert string to float: 'B'
Here is the example problematic replicates file:
clust_bio_comb.txt A 1.x,1.y
clust_bio_comb.txt B 2.x,2.y
clust_bio_comb.txt C 3.x,3.y
clust_bio_comb.txt D 4.x,4.y
clust_bio_comb.txt E 5.x,5.y
If I then convert all names in fields 2 and 3 to integers, I run into another error:
| Analysis started at: Thursday 12 May 2022 (15:50:20) |
| 1. Reading dataset(s) |
Traceback (most recent call last):
File "/opt/anaconda3/envs/clust/bin/clust", line 10, in <module>
sys.exit(main())
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/clust/__main__.py", line 102, in main
clustpipeline.clustpipeline(args.datapath, args.m, args.r, args.n, args.o, args.K, args.t,
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/clust/clustpipeline.py", line 92, in clustpipeline
(replicatesIDs, conditions) = io.readReplicates(replicatesfile, datapath, datafiles, replicates)
File "/opt/anaconda3/envs/clust/lib/python3.10/site-packages/clust/scripts/io.py", line 125, in readReplicates
conditions[c] = line[1:]
TypeError: 'filter' object is not subscriptable
https://github.com/BaselAbujamous/clust/issues/62
@BaselAbujamous do you have any formatting recommendations to bypass these errors?
Same here, even using the example data.
Hello both. Thanks for reporting this. Have you tried the most recent release?
On Mon, 12 Dec 2022 at 10:11, Changyu Yi @.***> wrote:
Same here, even using the example data.
— Reply to this email directly, view it on GitHub https://github.com/BaselAbujamous/clust/issues/76#issuecomment-1345998236, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.***>
Hello both. Thanks for reporting this. Have you tried the most recent release?
On Mon, 12 Dec 2022 at 10:11, Changyu Yi @.***> wrote:
Same here, even using the example data.
— Reply to this email directly, view it on GitHub https://github.com/BaselAbujamous/clust/issues/76#issuecomment-1345998236, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.***>
Hello both. Thanks for reporting this. Have you tried the most recent release? … On Mon, 12 Dec 2022 at 10:11, Changyu Yi @.> wrote: Same here, even using the example data. — Reply to this email directly, view it on GitHub <#76 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.>
Hi Basel,
Yes, I used the latest version (1.17.0 ). I was able to run it a year ago in my old laptop, today I tried using the laptop and I got the same error.
Thanks
Hello both. Thanks for reporting this. Have you tried the most recent release? … On Mon, 12 Dec 2022 at 10:11, Changyu Yi @.> wrote: Same here, even using the example data. — Reply to this email directly, view it on GitHub <#76 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.>
Hi Basel,
Yes, I used the latest version (1.17.0 ). I was able to run it a year ago in my old laptop, today I tried using the laptop and I got the same error.
Thanks
Hello Changyu,
The latest version is 1.18.1. Errors occurred because of some things that some dependency packages (e.g. scipy) changed in their recent versions making clust break because of their updates. So the recent release 1.18.1 was patched to overcome these.
Please let me know if this solves it or not.
Best, Basel
On Mon, 12 Dec 2022 at 12:41, Changyu Yi @.***> wrote:
Hello both. Thanks for reporting this. Have you tried the most recent release? … <#m_-770256276532528270_> On Mon, 12 Dec 2022 at 10:11, Changyu Yi @.> wrote: Same here, even using the example data. — Reply to this email directly, view it on GitHub <#76 (comment) https://github.com/BaselAbujamous/clust/issues/76#issuecomment-1345998236>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.>
Hi Basel,
Yes, I used the latest version (1.17.0 ). I was able to run it a year ago in my old laptop, today I tried using the laptop and I got the same error.
Thanks
— Reply to this email directly, view it on GitHub https://github.com/BaselAbujamous/clust/issues/76#issuecomment-1346172595, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ2MQQGLD7RBSABSI4TWM3XL3ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.***>
Hello Changyu,
The latest version is 1.18.1. Errors occurred because of some things that some dependency packages (e.g. scipy) changed in their recent versions making clust break because of their updates. So the recent release 1.18.1 was patched to overcome these.
Please let me know if this solves it or not.
Best, Basel
On Mon, 12 Dec 2022 at 12:41, Changyu Yi @.***> wrote:
Hello both. Thanks for reporting this. Have you tried the most recent release? … <#m_-770256276532528270_> On Mon, 12 Dec 2022 at 10:11, Changyu Yi @.> wrote: Same here, even using the example data. — Reply to this email directly, view it on GitHub <#76 (comment) https://github.com/BaselAbujamous/clust/issues/76#issuecomment-1345998236>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.>
Hi Basel,
Yes, I used the latest version (1.17.0 ). I was able to run it a year ago in my old laptop, today I tried using the laptop and I got the same error.
Thanks
— Reply to this email directly, view it on GitHub https://github.com/BaselAbujamous/clust/issues/76#issuecomment-1346172595, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ2MQQGLD7RBSABSI4TWM3XL3ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.***>
Hello Changyu, The latest version is 1.18.1. Errors occurred because of some things that some dependency packages (e.g. scipy) changed in their recent versions making clust break because of their updates. So the recent release 1.18.1 was patched to overcome these. Please let me know if this solves it or not. Best, Basel … On Mon, 12 Dec 2022 at 12:41, Changyu Yi @.> wrote: Hello both. Thanks for reporting this. Have you tried the most recent release? … <#m_-770256276532528270_> On Mon, 12 Dec 2022 at 10:11, Changyu Yi @.> wrote: Same here, even using the example data. — Reply to this email directly, view it on GitHub <#76 (comment) <#76 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.> Hi Basel, Yes, I used the latest version (1.17.0 ). I was able to run it a year ago in my old laptop, today I tried using the laptop and I got the same error. Thanks — Reply to this email directly, view it on GitHub <#76 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ2MQQGLD7RBSABSI4TWM3XL3ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.**>
Hi Basel,
I tried use conda to install clust, but the latest it can install is 1.17.0. And then I tried to use sudo pip install clust
, it can only install 1.18.0. then I tried sudo pip install clust==1.18.1
, it returns an error No matching distribution found for clust==1.18.1. Could you please help to fix this?
Thanks Changyu
Hello Changyu, The latest version is 1.18.1. Errors occurred because of some things that some dependency packages (e.g. scipy) changed in their recent versions making clust break because of their updates. So the recent release 1.18.1 was patched to overcome these. Please let me know if this solves it or not. Best, Basel … On Mon, 12 Dec 2022 at 12:41, Changyu Yi @.> wrote: Hello both. Thanks for reporting this. Have you tried the most recent release? … <#m_-770256276532528270_> On Mon, 12 Dec 2022 at 10:11, Changyu Yi @.> wrote: Same here, even using the example data. — Reply to this email directly, view it on GitHub <#76 (comment) <#76 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA https://github.com/notifications/unsubscribe-auth/AAJFLQ4BKM73S2BGTZDEDMTWM3F23ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.> Hi Basel, Yes, I used the latest version (1.17.0 ). I was able to run it a year ago in my old laptop, today I tried using the laptop and I got the same error. Thanks — Reply to this email directly, view it on GitHub <#76 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJFLQ2MQQGLD7RBSABSI4TWM3XL3ANCNFSM5VZOU6HA . You are receiving this because you were mentioned.Message ID: @.**>
Hi Basel,
I tried use conda to install clust, but the latest it can install is 1.17.0. And then I tried to use sudo pip install clust
, it can only install 1.18.0. then I tried sudo pip install clust==1.18.1
, it returns an error No matching distribution found for clust==1.18.1. Could you please help to fix this?
Thanks Changyu
Hi Basel,
I tried the fourth install method as below
wget https://github.com/BaselAbujamous/clust/releases/download/v1.18.1/clust-1.18.1.tar.gz
sudo tar -xvzf clust-1.18.1.tar.gz
sudo python3 clust-1.18.1/clust.py . -r Replicates.txt
I still get the same error as below, please note that the output still show the version is 1.18.0 as below
/===========================================================================\
| Clust |
| (Optimised consensus clustering of multiple heterogenous datasets) |
| Python package version 1.18.0 (2022) Basel Abu-Jamous |
+---------------------------------------------------------------------------+
| Analysis started at: Tuesday 13 December 2022 (14:27:34) |
| 1. Reading dataset(s) |
Traceback (most recent call last):
File "pandas/_libs/parsers.pyx", line 1124, in pandas._libs.parsers.TextReader._convert_tokens
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "clust-1.18.1/clust.py", line 6, in <module>
main(args)
File "/mnt/c/Users/cyi/clust/clust-1.18.1/clust/__main__.py", line 102, in main
clustpipeline.clustpipeline(args.datapath, args.m, args.r, args.n, args.o, args.K, args.t,
File "/mnt/c/Users/cyi/clust/clust-1.18.1/clust/clustpipeline.py", line 86, in clustpipeline
(X, replicates, Genes, datafiles) = io.readDatasetsFromDirectory(datapath, delimiter='\t| |, |; |,|;', skiprows=1, skipcolumns=1,
File "/mnt/c/Users/cyi/clust/clust-1.18.1/clust/scripts/io.py", line 46, in readDatasetsFromDirectory
datafilesread = readDataFromFiles(datafileswithpath, delimiter, float, skiprows, skipcolumns, returnSkipped)
File "/mnt/c/Users/cyi/clust/clust-1.18.1/clust/scripts/io.py", line 204, in readDataFromFiles
X[l] = pdreadcsv_regexdelim(datafiles[l], delimiter=delimiter, dtype=dtype, skiprows=skiprows,
File "/mnt/c/Users/cyi/clust/clust-1.18.1/clust/scripts/io.py", line 239, in pdreadcsv_regexdelim
result = pd.read_csv(StringIO('\n'.join(re.sub(delimiter, '\t', str(x)) for x in f)),
File "/usr/local/lib/python3.8/dist-packages/pandas/util/_decorators.py", line 211, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/readers.py", line 950, in read_csv
return _read(filepath_or_buffer, kwds)
File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/readers.py", line 611, in _read
return parser.read(nrows)
File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/readers.py", line 1778, in read
) = self._engine.read( # type: ignore[attr-defined]
File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/c_parser_wrapper.py", line 230, in read
chunks = self._reader.read_low_memory(nrows)
File "pandas/_libs/parsers.pyx", line 808, in pandas._libs.parsers.TextReader.read_low_memory
File "pandas/_libs/parsers.pyx", line 890, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 1037, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas/_libs/parsers.pyx", line 1130, in pandas._libs.parsers.TextReader._convert_tokens
ValueError: could not convert string to float: 'B'
The installed packages as below
attrs==19.3.0
Automat==0.8.0
blinker==1.4
certifi==2019.11.28
chardet==3.0.4
Click==7.0
cloud-init==22.4.2
colorama==0.4.3
command-not-found==0.3
configobj==5.0.6
constantly==15.1.0
contourpy==1.0.6
cryptography==2.8
cycler==0.11.0
dbus-python==1.2.16
distro==1.4.0
distro-info===0.23ubuntu1
entrypoints==0.3
fonttools==4.38.0
httplib2==0.14.0
hyperlink==19.0.0
idna==2.8
importlib-metadata==1.5.0
incremental==16.10.1
Jinja2==2.10.1
joblib==1.2.0
jsonpatch==1.22
jsonpointer==2.0
jsonschema==3.2.0
keyring==18.0.1
kiwisolver==1.4.4
language-selector==0.1
launchpadlib==1.10.13
lazr.restfulclient==0.14.2
lazr.uri==1.0.3
MarkupSafe==1.1.0
matplotlib==3.6.2
more-itertools==4.2.0
netifaces==0.10.4
numpy==1.23.5
oauthlib==3.1.0
packaging==22.0
pandas==1.5.2
pexpect==4.6.0
Pillow==9.3.0
portalocker==2.6.0
pyasn1==0.4.2
pyasn1-modules==0.2.1
PyGObject==3.36.0
PyHamcrest==1.9.0
PyJWT==1.7.1
pymacaroons==0.13.0
PyNaCl==1.3.0
pyOpenSSL==19.0.0
pyparsing==3.0.9
pyrsistent==0.15.5
pyserial==3.4
python-apt==2.0.0+ubuntu0.20.4.8
python-dateutil==2.8.2
python-debian===0.1.36ubuntu1
pytz==2022.6
PyYAML==5.3.1
requests==2.22.0
requests-unixsocket==0.2.0
scikit-learn==1.2.0
scipy==1.9.3
SecretStorage==2.3.1
service-identity==18.1.0
simplejson==3.16.0
six==1.14.0
sos==4.4
ssh-import-id==5.10
systemd-python==234
threadpoolctl==3.1.0
Twisted==18.9.0
ubuntu-advantage-tools==27.12
ufw==0.36
unattended-upgrades==0.1
urllib3==1.25.8
wadllib==1.3.3
zipp==1.0.0
zope.interface==4.7.1
Hi Basel,
I tried the fourth install method as below
wget https://github.com/BaselAbujamous/clust/releases/download/v1.18.1/clust-1.18.1.tar.gz
sudo tar -xvzf clust-1.18.1.tar.gz
sudo python3 clust-1.18.1/clust.py . -r Replicates.txt
I still get the same error as below, please note that the output still show the version is 1.18.0 as below
/===========================================================================\
| Clust |
| (Optimised consensus clustering of multiple heterogenous datasets) |
| Python package version 1.18.0 (2022) Basel Abu-Jamous |
+---------------------------------------------------------------------------+
| Analysis started at: Tuesday 13 December 2022 (14:27:34) |
| 1. Reading dataset(s) |
Traceback (most recent call last):
File "pandas/_libs/parsers.pyx", line 1124, in pandas._libs.parsers.TextReader._convert_tokens
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "clust-1.18.1/clust.py", line 6, in <module>
main(args)
File "/mnt/c/Users/cyi/clust/clust-1.18.1/clust/__main__.py", line 102, in main
clustpipeline.clustpipeline(args.datapath, args.m, args.r, args.n, args.o, args.K, args.t,
File "/mnt/c/Users/cyi/clust/clust-1.18.1/clust/clustpipeline.py", line 86, in clustpipeline
(X, replicates, Genes, datafiles) = io.readDatasetsFromDirectory(datapath, delimiter='\t| |, |; |,|;', skiprows=1, skipcolumns=1,
File "/mnt/c/Users/cyi/clust/clust-1.18.1/clust/scripts/io.py", line 46, in readDatasetsFromDirectory
datafilesread = readDataFromFiles(datafileswithpath, delimiter, float, skiprows, skipcolumns, returnSkipped)
File "/mnt/c/Users/cyi/clust/clust-1.18.1/clust/scripts/io.py", line 204, in readDataFromFiles
X[l] = pdreadcsv_regexdelim(datafiles[l], delimiter=delimiter, dtype=dtype, skiprows=skiprows,
File "/mnt/c/Users/cyi/clust/clust-1.18.1/clust/scripts/io.py", line 239, in pdreadcsv_regexdelim
result = pd.read_csv(StringIO('\n'.join(re.sub(delimiter, '\t', str(x)) for x in f)),
File "/usr/local/lib/python3.8/dist-packages/pandas/util/_decorators.py", line 211, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/readers.py", line 950, in read_csv
return _read(filepath_or_buffer, kwds)
File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/readers.py", line 611, in _read
return parser.read(nrows)
File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/readers.py", line 1778, in read
) = self._engine.read( # type: ignore[attr-defined]
File "/usr/local/lib/python3.8/dist-packages/pandas/io/parsers/c_parser_wrapper.py", line 230, in read
chunks = self._reader.read_low_memory(nrows)
File "pandas/_libs/parsers.pyx", line 808, in pandas._libs.parsers.TextReader.read_low_memory
File "pandas/_libs/parsers.pyx", line 890, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 1037, in pandas._libs.parsers.TextReader._convert_column_data
File "pandas/_libs/parsers.pyx", line 1130, in pandas._libs.parsers.TextReader._convert_tokens
ValueError: could not convert string to float: 'B'
The installed packages as below
attrs==19.3.0
Automat==0.8.0
blinker==1.4
certifi==2019.11.28
chardet==3.0.4
Click==7.0
cloud-init==22.4.2
colorama==0.4.3
command-not-found==0.3
configobj==5.0.6
constantly==15.1.0
contourpy==1.0.6
cryptography==2.8
cycler==0.11.0
dbus-python==1.2.16
distro==1.4.0
distro-info===0.23ubuntu1
entrypoints==0.3
fonttools==4.38.0
httplib2==0.14.0
hyperlink==19.0.0
idna==2.8
importlib-metadata==1.5.0
incremental==16.10.1
Jinja2==2.10.1
joblib==1.2.0
jsonpatch==1.22
jsonpointer==2.0
jsonschema==3.2.0
keyring==18.0.1
kiwisolver==1.4.4
language-selector==0.1
launchpadlib==1.10.13
lazr.restfulclient==0.14.2
lazr.uri==1.0.3
MarkupSafe==1.1.0
matplotlib==3.6.2
more-itertools==4.2.0
netifaces==0.10.4
numpy==1.23.5
oauthlib==3.1.0
packaging==22.0
pandas==1.5.2
pexpect==4.6.0
Pillow==9.3.0
portalocker==2.6.0
pyasn1==0.4.2
pyasn1-modules==0.2.1
PyGObject==3.36.0
PyHamcrest==1.9.0
PyJWT==1.7.1
pymacaroons==0.13.0
PyNaCl==1.3.0
pyOpenSSL==19.0.0
pyparsing==3.0.9
pyrsistent==0.15.5
pyserial==3.4
python-apt==2.0.0+ubuntu0.20.4.8
python-dateutil==2.8.2
python-debian===0.1.36ubuntu1
pytz==2022.6
PyYAML==5.3.1
requests==2.22.0
requests-unixsocket==0.2.0
scikit-learn==1.2.0
scipy==1.9.3
SecretStorage==2.3.1
service-identity==18.1.0
simplejson==3.16.0
six==1.14.0
sos==4.4
ssh-import-id==5.10
systemd-python==234
threadpoolctl==3.1.0
Twisted==18.9.0
ubuntu-advantage-tools==27.12
ufw==0.36
unattended-upgrades==0.1
urllib3==1.25.8
wadllib==1.3.3
zipp==1.0.0
zope.interface==4.7.1
Solved it! The replicates file and the data file need to be called specifically. It helps to put the data in a specific subfolder. (Make sure to call both paths or cd to the parent folder first).
See the example code below where I had all the files in my downloads folder, vs a subdirectory for the data files....
FAILED VERSION - ALL IN SAME FOLDER
pma37@dhcp-10-248-206-95 Clust_Example % clust /Users/pma37/Downloads/Clust_Example/ -r Replicates.txt
/===========================================================================\
| Clust |
| (Optimised consensus clustering of multiple heterogenous datasets) |
| Python package version 1.18.0 (2022) Basel Abu-Jamous |
+---------------------------------------------------------------------------+
| Analysis started at: Thursday 07 March 2024 (10:52:06) |
| 1. Reading dataset(s) |
Traceback (most recent call last):
File "parsers.pyx", line 1161, in pandas._libs.parsers.TextReader._convert_tokens
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.11/bin/clust", line 8, in <module>
sys.exit(main())
^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/clust/__main__.py", line 102, in main
clustpipeline.clustpipeline(args.datapath, args.m, args.r, args.n, args.o, args.K, args.t,
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/clust/clustpipeline.py", line 86, in clustpipeline
(X, replicates, Genes, datafiles) = io.readDatasetsFromDirectory(datapath, delimiter='\t| |, |; |,|;', skiprows=1, skipcolumns=1,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/clust/scripts/io.py", line 46, in readDatasetsFromDirectory
datafilesread = readDataFromFiles(datafileswithpath, delimiter, float, skiprows, skipcolumns, returnSkipped)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/clust/scripts/io.py", line 204, in readDataFromFiles
X[l] = pdreadcsv_regexdelim(datafiles[l], delimiter=delimiter, dtype=dtype, skiprows=skiprows,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/clust/scripts/io.py", line 239, in pdreadcsv_regexdelim
result = pd.read_csv(StringIO('\n'.join(re.sub(delimiter, '\t', str(x)) for x in f)),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
return _read(filepath_or_buffer, kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 626, in _read
return parser.read(nrows)
^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1923, in read
) = self._engine.read( # type: ignore[attr-defined]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 234, in read
chunks = self._reader.read_low_memory(nrows)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "parsers.pyx", line 838, in pandas._libs.parsers.TextReader.read_low_memory
File "parsers.pyx", line 921, in pandas._libs.parsers.TextReader._read_rows
File "parsers.pyx", line 1066, in pandas._libs.parsers.TextReader._convert_column_data
File "parsers.pyx", line 1167, in pandas._libs.parsers.TextReader._convert_tokens
ValueError: could not convert string to float: 'B'
WORKING VERSION:
pma37@dhcp-10-248-206-95 Clust_Example % clust /Users/pma37/Downloads/Clust_Example/Data -r Replicates.txt
/===========================================================================\
| Clust |
| (Optimised consensus clustering of multiple heterogenous datasets) |
| Python package version 1.18.0 (2022) Basel Abu-Jamous |
+---------------------------------------------------------------------------+
| Analysis started at: Thursday 07 March 2024 (10:52:46) |
| 1. Reading dataset(s) |
| 2. Data pre-processing |
| - Automatic normalisation mode (default in v1.7.0+). |
| Clust automatically normalises your dataset(s). |
| To switch it off, use the `-n 0` option (not recommended). |
| Check https://github.com/BaselAbujamous/clust for details. |
| - Flat expression profiles filtered out (default in v1.7.0+). |
| To switch it off, use the --no-fil-flat option (not recommended). |
| Check https://github.com/BaselAbujamous/clust for details. |
| 3. Seed clusters production (the Bi-CoPaM method) |
| 10% |
| 20% |
| 30% |
| 40% |
| 50% |
| 60% |
| 70% |
| 80% |
| 90% |
| 100% |
| 4. Cluster evaluation and selection (the M-N scatter plots technique) |
| 10% |
| 20% |
| 30% |
| 40% |
| 50% |
| 60% |
| 70% |
| 80% |
| 90% |
| 100% |
| 5. Cluster optimisation and completion |
| 6. Saving results in |
| /Users/pma37/Downloads/Clust_Example/Results_07_Mar_24_2 |
| Eigengene computation is currently not supported for multiple datasets. |
+---------------------------------------------------------------------------+
| Analysis finished at: Thursday 07 March 2024 (10:53:00) |
| Total time consumed: 0 hours, 0 minutes, and 13 seconds |
| |
\===========================================================================/
/===========================================================================\
| RESULTS SUMMARY |
+---------------------------------------------------------------------------+
| Clust received 3 datasets with 9332 unique genes. After filtering, 9329 |
| genes made it to the clustering step. Clust generated 2 clusters of |
| genes, which in total include 1601 genes. The smallest cluster includes |
| 680 genes, the largest cluster includes 921 genes, and the average |
| cluster size is 800 genes. |
+---------------------------------------------------------------------------+
| Citation |
| ~~~~~~~~ |
| When publishing work that uses Clust, please include this citation: |
| Basel Abu-Jamous and Steven Kelly (2018) Clust: automatic extraction of |
| optimal co-expressed gene clusters from gene expression data. Genome |
| Biology 19:172; doi: https://doi.org/10.1186/s13059-018-1536-8. |
+---------------------------------------------------------------------------+
| For enquiries contact: |
| Dr. Basel Abu-Jamous |
| [email protected] |
\===========================================================================/