text-classification
text-classification copied to clipboard
help
is there any way to change the input data and the categories? any help?
Yeah you can. Can you give more details and what exactly you want to achieve. You can load any dataset which are text files and pass the files to CountVectorizer or TfidfVectorizer.
I want to give a text file with sentences, every sentence has one mode of persuasion(there are 3 modes of persuasion in total) and a label which shows which modes it is, to train the model... and then detect modes of persuasion in some other data..
In your case, each sentence can be a data point with its target.
my problem is that i don't know how to adjust your code on my data
it would be a pleasure if you can give me any help
from sklearn.datasets import fetch_20newsgroups twenty_train = fetch_20newsgroups(subset='train', shuffle=True)
i was getting error when i run the above and the error is -
URLError Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\datasets\twenty_newsgroups.py in fetch_20newsgroups(data_home, subset, categories, shuffle, random_state, remove, download_if_missing) 246 "This may take a few minutes.") 247 cache = _download_20newsgroups(target_dir=twenty_home, --> 248 cache_path=cache_path) 249 else: 250 raise IOError('20Newsgroups dataset not found')
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\datasets\twenty_newsgroups.py in _download_20newsgroups(target_dir, cache_path) 79 80 logger.info("Downloading dataset from %s (14 MB)", ARCHIVE.url) ---> 81 archive_path = _fetch_remote(ARCHIVE, dirname=target_dir) 82 83 logger.debug("Decompressing %s", archive_path)
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\datasets\base.py in _fetch_remote(remote, dirname) 914 file_path = (remote.filename if dirname is None 915 else join(dirname, remote.filename)) --> 916 urlretrieve(remote.url, file_path) 917 checksum = _sha256(file_path) 918 if remote.checksum != checksum:
C:\ProgramData\Anaconda3\lib\urllib\request.py in urlretrieve(url, filename, reporthook, data) 245 url_type, path = splittype(url) 246 --> 247 with contextlib.closing(urlopen(url, data)) as fp: 248 headers = fp.info() 249
C:\ProgramData\Anaconda3\lib\urllib\request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context) 220 else: 221 opener = _opener --> 222 return opener.open(url, data, timeout) 223 224 def install_opener(opener):
C:\ProgramData\Anaconda3\lib\urllib\request.py in open(self, fullurl, data, timeout) 523 req = meth(req) 524 --> 525 response = self._open(req, data) 526 527 # post-process response
C:\ProgramData\Anaconda3\lib\urllib\request.py in _open(self, req, data) 546 547 return self._call_chain(self.handle_open, 'unknown', --> 548 'unknown_open', req) 549 550 def error(self, proto, *args):
C:\ProgramData\Anaconda3\lib\urllib\request.py in _call_chain(self, chain, kind, meth_name, *args) 501 for handler in handlers: 502 func = getattr(handler, meth_name) --> 503 result = func(*args) 504 if result is not None: 505 return result
C:\ProgramData\Anaconda3\lib\urllib\request.py in unknown_open(self, req) 1385 def unknown_open(self, req): 1386 type = req.type -> 1387 raise URLError('unknown url type: %s' % type) 1388 1389 def parse_keqv_list(l):
URLError: