text analysis - How to lemmatize tokens existing as series? -
i haven't performed pos tagging. how tokens -
[identification, risky, customers, needs, e... [date, last, contact, critical, data, field...
i tried running 3 codes on tokens -
nltk.stem.wordnet import wordnetlemmatizer wordnet_lemmatizer = wordnetlemmatizer() lemmatized=tokens.apply(lambda x: wordnet_lemmatizer.lemmatize(x)) def lemword(temp): wordnet_lemmatizer.lemmatize(temp) temp = [wordnet_lemmatizer.lemmatize(word) word in temp.split(" ")] temp = " ".join(temp) return temp tokens=tokens.apply(lambda x: lemword(x)) lemmatized=" ".join([wordnet_lemmatizer.lemmatize(i) in tokens])
but got same error each time -
traceback (most recent call last): file "<stdin>", line 1, in <module> file "/opt/cloudera/parcels/anaconda3/lib/python3.5/site- packages/pandas/core/series.py", line 2220, in apply mapped = lib.map_infer(values, f, convert=convert_dtype) file "pandas/src/inference.pyx", line 1088, in pandas.lib.map_infer (pandas/lib.c:62658) file "<stdin>", line 1, in <lambda> file "/opt/cloudera/parcels/anaconda3/lib/python3.5/site- packages/nltk/stem/wordnet.py", line 40, in lemmatize lemmas = wordnet._morphy(word, pos) file "/opt/cloudera/parcels/anaconda3/lib/python3.5/site- packages/nltk/corpus/reader/wordnet.py", line 1708, in _morphy if form in exceptions: typeerror: unhashable type: 'list'
i tried input list, tuple & dataframe error same. can me resolve this?
wiki
Comments
Post a Comment