python 2.7 - How to decode ascii from stream for analysis -
i trying run text twitter api through sentiment analysis textblob library, when run code, code prints 1 or 2 sentiment values , errors out, following error:
unicodedecodeerror: 'ascii' codec can't decode byte 0xc2 in position 31: ordinal not in range(128)
i not understand why issue code handle if analyzing text. have tried code script utf-8. here code:
from tweepy.streaming import streamlistener tweepy import oauthhandler tweepy import stream import json import sys import csv textblob import textblob # variables contains user credentials access twitter api access_token = "" access_token_secret = "" consumer_key = "" consumer_secret = "" # basic listener prints received tweets stdout. class stdoutlistener(streamlistener): def on_data(self, data): json_load = json.loads(data) texts = json_load['text'] coded = texts.encode('utf-8') s = str(coded) content = s.decode('utf-8') #print(s[2:-1]) wiki = textblob(s[2:-1]) r = wiki.sentiment.polarity print r return true def on_error(self, status): print(status) auth = oauthhandler(consumer_key, consumer_secret) auth.set_access_token(access_token, access_token_secret) stream = stream(auth, stdoutlistener()) # line filter twitter streams capture data keywords: 'python', 'javascript', 'ruby' stream.filter(track=['dollar', 'euro' ], languages=['en'])
can please me situtation?
thank in advance.
you're mixing many things together. error says, you're trying decode byte type.
json.loads
result in data string, you'll need encode it.
texts = json_load['text'] # string coded = texts.encode('utf-8') # byte print(coded[2:-1])
so, in script, when tried decode coded
got error decoding byte
data.
Comments
Post a Comment