Spark streaming not working -
i have rudimentary spark streaming word count , not working.
import sys pyspark import sparkconf, sparkcontext pyspark.streaming import streamingcontext sc = sparkcontext(appname='streaming', master="local[*]") scc = streamingcontext(sc, batchduration=5) lines = scc.sockettextstream("localhost", 9998) words = lines.flatmap(lambda line: line.split()) counts = words.map(lambda word: (word, 1)).reducebykey(lambda x, y: x + y) counts.pprint() print 'listening' scc.start() scc.awaittermination()
i have on terminal running nc -lk 9998
, pasted text. prints out typical logs (no exceptions) ends queuing job weird time (45 yrs) , keeps on printing this...
15/06/19 18:53:30 info sparkcontext: created broadcast 1 broadcast @ dagscheduler.scala:874 15/06/19 18:53:30 info dagscheduler: submitting 1 missing tasks resultstage 2 (pythonrdd[7] @ rdd @ pythonrdd.scala:43) 15/06/19 18:53:30 info taskschedulerimpl: adding task set 2.0 1 tasks 15/06/19 18:53:35 info jobscheduler: added jobs time 1434754415000 ms 15/06/19 18:53:40 info jobscheduler: added jobs time 1434754420000 ms 15/06/19 18:53:45 info jobscheduler: added jobs time 1434754425000 ms ... ...
what doing wrong?
spark streaming requires multiple executors work. try using local[4] master.
Comments
Post a Comment