Spark streaming not working -

April 15, 2013

i have rudimentary spark streaming word count , not working.

import sys pyspark import sparkconf, sparkcontext pyspark.streaming import streamingcontext  sc = sparkcontext(appname='streaming', master="local[*]") scc = streamingcontext(sc, batchduration=5)  lines = scc.sockettextstream("localhost", 9998) words = lines.flatmap(lambda line: line.split()) counts = words.map(lambda word: (word, 1)).reducebykey(lambda x, y: x + y)  counts.pprint()  print 'listening' scc.start() scc.awaittermination()

i have on terminal running nc -lk 9998 , pasted text. prints out typical logs (no exceptions) ends queuing job weird time (45 yrs) , keeps on printing this...

15/06/19 18:53:30 info sparkcontext: created broadcast 1 broadcast @ dagscheduler.scala:874 15/06/19 18:53:30 info dagscheduler: submitting 1 missing tasks resultstage 2 (pythonrdd[7] @ rdd @ pythonrdd.scala:43) 15/06/19 18:53:30 info taskschedulerimpl: adding task set 2.0 1 tasks 15/06/19 18:53:35 info jobscheduler: added jobs time 1434754415000 ms 15/06/19 18:53:40 info jobscheduler: added jobs time 1434754420000 ms 15/06/19 18:53:45 info jobscheduler: added jobs time 1434754425000 ms ... ...

what doing wrong?

spark streaming requires multiple executors work. try using local[4] master.

Search This Blog

Macro

Spark streaming not working -

Comments

Post a Comment

Popular posts from this blog

symfony - TEST environment only: The database schema is not in sync with the current mapping file -

twig - Using Twigbridge in a Laravel 5.1 Package -

jdbc - Not able to establish database connection in eclipse -