scala - How to transform Array[RDD[Row]] to SchemaRDD -- OR -- how to split SchemaRDD, in which the results be SchemaRDDs? -


i want use implementation of pipeline in mllib. use pipeline, there should sequence of labeleddocument passed pipeline (schemardd).

i create schemardd follows:

val data = sc.textfile("/test.csv"); val parseddata = data.map { line =>         val parts = line.split(',')         labeledpoint(parts(0).todouble, vectors.dense(parts.tail))         }.cache() val rddschema = parseddata.toschemardd; 

i want split new rddschema training (80%) , test (20%). if use randomsplit, returns array[rdd[row]] instead of schemardd.

problem: how transform array[rdd[row]] schemardd

-- or --

how split schemardd, in results schemardds?

i appreciate help.

i know old, did try :

val splits = parseddata.randomsplit(array(0.6, 0.4), seed = 11l) val training = splits(0) val test = splits(1) 

Comments

Popular posts from this blog

twig - Using Twigbridge in a Laravel 5.1 Package -

jdbc - Not able to establish database connection in eclipse -

Kivy: Swiping (Carousel & ScreenManager) -