java - PigUnit: Unable to open iterator -


i'm following pigunit testing example in apache pig page here. tried code example in eclipse using maven project. added pig , pigunit dependency in pom.xml, tried both 0.14 , 0.15 version.

here's pigunit test code taken apache pig page (i enclosed class of course):

  @test   public void testtop2queries() {     string[] args = {         "n=2",         };      pigtest test = new pigtest("top_queries.pig", args);      string[] input = {         "yahoo",         "yahoo",         "yahoo",         "twitter",         "facebook",         "facebook",         "linkedin",     };      string[] output = {         "(yahoo,3)",         "(facebook,2)",     };      test.assertoutput("data", input, "queries_limit", output);   } 

and pig script, copied:

data = load 'input' (query:chararray); queries_group = group data query; queries_count = foreach queries_group generate group query, count(data) total; queries_ordered = order queries_count total desc, query; queries_limit = limit queries_ordered 2; store queries_limit 'output'; 

however, encountering result, when try run > junit test:

org.apache.pig.impl.logicallayer.frontendexception: error 1066: unable open iterator alias queries_limit     @ org.apache.pig.pigserver.openiterator(pigserver.java:935)     ...[truncated] caused by: java.io.ioexception: couldn't retrieve job.     @ org.apache.pig.pigserver.store(pigserver.java:999)     @ org.apache.pig.pigserver.openiterator(pigserver.java:910)     ... 28 more 

this output console i'm getting:

store queries_limit 'output'; --> none data: {query: chararray} data = load 'input' (query:chararray); --> data = load 'file:/tmp/temp-820202225/tmp-1722948946' using pigstorage('\t') (     query: chararray ); store queries_limit 'output'; --> none 

it looks pig script trying load local file system data 'input' instead of using java string[] variable 'input' variable.

can this?

before getting solution, wanted comment on fact pig script loading local disk. when pig overrides statement , supply data mock, creates file on local disk , loads it. that's why see file being loaded. if @ file should see data supply in string array, input.

for still looking solution this, following worked me. solution based on pig version 0.15 , hadoop 2.7.1. seems me have specify pig artifact need.

    <dependency>         <groupid>org.apache.pig</groupid>         <artifactid>pigunit</artifactid>         <version>${pig.version}</version>         <scope>test</scope>     </dependency>     <dependency>         <groupid>org.apache.pig</groupid>         <artifactid>pig</artifactid>         <version>${pig.version}</version>         <classifier>h2</classifier>         <!-- note: important have classifier. unit tests         break if doesn't exist. gets pig jars hadoop v2. -->     </dependency> 

here helpful classes on pig github page.

pigtest implementation (good reading api docs): https://github.com/apache/pig/blob/trunk/test/org/apache/pig/pigunit/pigtest.java

pigunit examples: https://github.com/apache/pig/blob/trunk/test/org/apache/pig/test/pigunit/testpigtest.java


Comments

Popular posts from this blog

How to connect android app to App engine -

gcc - MinGW's ld cannot perform PE operations on non PE output file -

php - display validation error message next to the textbox in codeigniter -