java - Data inconsistency using ConcurrentHashMap -


the count changes every run same set of files. following code still not data consistent. how make thread safe? simple word count code.

package concurrenthashmapdemo;  import java.io.bufferedreader; import java.io.file; import java.io.filereader; import java.util.map; import java.util.concurrent.concurrenthashmap; import java.util.concurrent.concurrentmap;  class filereadertask implements runnable {     private string filepath;     private string filename;     private concurrentmap<string, integer> wordcountmap;      public filereadertask(string filepath, string filename,             concurrentmap<string, integer> wordcountmap) {         this.filepath = filepath;         this.filename = filename;         this.wordcountmap = wordcountmap;     }      public void run() {         file jobfile = new file(filepath + filename);         try {             bufferedreader breader = new bufferedreader(new filereader(jobfile));             string line = "";             while ((line = breader.readline()) != null) {                 string[] strarray = line.split(" ");                 (string str : strarray) {                     if (wordcountmap.containskey(str)) {                         wordcountmap.replace (str.trim(),                                 wordcountmap.get(str.trim()) + 1);                     } else {                         wordcountmap.putifabsent(str.trim(), 1);                     }                 }             }             //thread.sleep(10000);         } catch (exception e) {             // todo auto-generated catch block             e.printstacktrace();         }     } }  public class main {     public static void main(string[] args) {         concurrentmap<string, integer> wordcountmap = new concurrenthashmap<string, integer>();         file filedir = new file("c://job_files");         thread[] threads = new thread[filedir.listfiles().length];         for(int i=0;i<threads.length;i++){             filereadertask frt = new filereadertask("c:/job_files/", filedir.listfiles()[i].getname(), wordcountmap);             threads[i]= new thread(frt);             threads[i].start();         }         //         for(int i=0;i<threads.length;i++){         try {         threads[i].join();         } catch (interruptedexception e) {             // todo auto-generated catch block             e.printstacktrace();         }         }          for(map.entry<string, integer> entry: wordcountmap.entryset()){             string key = entry.getkey();             system.out.println(key +" - - "+wordcountmap.get(key));         }         system.out.println("main");     } } 

the concurrent containers ensure internal consistency (for example not adding same key twice), nothing protect stored values. code stands has race condition. thread can increment counter between call get , call replace. replace puts wrong value in map, losing increment performed other thread.

you need make increment atomic. this, uses version of replace ensures value in map still same before peforming replacement:

str = str.trim(); while(true) {     integer oldvalue = wordcountmap.putifabsent(str, 1);     if(oldvalue != null) {         if(wordcountmap.replace(str, oldvalue, oldvalue + 1))           break; // incremented existing count     } else {         break; // added new count of 1     } } 

Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -