Tuesday, April 24, 2012

Java binary search in a sorted textfile


When making the Android app 'Wordlist Pro', I needed a fast binary search for finding single words in a huge A-Z sorted wordlist file. I was looking all over the internet for a working ready-to-go solution. I could not find a well working one, so I had to make my own version out of what I found. The result was working perfectly.

This is what I ended up with.

target = the word in the file you are looking for


    public boolean binarySearch(RandomAccessFile raf, String target)
            throws IOException {

        raf.seek(0);
        int counter = 0;
        String line = raf.readLine();

        if (line.compareTo(target) == 0) {
            return true;
        }

        long low = 0;
        long high = raf.length();
        long p = -1;

        while (low < high) {

            long mid = (low + high) >>> 1;
            p = mid;
            while (p >= 0) {
                raf.seek(p);
                char c = (char) raf.readByte();
                if(c == '\n')
                    break;
                p--;
            }
            if(p < 0)
                raf.seek(0);

            line = raf.readLine();

            if( (line == null) || line.compareTo(target) < 0)
                low = mid + 1;
            else
                high = mid;
        }

        p = low;
        while (p >= 0){
            raf.seek(p);
            if(((char) raf.readByte()) == '\n')
                break;
            p--;
        }
        if(p < 0)
            raf.seek(0);

        while(true){
            line = raf.readLine();
            if(line == null || !line.equals(target))
                break;
            return true;
        }

        // Nothing found
        return false;

    }

To use it, you can do like this:

            try{
                File f = new File("MyLargeSortedFile.txt");
                if( f.isFile() && f.canRead() && (f.length() > 0) ){
                    RandomAccessFile raf = new RandomAccessFile(f, "r");
                    if(binarySearch(raf, s)){
                        // Match found.
                        raf.close();
                        return true;
                    }
                    // No Match
                    raf.close();
                }
            }catch(FileNotFoundException e){
            }catch(IOException e){
            }


Use it to whatever you like.
'+1' license applies
(press the +1 button)

/r0b

No comments:

Post a Comment