Greetings

I implement the greedy coding on java. Namely, Shannon Code .

Is it possible to write to the file and read from the file a set of bits not a multiple of 8? (preferably without the invention of bicycles)

I had a thought to use the BitSet class. Create a bitSet for each code sequence, and then, when reading a file, perform a concatenation of the required bitSet. Convert a large bitSet into an array of bytes and write it to a file. But there are a few BUT!

  1. bitset.length () returns the index of the most significant bit set to 1. Since the code sequence ends with bits set to 0, you have to store the length of the code sequence separately.
  2. The toByteArray () method "flips" the bits. If {0, 1, 2, 4} bits are set in the bitSet, the byte looks like 10111
  3. implementation of concatenation also remains with me, although it is not scary.

Decoding is now presented only as reading one or several bytes and receiving bits using shifts.

Maybe there is a more elegant solution to this problem?

    3 answers 3

    Regarding

    Есть ли возможность записывать в файл и читать из файла набор бит не кратный 8? 

    No you can not. In fact, you can not even read from the file the number of bytes not a multiple of 512. Since this is the standard size of the sector on the hard disk (now everyone is switching to 4096). The OS simply hides the cutting operations of a specific piece of bytes, after reading from the file, from the developer, and buffers reading from the file.

      bitset.length() возвращает индекс старшего установленного в 1 бита. Так как кодовая последовательность заканчивается битами установленными в 0, приходится хранить длину кодовой последовательность отдельно. 

    Use the size () method

     метод toByteArray() "переворачивает" биты. Если в bitSet-е установлены {0, 1, 2, 4} биты, то байт выглядит как 10111 

    If you initially have a BitSet dictionary, then you can adjust them according to the order you need.

    • Yes. With bitwise recording, I got excited. Actually, it's not so important to me how many bytes and bits I write. It would be more interesting to abstract from reading the required number of bytes and extracting the necessary number of bits from them. But, apparently, there is nothing like that in standard libraries. - Alexander
    • BitSet bs = new BitSet(3); System.out.println(bs.size()); // 64 BitSet bs = new BitSet(3); System.out.println(bs.size()); // 64 size () also does not fit - Alexander
    • In the standard collection, you will not find anything like this, for even BitSet is not recommended for use. There is no perversion here. Try List <Boolean>, if not very interested in performance issues. There are methods for everything that you need: the selection of the required number of "bits", the size of the array "bits". - Temka too

    For my own purposes, I still had to implement a wrapper class for the existing BitSet class. I would also like to note that the .length() and .toByteArray() methods (as well as the opposite method for converting ByteArray to BitSet) work quite adequately. The least significant bit to which BitSet assigns an index of 0 corresponds to the least significant bit in the byte that is the “rightmost”. I already needed to move away from the standard recording.

    And here is the same class:

     public class MyBitSet { public MyBitSet(BitSet bs, int length) { bitset = bs; this.bitLength = length; } BitSet bitset; private int bitLength; // количество полезных бит private int bitcount; // счётчик бит для временного байта. Используется для сдвига private int tmpByte; // буфферный байт private ByteArrayOutputStream baos; private ByteArrayInputStream bais; public long getBitLength() { return this.bitLength; } public MyBitSet getSubBitSet(int start, int end) { MyBitSet mbs = new MyBitSet(end - start); for(int i = start, j = 0; i < end; i++) { mbs.bitset.set(j, this.bitset.get(i)); } return mbs; } public MyBitSet(String filename) { bais = null; FileInputStream fis = null; try { File f = new File(filename); fis = new FileInputStream(f); byte[] byteFile; byteFile = new byte[(int) f.length()]; fis.read(byteFile, 0, (int) f.length()); this.bitset = new BitSet((int) f.length()); for (int i = 0; i < byteFile.length; i++) { byte b = byteFile[i]; for (int j = 0; j < 8; j++) { bitset.set(i*8 + j, getBit(b, 7 - j)); } } } catch (FileNotFoundException ex) { Logger.getLogger(MyBitSet.class.getName()).log(Level.SEVERE, null, ex); } catch (IOException ex) { Logger.getLogger(MyBitSet.class.getName()).log(Level.SEVERE, null, ex); } finally { if(fis != null){ try { fis.close(); } catch (IOException ex) { Logger.getLogger(MyBitSet.class.getName()).log(Level.SEVERE, null, ex); } } } } @Override public int hashCode() { return this.bitset.hashCode(); } @Override public boolean equals(Object obj) { if (!(obj instanceof MyBitSet)) { return false; } if (this == obj) { return true; } MyBitSet set = (MyBitSet) obj; if (this.bitset.equals(set.bitset) && this.bitLength == set.bitLength) { return true; } else { return false; } } public void concatenate(BitSet bs, int length) { for (int i = 0; i < length; i++, bitLength++, bitcount--) { int myInt = (bs.get(i)) ? 1 : 0; // true = 1, false = 0 tmpByte |= (myInt << (bitcount)); if (bitcount == 0) { bitcount = 8; baos.write(tmpByte); tmpByte = 0; } } } void writeToFile(String filename) { if (bitcount != 7) { baos.write(tmpByte);//добавить последние биты, оставшиеся во временном байте } OutputStream outputStream = null; try { outputStream = new FileOutputStream(filename); baos.writeTo(outputStream); } catch (FileNotFoundException ex) { Logger.getLogger(MyBitSet.class.getName()).log(Level.SEVERE, null, ex); } catch (IOException ex) { Logger.getLogger(MyBitSet.class.getName()).log(Level.SEVERE, null, ex); } finally { if (outputStream != null) { try { outputStream.flush(); outputStream.close(); } catch (IOException ex) { Logger.getLogger(MyBitSet.class.getName()).log(Level.SEVERE, null, ex); } } } } byte[] getByteArray() { if (bitcount != 7) { baos.write(tmpByte);//добавить последние биты, оставшиеся во временном байте } return baos.toByteArray(); } public MyBitSet() { bitLength = 0; bitcount = 7; baos = new ByteArrayOutputStream(); } public MyBitSet(int bitSetLength) { this.bitset = new BitSet(bitSetLength); this.bitLength = bitSetLength; } private static boolean getBit(byte b, int i) { return ((b >> i) & 1) == 1; } } 

      I have a very small library for such purposes called JBBP which supports bit writing and reading, you can use its classes JBBPBitInputStream and JBBPBitOutputStream