First we define a bytes array:
s = bytearray(b"Hello World") for i in s: print i ... 72 101 108 108 111 32 87 111 114 108 100
now, let's write this data to a file:
f = open('helloword.bin','wb') for i in s: f.write(struct.pack("I",i)) f.close()
Let's inspect the file created:
$ du -h points.bin 4.0K points.bin $ file points.bin points.bin: data $ less points.bin "points.bin" may be a binary file. See it anyway?
Binary file sizes
let's us write "hello world" into a text file in a text form:f = open('helloword.txt") f.write("hello world") f.close()
Once again we can inspect the file:
$ du -h helloworld.txt 4.0K helloworld.txt
Now, what happens if we make a longer binary array?
s.split() s.append(33) bytearray(b'Hello World!') for i in range(10000): s.append(33) f = open('longhelloword.bin', 'wb') import struct for i in s: f.write(struct.pack("I",i)) f.close() f = open('longhelloword.txt', 'w') hello = "Hello World!" for i in range(10000): hello = hello+"!"In a shell, examaining the file sizes:
$ du -h longhelloword.bin 40K longhelloword.bin $ du -h longhelloword.txt 12K longhelloword.txt
Wait a minute ! Why is the binary file almost 4 times bigger?
The answer is: it depends on the format specifier instruct.pack
. Namely, we used an
unsigned int
, for each character we then reserved 4 bytes!. When we saved the text, every character was assigned to the file exactly as a
char
which takes one byte only. If we repeat the above with
struct.pack("b",i)
the sizes of the file won't differ: f = open('longhellowordwithchar.bin', 'w') for i in s: f.write(struct.pack('b',i)) f.close()
and in the shell:
$ du longhellowordwithchar.bin 12 longhellowordwithchar.bin $ du longhelloword.txt 12 longhelloword.txt
Credits:
http://dabeaz.blogspot.de/2010/01/few-useful-bytearray-tricks.html
http://docs.python.org/library/struct.html#module-struct