binasc: binary/ascii file viewing/creation




1. Displaying ASCII codes for bytes in a file.

The binasc program can convert a file into an ASCII list of hexadecimal numbers which represent each byte in the input file as well as display any printable ascii characters associated with the hexadecimal numbers. Example output given below shows beginning of the output from the binasc program when it is run on the binasc program file. Note that the lines come in pairs, first the line describing the bytes, then a comment line displaying any ASCII printable bytes.

     7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 02 00 03 00 01 00 00 00 ac 
    ;    E  L  F                                                                
    
     8c 04 08 34 00 00 00 68 5e 00 00 00 00 00 00 34 00 20 00 05 00 28 00 16 00 
    ;          4           h  ^                    4                 (          
    
     15 00 06 00 00 00 34 00 00 00 34 80 04 08 34 80 04 08 a0 00 00 00 a0 00 00 
    ;                   4           4           4                               
    
     00 05 00 00 00 04 00 00 00 03 00 00 00 d4 00 00 00 d4 80 04 08 d4 80 04 08 
    ;                                                                           
    
     13 00 00 00 13 00 00 00 04 00 00 00 01 00 00 00 01 00 00 00 00 00 00 00 00 
    ;                                                                           
    
     80 04 08 00 80 04 08 78 5a 00 00 78 5a 00 00 05 00 00 00 00 10 00 00 01 00 
    ;                      x  Z        x  Z                                     
    
     00 00 78 5a 00 00 78 ea 04 08 78 ea 04 08 2c 02 00 00 38 03 00 00 06 00 00 
    ;       x  Z        x           x           ,           8                   
    
     00 00 10 00 00 02 00 00 00 04 5c 00 00 04 ec 04 08 04 ec 04 08 a0 00 00 00 
    ;                               \                                           
    
     a0 00 00 00 06 00 00 00 04 00 00 00 2f 6c 69 62 2f 6c 64 2d 6c 69 6e 75 78 
    ;                                     /  l  i  b  /  l  d  -  l  i  n  u  x 
    
     2e 73 6f 2e 32 00 00 25 00 00 00 38 00 00 00 00 00 00 00 0d 00 00 00 20 00 
    ; .  s  o  .  2        %           8                                        
    

There are two other main viewing options for the binasc command: -a for displaying only ASCII printable bytes, and -b for displaying only the hexadecimal numbers for the bytes.

the -a option will display only the ascii-printable characters in a file. Multiple spaces (unprintable characters) are suppressed in the output. The -a option is a good way to search for text in a binary file. Here is an example output using the same file as in the example show above:

    ELF 4 h^ 4 ( 4 4 4 xZ xZ xZ x x , 8 \ /lib/ld-linux.so.2 % 8 # / 5 ! % , "
    & 7 $ 6 ) 1 + 0 - 2 3 4 ( ' * . ) p ? ` h E 1 K " ] L " n \ " | " L h U < i
    ( < > ( 8 @ ( = D > K > e , v 0 , ) E . l I l 3 y E | Q i a C \ | ' | ! !
    __gmon_start__ libg++.so.2.7.2 _DYNAMIC _GLOBAL_OFFSET_TABLE_ _init _fini
    __builtin_vec_new __builtin_delete __builtin_new __builtin_vec_delete
    __ls__7ostreamPCc __ctype_b __ctype_tolower write__7ostreamPCci
    get__7istreamRc _vt.3ios _vt.7ostream.3ios __ls__7ostreami cerr exit
    __strtod_internal __ls__7ostreamc cout strchr strcmp atexit
    libstdc++.so.2.7.2 __11fstreambasei _vt.7istream.3ios _vt.8ifstream.3ios
    __11fstreambaseiPCcii open__11fstreambasePCcii _vt.8iostream.3ios
    _vt.7fstream.3ios close__11fstreambase _._7fstream _._8ifstream
    getline__7istreamPcic read__7istreamPci hex__FR3ios __ls__7ostreaml
    endl__FR7ostream libm.so.6 libc.so.6 __libc_init_first bsearch qsort
    __strtol_internal strcpy strncpy strtok _environ __environ environ _start
    _etext _edata __bss_start _end 1 0 @ h | - ! ( ' , * + ) $ . / % # " & U S
    

Alternatly, with the -b option, you can print out only the printable codes for the binary numbers associated with each byte in the file. Unlike the Unix od command, bytes are not grouped into two-byte words when displayed as hexadecimal numbers (which will switch order of the bytes in the output display on little-endian computers). Here is example output when using the -b option using the same file as in previous examples:

    7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 02 00 03 00 01 00 00 00 ac 
    8c 04 08 34 00 00 00 68 5e 00 00 00 00 00 00 34 00 20 00 05 00 28 00 16 00 
    15 00 06 00 00 00 34 00 00 00 34 80 04 08 34 80 04 08 a0 00 00 00 a0 00 00 
    00 05 00 00 00 04 00 00 00 03 00 00 00 d4 00 00 00 d4 80 04 08 d4 80 04 08 
    13 00 00 00 13 00 00 00 04 00 00 00 01 00 00 00 01 00 00 00 00 00 00 00 00 
    80 04 08 00 80 04 08 78 5a 00 00 78 5a 00 00 05 00 00 00 00 10 00 00 01 00 
    00 00 78 5a 00 00 78 ea 04 08 78 ea 04 08 2c 02 00 00 38 03 00 00 06 00 00 
    00 00 10 00 00 02 00 00 00 04 5c 00 00 04 ec 04 08 04 ec 04 08 a0 00 00 00 
    a0 00 00 00 06 00 00 00 04 00 00 00 2f 6c 69 62 2f 6c 64 2d 6c 69 6e 75 78 
    2e 73 6f 2e 32 00 00 25 00 00 00 38 00 00 00 00 00 00 00 0d 00 00 00 20 00 
    00 00 15 00 00 00 00 00 00 00 07 00 00 00 0b 00 00 00 23 00 00 00 01 00 00 
    00 1d 00 00 00 14 00 00 00 16 00 00 00 0c 00 00 00 00 00 00 00 2f 00 00 00 
    0e 00 00 00 00 00 00 00 00 00 00 00 35 00 00 00 19 00 00 00 21 00 00 00 1f 
    

2. Creating files by byte description

With the binasc program, you can convert an ascii file with the binary numbers back into actual bytes by using the -c option. When using the -c option, you must specify an output file. Byte numbers can be in various formats as described below.

binasc comments

    A semi-colon (;) marks the beginning of a comment which extends to the end of a line. A space (or tab) character must precede the semi-colon when the comment follows a number on a line.

binasc hexadimal numbers

    hexadecimal numbers specify one byte and must contain no more than 2 digits and range from 00 to ff (0 to 255 in decimal notation, or -128 to 127 in signed decimal notation.): example of valid hexadecimal numbers:
      7f 45 4c 46 1 1 1 0 0 
      8c 04 08 34 0 0 0 8 e 
      15 00 06 10 0 0 4 0 0 
      

binasc binary numbers

    Binary numbers can be specified by numbers longer than three characters, or numbers containing a comma. The binary number is allowed to have up to 8 digits (bits) since a binary number represents one byte in the output file. An optional comma is expected to split the number into two equal parts with 4 bits on each side of the comma

    For example 0010 is the binary number which is equal to the decimal number "4". 0010 is equivalent to 0,0010 or 0000,0010. Note that 10 is the hexadecimal number equal to the decimal number "16" and is not the binary number equal to the decimal number "2".

    Here are some binary numbers examples:

       binary    decimal        invalid     reason
       0,0       = 0              ,0          cannot start or end with comma
       0000,0000 = 0              0000, 0000  cannot have spaces around comma
       00000000  = 0              1001010110  maximum of 8 binary digits
       1,1       = 17             10011,1000  max of 4 digits each side of comma
       0001,0001 = 17             
       010       = 2              10          interpreted as a hexadecimal number
       0,101     = 5              
       00000101  = 5
       101       = 5
    

binasc decimal numbers

    binasc decimal numbers, unlike hexadecimal or binary numbers, can fill slots of 1-4 bytes for integers, or 4 and 8 bytes for floating-point decimal numbers. Decimal numbers may also be either positive or negative unlike the hexadecimal or binary number input.

    A decimal number starts with a quote character ('). There are two specifications which can be given just before the quote:

    1. a number in the range from 1 to 4 which specifies how many bytes an integer decimal number is to be stored in. Floating-point numbers can be either 4 or 8 bytes in size. The default size for floating-point numbers is 4 bytes if no size is specified.
    2. the symbol "u" can be given before the quote character in a decimal number to indicate the direction into which the bytes for the number will be placed in the file. No letter "u" means that the most significan byte is written first (big-endian), while the letter "u" indicates to write the bytes in reverse order (little-endian). For example, the decimal number 1234 can be represented by the two-byte hexadecimal number 0x04d2. In big-endian storage the 04 byte is written first, then the d2 byte. in little-endian storage the d2 byte is written first then the 04 byte.
            1234 = 0x04d2       big endian:   04 d2      little endian:   d2 04
            decimal binasc representations:   '1234                      u'1234
        hexadecimal binasc representations:   04 d2                       d2 04
      

    When a byte size is not specified before the quote character, the default is 1 for integers and 4 for floating-point. When not specifying a byte size, valid decimal numbers are in the range from 0 to 255, or -128 to 127 if signed, i.e., the range for one-byte decimal numbers is from -128 to 255, and you have to know the representation later (signed or unsigned). If you specify a byte size of 1, then you can give any integer number value, but it will be truncated to fit into one byte. The maximum integer decimal number which can fill 4 bytes is 4294967294 or so. (hexadecimal 0xffffffff).

    If a decimal number includes a period character (.) it is assumed to be a floating-point number. Floating-point numbers can be either 4 or 8 bytes. Integer numbers can be between 1 and 4 bytes, although 3-byte integers can only be positive.

    Examples of decimal numbers:

                          
         valid                     invalid 
         examples                  examples      reason
         '0      =    0               123         does not start with a quote
         '255    =  255             
         1'256   =   0 (truncated)   '256         exceeds one byte in size
         2'256   = 256
         4'44100 = 44100
         4u'453  = 453 (but bytes are written small to large order)
         u4'453  = 453 (same as above)
         2'-5    = -5  (short int)   2' -5        cannot have a space around quote
         '3.1415 = 3.1415 (4-byte storage, float in the C language)
         8'3.1415 = 3.1415 (8-byte storage, double in the C language)
    

binasc ascii bytes

    ASCII characters can be input by preceding each with a plus (+). Each character is a separate word. For example, to place the characters "cat" into a file, the input would be "+c +a +t".

example 1

    The following file will compile into a NeXT/Sun soundfile with five zero-valued sound samples. This example has lots of comments.
       2e 73 6e 64      ; the magic number which identifies the type of the file
      ; .  s  n  d      ; character equivalents of the magic number digits
      
      
      00 00 00 32       ; the byte offset of the data (50 bytes precede the data)
                        ; i.e., the header contains 50 bytes
      
      00 00 00 0a       ; the number of bytes in the data (10 bytes).
      
      00 00 00 03       ; the NeXT/sun data format (3 = 16-bit Linear sound)
      
      00 00 ac 44       ; the sampling rage, which is 44100 samples/sec here
      
      00 00 00 01       ; the number of channels (1 = monophonic soundfile)
      
                        ; next comes a sound file comment:
       54 68 69 73 20 69 73 20 61 20 62 6c 61 6e 6b 20 73 6f 75 6e 64 66 69 6c 65 2e
      ; T  h  i  s     i  s     a     b  l  a  n  k     s  o  u  n  d  f  i  l  e  .
      
      ; finally the individual sample data:
      
      00 00       ; first 16-bit sample (big-endian)
      00 00       ; second 16-bit sample (big-endian)
      00 00       ; third 16-bit sample (big-endian)
      00 00       ; fourth 16-bit sample (big-endian)
      00 00       ; fifth 16-bit sample (big-endian)
      
      ; end of example soundfile.
      

    Here is a more succinct version of the previous example:

      +. +s +n +d      ; magic number (characters ".snd")
      4'50             ; header bytes (the decimal number 50 filling 4 bytes)
      4'10             ; sample count
      4'3              ; format
      4'44100          ; srate
      4'1              ; channels
                       ; comment:
      +T +h +i +s +  +i +s +  +a +  +b +l +a +n +k +  +s +o +u +n +d +f +i +l +e +.
      
                       ; sample data shown in various input possibilities
      00 00            ; sample 1: hexadecimal digits
      '0 '0            ; sample 2: decimal digits
      2'0              ; sample 3: decimal number 0 filling up two bytes
      0000,0000 0,0    ; sample 4: binary digits
      2u'0             ; sample 5: decimal digits filling two bytes, but
                       ; using little endian byte ordering (backward).
      ; end of example soundfile.
      
    The simplest view of the previous example:
      2e 73 6e 64 00 00 00 32 00 00 00 0a 00 00 00 03 00 00 ac 44 00 00 00 01 
      54 68 69 73 20 69 73 20 61 20 62 6c 61 6e 6b 20 73 6f 75 6e 64 66 69 6c 
      65 2e 00 00 00 00 00 00 00 00 00 00 
      

example 2

    Just for fun, here is a WAVE format soundfile with the same contents as the previous examples (5 zero-valued samples). Notice that most data fields in the file are little-endian forms of numbers (since Intel computers are little-endian).

      52 49 46 46 2e 00 00 00 57 41 56 45 66 6d 74 20 10 00 00 00 01 00 01 00 
      44 ac 00 00 88 58 01 00 02 00 10 00 64 61 74 61 0a 00 00 00 00 00 00 00 
      00 00 00 00 00 00 
      
    Which is equivalent to:
      ; This is a WAVE formated soundfile with 5 zero samples.
      +R +I +F +F           ; RIFF chunk descriptor
      4u'46                 ; size of the chunk in bytes
      +W +A +V +E           ; format is the type of RIFF that follows
      +f +m +t +            ; the "fmt sub chunk
      4u'16                 ; number of bytes total in sub-chuck which follow
      2u'1                  ; audio format (PCM Linear)
      2u'1                  ; number of channels
      2u'44100              ; sampling rate 44100 = ac 44, 2u'44100 = 44 ac
      4u'88200              ; byte rate = srate * channels * bitspersample / 8.
      2u'2                  ; block align (bytes per sample / 8)
      2u'16                 ; bits per sample
      +d +a +t +a           ; "data" subchunk
      4u'10                 ; size of data subchunk in bytes which follows
      2u'0                  ; sample 1
      2u'0                  ; sample 2
      2u'0                  ; sample 3
      2u'0                  ; sample 4
      2u'0                  ; sample 5
      ; end of example wave file.
      
      

example 2

    Note that you can reverse the process of the binasc program unless you specify the -a option:

       binasc file1 > file2
       binasc -c file3 file2
       ; file1 and file3 should be the same
    
       binasc -b file1 > file2
       binasc -c file3 file2
       ; file1 and file3 should be the same
    
       binasc -a file1 > file2
       binasc -c file3 file2         ; this results in an error
    
    
    
    

    Here is the usage statement for the binasc program:

    For converting/compiling a binary file to/from an ASCII listing of   
    individual bytes of the file.                                        
                                                                         
    Usage: binasc.linux [-a | -b | -c output] input(s)
                                                                         
    Options:                                                             
       -a = output only non-space printable asci words                   
       -b = output only hexadecimal ascii numbers for each byte          
       -c output = compile binary file to output file
       -m = display the man page for the program                         
       no options = combination of -a and -b options.                    
       --options  = list of all options, aliases and defaults     
    
    
    

    View the source files: binasc.cpp , entire source
    Download binaries: here

    -- Craig Stuart Sapp