Variable Length Values




Goto: [ motivation | definition | examples ]

Motivation for Variable Length Values

    Standard MIDI Files contain MIDI data.

    However, MIDI data by itself is not useful to store unless you also store/record a time associated with that MIDI data.

    Therefore, Standard MIDI Files contain at the top level things called events. Each event consists of two components: a time, and a MIDI message. These time/message pairs follow each other one after another in a MIDI file, so the order of data in a MIDI file goes like this in one long stream:

         time message time message time message time message time message
         time message time message time message time message time message
         time message time message time message time message time message
         time message time message time message time message time message
    

    The time value is a measurement of the time to wait before playing the next message in the stream of MIDI file data. This method of specifying the time is called delta time, where delta is a jargon term in mathematics and related fields which means "difference" -- so a delta time is a time value which specifies the duration (or time difference) between two events.

    Now note that MIDI messages usually come after each other quickly -- especially if there are chords. So the predominant time between each MIDI message in a MIDI file is going to be small.

    MIDI files, like plain MIDI data, are stored in the form of bytes. Bytes can be though of as numbers in the range from 0 to 255, since a byte is a binary number with 8 digits, which can store numbers from 00000000 to 11111111.

    This poses a problem in binary data such as MIDI files. The MIDI file is just a stream of numbers, each of which is in the range from 0 to 255. How do you tell when a time or message starts/stops? For example, here is some data from a MIDI file (in hexadecimal notation):

      00 ff 58 04 04 02 30 08 00 ff 59 02 00 00 00 90 3c 28 81 00 90 3c 
      00 00 90 3c 1e 81 00 90 3c 00 00 90 43 2d 81 00 90 43 00 00 90 43 
      32 81 00 90 43 00 00 90 45 2d 81 00 90 45 00 00 90 45 32 81 00 90 
      45 00 00 90 43 23 82 00 90 43 00 00 90 41 32 81 00 90 41 00 00 90 
      41 2d 81 00 90 41 00 00 90 40 32 40 90 40 00 40 90 40 28 40 90 40 
      00 40 90 3e 2d 40 90 3e 00 40 90 3e 32 40 90 3e 00 40 90 3c 1e 82 
      00 90 3c 00 00 ff 2f 00
      
    How do you separate this stream of data into time/message pairs? If you don't know anything about MIDI, you are in the same position as a computer who is reading this data. In order to to the segretation into time/data pairs you have to know the MIDI Protcol and about Variable Length Values. Once you have a basic understanding of MIDI files, you should be able to distinguish the time/message pairs with hardly any effort:
      DELTA TIME   MESSAGE
      00           ff 58 04 04 02 30 08 
      00           ff 59 02 00 00 
      00           90 3c 28 
      81 00        90 3c 00 
      00           90 3c 1e 
      81 00        90 3c 00 
      00           90 43 2d 
      81 00        90 43 00 
      00           90 43 32 
      81 00        90 43 00 
      00           90 45 2d 
      81 00        90 45 00 
      00           90 45 32 
      81 00        90 45 00 
      00           90 43 23 
      82 00        90 43 00 
      00           90 41 32 
      81 00        90 41 00 
      00           90 41 2d 
      81 00        90 41 00 
      00           90 40 32 
      40           90 40 00 
      40           90 40 28 
      40           90 40 00 
      40           90 3e 2d 
      40           90 3e 00 
      40           90 3e 32 
      40           90 3e 00 
      40           90 3c 1e 
      82 00        90 3c 00 
      00           ff 2f 00
      

    Notice that time values sometime are two bytes in length, and sometimes they are one byte in length. Using more than one byte for the delta time implies a longer time value. Time values in a MIDI file are stored as Variable Length Values (VLV). A VLV is a number with a variable width

    I won't explain how to segment the MIDI message data here, but here is a simple algorithm for parsing a VLV value out of a byte stream:

       If byte is greater or equal to 80h (128 decimal) then the next byte 
            is also part of the VLV,
       else byte is the last byte in a VLV.
    

    Why use VLV's? Why not use time values which are stored in fixed number of bytes which can store the maximum delta-time necessary? This is possible, and perhaps preferrable since this would make things easier. Then the byte stream in a MIDI file might look like this:

      DELTA TIME    MESSAGE
      00 00 00 00   ff 58 04 04 02 30 08 
      00 00 00 00   ff 59 02 00 00 
      00 00 00 00   90 3c 28 
      

    The people who designed the structure of the Standard MIDI file were concerned about storage space. The MIDI protocol was created in the early 1980's when PC computers often didn't have hard disks, and the floopy disks contains 256KB of information max. Now in the year 1999, a 1GB hard disk is considered small (which could hold around 5,000 256KB floppies on it). We can't easily change the MIDI file format now however, because there are lots of MIDI files and programs that read MIDI files which would become obsolete for no significant improvement in the file format other than to make it easier for humans to read (which they shouldn't be doing anyway).

    Since delta times are usually small, we can use just one byte usually to store the time. However if delta times are large, then more than one byte is needed to store the delta time. Therefore, by using VLV's, the size of a MIDI file is reduced. So, you can think of VLV's as a form of compression.

Definition of Variable Length Values

    A MIDI file Variable Length Value is stored in bytes. Each byte has two parts: 7 bits of data and 1 continuation bit. The highest-order bit is set to 1 if there is another byte of the number to follow. The highest-order bit is set to 0 if this byte is the last byte in the VLV.

    To recreate a number represented by a VLV, first you remove the continuation bit and then concatenate the leftover bits into a single number.

    To generate a VLV from a given number, break the number up into 7 bit units and then apply the correct continuation bit to each byte.

    In theory, you could have a very long VLV number which was quite large; however, in the standard MIDI file specification, the maximum length of a VLV value is 5 bytes, and the number it represents can not be larger than 4 bytes.

Examples

    The MIDI file delta-time 0h can be represented as a byte like this:
         00000000
      
    This is a variable length value with a length of one byte.

    First segretate the continuation bit from the data bits for the value:

         0  0000000
      

    What is the continuation bit? It is zero. This means that there are no more bytes to follow this byte for the delta time's value. Now look and figure out what number they are. 0000000 in binary notation is equal to 0 hex, or 0 decimal.

    Therefore the VLV 00000000 is equal to 0.

    
    
    

    Now look at a harder example: 81 00 (hex).

    Since 81 is larger than 80, you can see that the 00 byte is also part of the VLV. Lets look at 81 00 in binary notation:

      10000001 00000000
      
    Separate the continuation bit from the data bits:
      1  0000001 0  0000000
      
    Now you can see in binary notation what you can also see in hex notation -- that the first byte expects at least one more byte to follow it in the VLV. Remove the continuation bits:
       0000001  0000000
      
    This is the actual number that is represented by the VLV value 81 00, but maybe we should convert to decimal or hex so you understand it better. First, combine all of the bits into one clump:
       00000010000000
      
    I will convert first to hexadecimal, and later to decimal. For hexadecimal conversion of a binary number, arrange the digits into groups of 4, starting with the least siginificant (smallest, right) side of the number:
       00 0000 1000 0000
      
    There are two zeros left on the left size, so add two more zeros to make a grouping of four digits:
       0000 0000 1000 0000
      
    We can do this in decimal notation to show why this can be done: the number 0045 and 45 are the same number. Now, convert the binary groupings into hexadecimal digits:
       0000 0000 1000 0000
          0    0    8    0
      
    Therefore the number is 0080 or 80 in hex notation. Convert 80 to decimal notation:
          80hex = 8 * 161 + 0 * 160 = 8 * 16 + 0 = 128 decimal
      
    Here is the same data as listed in the motivation section, but with the delta times convered to decimal numbers
      DELTA TIME   MESSAGE
      0            ff 58 04 04 02 30 08 
      0            ff 59 02 00 00 
      0            90 3c 28 
      128          90 3c 00 
      0            90 3c 1e 
      128          90 3c 00 
      0            90 43 2d 
      128          90 43 00 
      0            90 43 32 
      128          90 43 00 
      0            90 45 2d 
      128          90 45 00 
      0            90 45 32 
      128          90 45 00 
      0            90 43 23 
      256          90 43 00 
      0            90 41 32 
      128          90 41 00 
      0            90 41 2d 
      128          90 41 00 
      0            90 40 32 
      64           90 40 00 
      64           90 40 28 
      64           90 40 00 
      64           90 3e 2d 
      64           90 3e 00 
      64           90 3e 32 
      64           90 3e 00 
      64           90 3c 1e 
      256          90 3c 00 
      0            ff 2f 00
      

    What are the units of measurements for delta times? In a MIDI file, the units are arbitrary, and you have to look at the header of the MIDI file to see what the units mean. For this example, the time units are 128 ticks to the quarter note, so 128 is a quarter note duration, 256 is a half-note, and 64 is an eighth-note duration.

    For the class, you can use the program vlv to convert between variable length values and the numbers that they represent (or vice-versa).

    Here are some more equivalences between VLV's and the number they represent:

            Variable length (in hex)    Real value in hex (dec)
            7F                          7F   (127)
            81 7F                       FF   (255) 
            82 80 00                    8000 (32768)
    

-- Craig Stuart Sapp