As part of a personal project I have been working on I needed to estimate the size of an MPEG2 file (.mpg) once it was DivXed, following the formula:
Estimated KB = (Video bitrate + Audio bitrate) X (Runtime/8)
Based upon the above equation I thought it would be a pretty easy task. I was wrong. I just needed the result of this simple equation so I could have a progress bar to display how far through the DivXing process it was. It wasn’t exactly a major feature of the program, so I didn’t expect it to take so much effort. I had all the information I needed except for one variable – the runtime of the video. I did a bit of searching to try and dig up some C# code that would give me this information, but came up pretty empty handed. The only code I came up with was a C# port of a part of the mpgtx project here – but its duration calculation did not give even a remotely correct answer.
I expanded my search to look for code in any programming language that showed how to get this information, but found very little there either. Most were wrapped up in such convoluted projects that I’d have more luck reverse engineering the MPEG file myself than the code I found. I did find a file on this site (MpegInfo.c in the MPEG Decoder project) that helped a little and gave me a basis to work from, but my code has ended up working very differently to the way it does (I think Nic’s has a minor bug in the timestamp calculation that can put it a few seconds out, which mine doesn’t appear to have).
Of course at this point I could have just said stuff it and not bothered with a progress bar for this part of the process. But like the nerd I am I saw a challenge and I couldn’t walk away without having a try at writing a class to parse an MPEG file myself. Several late nights later I’ve come up with something that works and have decided others might find it useful too.
Some background info: the duration of an MPEG file isn’t stored in a header anywhere. To calculate the duration of the video you actually have to search for the first timestamp (usually 0:00:00) and the last timestamp (from the end of the file or it could take ages to search through the whole file for it). You can then subtract the first timestamp from the last timestamp and you get the duration of the video. It isn’t foolproof as the timestamps can be affected by various problems, but it’s usually a pretty good indicator of the video length without having to parse the entire file.
The MPEG file has a header in which it stored various information like the picture width/height, the aspect ratio, the bit rate, the frame rate, and more. The file then is delimited by GOP (Group Of Pictures) stream markers which are followed by a timestamp. This data is encoded into bits which you can extract using some bit shifting and masking. You can mask then shift the bits or vice versa. I masked then shifted – I thought it was a bit more understandable that way. It’s a bit difficult working out the structures of these headers and markers, but in the end I found a great source with the structure of the MPEG headers and stream markers – I wish I had found it earlier! http://dvd.sourceforge.net/dvdinfo/mpeghdrs.html#gop
I’ve wrapped the functionality into a C# class which provides the duration along with some other properties (including picture width/height, frame rate, and aspect ratio), which you can download from here:
To use the class just instantiate it with the path to the .mpg file as a parameter to its constructor, and then you can use the properties immediately. I’ve commented it fairly well so the code should be pretty easy to follow. Feel free to use it in your own projects (it’s LGPL) and I’d love to hear about where you use it and any additions you make to it. Enjoy!
*I’ve been rather surprised by the amount of interest this has had since it’s been up – in fact it’s one of the most popular downloads from my blog. I didn’t think it would be of interest to many people, so please leave a comment on where you’ve used it – I’m really interested to know!