Sat, 14 Feb 2009

Security Hyperventilating.

Just today, David Cournapeau and Lev Givon found a bug in all versions of libsamplerate up to and including 0.1.6. The bug causes a segfault and is basically the same as a bug I thought I fixed last year that resulted in the following entry in the ChangeLog:


  2008-07-02  Erik de Castro Lopo  <erikd AT mega-nerd DOT com>

     * src/src_sinc.c
     Fix buffer overrrun bug at extreme low conversion ratios. Thanks to Russell
     O'Connor for the report.

That bug fix went into version 0.1.4 of libsamplerate which was released on July 2nd, 2008. As a result, I was contacted by computer security firm Secunia Research on July 3rd and asked the following questions:

I replied with a reasonably full explanation and stated that while this bug, if triggered, would crash the program, it was not, in my opinion, exploitable. Secunia Research was happy with my explanation and analysis and I thought that was that.

However, in November of 2008 there was a flurry of security advisories which quoted the above ChangeLog entry but got the security implications completely wrong. The curious thing was that no-one, not a single person contacted me or the project mailing list to ask about the severity of the problem.

The earliest of these advisories I can find was dated November 3rd and appeared on the Openwall OSS-Security mailing list. However, by December 2nd, the Gentoo security team was saying:

"A remote attacker could entice a user or automated system to process a specially crafted audio file possibly leading to the execution of arbitrary code with the privileges of the user running the application."

I on the other hand, remain convinced that this bug is not exploitable under anything other than purely theoretical circumstances.

Since I have now found another variant of the same bug I decided I should document the problem more fully than last time. The problem occurs deep in a rather complicated piece of code that looks like this:


  static void
  prepare_data (SINC_FILTER *filter, SRC_DATA *data, int half_filter_chan_len)
  {   int len = 0 ;
  
      /* Calculation of value of len elided. */
  
      len = MIN (filter->in_count - filter->in_used, len) ;
      len -= (len % filter->channels) ;
  
      memcpy (filter->buffer + filter->b_end, data->data_in + filter->in_used,
                          len * sizeof (float)) ;

The problem is that under conversion ratios near the lower bound of allowed values, the variable len can end up as a small negative value (I saw values in the range [-20, -1]). That small negative number then gets multiplied by 4 (sizeof float) and then passed as the third parameter to memcpy.

However, the third parameter to memcpy is of type size_t which is an unsigned value the same size as sizeof (void*). In the C programming language, when a small signed value is converted to an unsigned value, it gets converted into a very large positive value very near the maximum representable positive unsigned value, ie 4Gig on 32 bit systems or 2 to the power of 64 on 64 bit systems.

Due to the conversion, the memcpy tries to write to a huge chunk of memory (nearly 4 Gig on 32 bit systems, and a hell of a lot more on 64 bit systems). Under a protected memory system (Linux, Windows NT or later and Mac OSX), the only way this bug could possibly exploitable would be if the memcpy returns.

However, since the memcpy is trying to overwrite the vast majority of the CPU's memory space, the chances of the memcpy returning are essentially zero for any sane operating system. Instead, the process containing the call into libsamplerate will attempt to write outside its allocated memory space and will therefore be terminated by the operating system with a segmentation fault.

I have now fixed the function containing the memcpy so that it checks the value of len and returns from the function if it is outside a sane range.

Now I do think that security researchers and security specialists at Linux distributions do a very important and rather thankless job. I also think that in this case some of them fell down rather badly. Secunia Research did the right thing and asked me about the implications of the bug. The others, extrapolated incorrectly, from very little information and got it wrong, without ever contacting me. I'm hoping that the ChangeLog entry (below)


  2009-02-14  Erik de Castro Lopo  <erikd AT mega-nerd DOT com>

      * src/src_sinc.c
      Fix a segfault which occurs when memcpy is passed a bad length parameter.
      This bug has zero security implications beyond the ability to cause a
      program hitting this bug to exit immediately with a segfault.
      See : http://www.mega-nerd.com/erikd/Blog/2009/Feb/14/index.html
      Thanks to David Cournapeau.

will prevent the kind of security hyperventilating I saw last time. The new version of libsamplerate, version 0.1.7, has just been released.

Update : 2009-02-15 11:19

David Cournapeau informs me that Lev Givon, as a user of David's libsamplerate Python bindings, reported this bug to him and that it was actually Lev that found the bug.

Posted at: 22:58 | Category: CodeHacking/SecretRabbitCode | Permalink

Sun, 11 Jan 2009

libsamplerate 0.1.5.

There is a new release of libsamplerate available here. This release contains some improvements to the optimisations that were mentioned in this blog post.

After announcing the previous optimisation work on the Secret Rabbit Code mailing list someone asked for a special case optimisation of 6 channels for doing the 5.1 surround sound format. Then there was also a request for an 8 channel special case for doing 7.1 surround.

I added the 6 channel special case immediately and then found another optimisation for the multi-channel case which uses a horrible hack called Duff's Device in the inner most loop. The results for these optimisations can be seen on the following throughput graphs:


[Throughput graphs]

Obviously, the special cases for 1, 2, 4, and 6 channels have significant improvements, especially for 4 and 6 channels. The general multi channel code path is used for 3, 5, 7 and more channels. Interestingly, if one draws a line through the 1, 2, 4 and 6 channel through-puts for and then a line through the general multi channel through-puts, they intersect at about 8 channels, suggesting that there is very little to be gained by adding a special case code path for 8 channels.

The only downside of this new release is that it uses a C construct that is part of the 1999 ISO C Standard which the Microsoft C compiler does not support. Windows users that want to compile libsamplerate should either use GNU GCC or the Intel compiler with the -c99 command line option.

Posted at: 20:47 | Category: CodeHacking/SecretRabbitCode | Permalink

Sun, 14 Dec 2008

Rabbit Optimisation.

For some time I have known that the sinc based converters in my sample rate conversion library Secret Rabbit Code might benefit from a particular optimisation which reuses the filter coefficients where possible, rather than recalculating them. The good news is that I finally managed to get around to performing this optimisation work and the results are reasonably impressive.

The old converter used what was basically a single channel converter on each channel in turn. Since part of this single channel operation involved calculating the filter coefficients on the fly it was pretty obvious that if more than one channel was being processed, it would be possible to calculate the coefficients for the first channel and reuse them for all the rest. Once that was done, the inner loop can be special cased for commonly used channel counts like 1, 2 and 4 while having other channel counts handled by a generic function that can handle any number of channels. These special cases contain straight line code where the generic function contains a loop.

As can bee seen in the throughput graph for the SRC_SINC_BEST_QUALITY converter, this optimisation resulted in modest throughput improvements for 1 channel, a big improvement for 2 channels and an outrageous more than doubling of throughput for the 4 channel case. Even the generic multi-channel case, represented here by 3 and 5 channels showed a modest improvement. It should be noted that the filter coefficients are the same for the old and new versions so the only changes in performance are due to the optimisation.


[Throughput graphs for old and new Best Sinc converter]

These results were obtained on a 3 plus year old 1.1 GHz Pentium M laptop. More recent machines should show much better absolute results but similar proportional improvements. The other sinc-based converters will also show similar proportional improvements.

A new version of Secret Rabbit Code containing this optimisation should probably be released some time during the next couple of days.

Posted at: 08:16 | Category: CodeHacking/SecretRabbitCode | Permalink

Sat, 18 Oct 2008

Foobar 2000 and the Rabbit.

Foobar 2000 (see also the Wikipedia entry) is a media player for a legacy operating system commonly known as Microsoft Windows. Secret Rabbit Code is an audio sample rate converter that I wrote, and released under the terms of the GNU GPL in 2002.

As the sole author and copyright owner of Secret Rabbit Code I have also made it available under a commercial use license (PDF) that is currently earning me a small income. However, developing Secret Rabbit Code was difficult, took a huge amount of research and the development of many prototypes which were thrown away. Now, after 6 years, that income is coming close to covering the cost of developing that initial version. It still has some way to go to cover the cost of the subsequent maintenance and the improvements I have made.

In 2005 I became aware that someone had released a binary only plugin for Foobar 2000 that used Secret Rabbit Code to do sample rate conversion. The fact that this was a binary only release was not the only problem. There was also the problem of Foobar being under a license that was not GPL compatible.

First off, I emailed the ISP in France where the binary was being hosted and asked for the download to be taken down. I also tried to track down the author of the plugin since authorship was not obvious from the download. Within a day or two I was able to track him down via the Hydrogen Audio forums.

The final result of the discussion on that forum was that I decided I would write and release a Secret Rabbit Code based plugin for Foobar. This of course involved development for windows on windows a platform I rarely if ever use and which I actually prefer not to work on. However after some 10 or 12 hours work I had a working plugin that I posted on my website. I also added a Paypal donation button on the page hoping to get paid for the time and effort I put into creating the plugin.

Unfortunately, the donations have been few and far between. Since 2005 when the plugin was released I have only had about 10 people pay the measly US$10 I am asking for. That is despite the fact that I have proven interest in this plugin. The page has had over 30000 hits since 2005, and I get an email every couple of weeks asking if there is a more recent version using later versions of Secret Rabbit Code or a version for other versions of Foobar 2000.

So here's how it stands:

Since most users of this plugin don't think I should be paid for my work creating it for use with Foobar on Windows, I currently have no intention of releasing a new version of this plugin. When and if I get a decent stream of payments for the work I have put in so far I will roll out a new version and maybe even an updated plugin for other versions of Foobar.

I will answer Foobar related emails from people who have donated or work for companies that paid for a commercial use license. The vast majority of other emails regarding the Foobar plugin will be directed to this blog post.

Posted at: 09:21 | Category: CodeHacking/SecretRabbitCode | Permalink

Sun, 30 Mar 2008

libsamplerate 0.1.3.

About a week ago I released a new version of SecretRabbitCode (aka libsamplerate).

The major change was that the new improved SINC based converters I blogged about here are now the default. There were also a couple of minor bug fixes.

The fine people at Infinitewave have now updated their test results to include the new converter and it shows Secret Rabbit Code comes very close to the best of the commercial converters in terms of quality.

Posted at: 15:11 | Category: CodeHacking/SecretRabbitCode | Permalink

Sat, 08 Mar 2008

Progress on the Rabbit.

For over three years now, I have been working on (on and off, but mostly off) a new algorithm for doing audio sample rate conversion in Secret Rabbit Code. The idea for the new algorithm has been rattling around in my head for most of that time, but the problem was always the implementation. While I am making progress it has been slow.

However, a public comparison between a large collection of converters showed that while the conversion quality of Secret Rabbit Code was good, it was nowhere near state of the art.

In order to see if I could get Secret Rabbit Code closer to state of the art quickly, I decided to revisit the existing converter during the xmas/new-year break.

The existing converter had a set of digital filters whose coefficients were generated by a small program written in GNU Octave. My first task was to convert that program to Ocaml which has become my favourite language for technical computing. I then spent quite a bit of time finding and analyzing where the filter design program was loosing precision and finding work arounds. Finally, I spent even more time looking at how the different filter design parameters interact with one another and with the conversion algorithm itself.

Fortunately, all this work has paid off. The result is new versions of the SRC_SINC_MEDIUM_QUALITY and SRC_SINC_BEST_QUALITY converters. The old versions of these converters have been renamed to SRC_OLD_SINC_MEDIUM_QUALITY and SRC_OLD_SINC_BEST_QUALITY. The old versions will be removed once the new versions have been fully validated.

So far, the new converters seem to have significantly improved signal to noise ratio as can be seen from the following to spectrograms (using the methodology described here). It should be obvious from these plots that the new versions of the converters have significantly less artifacts (the purple and blue bits) than the old converters.


[Sweep test for old mid quality converter]


[Sweep test for new mid quality converter]


[Sweep test for old high quality converter]


[Sweep test for new high quality converter]

Obviously, conversion quality is not the only criterion to evaluate sample rate converters; conversion speed can also be important in some situations. In my preliminary testing, the updated Best SINC converter runs up to 25% slower than the old one. The new best converter also uses significantly more memory than the old one. Storage of filter coefficients has gone up by a factor of 20, which is now over a megabyte for best quality converter alone.

In the tables below I've listed the SNR, throughput speeds and bandwidths as measured by the test suite (the snr_bw_test and throughput_test programs) distributed with the code for a couple of different CPU types.

1.1 GHz Intel Pentium M (32 bit) with 2048 KB cache

Converter Name SNR Throughput Bandwidth
SRC_OLD_SINC_MEDIUM_QUALITY
97.46 dB
648800 samples/sec
90.68 %
SRC_SINC_MEDIUM_QUALITY
121.33 dB
593673 samples/sec
90.55 %
SRC_OLD_SINC_BEST_QUALITY
97.35 dB
223025 samples/sec
96.96 %
SRC_SINC_BEST_QUALITY
145.68 dB
163735 samples/sec
96.08 %

1.8 GHz AMD Opteron 265 (64 bit) with 1024 KB cache

Converter Name SNR Throughput Bandwidth
SRC_OLD_SINC_MEDIUM_QUALITY
97.46 dB
1088447 samples/sec
90.68 %
SRC_SINC_MEDIUM_QUALITY
121.33 dB
1088447 samples/sec
90.55 %
SRC_OLD_SINC_BEST_QUALITY
97.35 dB
179116 samples/sec
96.96 %
SRC_SINC_BEST_QUALITY
145.68 dB
187755 samples/sec
96.08 %

1.86GHz Intel Core Duo (32 bit) with 2048 KB cache

Converter Name SNR Throughput Bandwidth
SRC_OLD_SINC_MEDIUM_QUALITY
97.46 dB
1167840 samples/sec
90.68 %
SRC_SINC_MEDIUM_QUALITY
121.33 dB
1042334 samples/sec
90.55 %
SRC_OLD_SINC_BEST_QUALITY
97.35 dB
395102 samples/sec
96.96 %
SRC_SINC_BEST_QUALITY
145.68 dB
302773 samples/sec
96.08 %

A pre-release containing these updated converters is available for download here. Once they have been tested a little more widely I intend to replace the old versions of the converters with the new, higher specification ones.

Anybody who wants to discuss this further should join the SRC mailing list and discuss it there.

Finally, once a version of Secret Rabbit Code with these new converters has been officially released I can get back to the new converter algorithm which should at least match the what I have here in terms of quality but run significantly faster and use at least an order of magnitude less RAM.

Posted at: 14:50 | Category: CodeHacking/SecretRabbitCode | Permalink

Tue, 30 Jan 2007

SRC Comparison.

One of my Free Software projects is Secret Rabbit Code, aka libsamplerate, aka the Rabbit, a library for performing sample rate conversion (Wikipedia) on audio signals. Recently, a company in Canada did a comparison of a number sample rate converters in professional audio software and also included the Rabbit in that test.

The tests were carried out by generating a input signal at a sampling rate of 96 kHz, configuring each sample rate converter to to do a conversion from 96 kHz input sample rate to 44.1 kHz output sample rate and passing the input signal through each converter and capturing each converter's output. The input test signal was a sine wave which sweeps from a low frequency of about 100 Hz at the start to a frequency of 44.1 kHz at the end. Finally, a spectrogram is then generated from each output signal.

The spectrogram of the output of Secret Rabbit Code's Best Sinc converter looks like this:


[Secret Rabbit Code sweep test] [Color key]

The spectrogram shows time in seconds along the x-axis and frequency in Hertz along the y-axis. The colour indicates the signal strength at each point in time and frequency, with white being the strongest signal (0 decibels) and black being the weakest signal (-180 decibels).

The tricky thing about the sample rate conversion process is that for any given sample rate fs, the highest frequency signal that can be correctly represented is at fs/2. When sample rate converting from 96 kHz to 44.1 kHz, all frequencies above half of the destination sample rate must be removed during the conversion process. Failure to do so will result in audio distortion and noise in the output signal.

Looking at the spectrogram of the Rabbit's output, its easy to see that the the main sweep (in bright white) clearly goes from some low frequency at the start to 22.05 kHz (half of the output sample rate) at 5 seconds. After about 5 seconds, the input signal's sine wave frequency goes above half the destination sample rate and the Rabbit does the correct thing and almost completely removes it.

The rest of the colour in the spectrogram is an artifact of the conversion process but by referencing the colour scale, its possible to confirm that all of these artifacts are 100 decibels below the level of the main signal. Ideally they shouldn't be there at all, but if they are the should be as low as possible.

Anyone who has read this far can now go to the comparison page pick any two converters and compare them. They can also confirm for themselves that although the Rabbit (Best Sinc) wasn't the best converter among the ones tested (that award would have to go to r8brain and iZotope), it certainly didn't disgrace itself either. A number of the commercial converters in expensive software packages (like Sony Vegas and Digital Performer) didn't perform all that well in comparison.

The good news is that the existence of commercial closed source converters that are better than the Rabbit gives me some incentive to come up with a better converter for inclusion in the Rabbit.

Posted at: 23:18 | Category: CodeHacking/SecretRabbitCode | Permalink