Encoding an image to sound

The purpose of this project is to encode an image to a sound that can be viewed with a spectrogram. For some time I have known that musical artists have encoded pictures into their music. Most notable of these is artists is Aphex Twin. Luckily I had a copy of Windolicker and a great visualization program Sonic Visualiser. After looking at the images I decided it would be cool to try and encode my own images. I saw a few programs available, but decided it would be a better challenge to write my own program from scratch using Perl.


A spectrogram is a graph representing the intensity or a frequency with relation to time. Normally the frequencies are along the Y axis, with the time on the X axis. The intensity of the frequency is represented by the brightness of the color. The frequency and color can use either a linear scale or a logarithmic scale. Below is an spectrogram of a few piano chords. The audio file used can be found on Wikipedia here.

Image encoding

The idea I had to encode the image was to simply create a sine wave at a corresponding frequency to represent the Y axis, a corresponding time to represent the X axis and a corresponding amplitude to represent the pixel color intensity.

Creating Sound

The first step to encoding an image was to learn how audio formats work. At first I tried writing a script that plays a frequency to the ‘/dev/dsp’ (Which is the sound card on Linux). When writing straight to /dev/dsp you are limited by a sample rate of 8000hz and a sample size of 8bits. Below simple Perl script that plays a concert A 440hz. To execute run ‘./sin.pl > /dev/dsp’.

use Math::Trig;
use strict;
use POSIX;

my $sample = 8000;
my $frequency = 440;
my $cycles = 6;
my $period = POSIX::floor($sample / $frequency * $cycles);

while (1) {
for(my $i=1;$i<=$period;$i++)
my $x = 128 + sin($cycles * 2 * pi * $i / $period) * 128;
$x = POSIX::floor($x);
my $char = pack(C,$x);
print $char color=”#ff00ff”>”

The DSP defaults do not offer much fidelity I needed at least the fidelity of an audio CD, which is 16bits at 44.1khz. I did some of searching on CPAN to find a library that allowed me write wave files. Most of the audio libraries had a too much overhead for what I wanted to do. Instead I looked up the file format for a ‘.wav’ and coded my own library. This library is limited to only producing a 16bit 44.1khz mono wave.

#Author Evan Salazar
#Generate a .wav file for 16 bit mono PCM
use strict;
package SimpleWave;

sub genWave {

#Get the reference to the data array

my ($audioData) = @_;

#This is the default sample rate
my $samplerate = 44100;
my $bits = 16;
my $samples = $#{$audioData} + 1;

my $channels = 1;

#Do Calculations for data wave headers
my $byterate = $samplerate * $channels * $bits / 8;
my $blockalign = $channels * $bits / 8;
my $filesize = $samples * ($bits/8) * $channels + 36;

#RIFF Chunk;
my $riff = pack(a4Va4,RIFF,$filesize,WAVE);

#Format Chunk
my $format = pack(a4VvvVVvv,
fmt ,

#Data Chunk
my $dataChunk = pack(a4V,data,$blockalign * $samples);

#Read audoData array
my $data;
for(my $i=0;$i<$samples;$i++) {

$data .= pack(v,$audioData->[$i]);

#Return a byte string of the wave
return $riff . $format . $dataChunk. $data;

Reading a Bitmap

Luckily I found a simple bitmap reader on CPAN called Image::BMP. This is a nice lightweight library that dose not depend on any external libraries or compiled code. Using this library I was able to easily load and read the bitmap data.

Encoding the Image

The first pass of my program disregarded the color data and only produced a frequency for the Y axis if the color intensity was less that half the sum of all colors. Below is an example. Note: I converted the WAV to an MP3 to conserve bandwidth, at 320kbps not much data is lost.

Audio File: ohmpie.mp3

I was really shocked to fist see the image! The only tweaking I needed to do was to use a linear scale for the frequency. Also if I selected too high an amplitude for the sin wave, clipping occurred in areas with too much black. For image above I used an amplitude of about 1000 on a scale of 0 to 32768.

The next step was to add amplitude scaling to match the color intensity. For this I summed all the color channels for a given pixel and scaled it to represent the max amplitude ‘(R + G + B) / 768 * max_amplitude’. Below is a picture of me after using the scaling.

Audio File: evan.mp3

By selecting a color scheme that goes from black to white and using a linear scale for the volume I get a very good black and white image. To prevent clipping on very dark images I added an inverse option that will invert the color producing a negative image.

Audo File: evanInv.mp3

You can reverse the color scheme to go from white to black to produce the regular image

Full Program

Below you can view and/or download the full code to this program. Currently performance is not optimized. So don’t write me telling me its slow. I currently have a few idea to speed it up. Also for best results use a small image around 100px tall.

Download: imageEncode-0.7.tar.gz

This entry was posted in Linux, Programming. Bookmark the permalink.

11 Responses to Encoding an image to sound

  1. bhamlefty says:

    Great post. I made a video tutorial on how to convert an image to sound using spectral analysis mapping from metasynth’s free 5.1 demo Check it out.


  2. thehowtomac says:

    Here’s how to use metasynth to convert an image to sound from left to right. Enjoy.


  3. Jorg Mohnen says:

    We have patented the reverse of this a few years back. It takes an image and encodes a sound file as an image. VERY cool that you have tried the reverse of this now….

  4. Tamh says:

    I just ran the algorithm using Strawberry Perl on Windows (yeah, guilty me, and my job). It doesn’t seems to work, all it gives out is a scrambled mess of frequencies when I watch the resulting WAV on Sound Visualizer. I want to fix it or find what’s wrong here, because I think this article is really creative (and I want to use it to obscure some hints in a project I have).

  5. Victor X says:

    There is an old project called Bitmaps & Waves http://victorx.eu/BitmapPlayer.htm

    The program creates images of sounds (use the Fourie tab) and sounds of images (use the Inverse Fourie).

  6. Ulkreghz says:

    Thank-you for the post, good sir.

    If you’ve a Twitter account I’d be interested in discussing the code and applications for sound to image encryption with you :)

    (Arrived here via Cracked.com)

  7. Tyler says:

    I downloaded sonic v and imageEncode-0.7.tar.gz but I can’t make it work. Can someone explain this like they’re explaining it to a 5 year old?!?!?!

  8. Brian Gilman says:

    Sounds somewhat similar to dolphin language. I wonder what would happen if another dimension of spatial information was encoded.

  9. ale says:

    When I run the script I get this error message:

    Can’t locate Image/BMP.pm in @INC (@INC contains: /etc/perl /usr/local/lib/perl/5.10.1 /usr/local/share/perl/5.10.1 /usr/lib/perl5 /usr/share/perl5 /usr/lib/perl/5.10 /usr/share/perl/5.10 /usr/local/lib/site_perl .) at imageEncode line 12.
    BEGIN failed–compilation aborted at imageEncode line 12.

    any clues?

  10. First says:

    Great post!

  11. AR says:

    Download http://cpansearch.perl.org/src/DAVEOLA/Image-BMP-1.17/lib/Image/BMP.pm

    Put it to the /usr/local/lib/site_perl

    Modify imageEncode:12
    -use Image::BMP;
    +use BMP;

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>