Android Blog by Snowpard: Talking Pets: Development Pet for Android

Development of entertainment applications in the style of talking pets can be divided into three main parts: voice recording, voice reproduction and animation. This article will reveal a small series of articles on the development of its own talking pet. We begin our development of algorithms with voice recording.

Some theory

For the detection of human speech or sounds to be calculated sound pressure level. To do this, use the following formula:

In the case of a set of values n:

Talking Pets: Development Pet for Android - Set of values n

RMS value of the sound pressure is determined by the formula:

Talking Pets: Development Pet for Android - Calculated RMS value

is a constant equal 20 µPa.

The resulting sound pressure level can detect human speech or sound. If this level is greater than the limit value, then the recording starts.

We have the following algorithm:

start with the thread record;
get the recorded data block and calculate the sound pressure level;
if the received level greater than the default level, then recorded data in the repository;
at the end of the recording (the recording end is the transition when the sound pressure level to be smaller default level) transfer data from the repository to the input thread playback;
play back a recording;
at the end of play back to the first step.

What was used

The AudioRecord class manages the audio resources for Java applications to record audio from the audio input hardware of the platform. This is achieved by "pulling" (reading) the data from the AudioRecord object. The application is responsible for polling the AudioRecord object in time using one of the following three methods: read(byte[], int, int), read(short[], int, int) or read(ByteBuffer, int). The choice of which method to use will be based on the audio data storage format that is the most convenient for the user of AudioRecord.

Upon creation, an AudioRecord object initializes its associated audio buffer that it will fill with the new audio data. The size of this buffer, specified during the construction, determines how long an AudioRecord can record before "over-running" data that has not been read yet. Data should be read from the audio hardware in chunks of sizes inferior to the total recording buffer size.

Practice

First, you need to add permissions to AndroidManifest:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.WAKE_LOCK" />

Basic constants and attributes for the class:

public static final int DEFAULT_LP = 73; // sound pressure level by default
public static int FREQUENCY = 24050; // recording frequency
public static final int CHANNEL = AudioFormat.CHANNEL_CONFIGURATION_STEREO; // mono or stereo channel
public static final int ENCODING = AudioFormat.ENCODING_PCM_16BIT; // encoding
public static final int SIZE = 1000000; // data size
// sample rates
private static final int[] mSampleRates = new int[] { 44100, 22050, 16000, 11025, 8000 };
// recorded data
private byte[] array = new byte[SIZE];
private int size = 0;

private int bufferSize; // buffer size
private byte[] buffer;  // buffer containing the recorded voice
 
private AudioRecord audioRecord = null;  // used to record audio data
private Timer recordTimer = new Timer();  // timer to stop recording

private Handler handler; // To send messages to the application UI thread

// Are used to check the thread of the various states
private boolean isStop = false;  
private boolean isStartRecord = false; 
private boolean isNotFullData = false;  
private boolean isRecord = false; 
private boolean isTimerStart = false;
private boolean isFirst = true;
private boolean isFinish = false;
private boolean isRec = false;

private byte[] buffer_temp; // local buffer

We have a data buffer byte [] buffer, which AudioRecord puts the data from the audio equipment.

First, you need to initialize AudioRecord:

/**
* Used to search for possible AudioRecord to the settings for the device 
*/
public AudioRecord findAudioRecord() {
  DebugLog.w(TAG, "findAudioRecord()");
  // Search the minimum buffer size
  this.bufferSize = AudioRecord.getMinBufferSize(FREQUENCY, CHANNEL, ENCODING);
  if (this.bufferSize != AudioRecord.ERROR_BAD_VALUE) {
   AudioRecord recorder = new AudioRecord(AudioSource.DEFAULT,
     FREQUENCY, CHANNEL, ENCODING, bufferSize);
   if (recorder.getState() == AudioRecord.STATE_INITIALIZED)
    return recorder;
  }
  // If you create a AudioRecord with preset parameters can not, then try to create at least some AudioRecord
  for (int rate : mSampleRates) {
   try {
    this.bufferSize = AudioRecord.getMinBufferSize(rate, CHANNEL,
      ENCODING);

    if (this.bufferSize != AudioRecord.ERROR_BAD_VALUE) {
     FREQUENCY = rate;
     AudioRecord recorder = new AudioRecord(AudioSource.DEFAULT,
       FREQUENCY, CHANNEL, ENCODING, this.bufferSize);

     if (recorder.getState() == AudioRecord.STATE_INITIALIZED)
      return recorder;
    }
   } catch (Exception e) {
    e.printStackTrace();
   }
  }
  return null;
 }

 public Record(Handler handler) {
  DebugLog.w(TAG, "Record()");
  this.handler = handler;
  this.isNotFullData = false;
  this.isStop = false;
  this.isFirst = true;
  this.audioRecord = findAudioRecord();
  if (this.bufferSize > 0)
   this.buffer = new byte[this.bufferSize]; 
 }

Now calculate Prms:

private double getPrms(byte[] buffer)
 {
  double prms = 0d;
  for (int i = 0; i < buffer.length / 2; i++) {
   short x = getShort(buffer[i * 2], buffer[i * 2 + 1]);
   prms += x * x;
  }
  
  prms = Math.sqrt(prms / buffer.length);
  return prms;
 }

Method getShort () converts a byte to short for a more accurate calculation of the sound pressure level:

private short getShort(byte argB1, byte argB2) {
  return (short) (argB1 | (argB2 << 8));
 }

Once calculated Prms, define Lp (sound pressure level):

private int getLp(double prms, double pref, int shum)
 {
  return (int) (20 * Math.log10(prms / pref)) + shum;
 }

The basic method - is run, which is called a Runnable object. This method tests whether or not to write or not, whether the data recorded speech, and ended with a record.

public void run() {
  
  while (!this.isFinish)
  {
   if (!this.isStartRecord)
    continue;
   try {
    DebugLog.w(TAG, "run()");
    int lp = 0; // current sound pressure level
    // Initialize variables for the calculation of the sound pressure level
    double prms = 0.0;
    double pref = 0.00002;
    int shum = -80; // background noise data transmission channel
    this.array = new byte[SIZE];
    this.size = 0;
    
    boolean isListen = true;
    int count_lp = 0;
    
    if (this.audioRecord != null) {
     
     this.audioRecord.startRecording();   
     while (this.isNotFullData) {
 
      Arrays.fill(buffer, (byte) 0); // Fill the array with zeros
      this.audioRecord.read(this.buffer, 0, this.buffer.length); // start to record voice
      
       // FIXME: Entries are at the first noise. Skip the first data.  
       //Because of this, it is possible that the character is the first syllable of swallows
       if (this.isFirst) {
       this.isFirst = false;
       continue;
      }
      prms = getPrms(this.buffer);
      lp = getLp(prms, pref, shum);

      int max_lp = DEFAULT_LP;
      
      if (!isRecord) {
       // If the current sound pressure level higher than the default level, begin writing data
       if (lp > max_lp) {   
        
        DebugLog.i(TAG, "lp = " + lp);
        
        count_lp++;
        if (count_lp != 1)
        {
         this.isTimerStart = true;
         this.isRecord = true;
         this.isRec = true;
         if (null != this.recordTimer)
          this.recordTimer.cancel();
         this.recordTimer = new Timer();
         if (isListen) {
          this.handler.sendEmptyMessage(Constants.MSG_LISTEN);
          isListen = false;
         }
        }
        else
        {
         this.buffer_temp = this.buffer;       
        }
       }
      } else {
       if (lp <= max_lp) {
        /*  If the sound pressure level is less than the default level, then start the timer to stop recording.
         This is used to record short pauses between words (0.65), but not to play
         immediately after the end of the word.
        */
        this.isRecord = false;
        if (null != this.recordTimer)
         this.recordTimer.cancel();
        TimerTask task = new TimerTask() {
         public void run() {
          isTimerStart = false;
          isNotFullData = false;
          isStop = false;
         }
        };
        this.recordTimer = new Timer();
        this.recordTimer.schedule(task, 650);
        this.isTimerStart = true;
       }
      }
      // Write the audio data in the array, if any
      if (this.isTimerStart) {
       if (this.buffer_temp!=null)
       {
        for (int i = 0; i < this.buffer_temp.length; i++)
         this.array[this.size + i] = this.buffer_temp[i];
        this.size += this.buffer_temp.length;
        this.buffer_temp = null;
       }
       for (int i = 0; i < this.buffer.length; i++)
        this.array[this.size + i] = this.buffer[i];
       this.size += this.buffer.length;  
       if (this.size + this.buffer.length > SIZE)
        this.isNotFullData = false;
      }
     }
    }
   } catch (Exception e) {
    e.printStackTrace();
   }

   try {
    if (this.recordTimer!=null)
     this.recordTimer.cancel();
    this.isRec = false;
    if (this.audioRecord != null) {
     if (!this.isStop) {
      this.isStartRecord = false;
      
      // Reproduce the recorded voice
      Message msg = this.handler.obtainMessage();
      msg.what = Constants.MSG_PLAYBACK;
      
      Bundle bundle = new Bundle();
      bundle.putInt(Constants.DATA_SIZE, this.size); 
      bundle.putByteArray(Constants.DATA_ARRAY, this.array);
      msg.setData(bundle);
      
      this.handler.sendMessage(msg);
     }
    } 
   } catch (Exception e) {
    e.printStackTrace();
   }  
  }
 }

Pages

Sunday, September 16, 2012

Talking Pets: Development Pet for Android - Part 1

Some theory

What was used

Practice

Links

2 comments: