Development
of entertainment applications in the style of talking pets can be divided into
three main parts: voice recording, voice reproduction and animation. This
article will reveal a small series of articles on the development of its own
talking pet. We begin our development of algorithms with voice recording.
Some theory
For the
detection of human speech or sounds to be calculated sound pressure level. To
do this, use the following formula:
In the case
of a set of values n:
RMS value of the sound pressure is determined by the
formula:
is a constant equal 20 µPa.
The
resulting sound pressure level can detect human speech or sound. If this level
is greater than the limit value, then the recording starts.
We have the
following algorithm:
- start with the thread record;
- get the recorded data block and calculate the sound pressure level;
- if the received level greater than the default level, then recorded data in the repository;
- at the end of the recording (the recording end is the transition when the sound pressure level to be smaller default level) transfer data from the repository to the input thread playback;
- play back a recording;
- at the end of play back to the first step.
What was used
The
AudioRecord class manages the audio resources for Java applications to record
audio from the audio input hardware of the platform. This is achieved by
"pulling" (reading) the data from the AudioRecord object. The
application is responsible for polling the AudioRecord object in time using one
of the following three methods: read(byte[], int, int), read(short[], int, int)
or read(ByteBuffer, int). The choice of which method to use will be based on
the audio data storage format that is the most convenient for the user of
AudioRecord.
Upon
creation, an AudioRecord object initializes its associated audio buffer that it
will fill with the new audio data. The size of this buffer, specified during
the construction, determines how long an AudioRecord can record before
"over-running" data that has not been read yet. Data should be read
from the audio hardware in chunks of sizes inferior to the total recording
buffer size.
Practice
First, you need to add permissions to AndroidManifest:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.WAKE_LOCK" />
Basic constants and attributes for the class:
public static final int DEFAULT_LP = 73; // sound pressure level by default public static int FREQUENCY = 24050; // recording frequency public static final int CHANNEL = AudioFormat.CHANNEL_CONFIGURATION_STEREO; // mono or stereo channel public static final int ENCODING = AudioFormat.ENCODING_PCM_16BIT; // encoding public static final int SIZE = 1000000; // data size // sample rates private static final int[] mSampleRates = new int[] { 44100, 22050, 16000, 11025, 8000 }; // recorded data private byte[] array = new byte[SIZE]; private int size = 0; private int bufferSize; // buffer size private byte[] buffer; // buffer containing the recorded voice private AudioRecord audioRecord = null; // used to record audio data private Timer recordTimer = new Timer(); // timer to stop recording private Handler handler; // To send messages to the application UI thread // Are used to check the thread of the various states private boolean isStop = false; private boolean isStartRecord = false; private boolean isNotFullData = false; private boolean isRecord = false; private boolean isTimerStart = false; private boolean isFirst = true; private boolean isFinish = false; private boolean isRec = false; private byte[] buffer_temp; // local buffer
We have a
data buffer byte [] buffer, which AudioRecord puts the data from the audio
equipment.
First, you
need to initialize AudioRecord:
/** * Used to search for possible AudioRecord to the settings for the device */ public AudioRecord findAudioRecord() { DebugLog.w(TAG, "findAudioRecord()"); // Search the minimum buffer size this.bufferSize = AudioRecord.getMinBufferSize(FREQUENCY, CHANNEL, ENCODING); if (this.bufferSize != AudioRecord.ERROR_BAD_VALUE) { AudioRecord recorder = new AudioRecord(AudioSource.DEFAULT, FREQUENCY, CHANNEL, ENCODING, bufferSize); if (recorder.getState() == AudioRecord.STATE_INITIALIZED) return recorder; } // If you create a AudioRecord with preset parameters can not, then try to create at least some AudioRecord for (int rate : mSampleRates) { try { this.bufferSize = AudioRecord.getMinBufferSize(rate, CHANNEL, ENCODING); if (this.bufferSize != AudioRecord.ERROR_BAD_VALUE) { FREQUENCY = rate; AudioRecord recorder = new AudioRecord(AudioSource.DEFAULT, FREQUENCY, CHANNEL, ENCODING, this.bufferSize); if (recorder.getState() == AudioRecord.STATE_INITIALIZED) return recorder; } } catch (Exception e) { e.printStackTrace(); } } return null; } public Record(Handler handler) { DebugLog.w(TAG, "Record()"); this.handler = handler; this.isNotFullData = false; this.isStop = false; this.isFirst = true; this.audioRecord = findAudioRecord(); if (this.bufferSize > 0) this.buffer = new byte[this.bufferSize]; }
Now calculate Prms:
private double getPrms(byte[] buffer) { double prms = 0d; for (int i = 0; i < buffer.length / 2; i++) { short x = getShort(buffer[i * 2], buffer[i * 2 + 1]); prms += x * x; } prms = Math.sqrt(prms / buffer.length); return prms; }
Method getShort
() converts a byte to short for a more accurate calculation of the sound
pressure level:
private short getShort(byte argB1, byte argB2) { return (short) (argB1 | (argB2 << 8)); }
Once
calculated Prms, define Lp (sound pressure level):
private int getLp(double prms, double pref, int shum) { return (int) (20 * Math.log10(prms / pref)) + shum; }
The basic
method - is run, which is called a Runnable object. This method tests whether
or not to write or not, whether the data recorded speech, and ended with a
record.
public void run() { while (!this.isFinish) { if (!this.isStartRecord) continue; try { DebugLog.w(TAG, "run()"); int lp = 0; // current sound pressure level // Initialize variables for the calculation of the sound pressure level double prms = 0.0; double pref = 0.00002; int shum = -80; // background noise data transmission channel this.array = new byte[SIZE]; this.size = 0; boolean isListen = true; int count_lp = 0; if (this.audioRecord != null) { this.audioRecord.startRecording(); while (this.isNotFullData) { Arrays.fill(buffer, (byte) 0); // Fill the array with zeros this.audioRecord.read(this.buffer, 0, this.buffer.length); // start to record voice // FIXME: Entries are at the first noise. Skip the first data. //Because of this, it is possible that the character is the first syllable of swallows if (this.isFirst) { this.isFirst = false; continue; } prms = getPrms(this.buffer); lp = getLp(prms, pref, shum); int max_lp = DEFAULT_LP; if (!isRecord) { // If the current sound pressure level higher than the default level, begin writing data if (lp > max_lp) { DebugLog.i(TAG, "lp = " + lp); count_lp++; if (count_lp != 1) { this.isTimerStart = true; this.isRecord = true; this.isRec = true; if (null != this.recordTimer) this.recordTimer.cancel(); this.recordTimer = new Timer(); if (isListen) { this.handler.sendEmptyMessage(Constants.MSG_LISTEN); isListen = false; } } else { this.buffer_temp = this.buffer; } } } else { if (lp <= max_lp) { /* If the sound pressure level is less than the default level, then start the timer to stop recording. This is used to record short pauses between words (0.65), but not to play immediately after the end of the word. */ this.isRecord = false; if (null != this.recordTimer) this.recordTimer.cancel(); TimerTask task = new TimerTask() { public void run() { isTimerStart = false; isNotFullData = false; isStop = false; } }; this.recordTimer = new Timer(); this.recordTimer.schedule(task, 650); this.isTimerStart = true; } } // Write the audio data in the array, if any if (this.isTimerStart) { if (this.buffer_temp!=null) { for (int i = 0; i < this.buffer_temp.length; i++) this.array[this.size + i] = this.buffer_temp[i]; this.size += this.buffer_temp.length; this.buffer_temp = null; } for (int i = 0; i < this.buffer.length; i++) this.array[this.size + i] = this.buffer[i]; this.size += this.buffer.length; if (this.size + this.buffer.length > SIZE) this.isNotFullData = false; } } } } catch (Exception e) { e.printStackTrace(); } try { if (this.recordTimer!=null) this.recordTimer.cancel(); this.isRec = false; if (this.audioRecord != null) { if (!this.isStop) { this.isStartRecord = false; // Reproduce the recorded voice Message msg = this.handler.obtainMessage(); msg.what = Constants.MSG_PLAYBACK; Bundle bundle = new Bundle(); bundle.putInt(Constants.DATA_SIZE, this.size); bundle.putByteArray(Constants.DATA_ARRAY, this.array); msg.setData(bundle); this.handler.sendMessage(msg); } } } catch (Exception e) { e.printStackTrace(); } } }
Links
- The source codes of this project can be downloaded here (only until it is an ongoing project): zip
Good to know about Your blog i have read and i am very inspired form your idea please keep updating Your blog I will be back as soon as possible.
ReplyDeleteThank You for sharing with us.
Gedge
Thank you! Now it is very busy on several projects. Soon, I will try to start publishing articles.
ReplyDelete