Connect the entire audio and video recording process in series to complete audio and video collection, encoding, and packaging into mp4 output

Audio collection: AudioRecord
Video capture: Camera preview callback YUV data
Encoding: MediaCodec
Composite packet MP4: MediaMuxer

First determine several threads to process tasks
1.audioThread audio capture and encoding
2.videoThread video encoding
3.muxerThread synthesis

Sample code: Kotlin

All detailed codes have been uploaded to github, and the address will be given later. The example Activity is Camera1PreviewActivity

Some verifications are missing in the code, such as the device supports the format of the preview, as mentioned in the previous article, you should pay attention to whether your device supports this setting.

At the end, the problems that are prone to occur will be written. When the code is not running correctly, you can check whether
made these mistakes

1. Initialize and open the camera

The SurfaceView used in the preview interface should know the camera preview through the previous study, so I won’t say much

  private fun initView() {
        surfaceView = findViewById(com.example.mediastudyproject.R.id.surface_view)
        surfaceView.holder.addCallback(object : SurfaceHolder.Callback2 {
            override fun surfaceRedrawNeeded(holder: SurfaceHolder?) {
            }

            override fun surfaceChanged(
                holder: SurfaceHolder?,
                format: Int,
                width: Int,
                height: Int
            ) {
                isSurfaceAvailiable = true
                this@Camera1PreviewActivity.holder = holder
            }

            override fun surfaceDestroyed(holder: SurfaceHolder?) {
                isSurfaceAvailiable = false
                mCamera?.stopPreview()
                //Here you need to cancel the preview callback set before, otherwise close the app, the camera is released, but it is still calling back, and an exception will be reported
                mCamera?.setPreviewCallback(null)
                mCamera?.release()
                mCamera = null
            }

            override fun surfaceCreated(holder: SurfaceHolder?) {
                isSurfaceAvailiable = true
                this@Camera1PreviewActivity.holder = holder
                thread {
                	//Turn on the camera
                    openCamera(Camera.CameraInfo.CAMERA_FACING_BACK)
                }
            }
        })
    }

Camera parameter settings

 /**
     * Initialize and open the camera, the rear camera that I open by default here
     */
    private fun openCamera(cameraId: Int) {
        mCamera = Camera.open(cameraId)
        mCamera?.run {
            setPreviewDisplay(holder)
            setDisplayOrientation(WindowDegree.getDegree(this@Camera1PreviewActivity))

            var cameraInfo = Camera.CameraInfo()
            Camera.getCameraInfo(cameraId, cameraInfo)
            Log.i("camera1", "camera direction ${cameraInfo.orientation}")


            val parameters = parameters

            parameters?.run {

                //The automatic exposure result gave me a black cloud, I can't stand it, set it yourself
                exposureCompensation = maxExposureCompensation

                //auto white balance
                autoWhiteBalanceLock = isAutoWhiteBalanceLockSupported


                //Set preview size
                appropriatePreviewSizes = getAppropriatePreviewSizes(parameters)
                setPreviewSize(appropriatePreviewSizes?.width!!, appropriatePreviewSizes?.height!!)

                //set focus mode
                val supportedFocusModes = supportedFocusModes
                if (supportedFocusModes.contains(Camera.Parameters.FOCUS_MODE_AUTO)) {
                	//Set auto focus, start auto focus is achieved through the Camera's autoFocus method
                	//If you want to focus continuously, this method needs to be called multiple times, so autoFocus is not called here
                	//If you want to focus continuously, you can implement it by yourself, just send messages continuously through Handler
                    focusMode = Camera.Parameters.FOCUS_MODE_AUTO
                }
                previewFormat = ImageFormat.NV21
            }

			//When recycling camera resources, pay attention to setPreviewCallBack(null) to remove the callback
            setPreviewCallback { data, camera ->
            	//isRecording is a flag to start recording, and the callback frame data is stored in the collection and waits for the encoder to encode
                if (isRecording) {
                    if (data != null) {
                        Log.i("camera1", "Get video data ${data.size}")
                        Log.i("camera1", "Whether the video thread is   $videoThread")
                        videoThread.addVideoData(data)
                    }
                }

            }
			//start preview
            startPreview()
        }
    }

In order to avoid the article being too long, some codes have not been posted, you can directly go to github to view, getAppropriatePreviewSizes(parameters) has not been posted.

2. Video processing thread
The YUV data format of the recording is NV21, which can be returned by the API of Camera1, but it is not supported by Camera2. The video encoding is preferably NV12 data. Finally, it needs to be converted. The main function of the recording thread is to obtain data and convert it into NV12 - > Encode as H264 -> Write to Muxer

/**
*The code is not separated, the internal class created directly in the Activity can be separated if you want the code to be more concise
**/
  inner class VideoEncodeThread : Thread() {
  		//The previewed data is directly added to this collection
        private val videoData = LinkedBlockingQueue<ByteArray>()


        fun addVideoData(byteArray: ByteArray) {
            videoData.offer(byteArray)
        }


        override fun run() {
            super.run()
            //Create a MediaFormat for encoding, posted below
            initVideoFormat()
            
			//Create video encoder MediaCodec
            videoCodec = MediaCodec.createEncoderByType(MediaFormat.MIMETYPE_VIDEO_AVC)
            videoCodec!!.configure(videoMediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE)
            videoCodec!!.start()
            //Loop encodes data if end is not set
            while (!videoExit) {

                val poll = videoData.poll()
                if (poll != null) {
                    encodeVideo(poll, false)
                }
            }

            //Send encoding end flag
            encodeVideo(ByteArray(0), true)
            //Note the release of resources
            videoCodec!!.release()
            Log.i("camera1", "video release")
        }
    }

Initialize MediaFormat

    private fun initVideoFormat() {
        videoMediaFormat =
            MediaFormat.createVideoFormat(
                MediaFormat.MIMETYPE_VIDEO_AVC,
                appropriatePreviewSizes!!.width,
                appropriatePreviewSizes!!.height
            )
        //Set color type 5.0 newly added color format
        videoMediaFormat.setInteger(
            MediaFormat.KEY_COLOR_FORMAT,
            MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Flexible
        )
        //set frame rate
        videoMediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 30)
        //set bit rate
        videoMediaFormat.setInteger(
            MediaFormat.KEY_BIT_RATE,
            appropriatePreviewSizes!!.width * appropriatePreviewSizes!!.height * 5
        )
        //Set the keyframe interval per second
        videoMediaFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 5)
    }

Video encoding (synchronous mode)

   private fun encodeVideo(data: ByteArray, isFinish: Boolean) {
        val videoArray = ByteArray(data.size)
        if (!isFinish) {
        	//NV21 to NV12 I found it on the Internet. The difference between the two is that the arrangement is VUVUVU and the other is UVUVUV.
        	//See the github code for details
            NV21toI420SemiPlanar(
                data,
                videoArray,
                appropriatePreviewSizes!!.width,
                appropriatePreviewSizes!!.height
            )
        }
        val videoInputBuffers = videoCodec!!.inputBuffers
        var videoOutputBuffers = videoCodec!!.outputBuffers


        //This TIME_OUT_US is set to 0.01s, which is 10000 microseconds. It was set to 1s before, and the video frame dropped as a result.
        //Seriously, the sound cannot be played, indicating that this value cannot be set too large
        val index = videoCodec!!.dequeueInputBuffer(TIME_OUT_US)

        if (index >= 0) {
            val byteBuffer = videoInputBuffers[index]
            byteBuffer.clear()
            byteBuffer.put(videoArray)
            if (!isFinish) {
                videoCodec!!.queueInputBuffer(index, 0, videoArray.size, System.nanoTime()/1000, 0)
            } else {
                videoCodec!!.queueInputBuffer(
                    index,
                    0,
                    0,
                    System.nanoTime()/1000,
                    MediaCodec.BUFFER_FLAG_END_OF_STREAM
                )

            }
            val bufferInfo = MediaCodec.BufferInfo()
            Log.i("camera1", "coding video  $index to write buffer ${videoArray?.size}")

            var dequeueIndex = videoCodec!!.dequeueOutputBuffer(bufferInfo, TIME_OUT_US)

			//It should be noted here that the audio and video MediaFormat to be set by MediaMuxer should be obtained here. After setting, there is no need to change it again
			//If you don't use the MediaFormat obtained here, it is very likely that a shutdown failure exception will occur when the MediaMuxer is closed.
            if (dequeueIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
                if (MuxThread.videoMediaFormat == null)
                    MuxThread.videoMediaFormat = videoCodec!!.outputFormat
            }

            if (dequeueIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
                videoOutputBuffers = videoCodec!!.outputBuffers
            }

            while (dequeueIndex >= 0) {
                val outputBuffer = videoOutputBuffers[dequeueIndex]
                //Since the configuration information is already included in the previous MediaFormat, there is no need to write MediaMuxer here
                if (bufferInfo.flags and MediaCodec.BUFFER_FLAG_CODEC_CONFIG != 0) {
                    bufferInfo.size = 0
                }
                //Add encoded data to the queue and wait for Muxer to write
                if (bufferInfo.size != 0) {
                    muxerThread?.addVideoData(outputBuffer, bufferInfo)
                }
                Log.i(
                    "camera1",
                    "after encoding video $dequeueIndex buffer.size ${bufferInfo.size} buff.position ${outputBuffer.position()}"
                )
                videoCodec!!.releaseOutputBuffer(dequeueIndex, false)
                //check is complete
                if (bufferInfo.flags and MediaCodec.BUFFER_FLAG_END_OF_STREAM != 0) {
                    break
                } else{
                    dequeueIndex = videoCodec!!.dequeueOutputBuffer(bufferInfo, TIME_OUT_US)
                }
            }
        }
    }

3. Audio thread
The audio thread needs to do 2 things, get audio data -> encode into AAC -> prepare to write to Muxer, process and video
Almost, here is not much to explain the steps

Prepare AudioRecord recording

    inner class AudioThread : Thread() {
        private val audioData = LinkedBlockingQueue<ByteArray>()


        fun addVideoData(byteArray: ByteArray) {
            audioData.offer(byteArray)
        }

        override fun run() {
            super.run()
            prepareAudioRecord()
        }
    }

 /**
     * Ready to initialize AudioRecord
     */
    private fun prepareAudioRecord() {
        initAudioFormat()

        audioCodec = MediaCodec.createEncoderByType(MediaFormat.MIMETYPE_AUDIO_AAC)

        audioCodec!!.configure(audioMediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE)
        audioCodec!!.start()

		//Create an audiorecord object, the configuration files are in AudioCongfig, the minsize is calculated according to the system method, please check github
        audioRecorder = AudioRecord(
            MediaRecorder.AudioSource.MIC, AudioConfig.SAMPLE_RATE,
            AudioConfig.CHANNEL_CONFIG, AudioConfig.AUDIO_FORMAT, minSize
        )


        if (audioRecorder!!.state == AudioRecord.STATE_INITIALIZED) {

            audioRecorder?.run {
                startRecording()

			  
                val byteArray = ByteArray(SAMPLES_PER_FRAME)
                var read = read(byteArray, 0, SAMPLES_PER_FRAME)
                while (read > 0 && isRecording) {
                    Log.i("camera1", "audio read $read")

					//The time stamp of the audio data needs to be obtained when reading, and getPTSUs is to obtain the nanosecond representation time of the current system
                    encodeAudio(byteArray, read, getPTSUs())


                    //If the read byte size uses minSize, that is, the calculated minimum size, after encoding and synthesis
                    //There will be no sound during playback, and the timestamp is incorrect. It is likely that the data of this size exceeds the size of one frame of data.
                    //To be studied, both 1024 and 2048 can play
                    read = read(byteArray, 0, SAMPLES_PER_FRAME)

                }

                audioRecorder!!.release()
                //Send EOS encoding end message
                encodeAudio(ByteArray(0), 0, getPTSUs())
                Log.i("camera1", "audio release")
                audioCodec!!.release()
            }
        }
    }

Audio encoding (synchronous mode)

    /***
     * @param Number of audio data
     */
    private fun encodeAudio(audioArray: ByteArray?, read: Int, timeStamp: Long) {
        val index = audioCodec!!.dequeueInputBuffer(TIME_OUT_US)
        val audioInputBuffers = audioCodec!!.inputBuffers

        if (index >= 0) {
            val byteBuffer = audioInputBuffers[index]
            byteBuffer.clear()
            byteBuffer.put(audioArray, 0, read)
            if (read != 0) {
                audioCodec!!.queueInputBuffer(index, 0, read, timeStamp, 0)
            } else {
                audioCodec!!.queueInputBuffer(
                    index,
                    0,
                    read,
                    timeStamp,
                    MediaCodec.BUFFER_FLAG_END_OF_STREAM
                )

            }


            val bufferInfo = MediaCodec.BufferInfo()
            Log.i("camera1", "coding audio  $index to write buffer ${audioArray?.size}")
            var dequeueIndex = audioCodec!!.dequeueOutputBuffer(bufferInfo, TIME_OUT_US)
            if (dequeueIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
                if (MuxThread.audioMediaFormat == null) {
                    MuxThread.audioMediaFormat = audioCodec!!.outputFormat
                }
            }
            var audioOutputBuffers = audioCodec!!.outputBuffers
            if (dequeueIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
                audioOutputBuffers = audioCodec!!.outputBuffers
            }
            while (dequeueIndex >= 0) {
                val outputBuffer = audioOutputBuffers[dequeueIndex]
                Log.i(
                    "camera1",
                    "after encoding audio $dequeueIndex buffer.size ${bufferInfo.size} buff.position ${outputBuffer.position()}"
                )
                if (bufferInfo.flags and MediaCodec.BUFFER_FLAG_CODEC_CONFIG != 0) {
                    bufferInfo.size = 0
                }
                if (bufferInfo.size != 0) {
                    Log.i("camera1","audio timestamp  ${bufferInfo.presentationTimeUs /1000}")
                    muxerThread?.addAudioData(outputBuffer, bufferInfo)
                }

                audioCodec!!.releaseOutputBuffer(dequeueIndex, false)
                if (bufferInfo.flags and MediaCodec.BUFFER_FLAG_END_OF_STREAM != 0) {
                      break
                } else {
                    dequeueIndex = audioCodec!!.dequeueOutputBuffer(bufferInfo, TIME_OUT_US)
                }
            }
        }

    }

The process is basically the same as video encoding

4.MediaMuxer synthesis thread

I proposed the thread of MediaMuxer separately, and created a class whose task is
Create a MediaMuxer object -> get the audio and video MediaFormat to add audio and video tracks -> enable synthesis ->
Get collection data, write

class MuxThread(val context: Context) : Thread() {
    private val audioData = LinkedBlockingQueue<EncodeData>()
    private val videoData = LinkedBlockingQueue<EncodeData>()

    companion object {
        var muxIsReady = false
        var audioMediaFormat: MediaFormat? = null
        var videoMediaFormat: MediaFormat? = null
        var muxExit = false
    }

    private lateinit var mediaMuxer: MediaMuxer
    fun addAudioData(byteBuffer: ByteBuffer, bufferInfo: MediaCodec.BufferInfo) {
        audioData.offer(EncodeData(byteBuffer, bufferInfo))
    }

    fun addVideoData(byteBuffer: ByteBuffer, bufferInfo: MediaCodec.BufferInfo) {
        videoData.offer(EncodeData(byteBuffer, bufferInfo))
    }


    private fun initMuxer() {

        val file = File(context.filesDir, "muxer.mp4")
        if (!file.exists()) {
            file.createNewFile()
        }
        mediaMuxer = MediaMuxer(
            file.path,
            MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4
        )

        audioAddTrack = mediaMuxer.addTrack(audioMediaFormat)
        videoAddTrack = mediaMuxer.addTrack(videoMediaFormat)
        //Note that adding tracks must be done before start
        mediaMuxer.start()
        muxIsReady = true

    }

    private fun muxerParamtersIsReady() = audioMediaFormat != null && videoMediaFormat != null


    override fun run() {
        super.run()
		//Determine whether the audio and video MediaFormat has been obtained
        while (!muxerParamtersIsReady()) {
        }

		//Initialize, add audio and video tracks, and start compositing
        initMuxer()
        Log.i("camera1", "current record status $isRecording ")
        while (!muxExit) {
            if (audioAddTrack != -1) {
                if (audioData.isNotEmpty()) {
                    val poll = audioData.poll()
                    Log.i("camera1", "mix write audio ${poll.bufferInfo.size} ")
                    mediaMuxer.writeSampleData(audioAddTrack, poll.buffer, poll.bufferInfo)

                }
            }
            if (videoAddTrack != -1) {
                if (videoData.isNotEmpty()) {
                    val poll = videoData.poll()
                    Log.i("camera1", "mix write video ${poll.bufferInfo.size} ")
                    mediaMuxer.writeSampleData(videoAddTrack, poll.buffer, poll.bufferInfo)

                }
            }
        }

		//write complete, release
        mediaMuxer.stop()
        mediaMuxer.release()
        Log.i("camera1", "synth unleashed")
        Log.i("camera1", "no audio written ${audioData.size}")
        Log.i("camera1", "no video written ${videoData.size}")
    }
}

These are the main processes of this series. Here are some points to pay attention to, which are also places that are prone to program errors.

1. For audio recording and encoding, the set read size cannot use the calculated minimum size, otherwise playback will occur
There is no sound, and the correct result can be obtained by using 1024 or 2048 byte encoding once

2.MediaCodec encoding, the waiting time for obtaining the available Buffer should not be too long, otherwise there will be video frame skipping after encoding
Seriously, there is no sound on the audio

3. The MediaFormat obtained by MediaMuxer is preferably in the process of MediaCodec encoding, through the above generation
Obtained in the way of code presentation, otherwise missing specific data may occur, and the failure to close MediaMuxer will be abnormal

4. Adding audio and video tracks of MediaMuxer must be completed before start

5. When the setPreviewCallback set by Camera releases the Camera resources, it should also be released, through
setPreviewCallback(null), otherwise it will report that the Camera is still being used, after the Camera calls release
the exception

6. Set the preview data size, which must be given by the system, the size supported by the system, Camera1 can pass
parameters.getSupportedPreviewSizes acquisition, the preview size is set to the system does not support, recording video
likely to be a problem

Project github address, code in Camera1PreviewActivity

References
Ask AAC how long each frame is

Android in MediaMuxer and MediaCodec use cases - audio+video

MediaMuxer Stop throws an exception and crashes

MediaMuxer error "Failed to stop the muxer"

Tags: Android MediaCodec rtc

Posted by sfc12345 on Mon, 30 Jan 2023 13:08:19 +1030