What's that Noise? : Working with sound on Android

Sound is diverse. It's nature is to comprise of so many forms that solid rules cannot draw lines between them. I always have been an admirer of sound.

I have spent a few weeks implementing sound acquisition and processing in android and have come up with something to begin with.

Presenting:



Shh..Silence 

An application that monitors sound level of an environment and plays a 'Shhhh...' when noise levels pass a limit.

You can download it here:

Get it on Google Play
Google Play and the Google Play logo are trademarks of Google LLC.

For the end user, it is a harmless application but from an engineering point of view, the internals are a window to a ton of possibilities. 

The application monitors sound in the following manner : 
  1. Acquire Microphone 
  2. Configure 
    • Sample Rate
    • Mono/Stereo
    • Encoding format
    • Buffer size
  3. Calculate average over the buffer size
  4. Compare obtained value with threshold 
  5. Trigger when above threshold
    The pipeline is simple when dealing only with the average. Here's a code snippet for step 3:

     private void readAudioBuffer() {  
         try {  
           short[] buffer = new short[bufferSize];  
           int bufferReadResult = 1;  
           if (audio != null) {  
             bufferReadResult = audio.read(buffer, 0, bufferSize);  //Audio Samples
             double sumLevel = 0;  
             for (int i = 0; i < bufferReadResult; i++) {  
               sumLevel += buffer[i];  
             }  
             lastLevel = Math.abs((sumLevel / bufferReadResult));  
           }  
         } catch (Exception e) {  
           e.printStackTrace();  
         }  
       }  
    

    The most intriguing part of it is at bufferReadResult. It contains a sequence of numbers that depict the sound received by the microphone. Following this, it is a matter of requirement what needs to be done next. On extracting audio features like Mel Coefficients, MFCC etc, the application can be stretched to domains of Audio Classification, Speech Recognition, User Identification and Keyword Detection.

    Implementing ML/DL on android has become easier than ever using TensorFlow with a light framework. The next step is to develop an application that uses TensorFlow for purpose of classifying sounds.

    The Optimist sees the potential in a seed

    Kudos.

    An Infinite point possibilities : Intel's Open3D Library

    Intel have recently launched its open source library for 3D data processing Open3D  [ research paper by Qian-Yi Zhou and Jaesik Park and Vladlen Koltun ]

    *not the official logo, only for personal representation

    Open3D is an open-source library that supports rapid development of software that deals with 3D data. The Open3D frontend exposes a set of carefully selected data structures and algorithms in both C++ and Python. The backend is highly optimized and is set up for parallelization. Open3D was developed from a clean slate with a small and carefully considered set of dependencies. It can be set up on different platforms and compiled from source with minimal effort. The code is clean, consistently styled, and maintained via a clear code review mechanism. Open3D has been used in a number of published research projects and is actively deployed in the cloud.

    With Open3D, the library enables developers to work with 3D models and point clouds.
    Open3D has the following features:

    • Basic 3D data structures
    • Basic 3D data processing algorithms
    • Scene reconstruction
    • Surface alignment
    • 3D visualization
    With Open3D, RGBD images (Images with 3 color components and a Depth component) can be converted into 3D models. Here' a python code snippet to achieve just that:

     import sys  
     import py3d  
     import matplotlib.pyplot as plt  
     sys.path.append("../Open3D/build/lib/")  
     print("Read Redwood dataset")  
     color_raw = py3d.read_image("/home/<username>/Open3D/build/lib/TestData/RGBD/color/00000.jpg")  
     depth_raw = py3d.read_image("/home/<username>/Open3D/build/lib/TestData/RGBD/depth/00000.png")  
     rgbd_image = py3d.create_rgbd_image_from_color_and_depth(  
         color_raw, depth_raw);  
     print(rgbd_image)  
     plt.subplot(1, 2, 1)  
     plt.title('Redwood grayscale image')  
     plt.imshow(rgbd_image.color)  
     plt.subplot(1, 2, 2)  
     plt.title('Redwood depth image')  
     plt.imshow(rgbd_image.depth)  
     plt.show()  
     pcd = py3d.create_point_cloud_from_rgbd_image(rgbd_image,  
                            py3d.PinholeCameraIntrinsic.prime_sense_default)  
     pcd.transform([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]])  
     py3d.draw_geometries([pcd])  
     print("Writing ply file")  
     py3d.write_point_cloud("Redwood.ply", pcd)  
    

    The result obtained is as follows:


    Open3D has been developed keeping in mind the computations required for solving 3-dimensional geometry and the need for parallelization for faster turn-around times. It has an inbuilt visualiser which enables developers to visually examine their work and also manipulate them using pan and rotate controls along with a dozen more manipulations such as lighting, changing point size, toggling mesh wireframe etc.

    Open Source community has always accelerated the development of advanced tools and libraries. Looking forward to the community to scale this one ahead too.

    Limitations only exist if you let them
    <This post is an attempt to integrate 3D model in a webpage using webVR, stay tuned for a post update>

    Peace Out.