What I learnt making Android games from scratch



banner


There are two ways you can build something:

1, Build on top of an established system
2, Build from scratch

While building on top is an obvious practice for software that sells, It is when projects are build from scratch which gives you a deeper understanding.

I have spent the last 3 years on and off building games from scratch ( using canvas, media player, bitmaps, threads, classes etc) for Android and here are my experiences:


1, The seemingly easiest problems are the toughest one:


A lot of times, problems seem easy when you do not understand them properly. Building games involves coming across dozen of such cases on a regular basis where your assessment mismatches reality which is often disappointing. For me, this involved around 80% of issues dealing with threads. You see, when you see a sprite moving on screen independent of game environment, that animation is a thread. A lot of times I used to find myself back at the drawing table trying to add one more level of interaction between the sprite thread and the main game thread. 

So before you go ez-pz on a problem, ensure you are not pushing yourself into muddy waters.


2, The documents are never complete:


This covers everything including documentation for android, an audio library that you are using and surprisingly your very own project. When you are dealing with a lot of parallel ideas, just commenting the code is not enough. Having a running document while you are building the game is the most mundane yet the most efficient of decisions. Your future self or your peers will thank you for this.


3, Things on the internet disappear:


You must have come across the statement: "If it's on the internet, its there forever". Sounds good, but there's another side to it. The web is constantly changing. This means while a lot of content is coming into new existence, a fraction of it also snaps out of it never to be seen again. This largely involves mid-tier knowledge that ties low level details with high-level abstract implementations ( looking at you bitmaps). Documentation again can save the day by bundling this knowledge into the project.

Another example is volatility of 3rd party dependencies. Remember Parse by facebook? or TinyPic? No right? They shut down! So no matter what company is behind a service (It was facebook godamnit!), It can disappear. However staying updated with your inbox saves you the hassle of cases like LOOSING ALL DOCUMENTATION DIAGRAMS TO TINYPIC SHUTDOWN! You get the idea what I am implying here.


4, Don't fool yourself:


So your game is good right? You sure buddy? You know how all parents think their newborn is the cutest of all the newborns that have ever existed?!! ( I mean come on peeps! Clearly my future kid is the cutest!). 

The point is, just because you invested a lot of time into it, doesn't mean it will turn out great. (This is not about kids anymore, but who am I to judge?)

Being rationally critical of yourself is one of the first steps to building cool things ahead. So if your game sucks big time, Its time to toss it in the bin! ( or get into fixing it if you are that stubborn). 

Remember, you are in this to learn. Honouring critical feedback is one of those learnings. This brings us to the next point.


5, Honour Feedback:


You will be tempted to ask others to play your game and they might not like it. Don't let this get to your head and above all, never take it personally. I have experienced that giving critical review is one of the most caring acts. What blocks us from reaping the benefit is how we take it. Think for yourself: If you didn't care for the project, why would you tell me what is wrong with it? Thinking this way requires major perception shift, The benefits break major mental bounds for everything you do going ahead.

6, Enjoy the struggle!


No Seriously! Its a game after all right? And while you might not have an audience in sight for your project when you begin, you can always pivot to monetise! The struggle you did while building gets fulfilled by the joy when someone appreciates your work. A lot of top titles started out as side hobby projects and they were definitely not the author's first project.



------------------------------------------------------------------------------------------------
If you liked or disliked what you just read, let me know in the comments.

P.s.  I do coffee if you pay.

Follow me on Twitter here: @Sanjeev_309
Connect on LinkedIn here: LinkedIn

What does a CNN see? : Visualising hidden layers of a Neural Network

Deep Learning has made remarkable progress over the past few years with quick transitions from discovery of new methods to their industrial implementation. While framework and libraries have made creating and working with deep architectures easy, quite less is known by practitioners about the internal states of the process. This post is an attempt to find out what composes a neural network and what a convolutional neural network sees in an input.

The code is publicly available on my Github

The architecture of the network we will work on is as follows:

Input
Convolution (5 x 5)
MaxPooling
Convolution  (5  x 5)
MaxPooling
FullyConnected

The model is trained on the popular MNIST dataset with following parameters:

batch_size = 50
learning_rate = 0.001
epochs = 400
Optimiser = Adam

After training, we load the layer to visualise and pass a sample input via the input layer. The function then runs a session for the layer given the input and returns all the filters that comprise that layer.
This is done by using TensorFlow's session.run() function which returns all the filters when a layer is fed in as an object.

Sample Input:



The results for the sample input are the following visualisations which are plot using matplotlib.

Hidden Layer 1 :

Open in new window for full scale


 Hidden Layer 2:

Open in new window for full scale

The number of plots correspond to the increased number of filters as we go deeper into the network.
The depth also describes how more finer details are sought by the filters as the depth increases. This can be seen in the representation between what HiddenLayer1 vs HiddenLayer2 sees as the filter shows how the input stimulates the filter.

The height of your accomplishments equal the depth of your convictions.

Stats:
TensorFlow 1.8
Jupyter notebook
Ubuntu 17.10

What's that Noise? : Working with sound on Android

Sound is diverse. It's nature is to comprise of so many forms that solid rules cannot draw lines between them. I always have been an admirer of sound.

I have spent a few weeks implementing sound acquisition and processing in android and have come up with something to begin with.

Presenting:



Shh..Silence 

An application that monitors sound level of an environment and plays a 'Shhhh...' when noise levels pass a limit.

You can download it here:

Get it on Google Play
Google Play and the Google Play logo are trademarks of Google LLC.

For the end user, it is a harmless application but from an engineering point of view, the internals are a window to a ton of possibilities. 

The application monitors sound in the following manner : 
  1. Acquire Microphone 
  2. Configure 
    • Sample Rate
    • Mono/Stereo
    • Encoding format
    • Buffer size
  3. Calculate average over the buffer size
  4. Compare obtained value with threshold 
  5. Trigger when above threshold
    The pipeline is simple when dealing only with the average. Here's a code snippet for step 3:

     private void readAudioBuffer() {  
         try {  
           short[] buffer = new short[bufferSize];  
           int bufferReadResult = 1;  
           if (audio != null) {  
             bufferReadResult = audio.read(buffer, 0, bufferSize);  //Audio Samples
             double sumLevel = 0;  
             for (int i = 0; i < bufferReadResult; i++) {  
               sumLevel += buffer[i];  
             }  
             lastLevel = Math.abs((sumLevel / bufferReadResult));  
           }  
         } catch (Exception e) {  
           e.printStackTrace();  
         }  
       }  
    

    The most intriguing part of it is at bufferReadResult. It contains a sequence of numbers that depict the sound received by the microphone. Following this, it is a matter of requirement what needs to be done next. On extracting audio features like Mel Coefficients, MFCC etc, the application can be stretched to domains of Audio Classification, Speech Recognition, User Identification and Keyword Detection.

    Implementing ML/DL on android has become easier than ever using TensorFlow with a light framework. The next step is to develop an application that uses TensorFlow for purpose of classifying sounds.

    The Optimist sees the potential in a seed

    Kudos.

    An Infinite point possibilities : Intel's Open3D Library

    Intel have recently launched its open source library for 3D data processing Open3D  [ research paper by Qian-Yi Zhou and Jaesik Park and Vladlen Koltun ]

    *not the official logo, only for personal representation

    Open3D is an open-source library that supports rapid development of software that deals with 3D data. The Open3D frontend exposes a set of carefully selected data structures and algorithms in both C++ and Python. The backend is highly optimized and is set up for parallelization. Open3D was developed from a clean slate with a small and carefully considered set of dependencies. It can be set up on different platforms and compiled from source with minimal effort. The code is clean, consistently styled, and maintained via a clear code review mechanism. Open3D has been used in a number of published research projects and is actively deployed in the cloud.

    With Open3D, the library enables developers to work with 3D models and point clouds.
    Open3D has the following features:

    • Basic 3D data structures
    • Basic 3D data processing algorithms
    • Scene reconstruction
    • Surface alignment
    • 3D visualization
    With Open3D, RGBD images (Images with 3 color components and a Depth component) can be converted into 3D models. Here' a python code snippet to achieve just that:

     import sys  
     import py3d  
     import matplotlib.pyplot as plt  
     sys.path.append("../Open3D/build/lib/")  
     print("Read Redwood dataset")  
     color_raw = py3d.read_image("/home/<username>/Open3D/build/lib/TestData/RGBD/color/00000.jpg")  
     depth_raw = py3d.read_image("/home/<username>/Open3D/build/lib/TestData/RGBD/depth/00000.png")  
     rgbd_image = py3d.create_rgbd_image_from_color_and_depth(  
         color_raw, depth_raw);  
     print(rgbd_image)  
     plt.subplot(1, 2, 1)  
     plt.title('Redwood grayscale image')  
     plt.imshow(rgbd_image.color)  
     plt.subplot(1, 2, 2)  
     plt.title('Redwood depth image')  
     plt.imshow(rgbd_image.depth)  
     plt.show()  
     pcd = py3d.create_point_cloud_from_rgbd_image(rgbd_image,  
                            py3d.PinholeCameraIntrinsic.prime_sense_default)  
     pcd.transform([[1, 0, 0, 0], [0, -1, 0, 0], [0, 0, -1, 0], [0, 0, 0, 1]])  
     py3d.draw_geometries([pcd])  
     print("Writing ply file")  
     py3d.write_point_cloud("Redwood.ply", pcd)  
    

    The result obtained is as follows:


    Open3D has been developed keeping in mind the computations required for solving 3-dimensional geometry and the need for parallelization for faster turn-around times. It has an inbuilt visualiser which enables developers to visually examine their work and also manipulate them using pan and rotate controls along with a dozen more manipulations such as lighting, changing point size, toggling mesh wireframe etc.

    Open Source community has always accelerated the development of advanced tools and libraries. Looking forward to the community to scale this one ahead too.

    Limitations only exist if you let them
    <This post is an attempt to integrate 3D model in a webpage using webVR, stay tuned for a post update>

    Peace Out.

    Project AlphaUI : Computer Vision and Virtual Menu Navigation

    We have always sought of new ways to interact with computers. From typed commands to automatic speech recognition, the aim is to make it appear natural to us, as if not interacting with a computer but a Human.

    Project AlphaUI

    AlphaUI is a virtual menu interface that lets you interact naturally with the GUI displayed. It works by using a webcam to capture live frames and through Image Processing finds out where the user wants to point out in the given space.



    The Program is written in C++ using OpenCV 3.1.0 Library and performs the following operations on each image from which relevant information is extracted.
    For the project to be demonstrated, I have utilised my computer vision project: Automatic Face Recognition System. The AlphaUI interface is built on top of the Face Recognition System with a custom GUI giving integrity to both projects.The functional response of interface have been disabled for the demo. Any developer can define their own GUI for their system that require user interaction in the same way.

    Screenshots of the system: 

    The AlphaUI interface

    Ball tracked continuously by the system

    Touchless Interaction with the interface


    The system can be trained on any object of interest provided it is distinct in color (read HSV segmentation) . The training is done by repeatedly marking all over the object with the mouse pointer. This step has to be done only once in a lifetime or when you need to use a new marker. The values are saved in a text file to be reused in next run.

    Disclaimer:
    This project was done about an year ago but never saw daylight until now. What would you do with the possibilities of this project? Do comment and let me know.

    Step By Step One Goes Very Far

    Used:
    Ubuntu 16.04
    Code::Blocks IDE
    OpenCV 3.1.0 : C++