Tag Archives: Big Data

Data, data everywhere… Prediction Four

I started this series of blog with my concerns at the amount of data that exist today. You can read the blog here.

After this blog i took the hat of an astrologer and started prediction…😀. My prediction one can be read as a blog here.

My second prediction can be read here.

My third prediction can be read here.

You have a variety of mobile covers available today…

Have you seen a mobile cover which allows you to cover the cameras in your mobile phone at will?


I don’t think so… i predict mobile covers allowing users to close these cameras or cover these camera at will (in their control)…

You would have heard of mobile signal jammer (without your knowledge it distorts mobile signals so that your mobile phone just cannot be connected to make calls or SMS)… something similar will come to make sure that we can either distort the camera pictures when not in use or have a mobile cover (not from phone manufacturer but someone else… or might be from phone manufacturer themselves) which can be used to completely cover the mobile phone cameras (front and back camera)…

You might think that… what does this has to do with data…

Again, this is because today you allow apps to take pictures and records video without knowing when they will do it by themselves… who knows that at a particular time of day, they just wake up and start recording videos or take pictures…

I have already seen many colleagues in my workplace already started putting stickers on the laptop camera… what they don’t realize is that, in their pocket or in their hand they have mobile phones which has dual camera having high density clear lenses capable of taking HD quality photos and videos….


I am not against collecting these data and letting companies deciphering these images or videos… but i am against vendors doing this without our permission… now you will say, these companies ask for our permission and we have already agreed to this, but my view is we did this only once and after that, just because you have allowed them to do this, they can misuse this under that license agreement…

In this series, i have one more prediction, thats in regards to voice command being used to control apps… i had this in my head for long… but now i have seen multiple bloggers started to write this as another intrusion into our private life… again we know this and we have agreed to it… but my issue is we don’t have any control on when these devices need to listen and when not…

So my last prediction is soon coming and i would conclude my series on this with that last one…

Thanks for reading… share it if you like/agree… 😀

Page Visitors: 76

Data, data everywhere… Prediction Three

I started this series of blog with my concerns at the amount of data that exist today. You can read the blog here.

After this blog i took the hat of an astrologer and started prediction…😀. My prediction one can be read as a blog here.

My second prediction can be read here.

My next prediction is in the area of mobile apps. Again, this is also very much inline with what i have been raising throughout this blog series.


Mobile apps today asks for various permission to use a number of device features. Once you allow, you don’t even know what all you allowed at a later stage. You have to dig hard (going through various taps and screens) to find these permissions and then disable it if you don’t want them to be used by these apps.

My prediction is that similar to browser, everything would be disabled by default. While installing also, these app vendors cannot ask to accept these permissions. Rather, when they start using that particular feature, app need to ask permission, each and every-time. What i mean by that is, whenever app tries to use a feature, it asks explicitly for what permission and then have to exactly say how this permission will be used. In addition, soon after its use, it again goes back into disabled mode.

As mentioned in my previous blogs, this is going to be tedious for customers/users. But, this is something which OS can take care and if due to any reason user feels tapping on these permission dialogs every-time tedious, they have a provision of enabling it for a period of time (say a weeks time, or a months time). After allowed period expires, again it goes back to disabled mode.

“Remember me” functionality in many websites, nowadays does have a provision of remembering username/password for a period, other than having it never for rest of the life (for a particular allowed browser).


Assume you allowed a particular app access to read gallery (pictures, videos etc..). You really don’t have any clue how and when these apps would be reading these files from your phone. Nowadays, every company in this world is trying to be so intelligent that they are ready to decipher an image, content of a video and so on, quite easily. Some pictures in your gallery would be very personal in nature and you don’t have any clue what these apps are deciphering meanings from these pictures. Also, these could be read and then stored in their digital asset library forever (for various purposes).

I am not saying that these apps would be using these gallery items in a bad way but if someone gets access to these content and then start using these in an unethical manner, its going to be really troubling for you and me.

Recently i did read (somewhere, i don’t remember) that a popular company gave their employees (mainly in support area) access to huge amount of personal data along with real-time information as to where their customers travel (where they start a journey, where they end the journey, what time of day journey took place and so on.) and then they used this information in an unethical manner for their own benefits.

The above is just one such incident and i am sure there are so many such instances out there. With more and more data available about you (with these app vendors), such misuse are bound to be a common problem going forward. Also, you have voluntarily given access to these data without much thought process and because of this you cannot raise any complaints but i am sure this will have huge repercussions on you going forward, which you are not aware off.

Think of such negative scenarios as well when you go ahead and click on “Allow” button on your phones going forward. I am sure there are so many positive aspects to share these data but over period of time, there will be so many negative aspects which could haunt you.

Mind you, its very hard to wipe your digital signature. If you think you have clicked on a particular picture and clicked delete. This in no way mean that the vendor will actually delete that picture. They could just make that picture as “ready for delete” and just leave it.

With AI (Artificial Intelligence) and ML (Machine Learning) becoming strong day by day, i am sure these big names in technology would want to find out as to why you deleted that particular image and would try to put some intelligence and then can use that as a way to haunt you with something down the line.

These are just my thoughts. Yes, i might be thinking too much here…🤓

Page Visitors: 99

Data, data everywhere…feeling a bit uncomfortable

Recently one of my colleagues came to me and said that he searched something on a popular search engine and after that everything that he did online (browsing other sites, social media etc.) seems to know this and started showing similar content what he was searching for earlier.

Even though the site domain varied, other sites knew what he searched for and started showing very personalized content (yes, i do know that if you are using Adsense, Google would have already figured out what to show so that user actually clicks on these advertisements). How is this possible? Do these sites having different domains share data between each other. Isn’t that, a domain don’t share anything with other domain holds good here. Isn’t that a very basic browser security?

One of my other colleague also once told me a similar incident in which she was looking for a piece of furniture. She had clear picture in mind on what she wants. She used image search in one of the popular search engine. But unfortunately she couldn’t get what she was looking for and gave up.

Few hours later she was browsing through some of the famous social media sites and BOOM. These sites starts showing exactly the image she was looking for. The exact furniture piece that she was looking for.

Do these sites sell personal data between each other and earn money..😀.
In both incidents it can be thought of in positive sense whereby they indeed were getting more relevant data that they are looking for.

BUT…they were both skeptical and was being fearful of how much each of those sites know about you as a person.
Most of these sites capture so much data from you without your knowledge. The so called behavioral data (what did you browse, when did you browse, what areas of the sites your clicked, touched even looked) and most of the data in regards to your machine (which operating system, system details etc.), browser (which browser, version, which features are available and so on) along with data which you have given full access to without knowing much about those privacy issues like location.

In near future I am sure that these big sites can be consulted to get a person’s good conduct certificate (which sites you are visiting, at what time of day you browse, while browsing at different times what are your browsing traits and so on). Also, looking at such data these big sites can predict in advance whether he/she has a criminal tendency or any other such traits which is very hard to get looking at someone on their face. For example, recently this person has started looking at some undesirable sites and also has been searching for content showing certain negative traits of a person.

These data collected never get erased even after you die and can still be even used and linked to your children’s account and even predict their behaviour and other personal characteristics. If though i laugh while i write, but they could link parent and children’s account and can state some characteristics of a kid much in advance. If father showing criminal traits, the child could also show a similar traits in the future.. :). Sorry i am taking this too far.

Have I started to make you think…if so, my post is a success. Let me know your views.

I am going to write few more posts in the same topic and also going to predict certain things which will become a norm going forward.

You would have already known about cookie policy…😀. Don’t laugh…

It’s just one storage mechanism in the browser…heard of local storage…session storage…indexed db….?

No one asks for permission when they want to write on these storage mechanisms…you yourself has already given permission for them to write onto your disk…the so called data which they will use it later on…I am not really saying it’s bad…but what’s the point of cookie policy…these aspects also should be regulated…I guess. Just a thought…


If you would like to read some of the predictions made, please follow below links:

For Prediction One click here.

For Prediction Two click here.

For Prediction Three click here.

For Prediction Four click here.

A thought on browser and its tracking can be read here.

Page Visitors: 233

Apache Sqoop – Data Lake for Enterprises Book

Apache Sqoop is the one of the primary frameworks which has been widely used as it is a part of Hadoop ecosystem and has been very dominant for this capability. Apache Sqoop is one of the main technologies used to transfer data to and from structured data stores such as RDBMS and traditional data warehouses to Hadoop. Apache Hadoop finds it very hard to talk to these traditional stores and Sqoop helps to do that integration very easily. Sqoop helps in bulk transfer of data from these stores also integrates easily with Hadoop based systems like Apache Oozie, Apache HBase and Apache Hive.

Apache Sqoop could be employed for many of the data transfer requirement in a Data Lake, which does have HDFS as the main data storage for incoming data from various systems. Below points gives some of the cases where Apache Sqoop makes more sense:

  • For regular batch and micro-batch to transfer data to and from RDBMS to Hadoop (HDFS/Hive/HBase), use Apache Sqoop. Apache Sqoop is one of the main and widely used technology in the data acquisition layer.
  • For transferring data from NoSQL data stores like MongoDB and Cassandra into Hadoop file system.
  • Enterprises having good amount of applications whose stores as based on RDBMS, Sqoop is a best option to transfer data into Data Lake.
  • Hadoop is a de-facto standard for storing massive data. Sqoop allows to transfer data easily into HDFS from traditional database with ease.
  • Use Sqoop when batch processing is acceptable and performance is required as it is able to split and parallelize data transfer.
  • Sqoop has concept of connectors and if your enterprise has diverse business applications with different data stores, Sqoop is an ideal choice.
Figure: Capability of Apache Sqoop in a Data Lake
Figure: Capability of Apache Sqoop in a Data Lake

Figure: Capability of Apache Sqoop in a Data Lake

Chapter 5 in the book “Data Lake for Enterprises” covers both theoretical and coding aspect of Apache Sqoop in purview of developing an Enterprise grade Data Lake.

More details on book can be found here.

Page Visitors: 616