Ecommerce Exploratory Data Analysis Basics: How to Gather Extensive Info For Your Store

by Jessica Day May 24th, 2022

Using data is great. All the secrets of commerce lie within. All the ways in which you can make your business successful are contained in the mass of data you have before you. Whatever your ecommerce strategy contains, you should make sure it is based on this array of sound and current data. 

What you need is a method for presenting and making sense of this raw data. Another way of putting this is that you need a way to organize it. And this is an important aspect to life in general, as Pooh Bear’s creator explains.

Organizing is what you do before you do something, so that when you do it, it is not all mixed up. -A A Milne

Exploratory data analysis is a way to do just this. To prepare data for initial interpretation, so ‘it is not all mixed up’. The patterns and findings that are then revealed will give your operation the boost it needs. 

Let’s see what it actually is and how it can help your business. Then we’ll see ways to put it into action. 

What is Exploratory Data Analysis?


Image source

Raw data is, by definition, a bit of an untamed jumble. That’s not to say it’s not valuable. Quite the opposite. Hidden inside are the secrets to customer acquisition and customer retention. Think of it as an uncut diamond, just dug up and a bit grubby to behold. It needs a degree of processing in order to achieve its potential. So it is with raw data. 

The idea is to put the data into a format in which its salient characteristics reveal themselves. This can often be achieved using statistical graphs or other imagery. The important thing is to assess the data so that the main characteristics and key relationships within are brought out and analyzed. 

How Does Ecommerce Exploratory Data Analysis Help Businesses?

By grouping sales data into a format that readily makes sense to the observer, exploratory data analysis can help businesses in the following ways:

  • It facilitates effective decision-making based on empirical evidence.
  • It shows up where data is inconsistent, which means invalid data is intercepted before it impacts the business any further.
  • It allows for test modeling of data at an early stage in the process. This can save a lot of money and time on projects before things are allowed to develop too far. 

For example, your company has just decided to institute a bring your own device policy and you want to see how productivity is affected. 

Blank Ipad

Image source

Performing an exploratory data analysis will give you the data you need in a format that you want, at a stage early enough to be able to respond in a timely fashion. 

You may decide to extend the policy, abandon it, or tweak it a little. Exploratory data analysis gives you the means to come to the right decision, enabling you to prevent costly mistakes. Many social media disasters, for instance, can be averted in this manner. 

Steps in Ecommerce Exploratory Data Analysis

Data Context

It’s very important to know exactly what the data refers to, rather than it being just a series of disembodied figures with no representative meaning. It may be the case that in front of you lies a pile of data with some relationships emerging straightaway. But it’s only when one attaches the context that these relationships begin to make sense. 

For instance, let’s say the data you’re looking at has been put into a preliminary graph. So far so good. But it’s very academic. You need more. You need to know what the data means. Once you have the context, you can start to make some meaningful observations. 

Let’s say that you’ve produced an app, and the data refers to the app store rating for it. The data can tell you a certain amount, for sure. But you also need to know if there’s any significant age or geographical skew on the population doing the rating. If the app store is based in a market you’re not targeting then it might not be as crucial to look to boost the ratings there as it would be in a target area.  

Bappy Day's Burger Bar Sales

Image created by writer

Or let’s say the data refers to sales of vegan burgers from your city center outlet. With this in mind, you can immediately infer relationships with greater meaning. For instance, you can see that there’s a grouping of sales in the middle of the day. No surprise there - that’s lunchtime in my book. But you also see a mid-morning bulge. It would be worth having a closer look at that. 

Exploratory data analysis has helped by making the data easier to understand because it’s in a readily processed form (a graph). It has then been assisted by having context added. In the burger example, the company can effect a real benefit from this through being able to enhance its customer experience platform by staffing up for that bulge in demand. 

Data Cleaning

Remember that diamond?


Image source

Part of the treatment that makes it so valuable is the cleaning process, by which unwanted particles are eradicated. 

It’s the same with data cleaning. Data won’t usually come to you in a sparklingly immaculate condition. It invariably has some anomalies. The larger and more complex the data set, the more the likelihood that it will contain some potentially misleading elements. 

Going back to the vegan burger bar, it might seem to be an initial observation that there’s a glut of free burgers given out at the end of the day. That’s odd. 

What your data analysis needs is some additional information (akin to the context element we discussed earlier). In this case, it’s that voucher deals and other freebies are time consuming to enter into the till so the staff bundle them up and enter them at the end of the day when it’s quiet.

This then results in data that’s no use to you and will only skew trading pattern statistics. For this reason, it’s best to clean the data so as to exclude this part from your calculation. 

Incidentally, this is an area that can cause problems in affiliate marketing relationships. If you are using data derived from an affiliate marketing sales funnel, you need to be sure that you have understood all the factors surrounding the data, so that the data is valid and reliable. 

Should there be an anomaly in the way the data is collected that sets it apart from the way other affiliates gather it, you need to know, so as to allow for or remove that data altogether. 

Spellings etc

Data cleaning can also be used to correct spellings (This is very important as a misspelled word might not register in the analysis), as well as look out for duplicates and other errors that might affect the subsequent data analysis. Some businesses use a third-party for this task. Moreover, many operations are using AI assistive techniques to ease the burden. 


Puzzle pieces

Image source

To take one of the most widespread problems, that of missing data, one can seek to tackle it through several means. One example is what to do when you know you have missing variables. The missing data can be categorized as Totally Random Missing, Random Missing, and Not Random Missing. 

What these groupings mean is that the variable can be missing as a totally random occurrence, or (partially or totally) as a result of the influence of other variables. 

Why does this matter? Because if it’s one of the last two groupings, you may be able to deal with it by appraising the variables that seem to be responsible for the omission. If it’s the former, then you may be better off just accepting the omission and moving on. 

To give an example, you’re trying to assess the effectiveness of a particular landing page design. You want to see in particular how likely people are to click on a certain link. You notice that some of the data seems to be missing in that there’s no trace of what some people clicked on. 

It might be the case that this is simply random happenstance. However, if you find out that one of the links is not reporting the way it should, you can conclude that these customers were clicking on that link but it was going unreported. This means you can reinstate some of the missing data accordingly. 

Outlier Detection

Person sky diving

Image source

An important part of data cleaning is to look at the extreme figures and assess their veracity. If they’re valid, then they make for some interesting analysis. If they’re not, then you can either overlook them or replace them with whatever valid data is available. 

For instance, if an ecommerce company is moving into mobile commerce, it will be wanting to know what impact this has had on its product sales. One surprising stat is that sales of berets have gone up by 500% since launching its mobile-friendly landing page. 

If this is valid, this will mean that berets seem to be on the cusp of being this year’s big hat. Alas, on inspection, it seems there’s a bug in the data report and beret sales have been somewhat exaggerated. Oh well, maybe next year. 

Get Stuck into Exploratory Data Analysis

So, now you know what the data is all about and you’ve got rid of the parts that were just getting in the way, you have before you a treasure chest of useful information. How do you open that chest and make its contents immediately apparent, all the better to benefit your fabulous ecommerce operation?

Ways to Display

There are several different techniques you can use. Statistical graphs are a very popular and a universally understood technique. They give excellent representation of ranking as well as a good idea of the exact figures concerned. 

Another one is a word cloud. Imagine that your burger restaurant is doing well, but you want to tap into some of the current vegan buzzwords. Getting a focus group together throws up some commonly used expressions, which can be put into pictorial form. Something like this:

Vegan Buzzwords

Image source

This is a great way of emphasizing the big hitting terms. There’s no missing that people are talking most about lentils, peppers, fruits and black beans. Perhaps it might be worth utilizing one or more of these ingredients in your next menu review. 

Other methods include pinning customer locations on a map so as to demonstrate geographical spread. Again, this is a method of exploratory data analysis that’s immediately effective: the briefest inspection reveals a great deal of information. 

Let’s envision a company selling phone systems for enterprises spread across a wide territory. A location-based exploratory data analysis (known as a geo-map) will show a business in the clearest form where most of the customers are based, which may help when deciding where to focus most of the company’s efforts. 


Image source

Those areas that feature a high concentration of customers will want a good deal of attention on servicing those phone systems. Those areas that feature a low concentration of customers will possibly want a degree of phone system sales focus in order to tap all this potential. 

Grouping Techniques

Alongside deciding on the best way to display the data, pay some attention to the manner in which you elect to group it. This is especially key when you have a business of some complexity. Let’s say you’re using a Shopify ERP solution, which incorporates within it a wealth of product and customer data. 

Shopify logo

Image source

You want to get some meaningful insights into your sales data. You can group it by product or customer. This will tell you a certain amount. But don’t forget you can also group by geographical location and time of day/week/month/year the sale took place, as well as more specific groupings, for instance sales of barbecue equipment around summer holiday dates. 

You can also group products alongside return data, so as to highlight which products are being sent back more than the others. 


So, it’s clear that exploratory data analysis can be a very useful tool to an ecommerce operation. There are a huge number of techniques we haven’t mentioned here (for instance log-transformation and Pearson correlation), and it’s possible to get extremely technical extremely quickly. 

But don’t let that put you off. Exploratory data analysis doesn’t have to be overly complex to deliver some very helpful insights into your ecommerce operation. So, what are you waiting for?



Ready to start building?

Your cart