silvia, December 1, 2009

“How many visitors are coming to my site?”

Dear Web Analytics beginner,

Welcome to my second post. Just to refresh your mind, last week I started an online diary of my first Web Analytics experience using the Self Hosted Edition of Logaholic Web Analytics. In my previous post I talked about targets and key performance indicators, the basic underpinning of any e-marketing strategy, and the first step in getting meaningful Web Analytics for your site.

This week I’m going to talk about incoming traffic to your site and how this is reflected in the reports in your Web Analytics software.

The truth is out there

Let’s start on the Logaholic Dashboard, the “Today Overview”. In this overview report you’ll see references to Visitors per day and per month, per page, per keyword and so on. It’s important to remember that the dashboard overview is just a summary of the statistics that are gathered in other reports. With some Web Analytics solutions, such as Logaholic, the dashboard is customizable, which means that depending on your interests, you would place different reports on it. The most typical information is usually displayed on the dashboard, and that might seem like all the info you need, but in fact it’s just the tip of the iceberg.

But first of all, you need to ask yourself a simple question: “How many visitors are coming to my site?” In order for us to be able to answer this question, we need to define what visitors are.

The Web Analytics Association defines a Unique Visitor as:

The number of inferred individual people (filtered for spiders and robots), within a designated reporting time frame, with activity consisting of one or more visits to a site. Each individual is counted only once in the unique visitor measure for the reporting period.

As most of you know, we all try to attract human visitors to our site: people who are interested in our product or service and whom we are trying to convert to customers. So what is all this talk about robots ?

Humans vs Spiders and Robots

Your Web Analytics software does not see all ‘users’ as ‘visitors’; by which we mean real people as defined above. There are also software programs, called “crawlers” or “bots” or “spiders” which are used by, for example, the search engines to crawl (scan) the pages of websites and index them to ensure they show up in relevant search results. GoogleBot is an example of such a crawler. There are also less noble crawlers, like the ones that are harvesting email addresses or are otherwise up to no good, but that is a different topic. In any case, the crawlers play quite an important role for your site’s visibility and SEO. However, they are less relevant to your bottom line: you can only sell to humans, not robots. So when looking at your visitor count, you would want crawlers to be excluded from most reports.

If you are interested specifically in crawler activity, you might want to look at the “Most active crawlers” report from the Incoming traffic section. Other reports to check out are the “All Traffic” reports, which shows how much of your traffic is being generated by both humans and crawlers. All other reports in Logaholic where the term “visitors” is used exclude crawlers and show only counts of human visitors.

Counting the beans

Now that we’ve figured out how your Web Analytics program defines a visitor on your site it would be a good idea to find out how it does the counting.

Which brings us to the real icky bit. There are basically three industry standard methods that can be used to count visitors.

  • By the visitors IP address
  • By a combination of visitor IP address and user agent
  • By Cookies

All of these methods have their own drawbacks and none of them are really very accurate. So, if you thought Web Analytics was an exact science, think again. It’s just an approximation of reality and it’s a good idea to always keep that in mind when you are working with Web Analytics reports.

In short; IP numbers are inaccurate because multiple people can share the same IP address (via a proxy), which causes the number of visitors to potentially be under-reported. On the other hand, many web users get a new IP address each time they log on to the internet, so this can cause the visitors to be over-reported.

Cookies on the other hand do not change when the user goes offline, so it is potentially a better method of counting, but users can refuse to accept cookies, or they can delete them. This means they will get a new cookie often. Also, cookies are browser specific, so one person visiting with two different browsers will be counted as two visitors. All this causes over-reporting.

The combination of IP address and user agent (a string that contains information about the users browser type) is often used when Cookies are not available. It is more accurate than just IP address in that it will minimize the proxy/IP sharing effect. But other than that it suffers from the same over-reporting effects as IP numbers and like cookies, it is browser specific.

In Logaholic, you can choose which counting method you want to use on the Advanced tab of the Edit profile screen.

Cookies are usually the default if you are using javascript tags to collect data. If you are using log files, you’ll probably want to use IP address or IP/Useragent unless you have a tracking cookie that is being included in the log file.

Don’t forget the time.

Finally, we have time to consider. When we say Unique Visitors we are referring to a unique count of some identifier within a date range.

So, if there are 100 unique visitors in one day, and 50 unique visitors the next day, but 10 people visited on both days, the total number of unique visitors for this 2 day period is 140 unique visitors, not 150.

Now you know

So, now that you know a bit more about the number of visitors to your site, we will look at traffic sources and keywords next week, which will be a nice wrap up to our beginner’s level, before we continue to explore the more advanced features of your Web Analytics software.

Meanwhile, if you have any questions or comments, please leave them below!

To (part 3)