Web Analytics

Created by Simon Casselle, Yuhui Feng, and Daniel Lim

What is Web Analytics?

Web analytics is the measurement, collection, analysis, and reporting of web data for purposes of understanding and optimizing web usage.1 The use of web analytics is not used to simply measure the popularity of a given web page but instead used to as a tool for the owner of the site. Web analytics is used in analyzing visitors’ behavior on a web page. Through the analysis of their behavior, the owner of the domain makes changes to their site to make it more appealing and hopefully profitable for them.

Basic_Steps_of_Web_Analytics_Process.png

History

1990: World Wide Web created2
On December 25th, 1990, Tim Berners-Lee created the Internet which consisted of HTML code allowing web user and web server to be connected to one another.

1996: Hit Counters
In 1996, Hit Counters (also known as the web counter) was created to keep track of a number of hits and visitors of the website.

1997:Javascript Tags
Javascript tagging was created in 1997 and it is another method of data collection. After web pages began to add more than texts such as elements like visuals or sound clip, the number of hits the server had was not the same as the number of pages that were requested by the user.

2005: Google buys Urchin and Launches Google Analytics
Google, an American multinational technology company that specializes in Internet-related services and products, which include online advertising technologies, search engine, cloud computing, hardware, and software like Google Maps, purchases Urchin and launches Google Analytics becoming the biggest web analytics service on the market with a focus on quantitative analysis.

2012: Google launches Universal Analytics
After Google created Universal Analytics, it allowed users to be tracked through user IDs that were connected with different kinds of electronic devices such as phone or laptop. Even offline behavior was monitored and customer data became accessible without access to the Internet.

2016: Machine Learning on Mobile
Google Analytics assimilated Machine learning as part of its own analytics.

Click this external link for more details of Web Analytics History

Methods of Conducting Web Analytics

Logfile Analysis

Log file analysis is the when the owner of a website tracks information about who, what, and when a visitor of their site.3 They do so by passing through the log-file at the server level and copied into reports immediately. Using Logfile analysis, the vendor can learn the browsers used, how long users spend on their site, peak hours for their site.

Types of Log files

  • Error Log File

The process of collecting files that are created during data processing to hold data known to contain errors and that is usually printed after completion of processing so that the errors can be corrected.4

  • Event Log File

The process of collecting files that logs events, for example querying events, subscribing to events, archiving event logs, and managing event metadata. Event Logs are now popular in Blockchain. They provide a user with a simple list of Blockchain related transactions. A user may comprise one or more of receiving an event log with events which occurred during operation of the computer, generating a hash value for the event log, adding details of the event log and the hash value as a transaction to a distributed blockchain, and storing the event log in a file store or server. Windows computers have a version of this built into the system and can be viewed through the Event Viewer.5

  • Data logFile

The process of collecting and storing data over a period of time in order to analyze specific trends or record the data-based events/actions of a system, network or IT environment.6 Data collected can be used to answer questions and provide assistance.For examples click here and see "Simple AI uses" under the "Real Life Applications of AI Header."

Positives for Logfile Analysis

  • The vendor can track the bandwidth of visitors on their web pages as well as the downloads.
  • Log file will also give data about every visitor to a webpage, including a robot, those with older web browsers and those who delete cookies.
  • With Log File Analysis data, you own all reports, so vendors have full control of the reports. Vendors can manipulate and move it to wherever they want. This allows for the vendor to move from one log file software program to the other without any needs to convert the data.

Negatives for Logfile Analysis

  • Visitors can use web browsers that use caching that can distort the data collected. Web browsers cache the visitor’s data making it difficult for the servers to create a log file of the event. This inability to see every visitor's interaction with a webpage makes it difficult to recognize unique information about the visitor.
  • Log file analysis does not have the ability to see the type of device the user is using to access their webpage.
  • Log file analysis software has a higher upfront cost because it is done away from the webpage host location. The software is usually purchased to read the log files rather than created by the webpage owner and is a cost that the webpage owner needs to take into consideration when making cost-benefit decisions of which method of web analytics to use.

Page Tagging

Page Tagging is a data collection method done through the use of Web Bugs. The web bug, an invisible snippet of Java code in a web page, tags a visitor with a cookie and sends data about the activity and history of the visitor back to a central server

Positives of Page Tagging

  • Vendors can view information about a visitor’s computer. Due to this clarity about the webpage owners customer now allows the webpage owners to create a better site based on gathering the demographics of their customers.
  • Bypasses web browser caching issues and lets you see the hit, even when your web server does not. With other methods of web analysis browsers save sites that visitors visit often so that they load quicker this, unfortunately, gives inaccurate data when tracking to see who is on the page. With Page tagging, the web bug is on each particular page making it easier to track and circumvents the caching of web pages.
  • Multiple visits from the same dynamic IP address can be broken down into unique visitors. This enables webpage owners to not group singular computers that may have multiple people using it as the same and create different profiles for each based on their individual interaction with the webpage.

Negatives of Page Tagging

  • Cannot track bandwidth of the visitor. Due to the fact that page tagging is the embedding of web bugs on a particular page, it does not have the ability to see or gather how much data is being downloaded by the visitor.
  • Very difficult to reprocess the data. Due the web bug gathers unique data as the guest visits it is hard to recreate the set of circumstance that led to the initial visit.
  • The data is stored on someone else’s servers. The web bugs are placed on the pages but they are not interacted with locally. The data that is gathered in page tagging must be housed on a server most commonly someone else’s.

Hybrid Methods

Geolocation Tracking

Click Analytics

  • Analytics that focuses on where a user clicks on a webpage. This metric is used most commonly high traffic sites to determine how to appeal to users better i.e. shrinking the size of text or making links larger.

Customer Life Cycle Analytics

  • Ties all the actions a user does on a webpage as one singular event/metric rather than separating them into individual metrics.

On-Site Web Analytics

  • Definition: On-site web analytics tools estimate the actual visitor traffic coming to your website. They are able to track visitors' engagements and interactions.
  • Example: Google Analytics, Adobe SiteCatalyst, WebTrends… Those tools are able to track the engagements and activities visitors make. They are capable to track following data for commercial such as engagement rate(how long a person stays on your web page), bounce rate(when a person leaves your website with 30 seconds), event tracking(allows you to track other activities on your website) , annotations(allows you to view a traffic report for past time), visitor flow(gives you a clear picture of pages visited and the sequence of the time), page load time(more is the load time, the more is the bounce rate) and behavior(lets you know page views and time spent on website).

Positives of On-Site Web Analytics:

  • On-site web analytics measures the actual visitors and interests.
  • On-site web analytics is accessible for any website-large and small traffic sizes.
  • On-site web analytics is relatively cheap compared with off-site web analytics. (Google Analytics is free to use!)

Negatives of On-Site Web Analytics:

  • Information on On-Site Web Analytics is limited demographically.
  • On-Site Web Analytics cannot track competitors or their websites.
  • Visitors have the choice to stop or delete cookies to prevent tracking.

Off-Site Web Analytics

  • Definition: Off-site web analytics tools measure your potential visitors. They allow you to see the picture of how your website compares to others in a macro way.
  • Example: Companies such as comScore and Nielsen NetRatings apply the panel method. They install monitoring software on users' computer to collect web activity date. The sizes of participants vary from tens to hundreds of thousands of people. The majority of the participants are based in the US. The company like comScore has around 2 million participants in the world and more than half of them are based in the US. Most of the participants are home users. Panel data collected are used to provide an estimate of the behavior of the total web population.

Positives of Off-Site Web Analytics:

  • Information on off-site web analytics is more available demographically.
  • Off-site web analytics can track competitors or their websites.
  • Off-site web analytics can track data based on your web presence. A website is not required.

Negatives of Off-Site Web Analytics:

  • Off-site web analytics go through a huge amount of data and it takes a longer time to analyze.
  • Off-site web analytics only suits for high traffic website. (more than 1 million visits per month)
  • Data is mostly US-centric which means most of the users are in the US.
  • Off-site web analytics is more expensive than On-site web analytics.

On-Site vs. Off-Site Analytics

When using on-site and off-site web analytics, both of them have their own advantages and disadvantages. We will first look at the advantages and disadvantages of On-Site Web Analytics. The three advantages of on-site web analytics are its ability to measure the actual visitors and interests accurately, their accessibility for any other website, and its relatively cheap cost to use the on-site Web Analytics.7 An example of relative cheap on-site web analytics is Google Analytics. The three disadvantages of on-site web analytics are information being limited demographically, its lack of ability to track competitors or their websites, and visitors having to delete or stop the cookies in order to let the website work.
The Off-Site Web Analytics also have their own shares of advantages and disadvantages. The three main advantages of Off-Site Web Analytics are their information being more available (hence more public), its ability to track competitors and their websites (which would expedite the process), and its ability to track data based on your web presence. However, there are four disadvantages of Off-Site Web Analytics. The four disadvantages are many of its data being inferred which may lead to accuracy issues, only being suitable for high traffic website (as there would be more than 1 million visits per month), data being US-centric (which means that there could be bias towards Us customers), and it's ultimately its cost.

Click on the external link for more details on onsite and offsite Web Analytics.

Common Source of Confusion in Web Analytics

Hotel Problem8

Day 1 Day 2 Day 3 Total Unique Visitors
Room 1 Daniel Daniel Yuhui 2 Unique Visitors
Room 2 Yuhui Simon Simon 2 Unique Visitors
Total Unique Visitors 2 2 2 ???

The problem that web analytics have in tracking the number of unique visitors is called the “hotel problem”.9 The hotel problem occurs when the unique visitors for each day in a month is not the same as the unique visitors for the month as a whole. If we look at the chart, the hotel has Room A and B and it has two unique users for each day for 3 days. If you add up all the unique users per day in a month, it is 6 people. However, if we look at the unique visitors for the month as a whole, it is 3 because 3 different people used the hotel twice. However, there is no solution for this.

Most of the Web Analytics program is able to catalog statistics based on the data-range you actually have. The Web Analytics program will not show the “repeat unique visitors” when looking at the week combined. So chances are, there could have been a person that visited the site at least more than once. To simplify it, if I access in the Website or hotel for Monday, Tuesday, and Wednesday but in the total week it would show up as one unique user whereas if you divide that into 3 different days, it will be 3 different unique visitors visiting the website 3 times.

Too Many Metrics

Today many marketing teams are using web metrics for marketing strategy and business purposes.10 However, lots of web metrics that are not relevant to your business goals and it is essential that we are selective with the choices of metrics that we use. An example of good metrics for a page would be visitors, top page, and pageviews. With the creation newer technology, like Augmented Reality and Virtual Reality, companies are using software to visualize these metrics and data to have a better understanding of how they affect their business. Software companies like Looker and IBM Analytics are spearheading the innovation.

First Party vs. Third Party Cookies

Most of the companies will use their own cookies to track our data, which are the Third Party Cookies. However, there are many browsers and anti-spyware applications that are used to rebuff their cookies, including the cookies that we use for Web Analytics. By removing the third party cookies, there will be an incomplete data.

Click this external link to learn about the other challenges in Web Analytics

Links to other presentations

Artificial Intelligence

Augmented Reality

Block Chain

Life-logging

Mirror Worlds

Virtual World

Bibliography
1. Bizouati, Michael. “A Brief History of Web Analytics | UX & Usability | Web Optimization.” ClickTale, 5 July 2017, www.clicktale.com/resources/blog/a-brief-history-of-web-analytics/.
2. Clifton, Brian. “Noise or Music?” On-Site Versus Off-Site Web Analytics – 2. How They Work, brianclifton.com/blog/2015/08/20/on-site-versus-off-site-analytics-which-is-accurate/.
3. J. Fan, F. Han, H. Liu Challenges of big data analysis National Science Review, 1 (2) (2014), pp. 293-314 CrossRefView Record in Scopus
4. M. Schroeck, R. Shockley, J. Smart, D. Romero-Morales, P. TufanoAnalytics: The real-world use of big data. How innovative enterprises extract value from uncertain data IBM Institute for Business Value (2012) Retrieved from http://www-03.ibm.com/systems/hu/resources/the_real_word_use_of_big_data.pdf
5. Lord, Nate. What is Security Analytics? Learn about the Use Cases and Benefits of Security Analytics Tools. 12 September 2018. 28 2018 October. <https://digitalguardian.com/blog/what-security-analytics-learn-about-use-cases-and-benefits-security-analytics-tools>.
6. W. He, S. Zha, L. Li Social media competitive analysis and text mining: A case study in the pizza industry
International Journal of Information Management, 33 (3) (2013), pp. 464-472
7. J. Heidemann, M. Klier, F. Probst Online social networks: A survey of a global phenomenon
Computer Networks, 56 (18) (2012), pp. 3866-3878
8. Kallarakkal, Seby. “Log File Analysis Vs Page Tagging” https://www.nabler.com/articles/log-file-analysis-versus-page-tagging.asp
9. Gopal, Sowparnika. “What are Log Files?” https://www.nabler.com/articles/log-files.asp
10. Teixeira, Joe. “The ‘Hotel Problem’ – Revisited.” The Analytics and Site Intelligence Blog @ MoreVisibility, MoreVisibility, 16 Sept. 2014, www.morevisibility.com/blogs/analytics/the-hotel-problem-revisited.html.
11. “Top 6 Web Analytics Issues and Concerns.” Web Analytics Guy Doing It Right, 7 May 2018, webanalyticsguy.com/2018/05/07/top-6-web-analytics-issues-and-concerns/.
12. WAA Standards Committee. "Web analytics definitions." Washington DC: Web Analytics Association (2008)
13. “Web Analytics.” Wikipedia, Wikimedia Foundation, 23 Nov. 2018, en.wikipedia.org/wiki/Web_analytics.
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License