Web Metrics

1
CHAPTER 1 INTRODUCTION
The purpose of web metrics or measurements is to obtain better measurements in terms of risk management, reliability forecast, cost repression, project scheduling, and improving the overall software quality. GUI code has its own characteristics that make using the typical software metrics such as lines of codes, cyclic complexity, and other static or dynamic metrics impractical and may not distinguish a complex GUI from others that are not. There are several different ways of evaluating a GUI that include; formal, heuristic, and manual testing. Other classifications of user evaluation techniques includes; predictive and experimental. Unlike typical software, some of those evaluation techniques may depend solely on users and may never be automated or calculated numerically. GUI structural metrics are those metrics that depend on the static architecture of the user interface. The main contribution that distinguishes this research is that it focuses on the generation of GUI metrics automatically (i.e. through a tool without user interference). This paper elaborates on those earlier metrics, suggested new ones and also make some evaluation in relation to the execution and verification process. Some open source projects are selected for the experiments. Most of the selected projects are relatively small, however, GUI complexity is not always in direct relation with size or other code complexity metric.
1.1.Problem Introduction
Evaluators of software applications and websites strive to make sure that their software is up to the quality standards relative to others. They used metric tools and methods to be able to get the software characteristics and compare them with other software applications or with standards. In order to make the approach plausible, those attributes should be gathered automatically through tools. Web applications have some characteristics that make their
maintenance expensive. This includes: heterogeneity, speed of evolution, and dynamic code generation.
1.1.1 Motivation
A key element of any web site engineering process is metrics. Web metrics are used to better understand the attributes of the web page we create. But, most important, use web metrics to assess the quality of the web engineered product or the process to build it. Study of websites is relatively new convention as compared to quality management. It makes the task of measuring web sites quality very important. Since metrics are crucial source of information for decision making, a large number of web metrics have been proposed to compare the structural quality of a web page. Internet and website are emerging media and service avenue requiring improvements in their quality for better customer services for wider user base and for the betterment of human kind. E-business is emerging and websites are not just medium for communication, but they are also the products for providing services. Measurement is the key issue for survival of any organization Therefore to measure and evaluate the websites for quality and for better understanding, the key issues related to website engineering is very important. Surveying and classifying previous work on a particular field have several benefits, which are: i) ii) iii) iv) To help organise a given body of knowledge; To provide results that can help identify gaps that need to be filled; To provide a categorization that can also be applied or adapted to other surveys; To provide a classification and summary of results that may benefit researchers who wish to carry out meta-analyses. But all the previous work identified above utilizes statistical techniques to give end result, while this report uses fuzzy logic in order to generalize end result.
1.1.2 Project Objective

The objective of this research is to design and develop web metrics for website usability prediction.
1.1.3 Scope of the Project

Literature survey: This Section presents a literature review of hypermedia and Web size metrics proposed within the past 12 years, described in chronological order. We have not detailed too much these metrics due to shortage of space. Survey of websites: This section contains survey of 5 websites which are evaluated as grade A, B, C, D and E based on the feedback given by the group of 30 web users. Identification of parameters: This section contains all parameters that will help in calculating complexity of a website and would enable to rate the website as grade A, B, C, D and E based on value of these parameters. Selection of parameters: Some parameters are selected from parameters identified inorder to calculate their value and find end result. Calculation of parameter value: Selected parameters value is calculated through parameter value extractor, students feedback and an online parameter calculation tool. Normalization of data: The data calculated above is normalized through MINMAX normalization on scale of 0-1. Generation of rulebase using MATLAB: The extracted values from the above calculations are then used to generate a fuzzy logic in MATLAB. For each criteria undertaken, there are three categories whose range lies between 0-1. These categories are high medium and low. Result evaluation: This section evaluates results of grading by our web matrices and validates them with the survey results of websites.
1.2.Related Previous Work

The evaluation of websites or web analytics can take several forms and natures. Several research projects focus on user manual evaluation of websites especially in some metrics where tools or automation will not be easy or possible. Research of web sites evaluation and metrics spans the spectrum of the different possible metrics or characteristics to measure. In most cases, tools are used to gather the website characteristics. Those characteristics can be static collected without the need to have the website live or dynamic to measure or collect those metrics while the website is running. Ruchika et al. [2011] used webby awards data to validate the relationship of web metrics and quality of websites using logistic regression technique. To automate the study of web page metrics they developed a tool for calculating 15 web page attributes. They used use JSP for this purpose. Izzat alsmadi et al. [2010] developed a crawler tool to crawl all pages of the websites under the study. Each website is crawled while all its components are collected and visited. The overall time for crawling website is collected to evaluate the effect of the size of the different components on performance. Basili et al. [1984] suggested a Goal Question Metric (GQM) approach to improve maintainability. First, the goals must be defined. A set of questions are set and their answers will show whether the goal is met or not. Later one metrics are defined to gather quantitative data which give answers to the previously defined question. Ghosheh et al. [2004] proposed an elementary study for few maintainability design metrics of web applications from class diagrams. The metrics collected were: size, complexity, coupling and reusability. Many people, earlier such as Conallen, tried to model web applications based on UML.
Alagappan et al. [2009] studied websites usability and performance through the utilities availability and websites visual appearance. They studied websites contents, domains, and navigability and their impact on websites usability. They studied some website-user related metrics such as traffic analysis in terms of number of hits and users behavior. Mendes et al. [2001] proposed a prediction model for estimating design and authoring effort in website applications. The study was based on a student class who were taught and given an assignment for building a website after gaining similar training. Websites attributes along with development time were studied looking for correlations. Sanjeev et al. [2008] focused on evaluating hypermedia applications of websites in terms of reliability, usability, maintainability and effort estimation. Victor et al. [2008] used data mining and visualization to analyze and study web data. He developed and used a web mining tool called WET. Carsten et al [2005] introduced an efficiency metric (called Guidance Performance Indicator, GPI) that tried to evaluate the degree in which a goal driven website meets its goals in terms of user acceptance. This is accomplished through modeling desired user behaviour patterns. There are several other related papers for this subject. The focus on those papers is in selecting one or more particular metric to propose or evaluate. Examples of some of the metrics that get more focus than others include: usability, navigability, accessibility, performance. Similarly, our paper focuses on structural metrics of websites. Those are the metrics that are related to the structure of the websites, the number and the size of its components such as: pages, images, forms, etc.
1.3.Organization of the Report.
Literature survey: It gives brief of all the papers studied during the course of the project. 14 papers have been undertaken for study. System Design and Methodology: It defines the method of data collection and methodology used to evaluate the websites and verify the results. Implementation and Result: This chapter describes basic software requirements as well as assumptions and dependencies along with implementation details of rulebase through MATLAB Conclusion: This chapter gives generalized end result and comparison of work done to the previous work.
CHAPTER 2 LITERATURE SURVEY
Web technologies and applications are becoming increasingly important in the information systems world. One of the main problems of web developments is their short span of window owing to the ever changing world, which can result in a lack of quality. A good mechanism for controlling the quality of a web based applications (and hence of a web site) is the use of web metrics. It is important to measure the attributes of the software quantitatively and qualitatively for understanding and enhancing it. The first step of any measurement is to define the attributes of measurement. There are number of quality attributes defined by different researchers for analysis.
2.1 A survey of web metrics by Botafogo et al [1992].

Botafogo proposed size metrics to be used to help identify problems with the hyperdocument being created. Their focus was on the hyperdocuments navigation rather than on its content. Compactness: Measures how well connected (by links) a hyperdocument is. Its value varies between zero (completely disconnected) and one (completely connected). Stratum: Measures to what degree the hyperdocument is organised into a single reading path. Its value varies between zero (no imposed reading order) and one (single path).
2.2 Measures of size and complexity for website content by Yamada et al [1995].
Yamada proposed size metrics to measure authoring and maintenance problems.
Interface Shallowness: Cognitive load on users. Assumes that applications are structured hierarchically, each level corresponds to a cognitive layer, and moving from one layer to another increases the cognitive load on users.
Downward Compactness: structural complexity of reaching the nth node from the root. Downward Navigability: measures hypermedia navigability, where an easily navigable hypermedia application has a shallow interface layer from the root to the nth node and is compact from the root (that is, it is structurally simple to reach the nth node from the root).
2.3 Web structural metrics evaluation by Hatzimanikatis et al [1995].

Hatzimanikatis proposed size metrics to measure the readability and maintainability of hypermedia applications. Path complexity: The number of different paths or cycles that can be found in a hyperdocument, assuming it to be a graph. The path complexity of a linear hyperdocument is minimal. Tree impurity: The extent to which a graph deviates from being a tree. Modularity: Measures if the nodes are self-contained and independent. Individual node complexity: Complexity that a single node imposes on the overall structure.
2.4 Consolidating the ISO usability models by Bray [1996].

Bray proposed size metrics to measure the size of Web applications. Page size: It is measured in three different ways: 1. The sum of space used (Kbytes) by its Web pages (PS1); 2. The sum of the number of words in its Web pages (PS2); 3. The sum of the number of image references in its Web pages (PS3). Outbound connection: Number of links that point to another Web application/site. Inbound connection: Number of links from other applications pointing to application w.
2.5 User Interface structural metrics by Fletcher et al [1997].

Fletcher proposed size metrics to predict effort to develop multimedia applications1. Media type: Number of graphics, audio, video, animations, photographs. Media source: If media is original or reused. Component duration: Duration of an animation, sound or video. Number of objects (including sounds): Number of objects on the screen. Screen connectivity: Number of links between a screen and other screens. Screen events: Number of events on a screen. Actions per Event: Average number of actions per event.
2.6 The systematic review of web measurement by Cowderoy [1998]

Cowderoy proposed size metrics to predict effort to develop Web applications. Web application
Web pages: number of Web pages in an application. Home pages: number of major entry points to the Web application. Leaf nodes: number of Web pages in an application that have no siblings. Hidden nodes: number of Web pages excluded from the main navigation buttons. Depth: number of Web pages on the second level that have siblings. Application Paragraph count: number of PPC for all Web pages in an application. Delivered images: number of unique images used by the Web application. Audio files: number of unique audio files used in a Web application. Application movies: number of PMs for all the Web pages in an application. 3d objects: number of files (inc;. 3D objects) used in a Web application. Virtual worlds: number of files (incl. virtual worlds) used in a Web application. External hyperlinks: number of unique URLs in the Web application.
10
Web page Actions: number of independent actions by use of Javascript, Active X etc. Page paragraph count (PPC): Number of paragraphs in a Web page. Word count: Number of words in a Web page. Navigational structures: Number of different structures in a Web page. Page movies (PM): Number of movie files used in a Web page. Interconnectivity: Number of URLs that link to other pages in the same application. Media Image size (IS): Computed as width * height. Image composites: Number of layers from which the final image was created. Language versions: Number of image versions that must be produced to accommodate different languages or different cultural priorities. Duration: Summed duration of all sequences within an audio file. Audio sequences: Number of sequences within the audio file. Imported images: Number of graphics images imported into an audio file. Program Lines of source code: The number of lines of code in a program/script. McCabe ciclomatic complexity: The structural complexity of a program/script.
2.7 Size Metrics by Mendes et al [1999; 2000; 2001].

Mendes proposed size metrics initially to estimate effort to develop Hypermedia applications and later to estimate effort for Web applications. Hyperdocument size: number of files (e.g. HTML files). Complexity Connectivity: number of non-dynamically generated links within a hypermedia application. Compactness: measures how inter-connected the nodes are.
11
Stratum: measures to what degree the application is organised for directed reading. Link Generality: measures if the link applies to a single or multiple instances. Web application Page count: number of HTML or SHTML files . Media count: number of unique media files. Program count: the number of CGI scripts, JavaScript files, and Java applets. Total page allocation: space (Mbytes) allocated for all HTML or SHTML pages. Total media allocation: space (Mbytes) allocated for all media files. Total code length: number of lines of code for all programs. Reused media count: number of reused or modified media files. Reused program count: number of reused or modified programs. Total reused media allocation: space (Mbytes).allocated for all reused media files. Total reused code length: number of lines of code for all reused programs. Code comment length: number of comment lines in all programs. Reused code length: number of reused lines of code in all programs. Reused comment length: number of reused comment lines in all programs. Total page complexity: average number of different types of media used, excluding text. Connectivity: number of internal links, not including dynamically generated links. Connectivity density: computed as Connectivity divided by page count. Cyclomatic complexity: computed as Connectivity -page count) + 2. Page allocation: allocated space (Kbytes) of a HTML or SHTML file. Page complexity: number of different types of media used on a page, not including text. Graphic complexity: number of graphics media. Audio complexity: number of audio media. Video complexity: number of video media. Animation complexity: number of animations. Scanned image complexity: number of scanned images.
12
Page linking complexity: number of links. Media duration: duration (minutes).of audio, video, and animation Media allocation: size (Kbytes) of a media file. Program Code length: number of lines of code in program.
2.8 Website complexity metrics for measuring navigability by Rollo [2000].

Rollo did not suggest any new size metrics. However, he was the first, as far as we know, to investigate the issues of measuring functionality of Web applications aiming at cost estimation, using numerous function point analysis methods. Functional size: number of function points associated with a Web application. Function points were measures using COSMIC-FFP2, Mark II and Albrecht.
2.9 Usability measurement in context by Cleary [2000].

Cleary proposed size metrics to estimate effort to develop Web applications. Web hypermedia application Non-textual elements: number of unique non-textual elements within an application. Externally sourced elements: number of externally sourced elements. Customized infra-structure components: number of customized infra-structure components. Total Web points: size of a Web hypermedia application in Web points. Web software application Function points: functionality of a Web software application. Web page Non-textual elements page: number of non-textual elements. Words Page: number of words. Web points: length of a Web page. Scale points are Low, Medium and High. Each point is attributed a number of Web points, previously calibrated to a specific dataset.
13
Complexity Number of links into a Web page: number of incoming links (internal or external links). Number of links out of a Web page: number of outgoing links (internal or external links). Web page complexity: complexity of a Web page based upon its number of words, and combined number of incoming and outgoing links, plus the number of non-textual elements. 2 COSMIC-FFP = COmmon Software Measurement International Consortium-Full Function Points
2.10 User interface architecture by Reifer [2000].

Reifer proposed size metrics to be used to estimate effort to develop Web applications. Web Objects: the number of Web Objects in a Web application using Halsteads equation for volume, tuned for Web applications. The equation is as follows: V = N log2(n) = (N1 + N2) log2 (n1 + n2) where: N = number of total occurrences of operands and operators n = number of distinct operands and operators N1 = total occurrences of operand estimator N2 = total occurrences of operator estimators n1 = number of unique operands estimator n2 = number of unique operators estimators V = volume of work involved represented as Web Objects Operands are comprised of the following metrics: Number of building blocks: number of components, e.g., Active X, DCOM, OLE. Number of COTS: number of COTS components (including any wrapper code). Number of multimedia files: number of multimedia files, except graphics files. Number of object or application points: number of object/application points etc. (1)
14
Number of Lines: number of xml, sgml, html and query language lines. Number of Web components: number of applets, agents etc. Number of graphics files: number of templates, images, pictures etc. Number of scripts: number of scripts for visual language, audio, motion etc.
2.11 Web application size metrics by Mangia and Paiano [2003].

Mangia and Paiano proposed size metrics to estimate effort to develop Web applications modeled using the W2000 methodology. Macro: macro-functions required by the user. DEI: input data for each operation. DEO: output data for each operation. Entities: information entities which conceptually model the database. AppLimit: application limit of each operation. LInteraction: level of interaction various users of the application have in each operation. Compatibility: compatibility between each operation and applications delivery devices. TypeNodes: types of nodes which constitute the navigational structure. Acessibility: accessibility associations and pattern of navigation between node types. NavCluster: navigation cluster. ClassVisibility: visibility that classes of users have of the navigational structure. DeviceVisibility: visibility that delivery devices have of the navigational structure.
2.12 Emperical validation of web metrics for improving the quality of web page by ruchika et al [2011].
Ruchika et al. used 15 parameters to evaluate the websites. Number of words: total words on page. Body text words: Words that are body Vs. display text
15
Number of links: Links on a page Embedded links: Links embedded in text on a page Wrapped links: Links spanning multiple lines Within page links: Links to other areas of the same page Number of!s: Exclamation points on a page Page title length: Words in page title Page size: Total bytes for the page and images Number of graphics: Total images on a page Text emphasis: Total emphasized text Number of list: List on a page Frames: Use of frames Number of tables: Number of tables present on a web page Emphasized body text: Total emphasized body text
2.13 Summary
This literature shows that from 1996 onwards, the majority of size metrics were geared towards Web applications, rather than hypermedia applications, illustrating a shift in the focus not only from the research community but also by practitioners. Most size metrics were aimed at cost estimation, except for those proposed between 1992 and 1996. Recent work showed that complexity size metrics do not seem to be as important as functionality and length size metrics. This may be due to the motivation behind the proposition of such metrics. However, it may also point towards a change in the characteristics of Web applications developed in the past, compared to those developed today. Many Web applications are moving to be dynamic applications, where pages are generated on-the-fly. This may indicate that looking at an applications structure, represented by its links, ceases to be as important as in the past. This also explains the gradual exclusion of complexity size metrics from recent literature.
16
CHAPTER 3
SYSTEM DESIGN AND METHODOLOGY
This chapter represents the system design and various methodologies which are to be formulated to identify the coverage criteria which cover maximum test cases for generated application. The method includes survey of different websites, identifying parameters to evaluate website, design and development of web metric tool.
3.1 Research Methodology

This study calculates quantitative web page metrics for example number of words, body text words, number of graphics, emphasized body text, number of links etc from the web pages. This project takes into account JSS Infocenter web site which provides the facility to view and upload notices and files for every department. This site has a login page where each student and teacher logins with their respective login ids. The staff has the provision to upload the notices or files in their respective department sections while the students can only retrieve the information and not modify it.
3.1.1 Data Collection
17
The data is collected through a survey of 20 students, web parameter calculator, an online parameter extractor and webby awards. We are supposed to see whether the website is good, very good, bad or excellent on the basis of average usage and different other parameters undertaken.
3.1.2 Parameters identified

Following are different parameters identified from different research papers:
1. Hits
Hits were the buzz word of the late 90s dot com boom. Little did they realize that a hit counted any hit to the web server, like images, not just pages. So, pages with many images received many more hits than plain text sites so it was a pretty inaccurate/unfair metric until tech geniuses invented a better metric to measure web popularity the pageview metric (covered in a minute).
2. Web Counters
This was another craze of the dot com boom. Websites all over the internet showed little counters boasting how many hits they had since who knows when (that day? that month? that year? they never usually stated). Even funnier were the people who put counters on their personal websites only to be sad to see that no one was visiting! Any site that still has a web counter needs to be quietly pulled aside and laid to rest, if you know what I mean.
3. Top Web Browser Platform

So, big deal, you can tell that more people on Internet Explorer are using a website than
18
Firefox (or the other way around if it has a tech audience website). What can the metrics do to improve the website based on knowing this.
4. Top Operating System

Even worse than top browser metric. Everyone knows that only geeks and graphic artists (or trendy teenagers) use Macs. If any operating system takes more than 5% away from Windows share (as Mac cant seem to get past), It will be surprising. So forget about this pretty useless metric nothing useful can be done with it. Basic, But Good Metrics
5. Pageviews
This tells the amount of views any website page is getting in particular, this shows how a website fares over time. A view counts as a loading of a page. Still considered a very important metric, but the increasing amount of flash/AJAX built websites, and the increase in online video, means fewer page views are counted, even though the same amount of content is being looked at. Therefore, its not as good an indicator of website popularity that it used to mean.
6. Visits
A visit is the equivalent of when someone arrives at a website and starts looking at pages. A visit can consist of many pageviews, or just one. Not as good or interesting as unique visitors or pageviews though as its kind of in between both.
7. Unique Visitors
A unique visitor counts the number of distinct people that are visiting (making visits) a website in a particular time period, usually one day. A unique visitor can contain many visits, each
19
containing many pageviews. This is still one of the best metrics to use for a website as it tells you the amount of different people that are visiting a website on a daily basis a great indicator of site popularity (more advanced analysts use daily, weekly and monthly unique visitor metrics too).
8. Referrers
It tells about all the places that people are finding a website and visiting from. If server doesnt know where people are coming from, then marketing efforts are hard to judge, and also where to spend additional money.
9. Top Search Engines

This metric is a more detailed version of referrers and tells about which search engines people are visiting a website from. So if a webite is getting plenty of visits from Yahoo, but not many in Google, it is considered doing some search engine optimization (SEO) for Google, or consider doing some pay-per-click (Google AdWords) instead to get some visits from them.
10. Top Keywords

As it sounds, this metric tells the top keywords that people are typing in at search engines and arriving at a website from. Its basically an even more valuable, in-depth version of the top search engines metric. Do some research on keywords related to a website, and see how top keywords compare. If there are not many visits from top keywords that other websites related to it are getting, then its time to spend more on SEO or pay-per-click to get these top keywords.
11. Average Time Spent
20
This Average Time Spent (ATS) metric indicates the amount of time a visitor spends on a website and pages. Its usually a good indicator of the quality of a website (depending on the type of website). The longer the ATS, usually, the better. However, a long ATS can be an indicator of a bad website experience and that people cant find what they are looking for. Its best to combine it with the bounce rate and exit pages (see below) to get a more accurate picture of the quality of a website content. Also, the average time spent doesnt take into account the last page saw (it has no way of knowing when the visitor closed their browser or walked away), so blog home pages suffer from this.
12. Exit Pages

This metric indicates the amount of exits from pages on a website. Therefore, it reveals the pages on a website that drive people away. But remember, some exit pages are more natural exit pages, like purchase confirmation or newsletter signup confirmation pages. Look for the highest exited pages that seem to be an important path of a websites flow, like products pages or info pages, and improve these.
13. Entrance Pages

All too often people just analyse and improve the homepage, because they think thats where the majority of their traffic arrives from. However, all too often the reality is that many people will arrive deep into a website through search engines. Looking at this metric reveals which of a pages are most often used as entrance pages. Look to improve these pages and make sure its easy for visitors to navigate from these pages otherwise these entrance pages will become exit pages.
14. Bounce Rate

This is one of the most under-used, but most revealing metrics. To put it simply, it indicates
21
the amount of people that, upon arriving at a website, immediately leave. Therefore, its a great indicator of the quality of a website. Bounce Rate is the percentage of single-page visits from entrance page visits for individual pages. In particular, its very revealing to check out the bounce rate for a paid search keywords spend more on the keywords with low bounce rates, and cut out keywords with high bounce rates. A bounce rate below 40% for pages is considered good.
15. Repeat Visits

This is another great metric to use, and is a great indicator of the quality of a website. Simply put, the more a visitors return, the better a website is likely to be so therefore, you should try and get a repeat visits as high as possible. The higher the percent of repeat visits versus first time visits is another great indicator to use for site quality.
16. Conversion Rate

Lastly, but certainly not leastly, knowing the conversion rate on a website is one of the most powerful things to know and act on. And not just conversion for the site as a whole but one must be looking at conversion rate by page or set of pages. Ideally a funnel should setup for each conversion so that it is known exactly where people are leaving before they convert a prime candidate to analyze conversion rate and funnel is pages within a shopping cart. Also looking at conversion rates by referrers gives a good indicator of the sources of traffic to a website.
3.1.3 PARAMETERS USED FOR METRICS TOOLS 1. Number of words

Total number of words on a page is taken. This attribute is calculated by counting total number of words on the page. Special characters such as & / are also considered as words.
22
2. Body text words

This metrics counts the number of words in the body Vs display text (i.e. Headers). In this, we calculate the words that are part of body and the words that are part of display text that is header separately. The words can be calculated by simply counting the number of words falling in body and number of words falling in header.
3. Number of links
These are the total number of links on a web page and can be calculated by counting the number of links present on the web page.
4. Embedded links
Links embedded in text on a page. These are the links embedded in the running text on the web page.
5. Wrapped links
These are the links that spans in multiple lines. These are the links which take more than one lines and can be calculated by counting the number of links that spans in multiple lines.
6. Within page links

These are the links to other area of the same page. This can be calculated by counting the number of links that links to other area of the same page. Example in some sites have top bottom.
7. Number of !s
23
Exclamations points on a page can calculated by counting total number of ! marks on a page.
8. Page title length

These refer to the words in the page title and can be calculated by counting the total no of words in the page title.
9. Number of graphics
These refer to the total number of images on a page. And can be calculated by counting the total number of images present on the page.
10.Page size
It refers to the total size of the web page and can be found in properties option of the web page.
11.Number of list
This metrics can be calculated by counting total number of ordered and unordered list present on a web page.
12.Number of tables
This metrics gives the answer of the question .How many number of tables is used in making a web page?
13.Frames
24
This metrics can be calculated by analyzing whether a web page contains frames or not.
14.Text emphasis
This metric can be calculated by analyzing the web page and counting the total number of words which are in bold, italics and capital.
3.1.4 Parameter value calculation

Parameters identified above are classified into 5 categories, they are: Navigation: Websites navigation system is like a road map to all the different areas and information contained within the website. Number of links will determine the navigation ease experienced by the intended user. Usability: usability is an approach to make web sites easy to use for an end-user, without the requirement that any specialized training be undertaken. For ex: press a button to perform some action. Flexibility: It means enabling user to browse website anywhere at any browser with ease. Efficiency: Efficiency is a measure of how well a website does what it should do. It can be checked by calculating and comparing response time of each website. Satisfaction: It is user satisfaction that was it useful for him to visit website.
Value of above identified parameters is calculated through three sources: Students Survey: Students feedback has been taken for satisfaction, whether the website served their purpose or not. Web Parameter Extractor: Parameters identified above like number of words, head tag, body tag, etc. are calculated through this tool developed by us in php. Online parameter extractor: we have calculated some of our parameters through online parameter extractor i.e. www.digsitevalue.org
25
3.1.5 Normalization of data

The values calculated above are normalized on scale of 0 to 1 using Min-Max normalization:
Where, B = Normalized value A = Value of parameter D = Maximum value of range C = Minimum value of range
3.1.6 Implementation of Fuzzy Logic using MATLAB 7.10

Fuzzy logic is a convenient way to map an input space to output space. In this project a fuzzy model is proposed with five inputs, namely Navigation, Usability, Flexibility, Efficiency and Satisfaction. Figure1 shows the fuzzy model. The proposed model consists of five inputs and provides a crisp value of output using Rule Base. Fuzzy Inference System (FIS) is the process of formulating the mapping from a given input to an output using fuzzy logic. This will use Mamdanis fuzzy inference method which is most commonly seen fuzzy methodology as shown in Figure 2. [Neha Chaudhary,2012]
26
Knowledge base
DATA BASE RULE BASE
Navigation Usability Flexibility Efficiency Satisfaction

Fig3.1-Fuzzy Model
FUZZIFICATIO N MODULE
INFERENCE ENGINE
DE- Output FUZZIFICA TION MODULE
After the fuzzification process, there is a fuzzy set for each output variable that needs defuzzification. The input for the defuzzification process is a fuzzy set (the aggregate output fuzzy set) and the output is singleton number.
navigation
usability
Output generation (Mamdani)
flexibility
output efficiency
27
satisfaction
Fig3.2- Fuzzy Inference System
3.2. EVALUATION OF RESULTS

We employed statistical techniques to describe the nature of the data for a period of time. We also apply Logistic Regression for the prediction of different models to examine differences between good and bad design. This section presents the analysis results, following the procedure described in earlier section. Descriptive statistics are for every year data is also provided for performance analysis of the website. The objective of this project was to measure the usability of a website by designing and developing a web metrics. A tool has been developed that works on the fuzzy logic implementation on MATLAB. The above figure shows the final output after the implementation of fuzzy logic. The result shown has been categorized on the basis of different categories.
28
CHAPTER 4 IMPLEMENTATION AND RESULTS
This chapter describes basic software requirements as well as assumptions and dependencies along with implementation details of rulebase through MATLAB. This chapter also contains the screenshots of the interfaces of parameter value extractor. Test cases used to test the application and result thus obtained is then described.
4.1.Software Requirements
WAMP Web Application Server is needed to use the system as a server and run the PHP applications PHP Server Side Scripting Language MySQL Database, to perform the tasks related to database in the web applications.
4.2.Assumptions and dependencies

Some inputs taken for classification of website are based on feedback taken from 30 engineering students. The inputs generalized by 30 students can further be generalized by taking feedback of more number of persons from different backgrounds.
4.3.Constraints
Interface require full web address, ex: www.jssaten.com.
29
4.4.Implementation Details
Parameter value extractor is designed and developed in php, in order to calculate value of selected parameters.
4.4.1. Snapshots of Interfaces :

It contains snapshot of user interface of parameter value extractor.
Fig4.1-Screenshot of user interface
30
4.4.2. Test Cases

Table4.1.Rule base used
31
Fig4.2-Value of parameters calculated
32
Fig 4.2-Implementing fuzzy logic on the undertaken categories
33
Fig4.3-Result obtained from MATLAB
4.4.3. Normalised Data
34
35
4.4.4
RESULTS:
Table 4.3 Shows category wise website evaluation
The objective of this project was to measure the usability of a website by designing and developing a web metrics. A tool has been developed that works on the fuzzy logic implementation on MATLAB. The above figure shows the final output after the implementation of fuzzy logic. The result shown has been categorized on the basis of different categories.
36
CHAPTER 5 CONCLUSION
The web metrics is a tool that provides us with the knowledge of usability of a particular website. This project undertakes various parameters from different websites, identify them and provide a normalized form of data for websites. In this project implementation of fuzzy logic has been done in MATLAB 7.10 considering the five input criteria i.e. navigation, usability, flexibility, efficiency and satisfaction , and a crisp value for the output is generated.
5.1.
Performance Evaluation
The performance of the website has been evaluated based on the following criterions: Navigation: Websites navigation system is like a road map to all the different areas and information contained within the website. Number of links will determine the navigation ease experienced by the intended user. Usability: usability is an approach to make web sites easy to use for an end-user, without the requirement that any specialized training be undertaken. For ex: press a button to perform some action. Flexibility: It means enabling user to browse website anywhere at any browser with ease. Efficiency: Efficiency is a measure of how well a website does what it should do. It can be checked by calculating and comparing response time of each website. Satisfaction: It is user satisfaction that was it useful for him to visit website.
37
On the basis of these criterions a range is provided to each category that will decide the usability of the web site. A checklist of criteria is also used and survey of various websites has been conducted by students. The rule base prepared for analyzing the website contains all the mentioned criteria along with their respective range i.e. HIGH MEDIUM LOW The output according to these criterions are shown by fuzzy logic and implemented in MATLAB 7.10.
5.2.
Comparison with existing State-of-the-Art Technologies.
Since we have implemented fuzzy logic in MATLAB to generate our result on the basis of the criteria used, the output gives us a much more clear understanding about the usability of the websites. The result have been classified in different ranges as Very high: 0.85-1 High: 0.6-0.9 Medium: 0.35-0.65 Low: 0.1-0.4 Very low: 0-0.15 Usability will be predicted based on these output values.
38
REFERENCES
1. Botafogo, R., Rivlin, A.E. and Shneiderman, B. Structural Analysis of Hypertexts: Identifying Hierarchies and Useful Metrics, ACM TOIS,10(2), (1992), 143--179. 1. Bray, T. Measuring the Web. Proc. Fifth WWW Conference, May 6-10, Paris, France, (1996) http://www5conf.inria.fr/fich_html/papers/P9/Overview.html. 2. Baskaran Alagappan, Murugappan Alagappan, and S. Danishkumar, Web Metrics based on Page Features and Visitors Web Behavior, Second International Conference on Computer and Electrical Engineering, December, 2009, Dubai, UAE. 3. Cleary, DWeb-based development and functional size measurement. Proc. IFPUG 2000 Conference, (2000). 4. Cowderoy, A.J.C., Donaldson, A.J.M., Jenkins, J.O. A Metrics framework for multimedia creation, Proc. 5th IEEE Metrics Symposium, Maryland, USA, (1998). 5. Cowderoy, A.J.C., Measures of size and complexity for web-site content, Proc. 11 th ESCOM Conference, Munich, Germany, (2000), 423--431. 6. Carsten Stolz, Maximilian Viermetz, and Michal Skubacz. Guidance Performance Indicator Web Metrics for Information Driven Web Sites, Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence (WI05), 2005, pp. 186192. 7. Claudia Ehmke and Stephanie Wilson, Identifying Web Usability Problems from EyeTracking Data, Proceedings of the 21st British CHI Group Annual Conference on HCI 2007: People and Computers XXI: HCI, pp. 119-128. 8. Chang Jinling and Xia Guoping, Comprehensive Evaluation of E-commerce Website Based on Concordance Analysis, Proceedings of the IEEE International Conference on e Business Engineering (ICEBE05), Beijing, China, October, 2005. 9. E. Ghosheh, S. Black, and J. Qaddour, Design metrics for web application maintainability measurement, Proceedings of the IEEE/ACS International Conference
39
on Computer Systems and Applications, March-April, 2004, Doha, Qatar, pp. 778 784.. 10. E. Mendes, N. Mosley, and S. Counsell, Web metrics - estimating design and authoring effort, IEEE Multimedia, 8(1), 2001, pp. 50-57. 11. Emilia Mendes, Nile Mosley, and Steve Counsell, Early Web Size Measures and Effort Prediction for Web Costimation, Proceedings of the 9th International Symposium on Software Metrics Symposium (METRICS03), 2003, page 18. 12. Fletcher, T., MacDonell, S. G., and Wong, W. B. L. Early experiences in Measuring Multimedia Systems Development Effort. In: Multimedia Technology and Applications, Hong Kong: Springer-Verlag, (1997), 211--220. 13. Hatzimanikatis, A. E., Tsalidis, C. T., and Chistodoulakis, D. Measuring the Readability and Maintainability of Hyperdocuments, J. of Software Maintenance, Research and Practice, 7, (1995), 77--90. 14. Junhua Wu and Baowen Xu, A Method to Support Web Evolution by Modeling Static Structure and Dynamic Behavior, Proceedings of the International Conference on Computer Engineering and Technology, 2009, ICCET08, Vol. 2, pp. 458 462. 15. J. Conallen, Building Web Applications with UM (The Addison-Wesley Object Technology Series), Addison Wesley, Second Edition, 2003. 16. Jinling Chang, Usability Evaluation of B2C Ecommerce Website in China, The Sixth Wuhan International Conference on e-business (WHICEB2007), 2007, pp. 53-59. 17. Mendes, E., and Mosley, N.. Web Metrics and Development Effort Prediction, Proc. ACOSM 2000, (2000). 18. Mendes, E., Hall, W., Harrison, R. Applying measurement principles to improve hypermedia authoring, NRHM, 5, Taylor Graham Publishers, (1999),105132. 19. Mendes, E., Mosley, N., and Counsell, S. Web Metrics Estimating Design and Authoring Effort. IEEE Multimedia, Special Issue on Web Engineering, Jan.-Mar., (2001), 50--57. 20. Mendes, E., Mosley, N., and Counsell, S. Investigating Early Web Size Measures for Web Costimation, Proc. EASE2003 Conference, Keele University, (2003)
40
21. Paul Warren, Craig Gaskell, and Cornelia Boldyreff, Preparing the Ground for Website Metrics Research, Proceedings of the 3rd International Workshop on Web Site Evolution (WSE'01), 2001, page 78. 22. Reifer, D.J. Web development: estimating quick-to-market software. IEEE Software, (Nov/Dec), (2000), 57--64. 23. Rollo, T. Sizing e-commerce. Proc. ACOSM 2000, Sydney, Australia, (2000). 24. Seoyoung Hong, and Jinwoo Kim, Architectural criteria for website evaluation: conceptual framework and empirical validation, Behaviour & Information Technology, Volume 23(5), 2004, pp. 337-357 25. Sanjeev Dhawan and Rakesh Kumar, Analyzing Performance of Web-based Metrics for Evaluating Reliability and Maintainability of Hypermedia Applications, Proceedings of the Third International Conference on Broadband Communications, Information Technology & Biomedical Applications, 2008, pp. 376-383. 26. Victor Pascual-Cid, An Information Visualization System for the Understanding of Web Data, IEEE Symposium on Visual Analytics Science and Technology (VAST '08), 2008, pp. 183-184. 27. V. R. Basili and D. M. Weiss, "A Methodology for Collecting Valid Software Engineering Data", IEEE Transactions on Software Engineering, 1984, SE-10 (6), pp. 728-738. 28. Yamada, S., Hong, J., and Sugita, S. Development and Evaluation of Hypermedia for Museum Education: Validation of Metrics, ACM Transactions on Computer-Human Interaction, 2(4), (1995), 284307. 29. Yuming Zhou, Hareton Leung, and Pinata Winoto, MNav: A Markov Model-Based Web Site Navigability Measure, IEEE Transactions on Software Engineering, 33(12), 2007, pp. 869 890.
41
42

Web Metrics

Cargado por

Información del documento

Título original

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Web Metrics

Cargado por

Copyright:

Formatos disponibles

1

1.1.2 Project Objective

1.1.3 Scope of the Project

1.2.Related Previous Work

1.3.Organization of the Report.

CHAPTER 2 LITERATURE SURVEY

2.1 A survey of web metrics by Botafogo et al [1992].

2.3 Web structural metrics evaluation by Hatzimanikatis et al [1995].

2.4 Consolidating the ISO usability models by Bray [1996].

2.5 User Interface structural metrics by Fletcher et al [1997].

2.6 The systematic review of web measurement by Cowderoy [1998]

2.7 Size Metrics by Mendes et al [1999; 2000; 2001].

2.8 Website complexity metrics for measuring navigability by Rollo [2000].

2.9 Usability measurement in context by Cleary [2000].

2.10 User interface architecture by Reifer [2000].

2.11 Web application size metrics by Mangia and Paiano [2003].

SYSTEM DESIGN AND METHODOLOGY

3.1 Research Methodology

3.1.1 Data Collection

3.1.2 Parameters identified

3. Top Web Browser Platform

4. Top Operating System

9. Top Search Engines

10. Top Keywords

11. Average Time Spent

12. Exit Pages

13. Entrance Pages

14. Bounce Rate

15. Repeat Visits

16. Conversion Rate

3.1.3 PARAMETERS USED FOR METRICS TOOLS 1. Number of words

2. Body text words

6. Within page links

8. Page title length

3.1.4 Parameter value calculation

3.1.5 Normalization of data

3.1.6 Implementation of Fuzzy Logic using MATLAB 7.10

Navigation Usability Flexibility Efficiency Satisfaction

DE- Output FUZZIFICA TION MODULE

Output generation (Mamdani)

3.2. EVALUATION OF RESULTS

CHAPTER 4 IMPLEMENTATION AND RESULTS

4.2.Assumptions and dependencies

Interface require full web address, ex: www.jssaten.com.

4.4.1. Snapshots of Interfaces :

Fig4.1-Screenshot of user interface

4.4.2. Test Cases

Fig4.2-Value of parameters calculated

Fig 4.2-Implementing fuzzy logic on the undertaken categories

Fig4.3-Result obtained from MATLAB

4.4.3. Normalised Data

Comparison with existing State-of-the-Art Technologies.

También podría gustarte