CASE STUDY: Group 7 – North African tourist destinations. Context: Your consulting team has been asked to analyse data collected in July 2019 for samples of accommodations in six North African tourist destination.

Mathematics and Statistics in Hospitality Business

Context:

Your consulting team has been asked to analyse data collected in July 2019 for samples of accommodations in six North African tourist destination.

As a part of your research, you should analyse the following variables: the ACCOMMODATION PRICE (in CHF) and the ACCOMMODATION’S DISTANCE TO CENTRE (in km). Moreover, you are also asked to conduct a correlation and regression study between the ACCOMMODATION’S DISTANCE TO CENTRE (in km) and the ACCOMMODATION’S LOCATION SCORE.

Some important information about the data:

The accommodations’ distances to centre, the accommodations’ location scores and the accommodation prices were extracted from the Booking.com website.
The accommodations’ location scores are based on the average evaluation of accommodations’ proximity to popular places for eating, shopping, nightlife, sightseeing, rated by customers on a scale from 1 to 10, with 1 indicating the lowest and 10 the highest result.
All the accommodations are of at least 4-star standard.
All the data needed for these studies can be found in the Excel file:
G07 – NORTH AFRICAN TOURIST DESTINATIONS.xlsx

Each team member should analyse the data from one tourist destination with the corresponding set of hotels (from the Excel file provided).

Variable: ACCOMMODATION PRICE (in CHF)

Summarise this variable using a grouped frequency distribution and present it on a histogram.
Calculate measures of central tendency (mean, median, mode) for this variable.
Calculate measures of spread (range, variance, standard deviation and coefficient of variation) for this variable.
Variable: ACCOMMODATION’S DISTANCE TO CENTRE (in km)

Calculate measures of central tendency (mean, median, mode) for this variable.
Calculate measures of spread (range, variance, standard deviation and coefficient of variation) for this variable.
Find the five-number summary for the boxplot. Calculate the lower limit and the upper limit for the outliers. Identify if there are any outliers and, if yes, list them on the summary table with the results.
Variables: ACCOMMODATION’S DISTANCE TO CENTRE (in km) and ACCOMMODATION’S LOCATION SCORE

In your analysis use variable ACCOMMODATION’S DISTANCE TO CENTRE (in km) as the independent variable, and ACCOMMODATION’S LOCATION SCORE as the dependent variable.
Present the underlying data on the scatter diagram.
Calculate the correlation coefficient r, regression coefficients a and b, and the coefficient of determination R2.
Important: All the results of calculations from the individual tasks (presented with two decimals, only coefficient b should be presented with four decimals) should be summarized in ONE table (provided on the next page) and presented in a first section of the summary report: Data and Results Presentation, together with the histogram and the scatter diagram, prepared for your data set.

For the details concerning the report structure, please refer to the project outline.
It must be clearly indicated in the table who was analysing each investigation unit.

As only the content of your summary report will be graded, you must present all the results of calculations from the Excel file in the word document. Excel file MUST NOT be submitted and WILL NOT be graded.

Table to include in the summary report (in this way AND order exactly, all the results should fit to ONE PAGE in your final report):

`` MARRAKECH   CASABLANCA  FES CAIRO   SHARM EL SHEIKH TANGIER``

Student name
Variable: ACCOMMODATION PRICE (in CHF)
Mode(s)/Modal class
Median
Mean
Shape of distribution
Range
Sample
Variance

Sample
st. deviation
Coefficient of variation
Variable: ACCOMMODATION’S DISTANCE TO CENTRE (in km)
Mode(s)/Modal class
Median
Mean
Shape of distribution
Range
Sample
Variance

Sample
st. deviation
Coefficient of variation
Five-number summary
Min for boxplot
Q1
Q2
Q3
Max for boxplot
Lower limit for outliers
Upper limit for outliers
Outliers (values)
Correlation and Regression
r
a
b (4 decimals)
R2

In the ANALYSIS AND DISCUSSION section, students should compare and review their individual results and address the following issues:

ACCOMMODATION PRICE (in CHF):

In which tourist destination is the ACCOMMODATION PRICE (in CHF) typically the lowest and in which tourist destination is the highest, and how this analysis can be linked to the shape of the underlying distributions?

In which tourist destination is the variability of ACCOMMODATION PRICE (in CHF) the largest and which measure of spread would be the most appropriate to analyse this variable?

ACCOMMODATION’S DISTANCE TO CENTRE (in km):

In which tourist destination is the ACCOMMODATION’S DISTANCE TO CENTRE (in km) typically the lowest and in which tourist destination is the highest?

Which measure of central tendency should be used for each tourist destination to represent their typical ACCOMMODATION’S DISTANCE TO CENTRE (in km)?

In which tourist destination is the variability of the ACCOMMODATION’S DISTANCE TO CENTRE (in km) the largest and which measure of spread would be the most appropriate to analyse this variable?

Present all box-plots on one diagram and analyse this variable using boxplots: what similarities and differences can you notice?

ACCOMMODATION’S DISTANCE TO CENTRE (in km) and ACCOMMODATION’S LOCATION SCORE
Compare the scatter diagrams and the values of correlation coefficient, r, and analyse in which tourist destination the ACCOMMODATION’S DISTANCE TO CENTRE (in km) has the greatest impact on the ACCOMMODATION’S LOCATION SCORE.

Analyse whether the intercepts from the regression equations are useful for the practical analysis of the underlying data.

How would you interpret the differences/similarities in the values of the slopes from the regression equations calculated for the underlying data?

What can influence the accuracy of the forecast conducted using the regression equations calculated for the underlying data? Comment on the other factors than the ACCOMMODATION’S DISTANCE TO CENTRE (in km) that may have impact on the variability in the ACCOMMODATION’S LOCATION SCORE.

In the RECOMMENDATIONS section propose practical recommendations arising from the data analysis.

