Paper Assignment of Applied Probability and Statistics in Data Analytics

Data Visualization and Statistical Modeling Exercise
In this assignment, you will use business knowledge, statistical modeling method, and data visualization skills to analyze the sales data from an online store. You can use any statistical tools you like, e.g. Excel, Python, R.
The dataset records the sales revenue and its marketing spending on each marketing channel from January 2013 to December 2014. The dataset has 105 rows and 16 columns.
Explanation of some columns:
Holiday: 1 means holiday, 0 means non-holiday.
PROMOTION: the scale of promotion
SP: ad spending or cost that used in each marketing channel, e.g. TV, email, paid search, online display, product search.
IMP: the total number of exposures that the ad is viewed by a visitor, or displayed on a web page in each marketing channel.
Data for signature assignment_BA502.xlsx download
1. Open the excel file and check the correlation between each ‘IMP'(impression) and ‘SP’.
2. Create visualization to see the distribution of ‘Sales’. How’s the trend of sales over the months?
3. Create visualization between each ‘IMP’ and ‘SP’. How’s the trend of each ‘IMP’ and ‘SP’?
4. Check the segmentation of ‘AVERAGE_PRICE’ and ‘MEDIA_SPEND_of_competitor’ for ‘Holiday’. Does ‘MEDIA_SPEND_of_competitor’ have an impact on ‘AVERAGE_PRICE’? (hint: you can use Excel PivotTable to solve this question)
5. Check the segmentation of ‘Sales’ on each ‘PROMOTION’. Does ‘PROMOTION’ boost ‘Sales’?
6. Try to impute a linear regression model to check the coefficient for ‘IMP’ and ‘SP’. Summarize the output of the linear regression model, e.g. p-value, r square.
7. Optional question: Analyze the contributions to the sales for each marketing channel and calculate the ROI (Return of Investment) of each marketing channel. (hint: research on marketing mix modeling method)
5 pages and 1 cover page