Currently, each application platform generates massive amounts of data every day. In-depth analysis reports based on massive data are becoming more and more valuable. This field covers mathematics, statistics, computer science and many other disciplines, and is a direction worthy of further study. This article covers a simple data analysis scenario designed to sort out common class libraries (pandas, matplotlib, etc.) and introductory knowledge involved in Python data analysis. In this paper, for several designated mobile phone brands, the monthly search data is obtained from the Baidu Index website by date range, and then their search trend comparison chart is drawn.
1) Key points
a) date range (using the date_range method of pandas);
b) For the specified date (year and month), obtain the search volume of each brand in the mobile phone brand list (requests use);
c) Construct a DataFrame (focusing on the data, index, and columns parameters), combined with matplotlib plotting.
2)Python Code
#!/usr/bin/python3 # -*- coding: UTF-8 -*- import requests import pandas as pd from datetime import datetime import json from pandas import DataFrame from matplotlib import pyplot as plt def get_indices(year, month, brands): uri = 'http://index.baidu.com/Interface/Newwordgraph/getTopBrand?i=2&datetype=m&year=' + year + '&no=' + month r = requests.get(uri) if 200 == r.status_code: brand_indices = {data['name']: data['value'] for data in json.loads(r.text)['data']['data']} return [int(brand_indices[brand]) for brand in brands] return [] if '__main__' == __name__: brands = ['IPHONE', 'OPPO', 'LG', 'HTC', 'VIVO'] year_months = [datetime.strftime(date, '%Y-%m') for date in pd.date_range(start='20140101', end='20171101', freq='m')] data = [] for year_month in year_months: year, month = year_month.split('-') indices = get_indices(year, month, brands) data.append(indices) frame = DataFrame(data, index=year_months, columns=brands) frame.plot() plt.title('Search Trends Of Mobile Phone Brands') plt.show()
3) Result output