The dataset has hourly temperature recorded for the last 10 years starting from 2006–04–01 00:00:00.000 +0200 to 2016–09–09 23:00:00.000 +0200. It corresponds to Finland, a country in Northern Europe. I can download the dataset from this Google drive link:https://drive.google.com/open?id=1ScF_1a-bkHi1qe8Rn78uxK6_5QwUD9Bu
Aim: we need to find whether the average Apparent temperature for the month of a month says April starting from 2006 to 2016 and the average humidity for the same period has increased or not.
Step1: Import Libraries
Pandas and Matplotlib
Step2: Load datasets
Weather history Dataset:https://www.kaggle.com/muthuj7/weather-dataset
Step3: Reading Data by Printing head of the data
Step4: Also Printing by Shapes, Info , types , Describe
Step5: Cleaning the Data
Step6: Check the data is proper or not.
Step7: Visualize the data
Print head of the data
Using head function reads-the first five records of the data. we can mention how many record prints through the head() here I use only read just first 5 records.
Print data shapes, dtypes, Columns, info ,describe
The function “shape” returns the shape of an array. The shape is a tuple of integers. These numbers denote the lengths of the corresponding array dimension.
This is the primary data structure of the Pandas. Pandas DataFrame. columns attribute return the column labels of the given Dataframe.
A dtype object can be constructed from different combinations of fundamental numeric types. Object to be converted to a data type object.
The info() function is used to print a concise summary of a Dataset. This method prints information about a Dataset including the index dtype and column dtypes, non-null values..
describe() is used to view some basic statistical details like percentile, mean, std etc. of a dataset or a series of numeric values.
Cleaning the Data
Hera, I Dropped the unwanted Columns in Dataset
We can use drop() method to remove unwanted columns
The drop() function is used to drop specified labels from rows or columns. Remove rows or columns by specifying label names and corresponding axis,
Check the Dataset is proper or not
I think the formatted date is not proper, so we need to make a date-time object. For this we can use _datetime() function. Then set index name as Formatted Date through set index function.
Since now we have been given hourly data, we need to resample it monthly. After resampling
Apparent temperature :is most commonly applied to the perceived outdoor temperature.
Humidity is the concentration of water vapor present in the air. Water vapor, the gaseous state of water, is generally invisible to the human eye. Humidity indicates the likelihood for precipitation, dew, or fog to be present
Plotting the dataset for the past ten years for all months
Here I use matplotlib that helps to visualize our last ten years records in month-wise
First import matplotlib library as mention alias name then I select which type of plot suitable for our dataset. Here I select line plot, But matplotlib plot function visualizes line plot by default.
Now I visualize January month from 2006–2016 likewise February, March, April………up to December 2016.
Similarly the code for January to December just can change variable name and index that index.month==1 is January like index.month==2 is Feb, index.month==12 December just put in Month in numerical.
In below Shown a January to December Visualization
From September to March Large difference in Apparent Temperature but no changes in Humidity and in April to August minor changes in temperature but here also no difference in humidity. exactly, November, December, January high changes than other months.
In this article, We are learned how to analyze Meteorological Data using Matplotlib In the future We discussing through seaborn plotly and tableau about Meteorological Data.