1. What dataset will you use for
your final report? (describe your dataset, include a link to it and
claim it at the URL above).
The dataset I have chosen to use
for my final assignment is from Stats Canada. I will be using information
collected on energy
consumption throughout Canada. This data was last modified March 18th
2016 and is recorded every two years. For my assignment I will be looking at
2011 and 2013 data.
2. Describe the dataset. What
kind of data does it contain?
The dataset contains information
on household energy consumption by income across Canada and shows data for each
province. The energy is listed as total energy consumption and is also broken
down by type of energy; natural gas, electricity and heating oil. For each
income bracket there is a calculated percentage for each type of energy used.
All energy usage is measured in gigajoules
and is further listed as gigajoules per household.
The data also contains number of
households. In some cases the number of household’s multiplied by gigajoules
per household does not equal total gigajoules and that is because some
households use more than one type of energy.
Data provided is for years 2011
and 2013.
3. Is there anything about your
data that you don’t understand? (i.e. what a column heading means). How
will you find this out?
The first thing I question about
this data is how heating oil and natural gas is measured in gigajoules. I will
look at conversion rates of natural gas and heating oil to electric energy to
determine any potential inaccuracies in the recorded data.
In some cases the data is listed
as F or E. According to the page I retrieved the data from F means the data was
too unreliable to be published and E means use that particular data with
caution.
I plan to reach out to a contact
person with Stats Canada to find out what constitutes information as being too
unreliable to be published versus “Income not stated.”
4. What are some questions you
hope to answer with your data? List at least three. (you don’t need the answers at this
point)
The following are questions I plan to answer with
this dataset:
- How do energy costs scale with income? For example: how much percent of income is a household with $20,000 spending on energy compared to a household with a $150,000 income.
- Which province has the highest total energy consumption? Lowest? I would also like to show the highest/lowest for each different energy type; natural gas, electricity and heating oil.
- How does 2011 compare to 2013?
I think that your narrowing of the data to two specific years is a great idea! I would be interested to hear what StatsCan has to say about the difference of "unreliable" and "Income not stated," my guess would be that unreliable was an answer that didn't seem plausible.
ReplyDeleteA question I would have is: What factors contribute to data change between the years?
Looking forward to your final project!
I think it's very interesting that for each income there's a calculated percentage for each type of energy used. I'm looking forward to your final project, because I think it would be very intersting to see, how the energy costs scale with the income.
ReplyDeleteAs Heather mentioned, i's great to narrow the data to two specific years. As we discussed in class, Stats Can can be unreliable so I look forward to your comments on the website.
ReplyDeleteHi Mel, this is a very relevant and interesting data set to analyze! I like your questions you are asking of the data, and your results will be helpful in considering current and future energy consumption patterns for households, and costs.
ReplyDeleteAs others have stated, I am curious to see what Stats Can will say about "unreliable data" - if you are able to contact them! Best of luck!
Excellent dataset to inquire about, and I love how you have a great sense to question the legitimacy of certain data components. In our household, utilities bills can really fluctuate, but I think a lot it could have to do with the type of energy consumption in different seasons.
ReplyDeleteGreat questions! I feel like high income households would definitely use more energy in comparison to low income households, as the increased land would inevitably lead to increased amounts of T.Vs and other appliances. It will be interesting to see where the data will not correlate to the hypothesis.
ReplyDeleteHi! The topic is very interesting, you explain the dataset very clear. Your result of the final report will be very meaningful, it will help some relevant householder clearly know their energy use.
ReplyDelete