Sunday, 16 October 2016

Data Update #1

1. What dataset will you use for your final report? (describe your dataset, include a link to it and claim it at the URL above).

The dataset I have chosen to use for my final assignment is from Stats Canada. I will be using information collected on energy consumption throughout Canada. This data was last modified March 18th 2016 and is recorded every two years. For my assignment I will be looking at 2011 and 2013 data.

2. Describe the dataset. What kind of data does it contain?

The dataset contains information on household energy consumption by income across Canada and shows data for each province. The energy is listed as total energy consumption and is also broken down by type of energy; natural gas, electricity and heating oil. For each income bracket there is a calculated percentage for each type of energy used.

All energy usage is measured in gigajoules and is further listed as gigajoules per household.

The data also contains number of households. In some cases the number of household’s multiplied by gigajoules per household does not equal total gigajoules and that is because some households use more than one type of energy.

Data provided is for years 2011 and 2013.

3. Is there anything about your data that you don’t understand? (i.e. what a column heading means). How will you find this out?

The first thing I question about this data is how heating oil and natural gas is measured in gigajoules. I will look at conversion rates of natural gas and heating oil to electric energy to determine any potential inaccuracies in the recorded data.

In some cases the data is listed as F or E. According to the page I retrieved the data from F means the data was too unreliable to be published and E means use that particular data with caution.

I plan to reach out to a contact person with Stats Canada to find out what constitutes information as being too unreliable to be published versus “Income not stated.”

4. What are some questions you hope to answer with your data? List at least three. (you don’t need the answers at this point)

The following are questions I plan to answer with this dataset:
  1. How do energy costs scale with income? For example: how much percent of income is a household with $20,000 spending on energy compared to a household with a $150,000 income.
  2. Which province has the highest total energy consumption? Lowest? I would also like to show the highest/lowest for each different energy type; natural gas, electricity and heating oil.
  3. How does 2011 compare to 2013?

7 comments:

  1. I think that your narrowing of the data to two specific years is a great idea! I would be interested to hear what StatsCan has to say about the difference of "unreliable" and "Income not stated," my guess would be that unreliable was an answer that didn't seem plausible.
    A question I would have is: What factors contribute to data change between the years?
    Looking forward to your final project!

    ReplyDelete
  2. I think it's very interesting that for each income there's a calculated percentage for each type of energy used. I'm looking forward to your final project, because I think it would be very intersting to see, how the energy costs scale with the income.

    ReplyDelete
  3. As Heather mentioned, i's great to narrow the data to two specific years. As we discussed in class, Stats Can can be unreliable so I look forward to your comments on the website.

    ReplyDelete
  4. Hi Mel, this is a very relevant and interesting data set to analyze! I like your questions you are asking of the data, and your results will be helpful in considering current and future energy consumption patterns for households, and costs.
    As others have stated, I am curious to see what Stats Can will say about "unreliable data" - if you are able to contact them! Best of luck!

    ReplyDelete
  5. Excellent dataset to inquire about, and I love how you have a great sense to question the legitimacy of certain data components. In our household, utilities bills can really fluctuate, but I think a lot it could have to do with the type of energy consumption in different seasons.

    ReplyDelete
  6. Great questions! I feel like high income households would definitely use more energy in comparison to low income households, as the increased land would inevitably lead to increased amounts of T.Vs and other appliances. It will be interesting to see where the data will not correlate to the hypothesis.

    ReplyDelete
  7. Hi! The topic is very interesting, you explain the dataset very clear. Your result of the final report will be very meaningful, it will help some relevant householder clearly know their energy use.

    ReplyDelete