Goals:
Results:
In this dataset, I discovered that gender is a moderate influencer of a healthy BMI, as 27.22% of men are considered to have a “healthy” BMI, compared to 28.09% of women.
Learning Lessons:
Overall, I wasn’t too surprised to discover that women were considered “healthier” than men in US society. What surprised me the most was the ability to navigate the data. While completing this assignment, I realized how misunderstanding data can be. For instance, I tried running a for loop on the ESTIMATE column, but it wouldn’t run through due to a mysterious data issue. Also, it took me a long time to understand what the data was trying to measure. On the data website, there was a vague description on the kind + quantity of population that the data set was trying to measure. It took a series of trial-and-error calculations to find a significant statistic. Having the skillset to figure out dataset errors is crucial, as I imagine there will be plenty of instances in my future career where a dataset cannot be accurately automated due to dirty data. As I encounter more datasets, the statistic that data scientist’s spend 60% of their work cleaning dirty data becomes increasingly clear to me.
Filezilla URL:
https://misdemo.temple.edu/tup16404/PROPOINTS/2402Project.html
JSON URL:
https://data.cdc.gov/api/views/3nzu-udr9/rows.json?accessType=DOWNLOAD
CSV: