-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here is the assignment and an answer sheet to submit (in Word format).
Here is the data file you’ll need [Groceries.csv].
Another note: Make sure you’ve included ALL the attachments (check the assignment ins […]
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here is the exercise.
And here are the supporting files. Remember, download them to your computer by right-clicking and selecting Save As…
The R script you’ll need: aRules.r
The data file you’ll need: Bank.csv -
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here is the exercise.
And here is the answer key.
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Leave your response as a comment on this post by the beginning of class on April 18, 2016. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your o […]
-
The most important take away is the mindset used data analytics. I learned that I needed a logical understanding of how a system works, so that I could wrap my head around all the concepts. This covers learning the programs needed in class, relating content it to your life and adopting new ideas in general. I realized the mindset’s importance recently because of the internships I’ve been communicating with. Every internship focuses on a logical and analytical mindset to achieve their goals and learn new jobs.
-
One of the significant concepts that I have taken away from this Data Analytics course is understanding computer language and the benefits of tools such as SQL and R Studio. As a Strategic Communications major it was really interesting learning different commands for SQL and leveraging that to extract information from different database. Initially, I thought some of the concepts were tedious specifically for the assignments, however I have noticed an improvement in my problem solving skills after having to reconfigure and readjust my approaches to the assignments. In a job interview, I would explain what I’ve learned by discussing the programs and the benefits that they have in helping businesses make decisions using the data they already have.
-
The most important takeaway that I learned from this class was how to utilize software like MySQL and R Studio and along with the architecture of an Organization with how data travels. MySQL and R Studio are both software that I’m sure I will see in the future so it was definitely important to get some background knowledge about them and how to use them. Also, learning about how data is transferred from the beginning stages of data entry to data analysis was helpful because I’m interested in either the data analyst or system analyst field. I would tell a future employer that I learned how to create data models, transactional databases, and analytical databases along with utilizing software such as MySQL and R Studio to create and browse databases.
-
the most important takeaway that I took from this class are the experience of learning and using SQL and RStudio. The way I would explain to my employer of why having this knowledge is important is by explaining that I can take cluttered information to make it into information that can be more beneficial for the company. As a Marketing Major, in a sense I am able to use the information that is given and directly market towards customers within the demographic area, or even by the products they have purchased. Retaining this knowledge will definitely help me in the long run.
-
The most important takeaway from this course is my new understanding of the power of data. Not only was I able to gain familiarity with several tools for data analytics, I was also able to gain an appreciation and an interest for how data analysis can benefit any company or organization. I would explain this to an employer by giving examples of certain assignments that I’ve had, and give details about skills I’ve learned in SQL and R. In addition, I would just explain my general understanding about the role data analysis should play in the company I’m interviewing with.
-
The most important takeaway that I took from this class is the importance that data analytics plays in all parts of our lives. From supermarkets to retail shopping, from large corporations to small businesses, from in store retailers to online purchases, data analysis is used to make everyday business decisions that could potentially increase revenues. Taking a look at the statistical importance of different business decisions using R studios has shown me first hand why our particular skill set as MIS majors is so in demand.
-
Before I attended this first class, or even before i chose the MIS major, I never really knew what data was about. So for me, the most essential takeaway from this Data Analytics course was learning the importance of big data. This course guided me on the path to think more critically and analytically when approaching a problem. To my future employer at a job interview, I would explain that i’ve learned how to seek relevant information to solve business problems by manipulating data using software such as MySQL, R Studio, and Excel.
-
The two biggest takeaways that I have from this course are 1) a preliminary understanding of the data analysis workflow/cycle and 2) my understanding of the “data-speak” behind basic analytical tools such as R and SQL. While not an MIS major/minor, I am glad I took this class because so much is rooted in data now. I hope to work in government and policy and with the civic tech. movement, I think that being able to talk in terms of data analysis will be helpful in interviews. It will separate me as someone not vertically versed in policy-making but also data-driven policy-making.
-
13. This class really showed me how data works and how I can use it for my benefit. Understanding the different types of data, and how it is transformed into historical data through the ETL process was very beneficial. It is also important to differentiate among data to find what is most relevant. Something that I enjoyed using was the SQL database, in which data relevancy was important, to create a good database with efficient results. Next, using R studios was a great tool to use when browsing through large databases. R studios can save us a lot of time, so that alone can solve many issues if we were pressured on time. Finally, data visualization, which I believe is one of the most important elements because it is all about presenting our data. If it isn’t done clearly or in the right way, our work might just mean nothing to others. I would use SQL, R-studios, and data visualization to explain to my future employer.
-
The most important things that I would take away from this class would be my knowledge from using different analytical tools such as SQL, R and RStudio. I think that these particular things were most interesting and opened my mind to new and exciting technologies. In a job interview I would explain to my employer that I am proficient in using SQL, both pulling data out of a database and putting data into the database. I would also explain how I was familiar with RStudio as well as analyzing the data from the decision trees and clustering graphs and charts. Learning more about Excel will also better prepare me for my future job as an MIS major no matter what job I will have in the future. Everything that I’ve learned in Data Analytics this semester was new and interesting and I can’t wait to use this knowledge and apply it to future job positions.
-
Everything in this course was important because everything links together. I believe that learning the feel of some of the software we used was a good way to understand how these software’s work. Such as: MYSQL, R Studio and Excel are many ways we can manage data which is very important.
I learned in my data analytic class that we have different types of data, different types of software that help us find data we need. How we can make better decisions using data, how we can put data into graphs and have any random person look at it and understand what is on the graph.That is how we are able to connect both worlds of IT and business together! -
The most important aspect of the course was learning how to adjust to information technology as a language, and being able to apply the language to a business situation. For example, using MySQL and Rstudio took a degree of adjusting to technology language, but once understood, I was able to tell a story with the data. If an employer asked what I have learned from this course, I would simply say I can understand, interpret, and communicate information technology language much more fluently now.
-
The most important takeaway from this course for me was through the use of Excel and SQL. These tools will come in handy when your boss gives you a spreadsheet filled with a million rows of data. I would explain to my potential employer that our Excel and SQL assignments simulated questions that we may come across while working in the field such as which customers spent the highest amount of money in a given month or year? Through these tools, we can analyze and make sense of collected data we are given.
-
The most important takeaway for me is the learning how to code in SQL. As I have found out, SQL is used everywhere and by putting that on my resume, I can stand out from the crowd. I would explain to a future employer that I am proficient in writing SQL code and can do basic functions such as selecting specific data from a database as well and modify database tables.
-
I feel that the most important take away from this class was actually learning to work with data! In MIS 2101, when the Data Analytics Challenge became a requirement, I felt entirely overwhelmed by it because I did not know there were accessible tools that could help me interpret the data. Now that I’ve been introduced to tools like MySQL and Rstudio, I feel that working with data has not only become much easier, but it has become interesting as well.
In a job interview, I would explain the work I’ve done with data analytics in those particular programs. The experience would definitely look good and they would know that I’m someone who would be able to handle a company’s data! -
Among all the MIS classes that I have taken, this is by far the most interesting and personally most favorite class of all. I like how I was taught the various aspect of data presentation and visualization and what goes on before and after it is processed into meaningful information. Software is usually tough to grasp (as observed in the MIS3501 classes) however I believe that the flow and pace of this class helped in understanding the fundamentals of data and eventually working on high end software (such as MYSQL and R Studio) and see the possibilities of creating meaningful information.
-
After taking this course, my ability to understand and interpret the way data and information (as well as the difference between the two) is created and made, has increased. After gaining experience with SQL, it is a lot easier to grasp the concepts of creating tables, retrieving and editing data – I also learned the importance of doing so in order to make more efficient decisions. Data and information are key elements in today’s world and society let alone a staple of the business world (every department can benefit from the better decision-making that efficient displays of data can provide us with).
-
The most important takeaway for me in this course would be having gained the overall knowledge of how data analytics works. I’ve learned the various different ways of how to extract meaningful data from databases and use that to find out specific information about a problem or situation. I would explain to a future employer that I now have the skills to use different systems, such as MySQL and R Studio, to extract meaningful data that would ultimately help out the company. I provide specific examples of the in-class exercises and assignments that we have done to further explain what exactly I was able to do to analyze and extract different information.
-
The most important takeaway for me from this course was learning how to use and interpret MySQL and RStudio. Another important skill that I learned was how to analyze data through the use of pivot tables created in Excel. I learned a little bit about pivot tables in the Introduction to Microsoft Excel course, but I feel that I was able to understand it more from examples that we completed in class. I would explain to a future employer that I am proficient in both MySQL and RStudio, and able to analyze large data and translate it into something that the average person would understand.
-
I learned a whole lot in this course. The greatest benefit in my eyes was learning the basics of “coding” and how to use multiple different database systems. Being able to discuss both SQL and R should greatly benefit me in a future interview. I have not worked with PHP, HTML, Java or other programming languages yet, but I am relatively certain that this class has properly prepared me for when I do need to learn them.
-
The most important thing that I got out of this class, was that working with data is fun. Whether it is working with SQL, creating visualizations, or using Pivot tables, you can uncover so many things from data. Tools such as R and SQL will definitely tools I will work a lot with in the future. Data analytics will be important for companies to gain an advantage in the future.
-
One off the biggest takeaways in this semester is learning the power of analyzing data. I appreciate the exercises we used in Excel because Excel is used throughout the business world. The company I will be interning at this summer uses SQL therefore learning the basics of SQL in this class gives me a solid foundation of the language and will prepare me for the type of work I will be doing in this summer.
-
The takeaways from this course I find most important are Excel and SQL. During the internship interviews i had this semester every firm asked me about my Excel proficiency. Learning the framework of SQL in class then furthering my learning independently did help me in my interviews. Having a solid base of Excel and SQL allowed me to accept my internship at TD Bank this summer doing IT Audit.
-
The biggest takeaway from this course for me was learning just how many ways data analytics can improve decision making on a daily bases. This class, along with learning about data analytics in 2101, were what ultimately inspired me to add a double major in MIS. The most crucial technical skills that I’ve learned and will be able to take advantage of in my future career are how to utilize MySQL Workbench in order to create, modify and query databases, as well as how to utilize R Studio to determine relationships and patterns in the data. I have also learned how to utilize Excel even more efficiently, by being able to create pivot tables and augment data variables to see the exact correlations that I am interested in seeing. In addition to learning how to read and get information, this course has also taught me how to better display and present information to a co-worker, superior, or potential client through using proper data visualization techniques. This knowledge is also completely relevant for Marketing purposes, and therefore I will be able to use my combined knowledge in both fields to be able to optimize my effectiveness when working in a future potential internship and career.
-
The most important take away from the course for me was the use of SQL and Excel. Extracting information, manipulating data, and updating databases with functions that other users often have no familiarity using. I would tell an employer that I have gained a coveted set of skills working with big data through course assignments. These set of skills have better prepared me for working in an analytical role with big data and making decisions based on the information found.
-
When looking back at the class I think I value a number of things that I learned. However, more than anything else, I found learning how to manipulate and read data in multiple different ways takes the highest level of importance. Now that I know how to input and output data in more than one way, I have no reason not to believe that this class could possibly be the reason why I land a future internship or job position.
-
The most important takeaway from this class is learning how to utilize data and making it into something that we can use to predict things. Being able to manage tons and tons of data and turning data into information. I would tell them that I learned how to handle and manipulate data and turn it into something useful that can be used for the benefit of the company.
-
The most important takeaway from this course is understanding well the relation of data and information. The tools we practiced to use like MySQL, pivot table, R language and basic ERD graph, they are all designed for easiest and fastest data processing. But all of tools are based on understanding of materials, excerpt information to simpler commands on tools. You need to read instructions, practices more and write basic commands to get perfect answers. If I am in a job interview in the future, I would tell my employers about how can I quick learn, understand and use these data processing applications in different working environment.
-
This course has given me great insight on how data analytics is used in business and other sectors. A main takeaway from this course are the skills needed to utilize MySQL and R Studio. These programs are very helpful in managing and analyzing data for a wide range of uses. I feel that I have more expertise that I can add to my knowledge bank for my career.
-
The biggest takeaway from this class for me was learning various ways of analyzing data for different purposes. This allows to me to see the data in a different perspective and gives me an idea how big data is analyzed in a systematic way. Also, the knowledge that I learned from this class will help my professional career because it enhanced my computer skills. And I believe that knowing how to use RStuido and MySQL can help my career in the insurance industry because insurance industry is heavily relay on the use of the big data.
-
I most enjoyed learning how to use the “hot” technology that all companies use to sort, analyze, and visualize their data, specifically: SQL and R. Both of these new skills I’ve developed will help me succeed in my internship role this summer, and in upcoming employment opportunities. I would explain to an employer in an interview that this class gave me the tools needed to better understand how data works, and the inspiration it gave me to learn more about all aspects of “big data.”
-
Everything that I learned in this class was valuable from SQL to analyzing data in R. What we learned in SQL really helped me with my other MIS courses that also used SQL. My favorite part of the class was using Rstudio to analyze statistical data. I especially liked the clustering assignment because it produced various visuals to analyze the data. I also enjoyed learning data visualization principles. We see manipulation of data in graphics all the time for various companies such as Verizon who markets their company by using a map to show Verizon users. I would tell a future employer that I have experience analyzing data in Rstudio and manipulating databases in SQL.
-
Learning SQL and R was very important in this class as it will be very beneficial in the future to not only have experience with these specific systems, but also to have experience learning new systems in general. Use of these systems helped with data visualization which brought the course together from what we learned in the beginning until now. Being able to analyze big data with these systems will be helpful in the future working in the IT field.
-
For me, the most important takeaway from Data Analytics would be learning SQL, pivot tables in Excel and R/RStudio. The tech industry is getting bigger and bigger and there is an explosion of the amount of data that is stored in servers all over the world. Companies need the people to fill in those high demand jobs and knowledge in database management or data software is a huge plus in this field. Knowing how to sift through piles of data is huge for anyone looking to land a tech job either in the US or looking to work all over the world. Many of these database management jobs can be done remotely which gives you the option to travel around the world while you work on your laptop. No matter what, data is huge everywhere.
I would explain that because of my knowledge of SQL, pivot tables in Excel and R/RStudio, I am able to organize massive amounts of data for a company which can help its business and financial decisions. In SQL I’ve learned how data can be read, written to and manipulated. Tables can be generated, dropped, changed and altered easily through commands. I’ve learned about the table structures and how the databases are contained on the servers which can be accessed remotely from anywhere in the world. MySQL (along with other DBMS’s) are crucial to data storage because of the exponentially increasing amount of data. With pivot tables in Excel, I am able to extract, transform and load different formatted data into easier to read tables to generate critical information faster and easier. Data can be extracted in Excel similar to SQL for easier interpretation for better business decisions in the future. Excel can be used in conjunction with SQL for further data analysis. R and RStudio is a great tool for data modeling which can be used to get into the nitty-gritty details of your data. With R and RStudio, data can be observed with varying levels of granularity which is important if you want to find the intricate details of the data.
-
Personally, I believe the most important takeaway from this course is understanding how data can be transformed and analyzed to help solve everyday business problems in short and long term scenarios. Learning to use different systems like SQL, Excel, and R/R studios opened my eyes and expanded my knowledge on how easily data can be recorded and retrieved for business purposes. Gaining the experience with these different softwares prepared me for real life business situations where I can use data to my advantage.
-
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here is the assignment and an answer sheet to submit (in Word format).
Here is the data file you’ll need [Jeans.csv].
Another note: Make sure you’ve included ALL the attachments (check the assignment ins […]
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here’s the class capture for 3/30 – Decision Trees.
CC 33016
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here is the assignment. It is optional!
If you are an MIS major, you will receive extra credit and Professional Achievement Points for a successfully completed assignment.
Even if you are not an MIS major, […] -
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here is the exercise.
And here are the supporting files. Remember, download them to your computer by right-clicking and selecting Save As…
The R script you’ll need: Clustering.r
The data file you’ll need: […] -
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Leave your response as a comment on this post by the beginning of class on April 11, 2016. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your op […]
-
2) I would advise to choose predictors variables that have the most impact. These include: income, age, has children or not, mortgage and others with the same caliber.
-
A business question regarding your own career that you could answer using a decision tree could be what university you’re going to attend. The data needed to be collected would include if you want to attend college in the first place, where the university is, how much money you’re willing to spend, where the college is, and what major you’re interested in. This would be a somewhat lengthy decision tree needing a lot of information, but it could be done.
-
1. As a general example, I can imagine a company using a decision tree to answer the question of, “Should we buy or build our technology?” They’d need to collect data such as: unique or common situation, ability to build vs. not able to, cost to maintain vs. cost to buy, etc. This is one small way to demonstrate the many uses of a decision tree.
-
A business question that could be answered by using a decision tree is “Is the job candidate qualified for the available position?” The data need that would need to be collected in this type of decision tree would be whether the candidate attended college. If yes then what degree. Some other data that would need to be collected is whether the candidate has prior experience. If yes then how many years of experience, etc. It would really depend on what type of position it was to determine what data needed to be collected. If it were a tech consultant job, data that could be collected is if the candidate had customer service experience.
-
A business question such as whether or not a company should invest in a new technology can be answered with the help of a decision tree. The data that would need to be collected would be the cost of the technology, benefits, the amount of time it would take to implement, and how much capital does the company have. With the analysis of the collected data, a company should have a general idea of whether or not they should invest in a new technology.
-
Colleges could use decision trees to take a better look at applicants for upcoming school years. The data could include GPA, SAT and ACT scores, gender, ethnicity, family income etc. This could be used to determine who of the applicants to accept and who to reject.
-
What advice would you give someone regarding how to select the right predictor variables for a decision tree analysis?
I would advise them to break down the problem description clearly, so that they can find their predictor variables easily.
-
A retail company could use decision trees for determining product roll-out strategies. Data which would be included in the tree is previous sales data for similar product categories, geographical and demographic data for the potential new regions in which a product would be rolled out in, and previously existing customer data that the company has, in order to forecast sales predictions in a specific region.
-
1) Name and describe a business question that you could answer using a decision tree. What data would you collect to perform the analysis? Don’t use an example we’ve covered in class.
A business question that could be answered using a decision tree could be used to determine whether or not someone should be accepted to get auto insurance in a certain group. It can be used to determine the age, location, income, and other attributes that can allow the person to be placed into a certain insurance class. If they do not get accepted into a certain class, they will then have to go to the government insurance pool that can be very expensive.
-
A business question that could be answered using a decision tree is whether a company should merge with another company. The aspects that we could analyze using a decision tree would be revenue per quarter, number of employees, number of warehouses, number of products produced, and the cost per product output. Other aspects that could be analyzed that are not financial or number based are the culture of the company and popularity of the brand.
-
One decision that could be made using decision trees is whether or not a restaurant chain should open a in a new location. The manager will be able to analyze the data to determine if the company has the capital available in order to jump start this new restaurant location, and whether or not that particular location would be conducive to generating profit. Data gathered for the decision tree should include company history, customer history, information about the new location (such as traffic patterns, area demographics, crime rates, etc.), and information about the current financial state of the company.
-
1) A possible real-world problem that could leverage the power of decision trees would be predicting the outcomes of high-risk youth in inner city schools. I know this isn’t necessarily business-related but I think that much of what we learn in this class can be applied to areas outside of business, which is why I like this class. For this problem, you could utilize data on students such as test scores, suspension records, attendance, retention, and other metrics in order to develop a decision tree to predict their likelihood to graduate, their likelihood to get involved with the juvenile justice system, and other possible outcomes for inner city youth. Maybe we already do this, but if we don’t, I think this data would be incredible useful in helping identify students who ned additional supports ahead of time.
-
Business questions: what will be my expected return on investment in the derivatives market in a specific position and economic condition?
Variables need: volatility, time, interest rate, dividend yield (y/n), market price, strike price, probability of price up-move, probability of price down-move.
This situation can be examined in multi-period or single-period binomial trees that predict that value of a derivative at a specific time in the future give the above variables. With this information, a trader can then price the derivative accordingly and determine the probability his/her position/portfolio being profitable. -
A business question that could be answered using a decision tree would be the quantity of products that a store sells. For a store like Dick’s Sporting Goods, who sell a variety of products, they could use a decision tree to break down previous year’s sales and determine which products are most popular based of gender, color, and time of year. For example a pair of Curry or Lebron basketball shoes would probably sell more than a certain toddler shoe since there’s more of a market for Curry’s or Lebron’s so Dick’s would be smart to have more of them on hand than toddler shoes.
-
The best advice I could think of is to first, clearly understand and think it through what you want to analyze. Then write down all of the variables that might relate to what you are trying to analyze. After that you start to choose the most important and most relevant variables and get rid of all others. Variables that don’t have a significant impact on your analysis are probably unnecessary meaning that they shouldn’t be used.
-
After doing research, I realized that lots of financial institutions and companies in various industries use R to carry out functions and bring about various data related to their work and research. Examples include finding standard deviations, averages,maximum & minimum return and even various qualitative information as well. This kind of work will also include decision trees which help in various decision making processes. One example would be, what is the size of investment the company is capable of doing (big or small size)? And according to that, what kind of investments should we look into?
-
A business question that you could answer using a decision tree would be what type of car someone might buy. Car dealers would be able to analyze a person’s age, gender, marital status, and income level to make this prediction through the use of a decision tree. For example, a married man, who has three children, would probably more interested in a safe, child-friendly SUV than a sporty, smaller car.
-
An example of a business question that could be answered using R and decision trees could include whether or not to run a particular ad campaign or not. The data necessary to collect to answer this question could include variables such as the need for an ad campaign or not, cost of the campaign vs. the company’s advertising budget, the time it would take to create the ad vs. when the company needs it to air, and the projected sales impact of running the ad. You could also use a decision tree to filter through which consumers would fit the company’s target market model in order to determine where to focus primary advertising efforts.
-
1) An auto insurance company could use decision trees to predict how likely it is that someone will get into an accident based off of age, gender, whether they live in a rural or urban environment and much more. This information could be useful to help them determine rates based off of this risk.
-
A great business question that could be answered using decision tree analysis would be insurance companies deciding on who to insure or not and what prices to charge which individuals. Decision trees provide accurate percentages and outcomes which can be a great tool for calculating risk. Also, risk managers could use these percentages to correlate the prices that they should charge their customers based on the amount of risk that is involved in insuring the individuals.
-
A great business question would be, how many of our customers in our loyalty program received food assistance from the government? What are the three main products customers purchase the most and what promotional deals can we offer to help support them? The best data that should be collected to perform the analysis is: age, gender, residence, family size, income, and frequently purchased items. To help an employee determine the most important variables, I would encourage them to think about other variables that would influence a customer receiving government assistance: income, family size and gender are all important variables in determining this.
-
Regarding what variables to choose for a decisions tree analysis. Is really to know if those variables would affect the decision tree in any way. Some variables might have no effect on the tree so those variables are useless to a decision tree. The variables that have effect on the decision tree must be added for a more accurate decision making. We always seek having a better analysis an more accurate analysis.
-
1) A business question that could be used using decision trees is which sales people a company should keep, fire, and promote. Data that should be collected should be sales rate, average sales, average item per order, and other measures of productivity. They can also use office ranking and sales activity throughout the year into account.
-
A business question that we can answer using a decision tree would be if we would want to continue a product line. We can base it off of sales, is it profitable, average order, how many was bought this year, who bought it.
-
A business question for colleges/universities that could be answered using a decision tree could be “Which prospective students should receive a scholarship?”. The data that would be collected to perform the analysis would include (but not limited to): ethnicity, gender, family income, age, declared/undeclared major, High-School GPA, and SAT score. Analyzing this data would regulate whether the candidate gets accepted or not.
-
My business question that can be answered using a decision tree would be whether or not a designer should release a new piece to their collection such as sunglasses, wallets, leather goods etc. They would want to exam the age of potential customers, income of those customers, gender, and loyalty to the designer (if they are a repeat customer or not).
-
What advice would you give someone regarding how to select the right predictor variables for a decision tree analysis?
When it comes to the predictor variables, you want variables that will accurately help provide outcomes. With that being said, when the decision tree is made it is crucial that the predictor variables play a part in the decision making process to determine the probabilities. A predictor variable should only be chosen if it can aid in determining a probability.
-
An example of a business question that could be answered using R and decision trees could include whether or not for a new computer would be popular. The data might need to perform the analysis would be: size, quality, shape, price, weight, function, memory storage capacity, speed, etc. There are so many good predictors that we can use to determine the outcome, popular or not popular.
-
My advice to those determining which predictor values to use for a decision tree would be to choose the variables that greatly divide the data first. You would then incrementally choose lesser differentiators until you reach the leaf node values. The order of the predictor values would be different depending on what you are trying to decide.
-
Name and describe a business question that you could answer using a decision tree. What data would you collect to perform the analysis?
My examples is whether or not the company uses the new applications or system for internal management. So it will test the compatibility on the devices (like portable laptops, smartphones, computers and multimedia), price (for installment and permission), privacy and security (safe=additional firewall, special keychain or authorization), complexion (easy or hard to use), range (limitation). -
1.) A business decision that could be made using a decision tree is whether to stay in your current role at your place of employment or explore other career opportunities. Data that can be collected for to make this decision is income, daily/weekly hours, culture, benefits, and location.
-
In order to select predictor values in a decision tree it is easiest for me to work backwards. Think about the end goal. What is it that you want to find? From there, you can then backtrack in the data. How will your desired outcome get extracted from the larger data set?
-
An example question I thought of could be do people like to ski or snowboard more and what makes that so? We have learned that some factors that we thought would never play a part in statistical patterns actually do, so there could be some tellers we would never expect for this question. Some could include Income, a comparison of two hobbies picking one or the other (would you rather play basketball or hockey?), what temperature do you keep your house set to? (snowboarders may enjoy colder temperatures than skiers. These could all be right or wrong, but the point is that some indicators could tell a story that we never thought could work.
-
1. Name and describe a business question that you could answer using a decision tree. What data would you collect to perform the analysis? A business can use decisions tree to see if a person is insurable for health insurance. By looking at different factors you can see if a person is too much of risk. This will help companies choose good risks and not catastrophic risk.
-
1) A business decision that could be made using a decision tree could be selecting and adjusting types of food to be placed on a restaurant menu. Data that can be collected and utilized for this decision would be highest selling dishes, costs, popularity, taste and ingredients. These factors will help determine which foods will sell better depending on the time of year.
-
1) A business could use decision trees for many things, for example marketing a new drug to a certain geographic location. Whether their return on adverting would be worthwhile to deploy marketing in that area. Data used for this analysis would be, age, gender, income, average usage of drug or similar drugs, etc.
-
Most Hollywood movies are shown in almost every theater around the US. Smaller indie films are only shown in select theaters around the country so a decision tree can be used to decide whether or not a customer will see a certain movie depending on their location, age, sex, neighborhood cluster or other demographics.
-
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here is the assignment and Assignment #7 – Decision Trees in R ANSWER SHEET (in Word format).
Here is the data file you’ll need [BankLoan.csv].
Note: If you try to open this file in Excel, you’ll get two e […]
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here is In-Class Exercise #12 – Decision Tree Induction Using R.
And here are the supporting files. Remember, download them to your computer by right-clicking and selecting Save As…
The R script you’ll nee […]
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
As we discussed, you are responsible for watching the Intro to R Mix that can be found here: https://mix.office.com/watch/1vfw1it1t0rjc
You should watch this mix and download R & R Studio to your PC before c […]
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Assignment #6
Here is the assignment and an answer sheet to submit (in Word format).Here is the data file you’ll need [OnTimeAirport-Jan14.csv].
Please email your assignment by the start of class on A […]
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here is the exercise.
And here are the supporting files. When you download them to your computer, I would suggest creating a special folder to hold your R files. You can download easiest by right-clicking a […]
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Leave your response as a comment on this post by the beginning of class on March 30, 2016. Remember, it only needs to be three or four sentences. For these weekly questions, I’m mainly interested in your op […]
-
A theme of companies that use R are banks. Banks like Bank of America and ANZ, which is the fourth largest bank in Australia, are using R reporting and credit risk analysis. Social media sites like Facebook and Twitter have also utilized R to analyze status updates along with monitoring user experience on the site.
-
Companies within the finance and insurance industries use R to develop new trading, pricing, and optimization strategies to increase returns and minimize risk. As an example, Lloyds of London insurance market uses R to model the potential costs associated with catastrophes such as hurricanes and earthquakes.
-
Using statistical analysis and with the help of R studio scientists are researching how to create rechargeable alkali ions batteries. Considering that alkali ion batteries are the dominant source of energy storage today, this research of ions is helping make huge leaps in high density storage. The researchers in this articles are dealing with complex chemical equations that are made easy to analyze with the help of R and R studios.
-
Zillow is a website that lists houses to rent/buy. Zillow uses R to create and update the estimates of prices. Once you find the average prices of houses in an area, you can reasonably come up with a price that is common for a house in the same area.
http://blog.revolutionanalytics.com/2014/05/companies-using-r-in-2014.html -
Kickstarter, a social funding program where individuals can donate cash to get a worthy project going y uses R to interpret, interact and visualize the funding. One of their charts, generated by R, shows the rapid decrease in the number of days it takes a project to accumulate $5 Million. With such a rapid growth of kick-started projects, the company is able to recognize it’s accomplishments of these projects by doing so.
-
R is used at Facebook to generate unique graphs based off of user interactions.
http://www.fastcompany.com/3030063/why-the-r-programming-language-is-good-for-business -
Companies such as twitter use R for data visualization reasons. For example, twitter uses R to create a geocoded map to show where heavy traffic from users who use geotags come from. This map also shows how intense (dense) the usage of geotags are in certain areas.
http://blog.revolutionanalytics.com/2013/05/the-arteries-of-the-world-in-tweets.html
-
Facebook uses R for Exploratory Data Analysis, Experimental Analysis, Big-Data Visualization, Human Resources, and user behavior analysis related to status updates and profile pictures. Facebook is a company that deals with a lot of data more than 500 terabytes a day and R is widely used at Facebook to visualize and analyze that data. Applications of R at Facebook include user behavior, content trends, human resources and even graphics for the IPO prospectus..
-
http://www.inside-r.org/blogs/2014/06/11/more-companies-using-r-uber-and-cultureamp
Uber uses R for statistical analysis. For example, Uber was able to find out since the introduction of Uber in Seattle, DUI incidents have decreased 10%.
-
Google is one company that uses R to produce better results for companies using Google’s advertising products. They use R for regression models, a statistical technique at Google used to evaluate the factors that lead to user satisfaction of Google products. They also use R to determines the effectiveness of display ads for its customers.
-
After some research, I discovered that R is used by Bank America in their financial modeling. Particularly, the company likes R for its visualization tools and data-crunching capabilities. The article even reports that the Vice President of Bank of America thinks, “R makes our mundane tables stand out.”
http://blog.revolutionanalytics.com/2014/06/bank-of-america-uses-r-for-reporting.html
-
I decided to check out some cool R packages. R is extremely powerful in data analysis and with the right packages you can accomplish an extensive amount of data analysis. I looked at a package called “vcd”. VCD stands for “Visualizing Categorical Data”. The package contains visualization techniques, data sets, summary and inference code aimed at categorical data, or data that can be separated into groups. The package places emphasis on grid graphics.
-
A team at Cornell University used an R package called “sphet” to estimate and test spatial models with heteroskedastic innovations. R is used to take datasets containing variables and observations to generate conclusions (spatial matrices) that improve efficiency and consistency. It creates a tool for estimating variance-covariances of events to reduce error. This has many applicable uses all throughout business and technology (ie. manufacturing efficiency).
https://cran.r-project.org/web/packages/sphet/vignettes/sphet.pdf -
The National Weather Service uses R at its River Forcast Centers to generate graphics for flood forecasting. They can predict the time of year that rivers will flood and prepare for them.
http://www.revolutionanalytics.com/companies-using-r -
The dating site, OkCupid uses R to identify trends about the love lives of a typical OkCupid member. OkCupid is considered the “google of online dating” with 3.5 active members. The company’s co-founder Christian Rudder’s crew uses R to visualize big data quickly, something they couldn’t do with Excel. Rudder stated, “R lets us get a ‘zoomed-out’ view of what’s going on with the data, which helps us decide quickly if the tack we’re taking with the data is yielding something interesting.” One way the company has used R was to recognize patterns to compare the dating habits of gay and straight members.
-
One of my favorite news sites is Nate Silver’s fivethirtyeight.com. They make visualizations on everything from sports, to politics, to when most people arrive at a party. These visualizations are insightful ways to view large amounts of data in neat graphs.
-
Google uses R to measure the effectiveness of the ads on their site. It runs hundreds of tests a month using R to try to determine what factors lead to an effective ad and to see how the add affects future purchase decisions. Its really cool to see that a company as large as google makes use of the same program we are learning.
http://blog.revolutionanalytics.com/2011/08/google-r-effective-ads.html
-
The New York Times has been utilizing R technology since 2009 to support its developmental presentation data analysis and data visualization. NYT uses R language to develop and implement their data journalism on the website as well as the newspaper. New York Times graphics editor and R pioneer described R in a recent podcast as “The greatest software on Earth”.
-
Uber has become an extremely popular transportation service over the last few years. In doing my research of R, I found out that Uber uses it for statistical analysis. An example of how it uses R would be tracking how much DUI rates decrease in areas that offer their service.
http://blog.revolutionanalytics.com/2014/06/more-companies-using-r.html -
http://blog.revolutionanalytics.com/2012/04/r-at-the-consumer-financial-protection-bureau.html
The reason why the Consumer Financial Protection Bureau uses R is because there are recent graduates graduating out of college using R. They are using a platform that they are already use to. Another reason why they are using R is because The Mac and PC versions are similar. Majority of college students are using a Mac, once they are out of college they at least have the knowledge of R to navigate effectively.
-
One of my favorite apps right now is Zillow. Zillow uses R to produce statistical predictive products. Zillow predicts full estimated costs of renting or purchasing real estate through the use of R. Zillow and its use of R allows its users to make better data driven decisions.
http://strataconf.com/stratany2012/public/schedule/detail/26345 -
After doing some research, I read that the United States uses R in their national weather programs and reports. An example of what the national weather services use R’s statistical analysis tools to do, is forecasting weather predictions like flooding or snow blizzards. As inaccurate as these predictions frequently can be, R helps to minimize errors. R also helps them to generate graphics, representing current real-time forecasts and storm trajectories.
-
John Deere use R for reliable time series modeling and geospatial analysis.
R is used by John Deere for several purposes including forecasting demand for equipment, to forecasting crop yields , and even optimizing the build order on the production line that produces the tractors. This is especially important because John Deere provides forecasts to more than half the world’s food supply. The results are integrated with Excel and SAP.http://blog.revolutionanalytics.com/2012/11/video-how-john-deere-uses-r.html
-
The National Institute of Science and Technology (NIST) used R language in BP’s oil leak case. To help US government have accurate and effective response to public, NIST used R language to run an uncertainty analysis that supply estimated evidences to US Government for best control the scale and scope of leaking oil. -
After some research, I discovered how Facebook’s team used R to analyze status updates of users. So basically what they did was categorized words based upon the 68 categories of the Linguistic Inquiry and Word Count Dictionary to view how frequent each category was being mentioned.
http://blog.revolutionanalytics.com/2010/12/analysis-of-facebook-status-updates.html -
Google uses R to calculate ROI on advertising campaigns; to predict economic activity; to analyze effectiveness on TV ads; and to make online advertising more effective. For example, Google has just released a new package for R: Causallmpact. This package allows Google to resolve the classical conundrum: how can we asses the impact of an intervention when we can not know what would have happened if we had not run the campaign? Also, this package gives the Google a “virtual” control. http://blog.revolutionanalytics.com/2014/09/google-uses-r-to-calculate-roi-on-advertising-campaigns.html
-
Google uses R and has created a new package for R. This package enables them to take a look at the effects on an advertising campaign of website clicks. Their info graphic is simple to read and makes it so that a user who is not familiar with the type of data can easily assess the information with a short explanation.
-
Facebook uses r for a varitey of functions. They are “Exploratory Data Analysis, Experimental Analysis, Big-Data Visualization, Human Resources, and user behaviour analysis related to status updates and profile pictures.” This allows Facebook to stay on top of current trends and predict future ones.
http://blog.revolutionanalytics.com/2014/05/companies-using-r-in-2014.html
-
As I research for the companies that use R & R Studio in their business operations, I found out that many “big name” companies like Facebook, Bank of America, Google, New York Times etc use that software to carry out various process objectives. However what surprised me the most was to find out that Microsoft uses the software too to carry out some of their online xbox functions. This is because knowing the magnitude and the amount of intelligence Microsoft has, I was surprised that they did not create their own software tailor made to their processes and activities.
-
R is used by various companies all over the world including one of the largest motor companies, Ford. R is huge part of the revolution at Ford as they are using it for various data science related objectives. This includes breaking down data silos between analytic groups within Ford and understanding how drivers are using electric cars, based on opt-in telemetry data from the cars themselves.
http://blog.revolutionanalytics.com/2014/11/ford-uses-r-for-data-driven-decision-making.html -
After doing research, I realized that lots of financial institutions and companies in various industries use R to carry out functions and bring about various data related to their work and research. Examples include finding standard deviations, averages,maximum & minimum return and even various qualitative information as well.
-
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here’s the link to today’s class capture: CC 32316
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
Here’s a little Excel function table that will walk us through some Excel review:
Excel Function Example
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 7 months ago
The slide deck for Intro to R has been posted under Slide Decks.
Make sure you have this handy for class and take additional notes!
Here’s the BaseballAnalysis.r script and the 2009BaseballTeamStats.csv d […]
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 8 months ago
Hi – here’s the class capture for today: CC 3/21/16
-
Amy Lavin wrote a new post on the site MIS2502 Spring 2016 8 years, 8 months ago
Here’s the link: CC 3/18/16
- Load More