Weekly Discussion Question: Week 11

Leave your response as a comment on this post by the beginning of class next week. Remember, it only needs to be a few sentences.

Answer one of the following, based on the video we watched in class where Andres Weigend talked about the proliferation of new forms of social and personal data:

  1. Which of these issues from earlier in the course is most important when considering use of this data: (1) Access versus Accuracy, (2)  Information versus Knowledge, (3) Choosing KPIs? Defend your answer.
  2. What are the challenges in integrating this “external” social data with existing data from a company’s internal systems?
If you want to re-watch the video, you can view it here:

22 Responses to “Weekly Discussion Question: Week 11”

  • Integrating external social data with existing data is difficult due to lack of standardization. Data can come in many different formats and it can be difficult to resolve the discrepancies. For example, a name may be recorded as John A. Doe in one system and John Doe in another. Another issue facing external data is accuracy. The accuracy of the data is not guaranteed to be 100% and can corrupt the internal system.

  • When integrating “external” social data with existing data from a company’s internal systems, some challenges can be faced. The major challenge is how to map external records with internal records. Even if the two records have IDs but we can’t assume that these two IDs are the same. Another issue is the data format, e.g. date format might be dd/mm/yyyy or mm/dd/yyyy. Some of these issues can be resolved using code, e.g. date format, but others are more difficult to be solved and will require more complex solutions and might even require human interaction, e.g. IDs.

  • Beth DiCamillo:

    Integrating external social data is difficult for several reasons. Because the social data comes from so many sources, there is no standard for it. It would be very difficult to combine all the data sets, especially with the sheer volume of data, and continued growth. Second, you don’t know the quality of the data because it is outside the control of the company. How can you be sure Facebook’s systems were not hacked and generating junk data, or users’ mobile devices were reporting correct GPS information, for example? Compatibility of the social data could also be an issue- is it in a format that is easily convertible to the company’s format?

  • Art K:

    The challenge of integrating new social data is combining it with existing data and mining the total data set in a fashion that brings value. First, it is highly unlikely that the new data, such as location information, will seamlessly integrate into existing data sets. There will be a major effort to pull these together into a common format and database. Once ETL is accomplished, then the data has to be analyzed to bring value or profit. Finding the economic patterns and using those to offer products and services to users is unproven with the new data sets available. For example, how does a customer’s past purchases relate to where they spend their time? While correlation would be very useful for marketing, the tremendous amount of data available presents a challenge to find patterns.

  • Amy V:

    Accuracy Vs Access- Weigend makes a statement while discuss time scales that essentially states that data is valuable if it is informing the decision. If the time scale matches the process of sampling it will be more accurate in informing the decision. His example was of weather. There isn’t a need to capture data in the “seconds” for the weather in san diego since it doesn’t change that much, but having the data captured in seconds is helpful in determining the winner of a sprint race or measuring an earthquake.

  • Mamta:

    The biggest problem that I see in integrating the “external” social data with existing data from a company’s internal systems is that of formatting and standardization. Also, data accuracy will be a problem in that even though social data might not be accurate, the accuracy of the data has to be check and before incorporating it with internal data, which might create an issue of the quality of the new data. Besides standardization, format, and accuracy, incorporating external data downloaded from outside sources might create a data corruption problem. Furthermore, the amount of external social data that is currently available will be a challenge by itself in that one has to decide what data to keep and what to discard. Also, the challenge will be in deciding who created the data and assess whether it’s trustworthy enough to use.

  • There are several challenges that will be faced when trying to integrate social data with data from a company’s internal system. First, it will be difficult to determine where the data came from since there are multiple sources. Second, the standardization between the two data sets is not common and may pose difficulties when trying to integrate them together. Third, we would have to examine which type of company we’re integrating the data with. The two data sets could have completely different demographics and finding a commonality between the two might be most difficult.

  • I see two several challenges to integrate external social network data with a company’s existing internal database. Two that come to mind are: Much of social network data is inaccurate for a variety of reasons such as the human tendencies to exaggerate and forget. Secondly, merging free form extemporaneous social network data into a structured corporate database likely involves a lot of manual effort to interpret the data consistently and accurately convert it into a useful, structured, format.

  • Karen Padula:

    When integrating external data into and existing system some of the challenges include matching formats, data types, and field specifications. For example if the name in the existing system is formatted at first name last name and the data you are integrating just has a name field with the first and last name combined this creates formatting issues that need to be addressed prior to integration. The same applies for field types, and field specifications, such as field length.

  • Steve Borland:

    Access vs. Accuracy is the most important issue.
    Generally, you know what your KPIs are. If you don’t, they will smack you right in the kisser. You won’t let that happen twice.

    Information vs Knowledge is just process. Working backwards, if you know what your KPIs are, then what you need to know should be evident. Since knowledge is interpreted information, then you know what information (or collection of data) that you need.

    Access and Accuracy is about the data. Everything starts with the data. Since data and technology have a relatively short lifespan, access is a moving target. If you can get people to willingly provide the data, then you don’t have to chase it, and your predictive models are easier and more accurate.

  • Monica A. Mason:

    Some of the challenges in integrating “external” social data with existing data from a company’s internal systems include complexity of the data, volume of data, standardization issues, accuracy issues,creating structure out of unstructured data, and how to identify key social networks and linking them back to the internal systems. Also, data from social networks like tag comments would also have to be integrated with survey answers, submitted ideas, questions, suggestions, and complaints from a company’s internal system.

  • Thomas Walsh:

    Key Performance Indicators are the most important part when choosing data. The selection of KPIs determines which data will be collected. Business decisions will be based on these KPIs. The Accuracy and Access of data is almost irrelevant if the wrong KPI is selected. The information and knowledge will also be useless if the data is not important to the business.

  • Which of these issues from earlier in the course is most important when considering use of this data: (1) Access versus Accuracy, (2) Information versus Knowledge, (3) Choosing KPIs? Defend your answer.

    Access versus accuracy is probably the most important issue dealing with the use (or new uses) of social and personal data. While knowing what data to look for (KPI) and being able to turn facts and raw data into information then knowledge is important, first having access or being able to solidify the accuracy of social or personal data offers significant advantages so the information about the data can be turned into something useful. For example, Wiegend talks about how an insurance company befriends a patient on Facebook to use her posts as a determinant for covering her illness. Without access to this data, the insurance company couldn’t make that determination.

  • There are many challenges in integrating “external social” data with existing data from a company’s internal systems. Some of them I could think of now is, compliance of external data with internal data: different business can use different terms for similar entities or fields, and also different sets of formats can be in use. Completeness of external data: the data that we get from external source may or may not have all the essential elements needed to fill in a field. This will affect the overall quality as well as accuracy of provided data. Ownership of external data: who owns the data, how we validate the relevancy of sources? Standardization: common process of data interface.

  • Dipti Dighe:

    In my opinion, transformation of social data into a format that is usable for statistical analysis and which can be matched with the internal data is the major problem. Social data may come through informal channels for e.g. via a status update on Facebook. Building suitable applications to extract meaningful and discrete data from this informal channel can be a tedious job. Even when standard formats are available, the quality in terms of accuracy of this social data may differ from the internal company data which can degrade the quality of overall data used in company’s decision making.

    As for the first question, in my opinion, information versus knowledge is the most important issue. The quality and availability (access) of social data primarily depends on the people who generate the social data, their frequency of using a social networking application, and their willingness in using networking tools. Thus, although important, access and accuracy of social data may be out of control for an organization. Formulating KPIs is achievable once organizations define the purpose of using social data. What remains in the scope of an organization is the important job of harnessing information from whatever social data is collected, analyzing it and developing business strategies and creating value from the data.

  • Aaron Grant:

    Since we are tasked to defend just one, I thing Accuracy vs. Access is the biggest problem. One of the biggest problems with these new forms of data collection is in general their participatory structure is opt in in nature. Data is gathered for progressive’s car insurance company but the people participating by nature would be skewed into safer drivers and people who drive less. Here we see progressive making a decision to get very accurate data about their customer base, but data that only represents a small percentage of their customers. He mentions building models that encourage people to contribute, not to penalize them. In that way companies can try to mitigate the tradeoffs for accuracy and access, but the problem will still exist.

  • There are many challenges in integrating external data with existing data. The same are as below:-
    1. Different Formatting: There is a possibility that both data are stored in different formats.
    2. Quality of External data: External data is often erroneous, and combining external data with internal one often aggravates the problem.
    3. Integration problem: It will be difficult to integrate external data with existing data as new rows have to be generated and records need to be updated as per external data.
    4. Cost efficiency: It can be very costly process.

  • John Walker:

    Some of the challenges in incorporating “external” social data with existing data revolve around privacy and ethical concerns. Ethical issues come into play with examples such as insurance companies monitoring social behavior to calculate reimbursement and premiums. Privacy is important because of the legal issues that arise once social data becomes public and whether it can be used in arenas such a marketing. Another challenge involves technology advancements that enable data to be “real-time.” Real time data can incorporate constant monitoring with GPS or constant awareness of work being performed at a workstation. Although this may be an attempt to increase business efficiency, integrating this type of data may prove to lower business morale and limit “water cooler” type innovation at companies. Other more structural data problems include the high quantity of free text in social data and the lack of standardization of format and quality.

  • Shivani:

    I think Weigend made a great point about even though the technology is changing around, what is more important is how we adapt to those changes made by technological innovations and how they affect our daily life schedule. He mentioned stuff like Yelp, Facebook, Twitter and many more social networks and I feel they have greatly transformed our social era. This is kind of a combination of information versus knowledge and access versus accuracy issue. Because the information has been there for a long time but how we use and apply that information in our daily routine is what has changed. Due to the increase of accessibility of information with the internet and smart phones and other applications on phones, it has allowed us to be more acute about things like our location (Four Square), personal mood/behavior (Facebook and Twitter) and many more things. Therefore, it is our adaptability to change with the emerging technology that has really changed the face of the innovations.

  • Don Lee:

    Answering the question number 2, the challenges in integrating external data with existing data is to standardizing the format difference between the data. The quality of the data should be also questionable since the accuracy of the data is not being “monitored” by organization. The whole data import/export process is also could be an issue since the system owner for the social network site is different from organization.

  • The external social data has many issues, the most common issue would be the standardization of the data. The data within the company is housed, standard, and is integral data. Unfortunately, when you recieve data from an outside source it is difficult to vouch for the integrity of the data. There could be mistakes and reads that are because of the problems within the data.

  • Tisha McKinney:

    Some of the many challenges in integrating external data with internal data is the source in which the external data comes from and the quality of the data. Second, is the differences in formats between the external data and the internal data. To consolidate the data in one format can be very time-consuming. As a result, the data in which you are consolidating can change or become outdated.

Leave a Reply

*