Importance of good quality data and DATA ANALYSIS for Research Methods INTRODUCTION Conducting a survey is often a useful way of finding something out, especially when `human factors’ are under investigation. Although surveys often investigate subjective issues, a well-designed survey should produce quantitative, rather than qualitative, results. That is, the results should be expressed numerically, and be capable of rigorous analysis. The data obtained from a study may or may not be in numerical or quantitative form, that is, in the form of numbers.
If they are not in numerical form, then we can still carry out qualitative analyses based on the experiences of the individual participants. If they are in numerical form, then we typically start by working out some descriptive statistics to summarise the pattern of findings. These descriptive statistics include measures of central tendency within a sample (e. g. mean) and measures of the spread of scores within a sample (e. g. range). Another useful way of summarising the findings is by means of graphs and figures. Several such ways of summarising the data are discussed later on in this chapter.
In any study, two things might be true: (1) there is a difference (the experimental hypothesis), or (2) there is no difference (the null hypothesis). Various statistical tests have been devised to permit a decision between the experimental and null hypotheses on the basis of the data. Decision making based on a statistical test is open to error, in that we can never be sure whether we have made the correct decision. However, certain standard procedures are generally followed, and these are discussed in this chapter. Finally, there are important issues relating to the validity of the findings obtained from a study.
One reason why the validity of the findings may be limited is that the study itself was not carried out in a properly controlled and scientific fashion. Another reason why the findings may be partially lacking in validity is that they cannot readily be applied to everyday life, a state of affairs that occurs most often with laboratory studies. THE NEED OF DATA Most research projects need data in order to answer a proposed research problem. The data that need to be acquired, and the sources of such data, must be identified as a matter of utmost importance.
No amount or depth of subsequent data analysis can make up for an original lack of data quantity or quality. Research problems and objectives (or hypotheses) need to be very carefully constructed and clearly defined, as they dictate the data that need to be obtained and analyzed in order to successfully address the objectives themselves. In addition, the quantity of data, their qualities, and how they are sampled and measured, have implications for the choice and effectiveness of the data analysis techniques used in subsequent analysis. • Most research requires data and data analysis. Data acquisition is of utmost importance and considerable effort should be made to obtain or generate good data. • Good data are data whose characteristics enable the research objectives to be met. • Data of poor quality or undesirably low quantity will lead to unsatisfactory data analysis and vague results. • The characteristics of the data, particularly their type, quantity, and how they were sampled, constrain the choice of data analysis techniques able to be used on the data. • Data analysis can only be as good as the original data allow Developing Conceptual Frameworks for DATA collection
Experience suggests that when developing the research questions it is very beneficial to also diagram the problem or topic. This is often called a conceptual framework. According to Miles and Huberman (1994), “A conceptual framework explains, either graphically or in narrative form [diagrams are much preferred], the main things to be studied – the key factors, constructs or variables – and the presumed relationships among them. ” (p. 18) [pic] A diagram of the topic is literally worth more than 10,000 words. The task here is to create a diagram of the topic that includes clearly defined ariables (independent, dependent, etc. ) along with the relationships of those variables and key factors that influence the variables and the relationships. This task is often done in conjunction with the development of the research questions and it is an iterative process. [pic] THE IMPORTANCE OF GOOD QUALITY DATA Scientific research is used by academics in a wide scope of academic disciplines such as social sciences, public health, biostatistics, education, social work, public administration, and business administration; and by practitioners engaged in marketing, commerce, and industry.
Data are the basis for all scientific research. Collecting good quality data plays a vital role in supplying objective information for the problems under study so that some analytical understanding of the problems and hence solutions can be obtained. Making decision on the basis of poor quality data is risky and may lead to disastrous results, as the situation may be distorted and hence all subsequent analyses and decision making will rest on a shaky ground. WHAT IS THE IMPORTANCE OF DATA ANALYSIS ? Data analysis is important to businesses will be an understatement.
In fact, no business can survive without analyzing available data. Visualize the following situations: [pic] A pharma company is performing trials on number of patients to test its new drug to fight cancer. The number of patients under the trial is well over 500. [pic] A company wants to launch new variant of its existing line of fruit juice. It wants to carry out the survey analysis and arrive at some meaningful conclusion. [pic]Sales director of a company knows that there is something wrong with one of its successful products, however hasn’t yet carried out any market research data analysis.
How and what does he conclude? These situations are indicative enough to conclude that data analysis is the Lifeline of any business. Whether one wants to arrive at some marketing decisions or fine-tune new product launch strategy, data analysis is the key to all the problems. What is the importance of data analysis – instead, one should say what is not important about data analysis. Merely analyzing data isn’t sufficient from the point of view of making a decision. How does one interpret from the analyzed data is more important. Thus, data analysis is not a decision making system, but decision supporting system.
Analysis By the time researcher get to the analysis of collected data, most of the really difficult work has been done. It’s much more difficult to: define the research problem; develop and implement a sampling plan; conceptualize, operationalize and test your measures; and develop a design structure. If you have done this work well, the analysis of the data is usually a fairly straightforward affair. In most social research the data analysis involves three major steps, done in roughly this order: • Cleaning and organizing the data for analysis (Data Preparation) • Describing the data (Descriptive Statistics) Testing Hypotheses and Models (Inferential Statistics) Data Preparation involves checking or logging the data in; checking the data for accuracy; entering the data into the computer; transforming the data; and developing and documenting a database structure that integrates the various measures. Descriptive Statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple graphics analysis, they form the basis of virtually every quantitative analysis of data. With descriptive statistics you are simply describing what is, what the data shows.
Inferential Statistics investigate questions, models and hypotheses. In many cases, the conclusions from inferential statistics extend beyond the immediate data alone. For instance, we use inferential statistics to try to infer from the sample data what the population thinks. Or, we use inferential statistics to make judgments of the probability that an observed difference between groups is a dependable one or one that might have happened by chance in this study. Thus, we use inferential statistics to make inferences from our data to more general conditions; we use descriptive statistics simply to describe what’s going on in our data.
In most research studies, the analysis section follows these three phases of analysis. Descriptions of how the data were prepared tend to be brief and to focus on only the more unique aspects to your study, such as specific data transformations that are performed. The descriptive statistics that you actually look at can be voluminous. In most write-ups, these are carefully selected and organized into summary tables and graphs that only show the most relevant or important information.
Usually, the researcher links each of the inferential analyses to specific research questions or hypotheses that were raised in the introduction, or notes any models that were tested that emerged as part of the analysis. In most analysis write-ups it’s especially critical to not “miss the forest for the trees. ” If you present too much detail, the reader may not be able to follow the central line of the results. Often extensive analysis details are appropriately relegated to appendices, reserving only the most critical analysis summaries for the body of the report itself. Conclusion Validity 6
Of the four types of validity ( internal validity, construct validity, external validity and conclusion ) conclusion validity is undoubtedly the least considered and most misunderstood. That’s probably due to the fact that it was originally labeled ‘statistical’ conclusion validity and you know how even the mere mention of the word statistics will scare off most of the human race! In many ways, conclusion validity is the most important of the four validity types because it is relevant whenever we are trying to decide if there is a relationship in our observations (and that’s one of the most basic aspects of any analysis).
Perhaps we should start with an attempt at a definition: Conclusion validity is the degree to which conclusions we reach about relationships in our data are reasonable. For instance, if we’re doing a study that looks at the relationship between socioeconomic status (SES) and attitudes about capital punishment, we eventually want to reach some conclusion. Based on our data, we may conclude that there is a positive relationship, that persons with higher SES tend to have a more positive view of capital punishment while those with lower SES tend to be more opposed.
Conclusion validity is the degree to which the conclusion we reach is credible or believable. Although conclusion validity was originally thought to be a statistical inference issue, it has become more apparent that it is also relevant in qualitative research. For example, in an observational field study of homeless adolescents the researcher might, on the basis of field notes, see a pattern that suggests that teenagers on the street who use drugs are more likely to be involved in more complex social networks and to interact with a more varied group of people.
Although this conclusion or inference may be based entirely on impressionistic data, we can ask whether it has conclusion validity, that is, whether it is a reasonable conclusion about a relationship in our observations. Whenever you investigate a relationship, you essentially have two possible conclusions — either there is a relationship in your data or there isn’t. In either case, however, you could be wrong in your conclusion. You might conclude that there is a relationship when in fact there is not, or you might infer that there isn’t a relationship when in fact there is (but you didn’t detect it! . So, we have to consider all of these possibilities when we talk about conclusion validity. It’s important to realize that conclusion validity is an issue whenever you conclude there is a relationship, even when the relationship is between some program (or treatment) and some outcome. In other words, conclusion validity also pertains to causal relationships. How do we distinguish it from internal validity which is also involved with causal relationships? Conclusion validity is only concerned with whether there is a relationship.
For instance, in a program evaluation, we might conclude that there is a positive relationship between our educational program and achievement test scores — students in the program get higher scores and students not in the program get lower ones. Conclusion validity is essentially whether that relationship is a reasonable one or not, given the data. But it is possible that we will conclude that, while there is a relationship between the program and outcome , the program didn’t cause the outcome.
Perhaps some other factor, and not our program, was responsible for the outcome in this study. For instance, the observed differences in the outcome could be due to the fact that the program group was smarter than the comparison group to begin with. Our observed posttest differences between these groups could be due to this initial difference and not be the result of our program. This issue — the possibility that some other factor than our program caused the outcome — is what internal validity is all about.
So, it is possible that in a study we can conclude that our program and outcome are related (conclusion validity) and also conclude that the outcome was caused by some factor other than the program (i. e. , we don’t have internal validity). There are several key reasons why reaching conclusions about relationships is so difficult. One major problem is that it is often hard to see a relationship because our measures or observations have low reliability — they are too weak relative to all of the ‘noise’ in the environment.
Another issue is that the relationship we are looking for may be a weak one and seeing it is a bit like looking for a needle in the haystack. Sometimes the problem is that we just didn’t collect enough information to see the relationship even if it is there. All of these problems are related to the idea of statistical power and so we’ll spend some time trying to understand what ‘power’ is in this context. One of the most interesting introductions to the idea of statistical power is given in the ‘OJ’ Page which was created by Rob Becker to illustrate how the decision a jury has to reach (guilty vs. ot guilty) is similar to the decision a researcher makes when assessing a relationship. The OJ Page uses the infamous OJ Simpson murder trial to introduce the idea of statistical power and illustrate how manipulating various factors (e. g. , the amount of evidence, the “effect size”, and the level of risk) affects the validity of the verdict. Finally, we need to recognize that we have some control over our ability to detect relationships, and we’ll conclude with some suggestions for improving conclusion validity
A threat to conclusion validity is a factor that can lead you to reach an incorrect conclusion about a relationship in your observations. It is essential to make two kinds of errors about relationships: • conclude that there is no relationship when in fact there is (you missed the relationship or didn’t see it) • conclude that there is a relationship when in fact there is not (you’re seeing things that aren’t there! ) Most threats to conclusion validity have to do with the first problem. Why?
Maybe it’s because it’s so hard in most research to find relationships in our data at all that it’s not as big or frequent a problem — we tend to have more problems finding the needle in the haystack than seeing things that aren’t there! So, I’ll divide the threats by the type of error they are associated with. Finding no relationship when there is one (or, “missing the needle in the haystack”) When researcher is looking for the needle in the haystack you essentially have two basic problems: the tiny needle and too much hay. He/She can view this as a signal-to-noise ratio problem.
The “signal” is the needle — the relationship you are trying to see. The “noise” consists of all of the factors that make it hard to see the relationship. There are several important sources of noise, each of which is a threat to conclusion validity. One important threat is low reliability of measures. This can be due to many factors including poor question wording, bad instrument design or layout, illegibility of field notes, and so on. In studies where you are evaluating a program you can introduce noise through poor reliability of treatment implementation.
If the program doesn’t follow the prescribed procedures or is inconsistently carried out, it will be harder to see relationships between the program and other factors like the outcomes. Noise that is caused by random irrelevancies in the setting can also obscure your ability to see a relationship. In a classroom context, the traffic outside the room, disturbances in the hallway, and countless other irrelevant events can distract the researcher or the participants. The types of people researcher have in his study can also make it harder to see relationships.
The threat here is due to random heterogeneity of respondents. If there is a very diverse group of respondents, they are likely to vary more widely on your measures or observations. Some of their variety may be related to the phenomenon you are looking at, but at least part of it is likely to just constitute individual differences that are irrelevant to the relationship being observed. All of these threats add variability into the research context and contribute to the “noise” relative to the signal of the relationship you are looking for.
But noise is only one part of the problem. We also have to consider the issue of the signal — the true strength of the relationship. There is one broad threat to conclusion validity that tends to subsume or encompass all of the noise-producing factors above and also takes into account the strength of the signal, the amount of information you collect, and the amount of risk you’re willing to take in making a decision about a whether a relationship exists. This threat is called low statistical power. Because this idea is so important in understanding how we make ecisions about relationships, we have a separate discussion of statistical power. Finding a relationship when there is not one (or “seeing things that aren’t there”) . In anything but the most trivial research study, the researcher will spend a considerable amount of time analyzing the data for relationships. Of course, it’s important to conduct a thorough analysis, but most people are well aware of the fact that if you play with the data long enough, you can often “turn up” results that support or corroborate your hypotheses.
In more everyday terms, you are “fishing” for a specific result by analyzing the data repeatedly under slightly differing conditions or assumptions. In statistical analysis, researcher attempts to determine the probability that the finding we get is a “real” one or could have been a “chance” finding. In fact, we often use this probability to decide whether to accept the statistical result as evidence that there is a relationship. In the social sciences, researchers often use the rather arbitrary value known as the 0. 5 level of significance to decide whether their result is credible or could be considered a “fluke. ” Essentially, the value 0. 05 means that the result you got could be expected to occur by chance at least 5 times out of every 100 times you run the statistical analysis. The probability assumption that underlies most statistical analyses assumes that each analysis is “independent” of the other. But that may not be true when you conduct multiple analyses of the same data. For instance, let’s say you conduct 20 statistical tests and for each one you use the 0. 5 level criterion for deciding whether you are observing a relationship. For each test, the odds are 5 out of 100 that you will see a relationship even if there is not one there (that’s what it means to say that the result could be “due to chance”). Odds of 5 out of 100 are equal to the fraction 5/100 which is also equal to 1 out of 20. Now, in this example, you conduct 20 separate analyses. Let’s say that you find that of the twenty results, only one is statistically significant at the 0. 05 level. Does that mean you have found a statistically significant relationship?
If you had only done the one analysis, you might conclude that you’ve found a relationship in that result. But if you did 20 analyses, you would expect to find one of them significant by chance alone, even if there is no real relationship in the data. We call this threat to conclusion validity fishing and the error rate problem. The basic problem is that you were “fishing” by conducting multiple analyses and treating each one as though it was independent. Instead, when you conduct multiple analyses, you should adjust the error rate (i. e. , significance level) to reflect the number of analyses you are doing.
The bottom line here is that you are more likely to see a relationship when there isn’t one when you keep reanalyzing your data and don’t take that fishing into account when drawing your conclusions. DATA Analysis Problems that can lead to either conclusion error Every analysis is based on a variety of assumptions about the nature of the data, the procedures you use to conduct the analysis, and the match between these two. If you are not sensitive to the assumptions behind your analysis you are likely to draw erroneous conclusions about relationships.
In quantitative research we refer to this threat as the violated assumptions of statistical tests. For instance, many statistical analyses assume that the data are distributed normally — that the population from which they are drawn would be distributed according to a “normal” or “bell-shaped” curve. If that assumption is not true for your data and you use that statistical test, you are likely to get an incorrect estimate of the true relationship. And, it’s not always possible to predict what type of error you might make — seeing a relationship that isn’t there or missing one that is.
There are assumptions, some of which we may not even realize, behind our qualitative methods. For instance, in interview situations we may assume that the respondent is free to say anything s/he wishes. If that is not true — if the respondent is under covert pressure from supervisors to respond in a certain way — you may erroneously see relationships in the responses that aren’t real and/or miss ones that are. The threats listed above illustrate some of the major difficulties and traps that are involved in one of the most basic of research tasks — deciding whether there is a relationship in your data or observations.
So, how do we attempt to deal with these threats? The researcher has a number of strategies for improving conclusion validity through minimizing or eliminating the threats described above Data Analysis Process / Steps: 1. Qualitative analysis of data Recording experiences and meanings 2. Interpretations of interviews, case studies, and observations Some of the problems involved in drawing conclusions from non-experimental studies. 3. Content analysis Studying the messages contained in media and communications. 4. Quantitative analysis: Descriptive statistics
What to do with all those numbers and percentages at the end of the study. 5. Data presentation and statistical tests When to use a chart or a graph. Which statistical test to choose and why. 6. Issues of experimental and ecological validity Does your study test what you say it does? Has it any relevance to real life? 7. Writing up a practical Presenting Research results. DATA Preparation Data Preparation involves checking or logging the data in; checking the data for accuracy; entering the data into the computer; transforming the data; and developing and documenting a database structure that integrates the various measures.
Logging the Data In any research project you may have data coming from a number of different sources at different times: -mail surveys returns -coded interview data -pretest or posttest data -observational data In all but the simplest of studies, you need to set up a procedure for logging the information and keeping track of it until you are ready to do a comprehensive data analysis. Different researchers differ in how they prefer to keep track of incoming data. In most cases, you will want to set up a database that enables you to assess at any time what data is already in and what is still outstanding.
You could do this with any standard computerized database program (e. g. , Microsoft Access, Claris Filemaker), although this requires familiarity with such programs. or, you can accomplish this using standard statistical programs (e. g. , SPSS, SAS, Minitab, Datadesk) and running simple descriptive analyses to get reports on data status. It is also critical that the data analyst retain the original data records for a reasonable period of time — returned surveys, field notes, test protocols, and so on. Most professional researchers will retain such records for at least 5-7 years.
For important or expensive studies, the original data might be stored in a data archive. The data analyst should always be able to trace a result from a data analysis back to the original forms on which the data was collected. A database for logging incoming data is a critical component in good research record-keeping. Checking the Data For Accuracy As soon as data is received you should screen it for accuracy. In some circumstances doing this right away will allow you to go back to the sample to clarify any problems or errors.
There are several questions you should ask as part of this initial data screening: • Are the responses legible/readable ? • Are all important questions answered ? • Are the responses complete? • Is all relevant contextual information included (e. g. , data, time, place, researcher) ? In most social research, quality of measurement is a major issue. Assuring that the data collection process does not contribute inaccuracies will help assure the overall quality of subsequent analyses. Developing a Database Structure
The database structure is the manner in which you intend to store the data for the study so that it can be accessed in subsequent data analyses. You might use the same structure you used for logging in the data or, in large complex studies, you might have one structure for logging data and another for storing it. As mentioned above, there are generally two options for storing data on computer — database programs and statistical programs. Usually database programs are the more complex of the two to learn and operate, but they allow the analyst greater flexibility in manipulating the data.
In every research project, you should generate a printed codebook that describes the data and indicates where and how it can be accessed. Minimally the codebook should include the following items for each variable: • variable name • variable description • variable format (number, data, text) • instrument/method of collection • date collected • respondent or group • variable location (in database) • notes The codebook is an indispensable tool for the analysis team. Together with the database, it should provide comprehensive documentation that enables ther researchers who might subsequently want to analyze the data to do so without any additional information. QUALITATIVE ANALYSIS OF DATA There is an important distinction between quantitative research and qualitative research. In quantitative research, the information obtained from the participants is expressed in numerical form. Studies in which we record the number of items recalled, reaction times, or the number of aggressive acts are all examples of quantitative research. In qualitative research, on the other hand, the information obtained from participants is not expressed in numerical form.
The emphasis is on the stated experiences of the participants and on the stated meanings they attach to themselves, to other people, and to their environment. Those carrying out qualitative research sometimes make use of direct quotations from their participants, arguing that such quotations are often very revealing. There has been rapid growth in the use of qualitative methods since the mid-1980s. This is due in part to increased dissatisfaction with the quantitative or scientific approach that has dominated psychology for the past 100 years.
Coolican (1994) discussed a quotation from Reason and Rowan (1981), which expresses that dissatisfaction very clearly: There is too much measurement going on. Some things which are numerically precise are not true; and some things which are not numerical are true. Orthodox research produces results which are statistically significant but humanly insignificant ; in human inquiry it is much better to be deeply interesting than accurately boring. Many experimental psychologists would regard this statement as being clearly an exaggeration. Orthodox research” with its use of the experimental method has transformed our understanding of attention, perception, learning, memory, reasoning, and so on. However, qualitative research is of clear usefulness within some areas of social psychology, and it can shed much light on the motivations and values of individuals. As a result, investigators using interviews, case studies, or observations often make use of qualitative data, although they do not always do so. Investigators who collect qualitative data use several different kinds of analysis, and so only general indications of what can be done with such data will be presented here.
However, there would be general agreement among such investigators with the following statement by Patton (1980; cited in Coolican, 1994): The cardinal principle of qualitative analysis is that causal relationships and theoretical statements be clearly emergent from and grounded in the phenomena studied. The theory emerges from the data; it is not imposed on the data. How do investigators use this principle? One important way is by considering fully the categories spontaneously used by the participants before the investigators develop their own categories.
An investigator first of all gathers together all the information obtained from the participants. This stage is not always entirely straightforward. For example, if we simply transcribe tape recordings of what our participants have said, we may be losing valuable information. Details about which words are emphasised, where the speaker pauses, and when the speaker speeds up or slows down should also be recorded, so that we can understand fully what he or she is trying to communicate. The investigator then arranges the items of information (e. g. statements) into various groups in a preliminary way.
If a given item seems of relevance to several groups, then it is included in all of them. Frequently, the next step is to take account of the categories or groupings suggested by the participants themselves. The final step is for the investigator to form a set of categories based on the information obtained from the previous steps. However, the investigator is likely to change some of the categories if additional information comes to light. Qualitative investigators are not only interested in the number of items or statements falling into each category.
Their major concern is usually in the variety of meanings, attitudes, and interpretations found within each category. For example, an investigator might study attitudes towards A-level psychology by carrying out interviews with several A-level students. One of the categories into which their statements were then placed might be “negative attitudes towards statistics”. A consideration of the various statements in this category might reveal numerous reasons why A-level psychology students dislike statistics! When qualitative researchers report their findings, they will often include some raw data (e. . direct quotations from participants) as well as analyses of the data based on categories. In addition, they often indicate how their hypotheses changed during the course of the investigation. Evaluation Qualitative analysis is often less influenced than is quantitative analysis by the biases and theoretical assumptions of the investigator. In addition, it offers the prospect of understanding the participants in a study as rounded individuals in a social context. This contrasts with quantitative analysis, in which the focus is often on rather narrow aspects of behaviour.
The greatest limitation of the qualitative approach is that the findings that are reported tend to be unreliable and hard to replicate. Why is this so? The qualitative approach is subjective and impressionistic, and so the ways in which the information is categorised and then interpreted often differ considerably from one investigator to another. There are various ways in which qualitative researchers try to show that their findings are reliable (Coolican, 1994). Probably the most satisfactory approach is to see whether the findings obtained from a qualitative analysis can be replicated.
This can be done by comparing the findings from an interview study with those from an observational study. Alternatively, two different qualitative researchers can conduct independent analyses of the same qualitative data, and then compare their findings. Qualitative researchers argue that the fact that they typically go through the “research cycle” more than once helps to increase reliability. Thus, for example, the initial assumptions and categories of the researcher are checked against the data, and may then be changed.
After that, the new assumptions and categories are checked against the data. Repeating the research cycle is of value in some ways, but it does not ensure that the findings will have high reliability. INTERPRETATION OF INTERVIEWS, CASE STUDIES, AND OBSERVATIONS Qualitative analyses as discussed in the previous section are carried out in several different kinds of studies. They are especially common in interviews, case studies, and observational studies, although quantitative analyses have often been used in all three types of studies.
Some of the advantages and limitations of these types of studies are discussed in the Research methods: Psychological enquiry chapter. What we will do in this section is to consider the interpretation of interviews, case studies, and observations. Interviews Interviews vary considerably in terms of their degree of structure. In general terms, unstructured interviews (e. g. non-directive or informal) lend themselves to qualitative analyses, whereas structured interviews lend themselves to quantitative analysis. As Coolican (1994) pointed out, there are various skills that interviewers need in order to obtain valuable data.
These skills involve establishing a good understanding with the person being interviewed, adopting a non-judgemental approach, and developing effective listening skills. Cardwell et al. (1996) illustrated the value of the interview approach by discussing the work of Reicher and Potter (1985) on a riot in the St Paul’s area of Bristol in April 1980. Many of the media reports on the riot were based on the assumption that those involved in the riot were behaving in a primitive and excessively emotional way. Unstructured interviews with many of those involved indicated that in fact they had good reasons for their actions.
They argued that they were defending their area against the police, and they experienced strong feelings of solidarity and community spirit. This interpretation was supported by the fact that very little of the damage affected private homes in the area. Evaluation There are various problems involved in interpreting interview information. First, there is the problem of social desirability bias (Social desirability bias: the tendency to provide socially desirable rather than honest answers on questionnaires and in interviews ).
Most people want to present themselves in the best possible light, so they may provide socially desirable rather than honest answers to personal questions. This problem can be handled by the interviewer asking additional questions to establish the truth. Second, the data obtained from an interviewer may reveal more about the social interaction processes between the interviewer and the person being interviewed (the interviewee) than about the interviewee’s thought processes and attitudes. Third, account needs to be taken of the self-fulfilling prophecy. Self-fulfilling prophecy: the tendency for someone’s expectations about another person to lead to the fulfillment of those expectations) This is the tendency for someone’s expectations about another person to lead to the fulfillment of those expectations. For example, suppose that a therapist expects his or her patient to behave very anxiously. This expectation may cause the therapist to treat the patient in such a way that the patient starts to behave in the expected fashion. CONTENT ANALYSIS Content analysis: a method involving the detailed study of, for example, the output of the media, speeches, and literature.
Content analysis is used when originally qualitative information is reduced to numerical terms. Content analysis started off as a method for analysing messages in the media, including articles published in newspapers, speeches made by politicians on radio and television, various forms of propaganda, and health records. More recently, the method of content analysis has been applied more widely to almost any form of communication. As Coolican (1994, p. 108) pointed out: The communications concerned were originally those already ublished, but some researchers conduct content analysis on materials which they ask people to produce, such as essays, answers to interview questions, diaries, and verbal protocols detailed records. One of the types of communication that has often been studied by content analysis is television advertising. For example, McArthur and Resko (1975) carried out a content analysis of American television commercials. They found that 70% of the men in these commercials were shown as experts who knew a lot about the products being sold.
In contrast, 86% of the women in the commercials were shown only as product users. There was another interesting gender difference: men who used the products were typically promised improved social and career prospects, whereas women were promised that their family would like them more. The first stage in content analysis is that of sampling, or deciding what to select from what may be an enormous amount of material. For example, when Cumberbatch (1990) carried out a study on over 500 advertisements shown on British television, there were two television channels showing advertisements.
Between them, these two channels were broadcasting for about 15,000 hours a year, and showing over 250,000 advertisements. Accordingly, Cumberbatch decided to select only a sample of advertisements taken from prime-time television over a two-week period. The issue of sampling is an important one. For example, television advertisers target their advertisements at particular sections of the population, and so arrange for the advertisements to be shown when the relevant groups are most likely to be watching television. As a result, advertisements for beer are more likely to be shown during a football match than a programme about fashion.
By focusing on prime-time television, Cumberbatch (1990) tried to ensure that he was studying advertisements designed to have general appeal. The other key ingredient in content analysis is the construction of the coding units into which the information is to be categorised. In order to form appropriate coding units, the researcher needs to have considerable knowledge of the kinds of material to be used in the content analysis. He or she also needs to have one or more clear hypotheses, because the selection of coding units must be such as to permit these hypotheses to be tested effectively.
The coding can take many forms. The categories used can be very specific (e. g. use of a given word) or general (e. g. theme of the communication). Instead of using categories, the coders may be asked to provide ratings. For example, the apparent expertise of those appearing in television advertisements might be rated on a 7-point scale. Another form of coding involves ranking items, or putting them in order. For example, the statements of politicians could be ranked in terms of the extent to which they agreed with the facts.
Evaluation One of the greatest strengths of content analysis is that it provides a way of extracting information from a wealth of real-world settings. The media influence the ways we think and feel about issues, and so it is important to analyse media communications in detail. Content analysis can reveal issues of concern. The greatest limitation of content analysis is that it is often very hard to interpret the findings. There are also problems of interpretation with other communications such as personal diaries or essays.
Diaries or essays may contain accurate accounts of what an individual does, thinks, and feels. On the other hand, individuals may provide deliberately distorted accounts in order to protect their self-esteem, to make it appear that their lives are more exciting than is actually the case, and so on. Another problem is that the selection and scoring of coding units can be rather subjective. The coding categories that are used need to reflect accurately the content of the communication, and each of the categories must be defined as precisely as possible
QUANTITATIVE ANALYSIS : DESCRIPTIVE STATISTICS Suppose that we have carried out an experiment on the effects of noise on learning with three groups of nine participants each. One group was exposed to very loud noise, another group to moderately loud noise, and the third group was not exposed to noise at all. What they had learned from a book chapter was assessed by giving them a set of questions, producing a score between 0 and 20. What is to be done with the raw scores? There are two key types of measures that can be taken whenever we have a set of scores from participants in a given condition.
First, there are measures of central tendency, which provide some indication of the size of average or typical scores. Second, there are measures of dispersion, which indicate the extent to which the scores cluster around the average or are spread out. Various measures of central tendency and of dispersion are considered next. Measures of central tendency Measures of central tendency describe how the data cluster together around a central point. There are three main measures of central tendency: the mean; the median; and the mode.
Mean(12) : the mean or an average is a word describing the average calculated over an entire population . It is therefore a parameter , and the average in a sample is both a descriptive statistic and the best estimate of the population mean . The best way to describe it is as : “ a mean of 22. 5 indicates that the average score in this sample and by inference the population as a whole is 22. 5 “ The mean in each group or condition is calculated by adding up all the scores in a given condition, and then dividing by the number of participants in that condition.
Suppose that the scores of the nine participants in the no-noise condition are as follows: 1, 2, 4, 5, 7, 9,9, 9, 17. The mean is given by the total, which is 63,divided by the number of participants, which is 9. Thus,the mean is 7. The main advantage of the mean is the fact that it takes all the scores into account. This generally makes it a sensitive measure of central tendency, especially if the scores resemble the normal distribution, which is a bell-shaped distribution in which most scores cluster fairly close to the mean.
However, the mean can be very misleading if the distribution differs markedly from the normal and there are one or two extreme scores in one direction. Suppose that eight people complete one lap of a track in go-karts. For seven of them, the times taken (in seconds) are as follows: 25, 28, 29, 29, 34, 36, and 42. The eighth person’s go-kart breaks down, and so the driver has to push it around the track. This person takes 288 seconds to complete the lap. This produces an overall mean of 64 seconds. This is clearly misleading, because no-one else took even close to 64 seconds to complete one lap. Mean |Scores |Scores | |Scores Number |1 |1 | |1 1 |2 |2 | |2 2 |4 |4 | |4 3 |5 |5 | |5 4 |7 _ Median |7 | |7 5 |9 |9 | |9 6 |9 |9 _ Mode | |9 7 |9 |9 | |9 8 |17 |17 | |17 9 | | | |63 9 Total | | | |63 ? 9 = 7 | | | Median Median: the middle score out of all participants’ scores in a given condition. Another way of describing the general level of performance in each condition is known as the median.
If there is an odd number of scores, then the median is simply the middle score, having an equal number of scores higher and lower than it. In the example with nine scores in the no-noise condition (1, 2, 4, 5, 7, 9, 9, 9, 17), the median is 7. Matters are slightly more complex if there is an even number of scores. In that case, we work out the mean of the two central values. For example, suppose that we have the following scores in size order: 2, 5, 5, 7, 8, 9. he two central values are 5 and 7, and so the median is (5+7)/2= 6 The main advantage of the median is that it is unaffected by a few extreme scores, because it focuses only on scores in the middle of the distribution.
It also has the advantage that it tends to be easier than the mean to work out. The main limitation of the median is that it ignores most of the scores, and so it is often less sensitive than the mean. In addition, it is not always representative of the scores obtained, especially if there are only a few scores. Mode The mode is useful where other measures of central tendency are meaningless, for example when calculating the number of children in the average family. It would be unusual to have 0. 4 or 0. 6 of a child! The final measure of central tendency is the mode. This is simply the most frequently occurring score. In the example of the nine scores in the no-noise condition, this is 9.
The main advantages of the mode are that it is unaffected by one or two extreme scores, and that it is the easiest measure of central tendency to work out. In addition, it can still be worked out even when some of the extreme scores are not known. However, its limitations generally outweigh these advantages. The greatest limitation is that the mode tends to be unreliable. For example, suppose we have the following scores: 4, 4, 6, 7, 8, 8, 12,12, 12. The mode of these scores is 12. If just one score changed (a 12 becoming a 4), the mode would change to 4! Another limitation is that information about the exact values of the scores obtained is ignored in working out the mode. This makes it a less sensitive measure than the mean.
A final limitation is that it is possible for there to be more than one mode. Levels of measurement From what has been said so far, we have seen that the mean is the most generally useful measure of central tendency, whereas the mode is the least useful. However, we need to take account of the level of measurement when deciding which measure of central tendency to use (the various levels are discussed further on p. 15 of this chapter). At the interval and ratio levels of measurement, each added unit represents an equal increase. For example, someone who hits a target four times out of ten has done twice as well as someone who hits it twice out of ten.
Below this is the ordinal level of measurement, in which we can only order, or rank, the scores from highest to lowest. At the lowest level, there is the nominal level, in which the scores consist of the numbers of participants falling into various categories. The mean should only be used when the scores are at the interval level of measurement. The median can be used when the data are at the interval or ordinal level. The mode can be used when the data are at any of the three levels. It is the only one of the three measures of central tendency that can be used with nominal data. Measures of dispersion The mean, median, and mode are all measures of central tendency.
It is also useful to work out what are known as measures of dispersion, such as the range, interquartile range, variation ratio, and standard deviation. These measures indicate whether the scores in a given condition are similar to each other or whether they are spread out. Range The simplest of these measures is the range, which can be defined as the difference between the highest and the lowest score in any condition. In the case of the no-noise group (1, 2, 4, 5, 7, 9, 9, 9, 17), the range is 17- 1 = 16. The main advantages of the range as a measure of dispersion are that it is easy to calculate and that it takes full account of extreme values.
The main weakness of the range is that it can be greatly influenced by one score which is very different from all of the others. In the example, the inclusion of the participant scoring 17 increases the range from 9 to 17. The other important weakness of the range is that it ignores all but two of the scores, and so is likely to provide an inadequate measure of the general spread or dispersion of the scores around the mean or median. Interquartile range The interquartile range : ( the spread of the middle 50% of an ordered or ranked set of scores . ) It can further be defined as the spread of the middle 50% of scores. For example, suppose that we have the following set of scores: 4, 5, 6, 6, 7, 8, 8, 9, 11, 11, 14, 15, 17, 18, 18, 19.
There are 16 scores, which can be divided into the bottom 25% (4), the middle 50% (8), and the top 25% (4). The middle 50% of scores start with 7 and run through to 15. The upper boundary of the interquartile range lies between 15 and 17, and is given by the mean of these two values, i. e. 16. The lower boundary of the interquartile range lies between 6 and 7, and is their mean, i. e. 6. 5. The interquartile range is the difference between the upper and lower boundaires, i. e. 16 – 6. 5 = 9. 5. The interquartile range has the advantage over the range that it is not influenced by a single extreme score. As a result, it is more likely to provide an accurate reflection of the Interquartile range | | | |6 8 11 18 | | | |Scores: 4 5 6 7 8 9 11 14 15 17 18 19 | | | |Range: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | | | | |Bottom 25% Middle 55% Top 25% | | | |(4 scores) (8 scores) (4 scores) | | | |16 | spread or dispersion of the scores. It has the disadvantage that it ignores information from the top and the bottom 25% of scores.
For example, we could have two sets of scores with the same interquartile range, but with more extreme scores in one set than in the other. The difference in spread or dispersion between the two sets of scores would not be detected by the interquartile range. Variation ratio Another simple measure of dispersal is the variation ratio. This can be used when the mode is the chosen measure of central tendency. The variation ratio is defined simply as the proportion of the scores obtained which are not at the modal value (i. e. the value of the mode). The variation ratio for the no-noise condition discussed earlier (scores of 1, 2,4, 5, 7, 9, 9, 9, 17), where the mode is 9, is as follows: number of non-modal scores ? otal number of scores = 6/9 = 0. 67 The advantages of the variation ratio are that it is not affected by extreme values, and that it is very easy to calculate. However, it is a very limited measure of dispersal, because it ignores most of the data. In particular, it takes no account of whether the non-modal scores are close to, or far removed from, the modal value. Thus, the variation ratio can only provide a very approximate measure of dispersal. Standard deviation The most generally useful measure of dispersion is the standard deviation. It is harder to calculate than the range or variation ratio, but generally provides a more accurate measure of the spread of scores.
However, you will be pleased to learn that many calculators allow the standard deviation to be worked out rapidly and effortlessly The first step is to work out the mean of the sample. This is given by the total of all of the participants’ scores ( ? X = 130; the symbol ? means the sum of) divided by the number of participants (N = 13). Thus, the mean is 10. The second step is to subtract the mean in turn from each score (X -M) . The calculations are shown in the fourth column. The third step is to square each of the scores in the fourth column (X – M)2 . The fourth step is to work out the total of all the squared scores, ? (X – M)2 . This comes to 136. The fifth step is to divide the result of the fourth step by one less than the number of participants, N – 1 = 12. This gives us 136 divided by 12, which equals 11. 33.
This is known as the variance, which is in squared units. Finally, we use a calculator to take the square root of the variance. This produces a figure of 3. 37; this is the standard deviation. The method for calculating the standard deviation that has just been described is used when we want to estimate the standard deviation of the population. If we want merely to describe the spread of scores in our sample, then the fifth step involves dividing the result of the fourth step by N. What is the meaning of this figure for the standard deviation? We expect about two-thirds of the scores in a sample to lie within one standard deviation of the mean.
Let’s take an example where, the mean is 10. 0, one standard deviation above the mean is 13. 366 and one standard deviation below the mean is 6. 634. In fact, 61. 5% of the scores lie between those two limits, which is only slightly below the expected percentage. The standard deviation has special relevance in relation to the so-called normal distribution. The normal distribution is a bell-shaped curve in which there are as many scores above the mean as below it. Intelligence (or IQ) scores in the general population provide an example of a normal distribution. Other characteristics such as height and weight also form roughly a normal distribution.
Most of the scores in a normal distribution cluster fairly close to the mean, and there are fewer and fewer scores as you move away from the mean in either direction. In a normal distribution, 68. 26% of the scores fall within one standard deviation of the mean, 95. 44% fall within two standard deviations, and 99. 73% fall within three standard deviations. The standard deviation takes account of all of the scores and provides a sensitive measure of dispersion. As we have seen, it also has the advantage that it describes the spread of scores in a normal distribution with great precision. The most obvious disadvantage of the standard deviation is that it is much harder to work out than the other measures of dispersion. DATA PRESENTATION
Information about the scores in a sample can be presented in several ways. If it is presented in a graph or chart, this may make it easier for people to understand what has been found, compared to simply presenting information about the central tendency and dispersion. We will shortly consider some examples. The key point to remember is that all graphs and charts should be clearly labelled and presented so that the reader can rapidly make sense of the information contained in them. Frequency polygon One way of summarising these data is in the form of a frequency polygon. Frequency polygon: a graph showing the frequencies with which different scores are obtained by the participants in a study.
This is a simple form of chart in which the scores from low to high are indicated on the x or horizontal axis and the frequencies of the various scores (in terms of the numbers of individuals obtaining each score) are indicated on the y or vertical axis. The points on a frequency polygon should only be joined up when the scores can be ordered from low to high. In order for a frequency polygon to be most useful, it should be constructed so that most of the frequencies are neither very high nor very low. The frequencies will be very high if the width of each class interval (the categories used to summarise frequencies) on the x axis is too broad (e. g. overing 20 seconds), and the frequencies will be very low if each class interval is too narrow (e. g. covering only 1 or 2 seconds). Each point in a frequency polygon should be placed in the middle of its class interval. There is a technical point that needs to be made here (Coolican, 1994). Suppose that we include all times between 53 and 57 seconds in the same class interval. As it has only measured running times to the nearest second, this class interval will cover actual times between 52. 5 and 57. 5 seconds. In this case, the mid-point of the class interval (55 seconds) is the same whether we take account of the actual measurement interval (52. 5–57. seconds) or adopt the simpler approach of focusing on the lowest and highest recorded times in the class interval (53–57 seconds, respectively). When the two differ, it is important to use the actual measurement interval. Histogram A similar way of describing these data is by means of a histogram. (Histogram: a graph in which the frequencies with which different scores are obtained by the participants in a study are shown by rectangles of different heights. ) [pic] In a histogram, the scores are indicated on the horizontal axis and the frequencies are shown on the vertical axis. In contrast to a frequency polygon, however, the frequencies are indicated by rectangular columns. These columns are all the same width but vary in height in accordance with the corresponding frequencies.
As with frequency polygons, it is important to make sure that the class intervals are not too broad or too narrow. All class intervals are represented, even if there are no scores in some of them. Class intervals are indicated by their mid-point at the centre of the columns. Histograms are clearly rather similar to frequency polygons. However, frequency polygons are sometimes preferable when you want to compare two different frequency distributions. The information contained in a histogram is interpreted in the same way as the information in a frequency polygon Bar chart Bar chart: a graph showing the frequencies with which the participants in a study fall into different categories
Frequency polygons and histograms are suitable when the scores obtained by the participants can be ordered from low to high. In more technical terms, the data should be either interval or ratio (see next section). However, there are many studies in which the scores are in the form of categories rather than ordered scores; in other words, the data are nominal. For example, 50 people might be asked to indicate their favourite leisure activity. Suppose that 15 said going to a party, 12 said going to the pub, 9 said watching television, 8 said playing sport, and 6 said reading a good book. These data can be displayed in the form of a bar chart. In a bar chart, the categories are shown along the horizontal axis, and the frequencies are indicated on the vertical axis.
In contrast to the data contained in histograms, the categories in bar charts cannot be ordered numerically in a meaningful way. However, they can be arranged in ascending (or descending) order of popularity. Another difference from histograms is that the rectangles in a bar chart do not usually touch each other. The scale on the vertical axis of a bar chart normally starts at zero. However, it is sometimes convenient for presentational purposes to have it start at some higher value. If that is done, then it should be made clear in the bar chart that the lower part of the vertical scale is missing. The columns in a bar chart often represent frequencies.
However, they can also represent means or percentages for different groups (Coolican, 1994). How should we interpret the information in a bar chart? In the present example, a bar chart makes it easy to compare the popularity of different leisure activities. We can see at a glance that going to a party was the most popular leisure activity, whereas reading a good book was the least popular. [pic] STATISTICAL TESTS The various ways in which the data from a study can be presented are all useful in that they give us convenient and easily understood summaries of what we have found. However, to have a clearer idea of what our findings mean, it is generally necessary to carry out one or more statistical tests.
The first step in choosing an appropriate statistical test is to decide whether your data were obtained from an experiment in which some aspect of the situation (the independent variable) was manipulated in order to observe i