What evidence is there to suggest that the outcomes of developmental research are dependent upon the methodologies it employs, and not just the people it studies?

Methodology refers not only to the specific methods and techniques involved in collecting and analysing data, but to the logic underpinning the decisions that researchers are required to make throughout the process, from hypothesis-generation to interpretation of findings (Kothari, 2004; Mouton & Marais, 1988). It will be argued these decisions not only influence developmental research outcomes profoundly, but give rise to methodologies that are “intrinsically fallible” (Mouton & Marais, p. 16) and thus threaten validity. Although reliability can also be influenced by methodology, it is generally easier to assess (Aneshensel & Phelan, 2006), and the empirical evidence cited demonstrates it is much less pertinent to the present discussion. Similarly, aspects of methodology that can be reasonably prevented from influencing outcomes will not be discussed. For example, although order-effects in repeated-measures research designs can bias outcomes, they can be “minimised by means of counter-balancing” (Breakwell, Smith, & Wright, 2012, p. 49).

Outcomes will be used here to refer to the results obtained from research and the conclusions drawn, because it will be argued that both aspects are affected by methodological decisions, primarily those concerning what should be studied, who should be included and how the research should be conducted. However, as the decision about what to study influences which data are collected and which techniques and methods are used (Kothari, 2004), the primary impact of deciding what to study is on the subsequent decisions of who to study and how to accomplish it.

Every sampling strategy involves some form of selection bias and, therefore, deciding who to study primarily affects the outcomes (and validity) of research because the sample may not be representative of the parent population (Pastorino & Doyle-Portillo, 2011). For example, Kohlberg (1969) developed a popular model to explain moral development which derived from the outcomes of case analyses and interviews conducted with young males. However, controversy arose from research which demonstrated young females scored significantly lower on tests of moral judgement based on Kohlberg’s model (Donenberg & Hoffman, 1988). Subsequently, Gilligan (1982) found the way young people responded to moral dilemmas differed between genders. Therefore, according to Gilligan, the outcomes of Kohlberg’s research were biased because of his decision to study males exclusively and this reduced the validity of his research.

Validity is also questioned when generalising cross-culturally as there may be considerable differences in the populations. For example, Kitayama, Markus, Matsumoto, and Norasakkunkit (1997) found an association between personal achievement and positive emotion in American students yet they found an association between negative emotion and personal achievement in Japanese students.

Even generalising within a country can be problematic in developmental research. For example, a longitudinal study by Salvas et al. (2011) investigated aggression in Canadian children yet the sample was taken from only one geographic region (Quebec). Furthermore, only 13 percent of the participants’ families had a annual household income between $40,000 and $49,999 when the national mean at the time was $46,893 (Statistics Canada, 1998). If aggression does not follow the same developmental trajectory irrespective of socio-economic factors, these sampling decisions will have introduced bias into the results.

Arguably, the problem of sample bias is especially problematic in developmental research owing to the small sample sizes typically imposed by difficulties in recruiting participants, increasing the likelihood they will be unrepresentative. However, using a small sample does not necessarily mean the outcomes will be affected. Gilligan’s (1982) theory of gender-differences in moral judgement was based on a very small sample size but has since been validated by considerably larger studies (e.g., Donenberg & Hoffman, 1988). Similarly, Amsterdam (1972) found the crucial age when children started recognising themselves (in the mirror) was 20-24 months and, although her research was based on a relatively modest sample, her results have been replicated reliably by studies utilising larger samples (e.g., Lewis & Brooks-Gunn, 1979).

As well as the larger sample, Lewis and Brooks-Gunn’s (1979) study additionally used photographs and videotapes to test self-recognition and the results generally corroborated the validity of Amsterdam’s (1972) mirror-method. However, other evidence suggests the way in which researchers conduct their studies can (and often does) have a major influence on outcomes.

Decisions about how to conduct a research study can be conceptualised as either micro-level decisions (e.g. the practical aspects of the method), or macro-level decisions (e.g. the research design). Because these decisions are often based on previous research, problems with validity can affect subsequent research. For example, the validity of Kohlberg’s (1969) research has already been criticised because of sample bias, but what about the validity of the research that influenced it? Piaget’s theory of cognitive development, particularly the role of egocentrism (a child’s inability to take another’s perspective), was a major influence in Kohlberg’s stage-based, moral-development model (Donenberg & Hoffman, 1988). Piaget and Inhelder’s (1956) research into egocentrism exemplifies how micro-level decisions can influence the outcomes and the validity of developmental research and, therefore, the validity of subsequent research that draws upon its conclusions.

Piaget and Inhelder (1956) tested egocentrism using a model of three mountains that participants viewed from one angle whilst being required to identify the perspective of another person (or a doll) viewing the model at a different angle. To communicate the answer, each participant was given photographs of the model and was required to identify the one that matched the other’s perspective. Four-year-olds consistently failed the task whereas children aged seven and eight consistently passed. The results appeared to confirm Piaget’s theory that children enter the concrete operational stage at around age seven, only then acquiring the cognitive faculties to manipulate information logically e.g. to identify another person’s visual-spatial perspective.

If the results of Piaget and Inhelder’s (1956) study were not simply a product of the methodology, and were solely dependent on the participants, children much younger than seven should find it difficult to take the visual-spatial perspective of another. However, Borke (1975) showed that children as young as three can pass such tests if changes are made to the method.

Borke (1975) designed her own version of Piaget and Inhelder’s (1956) perspective study, but placed the mountain model on top of a revolving turntable. Children were required to rotate the display until it matched the “view” of a toy figure. According to Borke, this made it easier for young children to communicate their understanding of another’s perspective as it did not involve the “transition from a three-dimensional display to a two-dimensional picture” (p. 243). Borke also included a familiarisation trial to ensure children understood the task. These small methodological changes resulted in four-year-olds being able to give correct responses 67% of the time.

Furthermore, Borke (1975) conducted trials with more age-appropriate models and found children aged three passed the task at least 80% of the time. Similarly, Hughes and Donaldson (1979) devised a test of egocentrism that required children to hide a doll from the view of a toy policeman and reported three-year-olds were able to pass the task with 90% accuracy because the task was familiar to them.

Therefore, Borke (1975) and Hughes and Donaldson (1979) obtained considerably different results from Piaget and Inhelder (1956) simply by making relatively minor alterations to the method, suggesting that outcomes of developmental research are reliant on methodologies. This further calls into question the validity of Kohlberg’s (1969) moral-development model that adopted egocentrism (and the associated developmental stage) as a crucial factor in the theory that underpinned it.

Another methodological decision affecting validity is the choice of outcome measure. Fosco, Stormshak, Dishion, and Winter (2012) investigated the role of parental monitoring in predicting adolescent problem behaviour. However, to determine how closely adolescents were monitored by their parents, a survey was designed that contained questions about whether they knew where their children were most of the time and whether their children kept any secrets from them. Obviously, the parents might have thought they knew the true answers, but this might not have been the case. Moreover, parents may not have answered honestly owing to concerns about how they would be judged. Therefore, the outcome measure might not have given a valid representation of the actual degree of parental monitoring each child received, thus invalidating the outcomes. Furthermore, the researchers often employed the interview method for collecting data and developmental research (e.g., Quas et al., 2007; Thompson, Clarke-Stewart, & Lepore, 1997) has demonstrated that interviewees can be heavily influenced by a biased interviewer.

Rosenthal and Fode (1963) demonstrated that researchers can also influence outcomes in more subtle ways. Student experimenters were asked to conduct a maze-learning experiment with rats. Some were told their rat was intelligent; others were told it was unintelligent. This was a deception as all rats came from the same population and were randomly assigned to experimenter. Despite the experimenters showing no signs of cheating or biasing the experiments, a statistically significant increase in performance was shown by the “intelligent” rats; the experimenters were unaware they had influenced the results.

In other fields, these experimenter effects can be overcome by the double-blind experiment in which neither the experimenter nor the participant is aware of the experimental condition to which he or she is randomly assigned (Kantowitz, Roediger, & Elmes, 2014). However, in developmental research, participants are often assigned to groups based on a subject-variable such as school year (e.g., Krauss & Glucksberg, 1969) which prevents the true experiment (and thus the double-blind design) from being viable because of the potential influence from confounding variables. Therefore, although the macro-level decision about which research design to use can have a considerable influence on the outcomes, developmental researchers are often limited in their choice. This choice is also restricted due to ethical issues. For example, Fosco et al. (2012) could not have chosen an experimental design and randomly assigned adolescents to be either closely monitored by their parents or not! Therefore, the methodology (and thus its impact on results) is often dictated by the nature of developmental research.

The “choice” of research design also has considerable influence over the conclusions that researchers make. For example, Eron, Huesmann, Lefkowitz, and Walder (1972) conducted a longitudinal study of 427 youths and found that watching violent television programmes in early childhood was associated with aggressive behaviour ten years later. However, owing to the nature of correlation research, causality could not be legitimately established from the study, that is, violence on television caused aggression. The conclusion was merely that there was an observed association because the research design prevented full control over confounding variables. The study also demonstrates an especially problematic variable synonymous with longitudinal research; attrition (Goodwin, 2009). 448 youths (out of the original 875) could not be located for the follow-up study 10 years after they were first measured. The remaining 427 thus comprised the entire sample. If there were characteristics/circumstances that influenced the likelihood of participants being unavailable 10 years later, this would have created a selective sample that might not have been representative and thus reduced the validity of the outcomes; especially if those characteristics/circumstances were related to aggression (e.g. risk of imprisonment).

Although the conclusions drawn from a study can be limited by the research design, deciding how to interpret the findings is often subjective and, arguably, is the most influential methodological decision. For example, Jusczyk and Aslin (1995) and Saffran, Aslin, and Newport (1996) investigated whether infants could detect repeated words in fluent speech by timing how long infants listened to speech-strings that contained either unfamiliar words or familiar words introduced during a familiarisation-procedure. Jusczyk and Aslin found infants listened longer to familiar words which they concluded was because infants recognised the words. The methodology Saffran et al. used differed in several aspects (e.g. using a speech synthesizer rather than a real voice) and, contrary to Jusczyk and Aslin’s results, Saffran et al. found that infants listened longer to unfamiliar words (which again demonstrates how research methods can influence outcomes). However, Saffran et al. attributed their results to a novelty preference and thus concluded their study had also demonstrated word-recognition!

The aforementioned examples provide firm evidence that the outcomes of developmental research are dependent on the methodologies employed, not just the people. The impact this has on validity raises concerns about the influence developmental research has on society, as it has: informed family interventions designed to reduce problem behaviour; influenced public policy in regard to age-restrictions on mediated violence; and has informed educational policy in regard to moral and cognitive development.

Although it has been demonstrated that methodology does not always impact validity, it has also been shown that it often does. Arguably, the main threat to the validity of developmental research is due to the restricted use of the controlled, experimental design and, therefore, developmental researchers should take extra measures to control for confounding variables and exercise caution about conclusions and generalisations made. In fact, Mouton and Marais (1988) call for all researchers to have critical awareness of how their assumptions and decisions influence the outcomes (as is common practice in qualitative research).

However, just as reliability can be established by replicating the methodologies of previous research (and checking that similar results are obtained), it has been demonstrated that validity can be established through follow-up research that utilises different methodologies from the studies preceding it.

References

Amsterdam, B. (1972). Mirror self‐image reactions before age two. Developmental psychobiology, 5(4), 297-305.

Aneshensel, C. S., & Phelan, J. C. (2006). Handbook of the Sociology of Mental Health: Springer.

Borke, H. (1975). Piaget’s mountains revisited: Changes in the egocentric landscape. Developmental Psychology, 11(2), 240-243. doi: 10.1037/h0076459

Breakwell, G. M., Smith, J. A., & Wright, D. B. (2012). Research Methods in Psychology: SAGE Publications.

Donenberg, G., & Hoffman, L. (1988). Gender differences in moral development. Sex Roles, 18(11-12), 701-717. doi: 10.1007/BF00288055

Eron, L. D., Huesmann, L. R., Lefkowitz, M. M., & Walder, L. O. (1972). Does television violence cause aggression? American Psychologist, 27(4), 253.

Fosco, G. M., Stormshak, E. A., Dishion, T. J., & Winter, C. E. (2012). Family relationships and parental monitoring during middle school as predictors of early adolescent problem behavior. Journal of Clinical Child & Adolescent Psychology, 41(2), 202-213.

Gilligan, C. (1982). In a different voice: Harvard University Press.

Goodwin, C. J. (2009). Research In Psychology: Methods and Design: John Wiley & Sons.

Hughes, M., & Donaldson, M. (1979). The use of hiding games for studying the coordination of viewpoints. Educational Review, 31(2), 133-140.

Jusczyk, P. W., & Aslin, R. N. (1995). Infants′ detection of the sound patterns of words in fluent speech. Cognitive psychology, 29(1), 1-23.

Kantowitz, B., Roediger, H., & Elmes, D. (2014). Experimental Psychology: Cengage Learning.

Kitayama, S., Markus, H. R., Matsumoto, H., & Norasakkunkit, V. (1997). Individual and collective processes in the construction of the self: self-enhancement in the United States and self-criticism in Japan. Journal of personality and social psychology, 72(6), 1245.

Kohlberg, L. (1969). The cognitive-developmental approach. In D. A. Goslin (Ed.), Handbook of socialization theory and research. Chicago: Rand McNally, 1969

Kothari, C. (2004). Research methodology: methods and techniques: New Age International.

Krauss, R. M., & Glucksberg, S. (1969). The development of communication: Competence as a function of age. Child development, 255-266.

Lewis, M., & Brooks-Gunn, J. (1979). Social cognition and the acquisition of self: Plenum Publishing Company Limited.

Mouton, J., & Marais, H. C. (1988). Basic Concepts in the Methodology of the Social Sciences: Human Sciences Research Council.

Pastorino, E., & Doyle-Portillo, S. (2011). What is Psychology? : Cengage Learning.

Piaget, J., & Inhelder, B. (1956). Child’s Conception of Space. New York: Norton.

Quas, J. A., Malloy, L. C., Melinder, A., Goodman, G. S., D’Mello, M., & Schaaf, J. (2007). Developmental differences in the effects of repeated interviews and interviewer bias on young children’s event memory and false reports. Developmental Psychology, 43(4), 823-837. doi: 10.1037/0012-1649.43.4.823

Rosenthal, R., & Fode, K. L. (1963). The effect of experimenter bias on the performance of the albino rat. Behavioral Science, 8(3), 183-189.

Saffran, J. R., Aslin, R. N., & Newport, E. (1996). Statistical learning by 8-month-old infants. Science(5294), p1926.

Salvas, M.-C., Vitaro, F., Brendgen, M., Lacourse, É., Boivin, M., & Tremblay, R. E. (2011). Interplay between Friends’ Aggression and Friendship Quality in the Development of Child Aggression during the Early School Years. Social Development, 20(4), 645-663. doi: 10.1111/j.1467-9507.2010.00592.x

Statistics Canada. (1998). Household Surveys Division. [1998]. Survey of Consumer Finances, 1998 [Canada]: Household Income, Facilities, and Equipment [public use microdata file]. Ottawa, Ontario: Statistics Canada. Household Surveys Division. [producer and distributor].

Thompson, W. C., Clarke-Stewart, K. A., & Lepore, S. J. (1997). What did the janitor do? Suggestive interviewing and the accuracy of children’s accounts. Law and Human Behavior, 21(4), 405.