Neuroscientific Study of Religious and Spiritual Phenomena: A Field Analysis
With the rapidly expanding field of research exploring religious and spiritual phenomena, there have been many perspectives regarding the validity, importance, relevance, and need for such research. There is also the ultimate issue of how such research should be interpreted with regard to epistemological questions. The best way to evaluate this field is to determine the methodological issues that currently affect the field and explore how best to address such issues so that future investigations can be as robust as possible and make this body of research more mainstream.
It should also be mentioned at the outset that the overall study of religious and spiritual phenomena requires at its root, an analysis of very complex, very compelling, and very subjective experiences. Hence the field of cognitive neuroscience offers one of the most important fields of study to explore such phenomena. It is unfortunate that various perspectives regarding this research are often uninformed or misinformed regarding the nature and potential results of such research. Perspectives range from the highly religious to the highly materialistic with concerns from â€œwriting offâ€ religious experience on one hand to being unimportant on the other. These criticisms miss several major issues such as the challenge to cognitive neuroscience for exploring arguably some of the most complex mental phenomena human beings have. With many current studies on emotion, laughter, morality, and happiness being reported almost daily, there should also be substantial information regarding complex human experiences vis a vis the study of religious and spiritual phenomena. Furthermore, it is a great challenge to science to develop appropriate definitions, measurement tools, and methods in order to study such phenomena. The results from these studies will also provide important mechanistic data that may help elucidate any potential health effects, positive or negative, that are associated with religion and spirituality. The religious and spiritual perspectives also stand to gain tremendously from this interaction since the results might help towards a new and deeper understanding of these phenomena. This research may also lead to a better understanding not only of specific types of experiences, but of a wide range of phenomena that pertain to religion including love, altruism, charity, forgiveness, worship, theology, and epistemology. However, it should be stated that such leaps must be made carefully, fully acknowledging the dynamic relationship between science and religion. While some of these applications are still far off, the potential benefits should be obvious.
This paper will review four dimensions of this area of research with a critical perspective on methodology and statistical analysis. The four dimensions as they relate to the neuroscientific study of religious and spiritual phenomena are: 1) Appropriate measures and definitions; 2) Subject selection and comparison groups; 3) Study design and biostatistics; and 4) Theological and epistemological implications.
1. Measurement and Definition of Spirituality and Religiousness
One of the most important issues related to the measurement of religious and spiritual phenomena has to do with correlating subjective and objective measures. For example, if a particular type of meditation reduces blood pressure or is associated with changes in cerebral metabolism, it is critical to know what was actually experienced by the individual.
In some sense, the most important measures of religious and spiritual phenomena are those that pertain to the subjective nature of the experience. When any person has a religious or spiritual experience, they can usually try to describe it in terms of various cognitive, behavior, and emotional parameters. Furthermore, a person will usually define the experience as â€œspiritualâ€ which distinguishes the experience from others which are regarded as â€œnon-spiritualâ€. The issue of measuring the subjective nature of these phenomena is akin to opening the mysterious â€œblack boxâ€ in which something is happening, but it is not immediately observable by an outside investigator. The problem becomes more difficult when trying to compare experiences across individuals and across cultures. A spiritual experience for a Jew may be vastly different than a spiritual experience for a Hindu. Furthermore, there is likely to be a continuum of experiences ranging from barely perceptible to absolutely mystical (dâ€™Aquili and Newberg, 1993). The question for any researcher is how to qualitatively ascertain the subjective component of such experiences. Is there a way to quantify and compare these subjective feelings and thoughts individuals have regarding their spiritual experiences? It is difficult to develop adequate scales to measure spirituality and religiousness. Many existing scales are difficult to find in the literature especially when they are reported in non-scientific journals that are not typically cited or referenced in literature reviews (Larson, Swyers, and McCullough, 1998).
A number of attempts have been reported in the literature to develop a self-reporting scale that measures the subjective nature of a particular religious or spiritual phenomenon. The book, Measures of Religiosity (Hill and Hood, 1999), provides fertile grounds for various scales and questionnaires that assess everything from a personâ€™s feeling of commitment to awe to hope to the direct apprehension of God. Some have been assessed for validity and reliability, which is critical if these scales are to have any use in future research studies. Testing the validity implies that the results return information about what the scale is supposed to measure (Patten, 2000). For example, a valid scale of a feeling of hopefulness would ask questions regarding the amount of hope a person has. If this scale did not address hope, but rather happy emotional responses, it would not be a valid measure of hope. Reliability assesses whether the scale when given to the same person at different time points yields roughly the same results (Patten, 2000). While it is important to assess the reliability and validity of scales, this is particularly problematic with regard to religious and spiritual phenomena. The reason for this difficulty is the problem with defining these terms. If someone defines spiritual as a feeling of â€œaweâ€ and another defines it as a feeling of â€œonenessâ€, what types of questions should be used to assess spirituality? A questionnaire that asks about feelings of awe might not truly be measuring spirituality and therefore, until clear and operational definitions of spirituality and religiousness can be determined, there will always be the potential problem of developing valid scales. Reliability is also a problem since spirituality and religiousness can be very consistent or widely variable within an individual. Thus, they might subjectively feel differently at different time points and therefore, the reliability of any scale with the intention to measure spirituality is always problematic.
Another problem with individual scales is whether they are useful across traditions and cultures. For example, many of the scales that are referenced in Measures of Religiosity are Christian-based, and therefore, may not be useful for evaluating Jewish or Buddhist perspectives. Fortunately, there are other scales which either have a more universal quality or at least can be modified to accommodate other perspectives. However, this might bring into question the validity and reliability of such scales in different contexts.
There is another interesting problem with scales that attempt to measure the subjective nature of spiritual or religious phenomena. This arises from the fact that most scales of spirituality and religiousness require the individual to respond in terms of psychological, affective, or cognitive processes. Thus, questions are phrased: How did it make you feel? What sensory experiences did you have? What did you think about your experience? On one hand, such measures are very valuable to individuals interested in exploring the neural correlates of such experiences because psychological, affective, and cognitive elements can usually be related to specific brain structures or function. However, the problem with phrasing questions in this way is that one never actually escapes the neurocognitive perspective to get at something that might be â€œtrulyâ€ spiritual. It might be suggested that the only way in which an investigator can reach something which is truly spiritual would be through a process of elimination in which all other factors â€“ i.e. cognitive, emotional, sensory â€“ are eliminated through the analysis, leaving only the spiritual components of the experience. In other words, the most interesting result from a brain scan of someone in prayer would be to find no significant change in the brain during the time that the individual has the most profound spiritual experience.
As described above, part of the problem with developing adequate measures is ensuring that they measure what they claim to measure. A subjective scale designed to measure the degree of an individualâ€™s religiosity needs to focus on the things which make someone religious. However, this first requires a clear definition of religiousness and spirituality. Furthermore, these definitions must be operationalized so that any measure or study can have a firm enough grasp to actually measure something (Koenig, 1998, Koenig, McCullough, and Larson, 2001). To that end, it is important to avoid narrow definitions that might impede research and also to avoid broad definitions that cannot be measured. For example, definitions of religion that pertain to a single God would eliminate almost two billion Hindu and Buddhist individuals from analysis. On the other hand, a definition of religiousness that is too broad might end up including many bizarre experiences and practices such as cults or devil worship.
One final issue, which is related to problems with definitions, is that there are so many approaches to religious and spiritual phenomena that it is often difficult to generalize from one study to another. Some scholars have pointed out that one type of meditation practice may be very different from other types, or one type of experience might be substantially different that other types (Andresen, 2000; Andresen and Forman, 2000). It is certainly critical to ensure that any study clearly states the specific practices, sub-practices, and traditions involved. Furthermore, changes in the brain associated with one type of meditative practice may not be specifically related to a different type of practice. Of course, the dynamic nature of this body of research may also provide new ways of categorizing certain practices or experiences so that one can address the question regarding whether different types of meditation truly are different, or are only experienced to be different.
Objective Measures of Spirituality
Objective measures of religious and spiritual phenomena that pertain to the neurosciences include a variety of physiological and neurophysiological measures. Recent advances in fields such as psychoneuroendocrinology and psychoneuroimmunology address the important interrelationship between the brain and body. Any thoughts or feelings perceived in the brain ultimately have effects on the functions throughout the body. While this can complicate measures as well as introduce confounding factors, this integrated approach allows for a more thorough analysis of religious and spiritual phenomena (Newberg and Iversen, 2003). Several types of measures which have already been reported in the literature include measures of autonomic nervous system activity. These are the most common approaches to specific religious and spiritual practices such as meditation or prayer. A number of studies have revealed changes in blood pressure and heart rate associated with such practices (Sudsuang, Chentanez, and Veluvan, 1991; Jevning, Wallace, and Beidebach, 1992; Koenig, McCullough, and Larson, 2001). It is interesting that the actual changes may be quite complex, involving both a relaxation as well as an arousal response. Early work by Gellhorn and Kiely (1972) developed a model of the physiological processes involved in meditation based almost exclusively on autonomic nervous system (ANS) activity, which while somewhat limited, indicated the importance of the ANS during such experiences. These authors suggested that intense stimulation of either the sympathetic or parasympathetic system, if continued, could ultimately result in simultaneous discharge of both systems (what might be considered a â€œbreakthroughâ€ of the other system). Several studies have demonstrated predominant parasympathetic activity during meditation associated with decreased heart rate and blood pressure, decreased respiratory rate, and decreased oxygen metabolism (Sudsuang, Chentanez, and Veluvan, 1991; Jevning, Wallace, and Beidebach, 1992; Travis, 2001). However, a recent study of two separate meditative techniques suggested a mutual activation of parasympathetic and sympathetic systems by demonstrating an increase in the variability of heart rate during meditation (Peng, Mietus, Liu, et al., 1999). The increased variation in heart rate was hypothesized to reflect activation of both arms of the autonomic nervous system. This notion also fits the characteristic description of meditative states in which there is a sense of overwhelming calmness as well as significant alertness. Also, the notion of mutual activation of both arms of the ANS is consistent with recent developments in the study of autonomic interactions (Hugdahl, 1996).
Measures of hormone and immune function have more recently been explored especially as an adjunct measure to various clinical outcomes (O’Halloran, Jevning, Wilson, et al., 1985; Walton, Pugh, Gelderloos, and Macrae, 1995; Tooley, Armstrong, Norman, and Sali, 2000; Infante, Torres-Avisbal, and Pinel, et al., 2001). Thus, if a hypothetical study showed that the practice of meditation results in reductions in cancer rates, then it might be valuable to measure the immunological and hormonal status of the individuals to determine the physiological basis of the effect. Certain cancers are related to abnormalities in immune (i.e. leukemia or lymphoma) or hormonal function (i.e. breast and prostate cancer). It is also important to note that alterations in various hormones and immune functions may be related to specific changes in brain function. Further, this interaction can be bidirectional. Thus, certain brain states may enhance hormonal status, but these hormonal states may in turn affect brain function. This can particularly be observed in women with premenstrual syndrome, but there are other circumstances in which various neurohormones can alter emotional, cognitive, and behavioral states.
Neurophysiological changes associated with religious and spiritual states can be obtained through a number of techniques that each have their own advantages and disadvantages. In general, the primary requirement is that the methodology evaluates functional changes in the brain. However, there are many ways of measuring such functional changes. Early studies of meditation practices made substantial use of electroencephalography (EEG) that measures electrical activity in the brain (Banquet, 1973; Hirai, 1974; Hebert, Lehmann, 1977; Corby, Roth, Zarcone, Kopell, 1978). EEG is valuable because it is relatively non-invasive and has very good temporal resolution. In other words, the instant that an individual achieves a certain state, the EEG should change accordingly. For this reason, it has continued to be useful in the evaluation of specific meditation states (Lehmann et al., 2001; Aftanas and Golocheikine, 2002; Travis and Arenander, 2004). The major problem with EEG is that spatial resolution is very low so that any change can only be local zed over very broad areas of the brain. Another problem is that analysis can be difficult because of the extensive amount of recordings that are made during any session. However, EEG may be particularly valuable to include in studies employing functional imaging techniques since the EEG may help to signal certain states, or at the very least, ensure that the individual being studied has not fallen asleep.
Functional neuroimaging studies of religious and spiritual phenomena might become one of the most important techniques since the results have physiological, clinical, and potentially philosophical relevance to the issues related to such phenomena. To date, brain imaging studies have utilized positron emission tomography (PET), single photon emission computed tomography (SPECT), and functional magnetic resonance imaging (fMRI). In general, such techniques can measure functional changes in the brain in pathological conditions, in response to pharmacological interventions, and during various activation states. Activation states have included sensory stimulation (i.e. vision, auditory, etc.), motor function and coordination, language, and higher cognitive functions (i.e. concentration) (Newberg and Alavi, 1996). The changes that can be measured include more general physiological processes such as cerebral blood flow and metabolism, in addition to many aspects of the neurotransmitter systems. For example, the serotonin, dopamine, opiate, benzodiazepine, glutamate, and acetylcholine systems have all been evaluated in a number of brain states (Newberg and Alavi, 2003; Warwick, 2004; Kennedy and Zubieta, 2004).
While functional neuroimaging studies have contributed greatly to the understanding of the human brain, they each have their own advantages and limitations with respect to evaluating religious and spiritual phenomena. Functional MRI primarily measures changes in cerebral blood flow. In general, this is a valid method for measuring cerebral activity since a brain region that is activated during a specific task will experience a concomitant increase in blood flow. This coupling of blood flow and activity provides a method for observing which parts of the brain have increased activity (increased blood flow) and decreased activity (decreased blood flow). Functional MRI has several advantages. Functional MRI has very good spatial resolution and can be coregistered with an anatomical MRI scan that can be obtained in the same imaging session. This allows for a very accurate determination of the specific areas of the brain that are activated. Functional MRI also has very good temporal resolution so that many images can be obtained over very short periods of time, as short as a second. Thus, if a subject was asked to perform 10 different prayers sequentially while in the MRI, the differences in blood flow could be detected in each of those 10 prayer states. Finally, fMRI does not involve any radioactive exposure. The disadvantages are that images must be obtained while the subject is in the scanner and the scanner can make up to 100 decibels of noise. This can be very distracting when individual are performing various spiritual practices such as meditation or prayer. However, several investigators have successfully utilized fMRI and have performed the study by having subjects practice their meditation technique at home while listening to a tape of the fMRI noise so that they become acclimated to the environment (Lazar, Bush, Gollub, Fricchione, Khalsa, Benson, 2000). The MRI noise can also affect brain activity, particularly in the auditory cortex. Functional MRI also relies on a tight coupling between cerebral blood flow and actual brain activity, which while a reasonable assumption, is not true in all cases. Well known examples in which brain activity and blood flow are not coupled include stroke, head injury, and pharmacological interventions (Newberg and Alavi, 2003). However, a detailed evaluation of this coupling in all brain states has not been performed. One final disadvantage is that at the present moment, fMRI cannot be used to evaluate individual neurotransmitter systems.
PET and SPECT imaging also have advantages and disadvantages for studying religious and spiritual phenomena. The advantages include relatively good spatial resolution for PET (comparable to fMRI) and slightly worse for SPECT imaging. PET and SPECT images can also be coregistered with anatomical MRI, but this must be obtained during a separate session and therefore, matching the scans is more difficult. PET and SPECT both require the injection of a radioactive tracer so radioactivity is involved, although usually this is fairly low. Depending on the radioactive tracer used, a variety of functional parameters can be measured including blood flow, metabolism (which more accurately depicts cerebral activity), and many different neurotransmitter components. The ability to measure these neurotransmitter systems is unique to PET and SPECT imaging. Such tracers can measure either state or trait responses. It should also be mentioned that some of the more common radioactive materials such as fluorodeoxyglucose (that measures glucose metabolism) can be injected through an existing intravenous catheter when the subject is not in the scanner. This allows for a more conducive environment for performing practices such as meditation and prayer. This tracer becomes â€œlockedâ€ in the brain during the injection period and the person can then be scanned after the person has completed their practice, but still measure changes associated with the practice (Herzog, 1990-1991; Newberg, Alavi, Baime, Pourdehnad, Santanna, and d’Aquili, 2001). A major drawback to PET and SPECT imaging, in addition to the radioactive exposure, is that these techniques have generally poor temporal resolution. Depending on the tracer, the temporal resolution can be as good as several minutes and as bad as several hours or even days. PET or SPECT would be very difficult to use in order to study 10 different prayer states in the same session. However, 2 or 3 states might be measured in the same imaging session if the appropriate radiopharmaceutical is used (Lou, Kjaer, Friberg, Wildschiodtz, Holm, Nowak, 1999). The result of this discussion is that depending on the goals of the study, various neuroimaging techniques might be better or worse.
There are other more global problems that affect the ability to interpret the results of all functional imaging studies. The most important of which is how to be certain what is actually being measured physiologically and how it compares to various subjective experiences. There are already potential problems addressing what a particular scan finding means in terms of the actual activity state of the brain. For example, it is not clear what will be observed if there is increased activity in a group of inhibitory neurons. Would that result in increased or decreased cerebral activity as measured by PET or fMRI? The bigger problem is trying to compare the physiological changes observed to the subjective state. With regard to religious and spiritual experience, it is not possible to intervene at some â€œpeakâ€ experience to ask the person what they are feeling. Therefore, if a person undergoes fMRI during a meditation session and they have a peak experience, how will the researcher know which scan findings relates to it? In addition, there are typically a number of changes in the brain with varying degrees of strength. It is not clear what degree of change should be considered a relevant change (i.e. 10% or 20%, etc.). From a statistical perspective, analyzing images has a number of problems including how to compare images across subjects and conditions and how to take into account the problems of multiple comparisons both in terms of activation states and also in terms of individual brain regions. A program called statistical parametric mapping (SPM) has been developed which can be used to evaluate various images and works by normalizing the images, coregistering the images and then analyzing them pixel by pixel for significant changes (Friston, Holme, Worsley, Poline, Frith, and Frackowiak, 1995). This is a very conservative statistical approach because of the problem with multiple comparisons and therefore subtle changes may be missed. Of course, the question still arises as to whether changes observed which are not significant in SPM are still clinically relevant. Furthermore, in the study of religious and spiritual states, it may be important to evaluate subjects on an individual basis since such states may be highly variable phenomenologically across subjects.
In spite of these limitations, neuroimaging studies have been successfully utilized to evaluate specific spiritual and meditative practices. There are currently six known studies which have spanned the different neuroimaging techniques (Herzog, 1990-1991; Lou, Kjaer, Friberg, Wildschiodtz, Holm, Nowak, 1999; Lazar, Bush, Gollub, Fricchione, Khalsa, Benson, 2000; Newberg, Alavi, Baime, Pourdehnad, Santanna, and d’Aquili, 2001; Kjaer, Bertelsen, Piccini, Brooks, Alving, and Lou, 2002; Newberg, Pourdehnad, Alavi, dâ€™Aquili, 2003). Interestingly, there appears to be some coherence of their findings with the frontal lobes, parietal lobes, thalamus, and limbic system appearing to be related in network associated with such practices. It may be that the variety of different types of practices activate a network of brain structures in relatively similar ways. It is also interesting that there do seem to be some differences that correlate well with the variations among the approaches. One study also measured changes in the dopamine system and found increased activity during meditation related practices (Kjaer, Bertelsen, Piccini, Brooks, Alving, and Lou, 2002). Thus, the level of complexity of our understanding continues to improve as more studies are performed. Future studies will certainly be necessary to more thoroughly evaluate the neurophysiological changes that occur in the brain during various religious and spiritual phenomena.
2. Subject Selection and Comparison Groups
Another interesting methodological issue in the study of religious and spiritual phenomena is to determine who are the most appropriate subjects to study and who should represent the comparison group(s). The issue of whom to study with regard to religious and spiritual phenomena depends somewhat on the definition of the phenomena. Obviously, if a researcher wanted to evaluate physiological changes during meditation, there would be thousands of different possible groups to consider studying. It is important to determine which elements of a particular practice or experience are of most interest. The more specific a researcher wants to be in terms of the phenomena, the more focused will be the subject group. For example, if a researcher wanted to study the physiological effects of the Rosary, the group would have to consist of those individual who practice the Rosary. If the focus is on feelings of unity, there may be many different practices that could be chosen, or perhaps the study group will consist of individuals from many different backgrounds. An important issue in this regard is the level of expertise or proficiency of the individuals being studied. In the case of meditative practices, there could be varied results between novice, experienced, and master level individuals. These differences could be related to whether more novice individuals can perform the practice in a manner that is similar to their usual level of practice while under the scrutiny of the researcher. For example, the noise of an MRI scanner may result in a novice not being able to perform their practice of meditation adequately while a more experienced individual may have less of a problem with the distraction. Thus, the difference observed might be related to the fact that one of them successfully performed the practice rather true differences between the practices. It is also important to select individuals that have similar socioeconomic and health backgrounds. If Franciscan nuns are less likely to smoke, then their brain scans might differ from a group of other individuals who do smoke.
The other major issue in terms of subject selection relates to the comparison or control groups. One possibility, which is frequently employed in functional neuroimaging studies is that the individual acts as their own comparison. Studies of various meditative practices typically compare the meditation state to the subjectâ€™s own baseline waking state. Others have suggested that a more appropriate comparison would be a state in which they are doing a task that is similar, but has no specific spiritual meaning. For example, one study explored whether different mantras (some spiritual some not) have different effects on the brain electrical activity during meditation (Telles, Nagarathna, and Nagendra, 1998). Another issue with regard to using subjects as their own comparison involves excluding other factors that are part of the practice. Thus, a practice that involves burning incense would be better to compare to a baseline state in which incense is also used. Otherwise, the primary change observed would be in the olfactory regions of the brain and may have nothing to do with the spiritual practice. Similarly, if a practice requires the eyes to be open (i.e. reading prayers), then the baseline state should have the subject with their eyes open, or possibly even reading non-religious texts. Some studies have looked at such differences and have found distinctions in cerebral activity depending on whether a subject was reading a religious or non-religious text (Azari, 2001). Other types of practices might also be used as comparisons including artistic and creative practices, athletic events, or cognitive and visuo-spatial tasks. Comparison groups could be other practitioners in the same tradition, but with different levels of expertise or practitioners in other traditions in which similar practices are performed. These groups might help to determine longitudinal effects of various spiritual practices, but factors such as age and health might interfere with the interpretation of such studies.
Placebo groups are another important problem with the study of religious and spiritual phenomena. It is not clear what a placebo would represent in many cases since most people who are spiritual know whether or not they are actually performing their spiritual practice. Placebo groups in this case more likely will represent other tasks that resemble the spiritual one, but are lacking the specific spiritual content. Thus if reading a prayer is going to be studied, then reading a non-religious text would represent a reasonable comparison group.
3. Study Design and Biostatistical Analysis
Based on the above review of the existing literature and the proposed operational definition of spiritual experience, there are at least seven neuroscientific paradigms which can readily contribute to the initial operationalization of spiritual experience (Larson, Swyers, and McCullough, 1998). These seven paradigms include: 1) the neurophysiology of spiritual interventions, 2) spiritual interventions associated with psychopharmacological agents, 3) drug-induced spiritual experiences, 4) neuropathologic and psychopathologic spiritual experiences, 5) spiritual experiential development in infants, children and adolescents, and 6) physical and psychological therapeutic interventions. After these study designs are considered, the biostatistical issues with such studies can be reviewed.
The Neurophysiology of Spiritual Interventions
The first paradigm involves an experimental spiritual intervention such as prayer or meditation with concomitant measures of a psychological- and spiritual- dependent evaluation. This will help to define and delineate the nature of the spiritual intervention itself. These psychological and spiritual measures can then be compared to simultaneously derived neurobiological parameters, such as electroencephalographic activity, cerebral blood flow, cerebral metabolism, and neurotran mitter activity. Such