`vignettes/form_prereg2D_v1.Rmd`

`form_prereg2D_v1.Rmd`

This vignette shows the Preregistration Template for Secondary Data Analysis form. It can be initialized as follows:

```
initialized_prereg2D_v1 <-
preregr::prereg_initialize(
"prereg2D_v1"
);
```

After this, content can be specified with preregr::prereg_specify() or preregr::prereg_justify. To check the next field(s) for which content still has to be specified, use preregr::prereg_next_item().

The form’s metadata is:

field | content |
---|---|

title | Preregistration of secondary data analysis: A template and tutorial |

author | Olmo R. van den Akker, Sara J. Weston, Lorne Campbell, William J. Chopik, Rodica Ioana Damian, Pamela E. Davis-Kean, Andrew N. Hall, Jessica E. Kosie, Elliott Kruse, Jerome Olsen, Stuart J. Ritchie, K. D. Valentine, Anna E. van ’t Veer, Marjan Bakker |

version | 1.0 |

comments | Please cite the associated paper when using this preregistration template (see https://doi.org/10.15626/MP.2020.2625) |

The form is defined as follows (use preregr::form_show() to show the form in the console, instead):

```
preregr::form_knit(
"prereg2D_v1"
);
```

Here we present a preregistration template for the analysis of secondary data and provide guidance for its effective use. We are aware that the number of questions (25) in the template may be overwhelming but it is important to note that not every question is relevant for every preregistration. Our aim was to be inclusive and cover all bases in light of the diversity of secondary data analyses. Even though none of the questions are mandatory, we do believe that an elaborate preregistration is preferable over a concise preregistration simply because it restricts more researcher degrees of freedom. We therefore recommend that authors answer as many questions in as much detail as possible. And, if questions are not applicable, it would be good practice to also specify why this is the case so that readers can assess your reasoning.

Effectively preregistering a study is challenging and can take a lot of time but, like Nosek et al. (2019) and many others, we believe it can improve the interpretability, verifiability and rigor of your studies and is therefore more than worth it if you want both yourself and others to have more confidence in your research findings.

The current template is merely one building block toward a more effective preregistration infrastructure and, given the ongoing developments in this area, will be a work in progress for the foreseeable future. Any feedback is therefore greatly appreciated. Please send any feedback to the corresponding author, Olmo van den Akker (ovdakker@gmail.com).

Title

title

Provide the working title of your study.

*Example*: Do religious people follow the golden rule?
Assessing the link between religiosity and prosocial behavior using data
from the Wisconsin Longitudinal Study.

Authors

authors

Name the authors of this preregistration.

*Example*: Josiah Carberry (JC) – ORCID iD: https://orcid.org/0000-0002-1825-0097 Pomona Sprout (PS)
– Personal webpage: https://en.wikipedia.org/wiki/Hogwarts_staff#Pomona_Sprout

Research questions

research_questions

List each research question included in this study.

*Example*: RQ1 = Are more religious people more prosocial than
less religious people? RQ2 = Does the relationship between religiosity
and prosociality differ for people with different religious
affiliations?

Hypotheses

hypotheses

Please provide the hypotheses of your secondary data analysis. Make sure they are specific and testable, and make it clear what your statistical framework is (e.g., Bayesian inference, NHST). In case your hypothesis is directional, do not forget to state the direction. Please also provide a rationale for each hypothesis.

*Example*: “Do to others as you would have them do to you”
(Luke 6:31). This golden rule is taught by all major religions, in one
way or another, to promote prosociality (Parliament of the World’s
Religions, 1993). Religious prosociality is the idea that religions
facilitate behavior that is beneficial for others at a personal cost
(Norenzayan & Shariff, 2008). The encouragement of prosocial
behavior by religious teachings appears to be fruitful: a considerable
amount of research shows that religion is positively related to
prosocial behavior (e.g., Friedrichs, 1960; Koenig, McGue, Krueger,
& Bouchard, 2007; Morgan, 1983). For instance, religious people have
been found to give more money to, and volunteer more frequently for,
charitable causes than their non-religious counterparts (e.g., Grønbjerg
& Never, 2004; Lazerwitz, 1962; Pharoah & Tanner, 1997). Also,
the more important people viewed their religion, the more likely they
were to do volunteer work (Youniss, McLellan, & Yates, 1999). Based
on the above we expect that religiosity is associated with prosocial
behavior in our sample as well. To assess this prediction, we will test
the following hypotheses using a null hypothesis significance testing
framework:

H0(1) = In men and women who graduated from Wisconsin high schools in
1957, there is no association between religiosity and prosociality H1(1)
= In men and women who graduated from Wisconsin high schools in 1957,
there is a positive association between religiosity and prosociality

Dataset

dataset

Name and describe the dataset(s), and if applicable, the subset(s) of the data you plan to use. Useful information to include here is the type of data (e.g., cross-sectional or longitudinal), the general content of the questions, and some details about the respondents. In the case of longitudinal data, information about the survey’s waves is useful as well.

*Example*: To answer our research questions we will use a
dataset from the Wisconsin Longitudinal Study (WLS; Herd, Carr, &
Roan, 2014). The WLS provides long-term data on a random sample of all
the men and women who graduated from Wisconsin high schools in 1957. The
WLS involves twelve waves of data. Six waves were collected from the
original participants or their parents (1957, 1964, 1975, 1992, 2004,
and 2011), four were collected from a selected sibling (1977, 1994,
2005, and 2011), one from the spouse of the original participant (2004),
and one from the spouse of the selected sibling (2006). The questions
vary across waves and are related to domains as diverse as
socio-economic background, physical and mental health, and psychological
makeup. We will use the subset consisting of the 1957 graduates who
completed the follow-up 2003-2005 wave of the WLS dataset because it
includes specific modules on religiosity and volunteering.

Openness of data

dataset_open

Specify the extent to which the dataset is open or publicly available. Make note of any barriers to accessing the data, even if it is publicly available.

*Example*: The dataset we will use is publicly available, but
you need to formally agree to acknowledge the funding source for the
Wisconsin Longitudinal Study, to cite the data release in any
manuscripts, working papers, or published articles using these data, and
to inform WLS about any published papers for use in the WLS bibliography
and for reporting purposes. To do this you need to submit some
information about yourself on the website (https://www.ssc.wisc.edu/wlsresearch/data/downloads/).
You will then receive an email with a download link.

Access to data

data_access

How can the data be accessed? Provide a persistent identifier or link if the data are available online, or give a description of how you obtained the dataset.

*Example*: The data can be accessed by going to the following
link and searching for the variables that are specified in Q12 of this
preregistration: https://www.ssc.wisc.edu/wlsresearch/documentation/browse/?label=&variable=&wave_108=on&searchButton=Search

Date(s) data were accessed

data_date

Specify the date of download and/or access for each author.

*Example*: PS: Downloaded 12 February 2019; Accessed 12
February 2019. JC: Downloaded 3 January 2019 (estimated); Accessed 12
February 2019. We will use the data accessed by JC on 12 February 2019
for our statistical analyses.

Data collection

data_collection

If the data collection procedure is well documented, provide a link to that information. If the data collection procedure is not well documented, describe, to the best of your ability, how data were collected.

*Example*: The WLS data was and is being collected by the
University of Wisconsin Survey Center for use by the research community.
The origins of the WLS can be traced back to a state-sponsored
questionnaire administered during the spring of 1957 at all Wisconsin
high school to students in their final year. Therefore, the dataset
constitutes a specific sample not necessarily representative of the
United States as a whole. Most panel members were born in 1939, and the
sample is broadly representative of white, non-Hispanic American men and
women who completed at least a high school education. A flowchart for
the data collection can be found here: https://www.ssc.wisc.edu/wlsresearch/about/flowchart/cor459d7.pdf

Data codebook

data_codebook

Some studies offer codebooks to describe their data. If such a codebook is publicly available, link to it here or upload the document. If not, provide other available documentation. Also provide guidance on what parts of the codebook or other documentation are most relevant.

*Example*: The codebook for the dataset we use can be found
here: https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k.
We will mainly use questions from the mail survey about religion and
spirituality, and the phone survey on volunteering, but will also use
some questions from other modules (see the answer to Q12).

Manipulated variable(s)

var_manipulated

If you are going to use any manipulated variables, identify them here. Describe the variables and the levels or treatment arms of each variable (note that this is not applicable for observational studies and meta-analyses). If you are collapsing groups across variables this should be explicitly stated, including the relevant formula. If your further analysis is contingent on a manipulation check, describe your decisions rules here.

*Example*: Not applicable.

Measured variable(s)

var_measured

If you are going to use measured variables, identify them here. Describe both outcome measures as well as predictors and covariates and label them accordingly. If you are using a scale or an index, state the construct the scale/index represents, which items the scale/index will consist of, how these items will be aggregated, and whether this aggregation is based on a recommendation from the study codebook or validation research. When the aggregation of the items is based on exploratory factor analysis (EFA) or confirmatory factor analysis (CFA), also specify the relevant details (EFA: rotation, how the number of factors will be determined, how best fit will be selected, CFA: how loadings will be specified, how fit will be assessed, which residuals variance terms will be correlated). If you are using any categorical variables, state how you will code them in the statistical analyses.

*Example*: Religiosity (IV): Religiosity is measured using a
newly created scale with a subset of items from the Religion and
Spirituality module of the 2004 mail survey (described here: https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k&module=gmail_religion).
The scale includes general questions about how religious/spiritual the
individual is and how important religion/spirituality is to them.
Importantly, the questions are not specific to a particular denomination
and are on the same response scale. The specific variables are as
follows: 1. il001rer: How religious are you? 2. il002rer: How spiritual
are you? 3. il003rer: How important is religion in your life? 4.
il004rer: How important is spirituality in your life? 5. il005rer: How
important was it, or would it have been if you had children, to send
your children for religious or spiritual instruction? 6. il006rer: How
closely do you identify with being a member of a religious group? 7.
il007rer: How important is it for you to be with other people who are
the same religion as you? 8. il008rer: How important do you think it is
for people of your religion to marry other people who are the same
religion? 9. il009rer: How strongly do you believe that one should stick
to a particular faith? 10. il010rer: How important was religion in your
home when you were growing up? 11. il011rer: When you have important
decisions to make in your life, how much do you rely on your religious
or spiritual beliefs? 12. il012rer: How much would your spiritual or
religious beliefs influence your medical decisions if you were to become
gravely ill? The levels of all of these variables are indicated by a
Likert scale with the following options: (1) Not at all; (2) Not very;
(3) Somewhat; (4) Very; (5) Extremely, as well as ‘System Missing’ (the
participant did not provide an answer) and ‘Refused’ (the participant
refused to answer the question). Variables il006rer, il008rer, and
il012rer additionally include the option ‘Don’t know’ (the participant
stated that they did not know how to answer the question). We will use
the average score (after omitting non-numeric and ‘Don’t know’
responses) on the twelve variables as a measure of religiosity. This
average score is constructed by ourselves and was not already part of
the dataset. Prosociality (DV): In line with previous research (Konrath,
Fuhrel-Forbis, Lou, & Brown, 2012), we will use three measures of
prosociality that measure three aspects of engagement in other-oriented
activities (see Brookfield, Parry, & Bolton, 2018 for the link
between prosociality and volunteering). The prosociality variables come
from the Volunteering module of the 2004 phone survey. The codebook of
that module can be found here: https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k&module=gvol).
The three measures of prosociality we will use are: 1. gv103re: Did the
graduate do volunteer work in the last 12 months? This dichotomous
variable assesses whether or not the participant has engaged in any
volunteering activities in the last 12 months. The levels of this
variable are yes/no. Yes will be coded as ‘1’, no will be coded as ‘0’.
2. gv109re: Number of graduate’s other volunteer activities in the past
12 months. This variable is a summary index providing a quantitative
measure of the participant’s volunteering activities. Scores on this
variable range from 1 to 5 and reflect the number of the previous five
questions to which the participant answered YES. The previous five
questions assess whether or not the participant volunteered at any of
the following organization types: (1) religious organizations; (2)
school or educational organization; (3) political group or labor union;
(4) senior citizen group or related organization; (5) other national or
local organizations. For each of these questions the answer ‘yes’ is
coded as 1 and the answer ‘no’ is coded as 0. 3. gv111re: How many hours
did the graduate volunteer during a typical month in the last 12 months?
This is a numerical variable that provides information on how many hours
per month, on average, the participant volunteered. The three variables
will be treated as separate measures in the dataset and do not require
manual aggregation.

Number of Siblings (Covariate): We will include the participant’s number of siblings as a control variable because many religious families are large (Pew Research Center, 2015) and it can be argued that cooperation and trust arise more naturally in larger families because of the larger number of social interactions in those families. To measure participants’ number of siblings we used the variable gk067ss: The total number of siblings ever born from the 2004 phone survey Siblings module (see https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k&module=gsib). This is a numerical variable with the possibility for the participant to state “I don’t know”. At the interview participants were instructed to include “siblings born alive but no longer living, as well as those alive now and to include step-brothers and step-sisters and children adopted by their parents.”

Agreeableness (Covariate): We will include the summary score for agreeableness (ih009rec, see https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k&module=gmail_values) in the analysis as a control variable because a previous study (on the same dataset, see the answer to Q18) we were involved in showed a positive association between agreeableness and prosociality. Because previous research also indicates a positive association between agreeableness and religiosity (Saroglou, 2002) we need to include agreeableness as a control variable to disentangle the influence of religiosity on prosociality and the influence of agreeableness on prosociality. The variable ih009rec is a sum score of the variables ih003rer-ih008rer (To what extent do you agree that you see yourself as someone who is talkative / is reserved [reverse coded] / is full of energy / tends to be quiet [reverse coded] / who is sometimes shy or inhibited [reverse coded] / who generates a lot of enthusiasm). All of these were scored from 1 to 6 (1 = “agree strongly”, 2 = “agree moderately”, 3 = “agree slightly”, 4 = “disagree slightly”, 5 = “disagree moderately”, 6 = “disagree strongly”), while participants could also refuse to answer the question. If a participant refused to answer one of the questions, that participant’s score was not included in the sum score variable ih009rec.

Inclusion and exclusion criteria

inclusion

Which units of analysis (respondents, cases, etc.) will be included or excluded in your study? Taking these inclusion/exclusion criteria into account, indicate the (expected) sample size of the data you’ll be using for your statistical analyses to the best of your knowledge. In the next few questions, you will be asked to refine this sample size estimation based on your judgments about missing data and outliers.

*Example*: Initially, the WLS consisted of 10,317
participants. As we are not interested in a specific group of Wisconsin
people, we will not exclude any participants from our analyses. However,
only 7,265 participants filled out the questions on prosociality and the
number of siblings in the phone survey and only 6,845 filled out the
religiosity items in the mail survey (Herd et al., 2014). This
corresponds to a response rate of 73% and 69% respectively. Because we
do not know whether the participants that did the mail survey also did
the phone survey, our minimum expected sample size is 10,317 * 0.73 *
0.69 = 5,297.

Missing data

missing

What do you know about missing data in the dataset (i.e., overall missingness rate, information about differential dropout)? How will you deal with incomplete or missing data? Based on this information, provide a new expected sample size.

*Example*: The WLS provides a documented set of missing codes.
In Table 1 (see https://doi.org/10.15626/MP.2020.2625) you can find
missingness information for every variable we will include in the
statistical analyses. ‘System missing’ refers to the number of
participants that did not or could not complete the questionnaire.
‘Partial interview’ refers to the number of participants that did not
get that particular question because they were only partially
interviewed. The rest of the codes are self-explanatory. Importantly,
some respondents refused to answer the religiosity questions. These
respondents apparently felt strongly about these questions, which could
indicate that they are either very religious or very anti-religious. If
that is the case, the respondent’s propensity to respond is directly
associated with their level of religiosity and that the data is missing
not at random (MNAR). Because it is not possible to test the stringent
assumptions of the modern techniques for handling MNAR data we will
resort to simple listwise deletion. It must be noted that this may bias
our data as we may lose respondents who are very religious or
anti-religious. However, we believe this bias to be relatively harmless
given that our sample still includes many respondents that provided
extreme responses to the items about the importance of the different
facets of religion (see https://www.ssc.wisc.edu/wlsresearch/documentation/waves/?wave=grad2k&module=gmail_religion).
Moreover, because our initial sample size is very large, statistical
power is not substantially compromised by omitting these respondents.
That being said, we will extensively discuss any potential biases
resulting from missing data in the limitations section of our paper.
Employing listwise deletion leads to an expected minimum number of
10,317 * 0.30 * 0.70 * 0.64 = 1,387 participants for the binary logistic
regression, and an expected minimum number of 10,317 * 0.24 * 0.70 *
0.64 = 1,109 (gv109re) and 10,317 * 0.23 * 0.70 * 0.64 = 1,063 (gv111re)
for the linear regressions.

Outliers

outliers

If you plan to remove outliers, how will you define what a statistical outlier is in your data? Please also provide a new expected sample size. Note that this will be the definitive expected sample size for your study and you will use this number to do any power analyses.

*Example*: The dataset probably does not involve any invalid
data since the dataset has been previously ‘cleaned’ by the WLS data
controllers and any clearly unreasonably low or high values have been
removed from the dataset. However, to be sure we will create a box and
whisker plot for all continuous variables (the dependent variables
gv109re and gv111re, the covariate gk067ss, and the scale for
religiosity) and remove any data point that appears to be more than 1.5
times the IQR away from the 25th and 75th percentile. Based on normally
distributed data, we expect that 2.1% of the data points will be removed
this way, leaving 1,358 out of 1,387 participants for the binary
regression with gv103re as the outcome variable and 1,086 out of 1,109
participants, and 1,041 out of 1,063 participants for the linear
regressions with gv109re and gv111re as the outcome variables,
respectively.

Sampling weights

sampling_weights

Are there sampling weights available with this dataset? If so, are you using them or are you using your own sampling weights?

*Example*: The WLS dataset does not include sampling weights
and we will not use our own sampling weights as we do not seek to make
any claims that are generalizable to the national population.

Previous work

previous_work

List the publications, working papers (in preparation, unpublished, preprints), and conference presentations (talks, posters) you have worked on that are based on the dataset you will use. For each work, list the variables you analyzed, but limit yourself to variables that are relevant to the proposed analysis. If the dataset is longitudinal, also state which wave of the dataset you analyzed. Importantly, some of your team members may have used this dataset, and others may not have. It is therefore important to specify the previous works for every co-author separately. Also mention relevant work on this dataset by researchers you are affiliated with as their knowledge of the data may have been spilled over to you. When the provider of the data also has an overview of all the work that has been done using the dataset, link to that overview.

*Example*: Both authors (PS and JC) have previously used the
Graduates 2003-2005 wave to assess the link between Big Five personality
traits and prosociality. The variables we used to measure the Big Five
personality traits were ih001rei (extraversion), ih009rei
(agreeableness), ih017rei (conscientiousness), ih025rei (neuroticism),
and ih032rei (openness). The variables we used to measure prosociality
were ih013rer (“To what extent do you agree that you see yourself as
someone who is generally trusting?”), ih015rer (“To what extent do you
agree that you see yourself as someone who is considerate to almost
everyone?”), and ih016rer (“To what extent do you agree that you see
yourself as someone who likes to cooperate with others?). We presented
the results at the ARP conference in St. Louis in 2013 and we are
currently finalizing a manuscript based on these results. Additionally,
a senior graduate student in JC’s lab used the Graduates 2011 wave for
exploratory analyses on depression. She linked depression to alcohol use
and general health indicators. She did not look at variables related to
religiosity or prosociality. Her results have not yet been submitted
anywhere. An overview of all publications based on the WLS data can be
found here: https://www.ssc.wisc.edu/wlsresearch/publications/pubs.php?topic=ALL.

Prior knowledge

prior_knowledge

What prior knowledge do you have about the dataset that may be relevant for the proposed analysis? Your prior knowledge could stem from working with the data first-hand, from reading previously published research, or from codebooks. Also provide any relevant knowledge of subsets of the data you will not be using. Provide prior knowledge for every author separately.

*Example*: In a previous study (mentioned in Q17) we used
three prosociality variables (ih013rer, ih015rer, and ih016rer) that may
be related to the prosociality variables we use in this study. We found
that ih013rer, ih015rer, and ih016rer are positively associated with
agreeableness (ih009rec). Because previous research (on other datasets)
shows a positive association between agreeableness and religiosity
(Saroglou, 2002) agreeableness may act as a confounding variable. To
account for this we will include agreeableness in our analysis as a
control variable. We did not find any associations between prosociality
and the other Big Five variables.

Statistical model

model

For each hypothesis, describe the statistical model you will use to test the hypothesis. Include the type of model (e.g., ANOVA, multiple regression, SEM) and the specification of the model. Specify any interactions and post-hoc analyses and remember that any test not included here must be labeled as an exploratory test in the final paper.

*Example*: Our first hypothesis will be tested using three
analyses since we use three variables to measure prosociality. For each,
we will run a directional null hypothesis significance test to see
whether a positive effect exists of religiosity on prosociality. For the
first outcome (gv103re: Did the graduate do volunteer work in the last
12 months?) we will run a logistic regression with religiosity, the
number of siblings, and agreeableness as predictors. For the second and
third outcomes (gv109re: Number of graduate’s other volunteer activities
in the past 12 months; gv111re: How many hours did the graduate
volunteer during a typical month in the last 12 months?) we will run two
separate linear regressions with religiosity, the number of siblings,
and agreeableness as predictors. The code we will use for all these
analyses can be found at https://osf.io/e3htr.

Effect size

effect_size

If applicable, specify a predicted effect size or a minimum effect size of interest for all the effects tested in your statistical analyses.

*Example*: For the logistic regression with ‘Did the graduate
do volunteer work in the last 12 months?’ as the outcome variable, our
minimum effect size of interest is an odds of 1.05. This means that a
one-unit increase on the religiosity scale would be associated with a
1.05 factor change in odds of having done volunteering work in the last
12 months versus not having done so. For the linear regressions with
‘The number of graduate’s volunteer activities in the last 12 months”,
and “How many hours did the graduate volunteer during a typical month in
the last 12 months?’ as the outcome variables, the minimum regression
coefficients of interest of the religiosity variables are 0.05 and 0.5,
respectively. This means that a one-unit increase in the religiosity
scale would be associated with 0.05 extra volunteering activities in the
last 12 months and with 0.5 more hours of volunteering work in the last
12 months. All of these smallest effect sizes of interest are based on
our own intuition. To make comparisons possible between the effects in
our study and similar effects in other studies the unstandardized linear
regression coefficients will be transformed into standardized regression
coefficients using the following formula: β_i=B_i (s_i/s_y), where B_i
is the unstandardized regression coefficient of independent variable i,
and s_i and s_y are the standard deviations of the independent and
dependent variable respectively. Comment(s): A predicted effect size is
ideally based on a representative preliminary study or meta-analytical
result. If those are not available, it is also possible to use your own
intuition. For advice on setting a minimum effect size of interest, see
Lakens, Scheel, & Isager (2018) and Funder and Ozer (2019).

Power

power

Present the statistical power available to detect the predicted effect size(s) or the smallest effect size(s) of interest, OR present the accuracy that will be obtained for estimation. Use the sample size after updating for missing data and outliers, and justify the assumptions and parameters used (e.g., give an explanation of why anything smaller than the smallest effect size of interest would be theoretically or practically unimportant).

*Example*: The sample size after updating for missing data and
outliers is 1,358 for the logistic regression with gv103re as the
outcome variable, and 1,086 and 1,041 for the linear regressions with
gv109re and gv111re as the outcome variables, respectively. For all
three analyses this corresponds to a statistical power of approximately
1.00 when assuming our minimum effect sizes of interest. For the linear
regressions we additionally assumed the variance explained by the
predictor to be 0.2 and the residual variance to be 1.0 (see figure
below for the full power analysis of the regression with the lowest
sample size). For the logistic regression we assumed an intercept of
-1.56 corresponding to a situation where half of the participants have
done volunteer work in the last year (see the R-code for the full power
analysis at https://osf.io/f96rn).

Inference criteria

inference_criteria

What criteria will you use to make inferences? Describe the information you will use (e.g. specify the p-values, effect sizes, confidence intervals, Bayes factors, specific model fit indices), as well as cut-off criteria, where appropriate. Will you be using one- or two-tailed tests for each of your analyses? If you are comparing multiple conditions or testing multiple hypotheses, will you account for this, and if so, how?

*Example*: We will make inferences about the association
between religiosity and prosociality based on the p-values and the size
of the regression coefficients of the religiosity variable in the three
main regressions. We will conclude that a regression analysis supports
our hypothesis if both the p-value is smaller than .01 and the
regression coefficient is larger than our minimum effect size of
interest. We chose an alpha of .01 to account for the fact that we do a
test for each of the three regressions (0.05/3, rounded down). If the
conditions above hold for all three regressions, we will conclude that
our hypothesis is fully supported, if they hold for one or two of the
regressions we will conclude that our hypothesis is partially supported,
and if they hold for none of the regressions we will conclude that our
hypothesis is not supported.

Assumptions

assumptions

What will you do should your data violate assumptions, your model not converge, or some other analytic problem arises?

*Example*: When the distribution of the number of volunteering
hours (gv111re) is significantly non-normal according to the
Kolmogorov-Smirnov test (Massey, 1951), and/or (b) the linearity
assumption is violated (i.e., the points are asymmetrically distributed
around the diagonal line when plotting observed versus the predicted
values), we will log-transform the variable.

Sensitivity

sensitivity

Provide a series of decisions about evaluating the strength, reliability, or robustness of your focal hypothesis test. This may include within-study replication attempts, additional covariates, cross-validation efforts (out-of-sample replication, split/hold-out sample), applying weights, selectively applying constraints in an SEM context (e.g., comparing model fit statistics), overfitting adjustment techniques used (e.g., regularization approaches such as ridge regression), or some other simulation/sampling/bootstrapping method.

*Example*: To assess the sensitivity of our results to our
selection criterion for outliers, we will run an additional analysis
without removing any outliers.

Exploratory

exploratory

If you plan to explore your dataset to look for unexpected differences or relationships, describe those tests here, or add them to the final paper under a heading that clearly differentiates this exploratory part of your study from the confirmatory part.

*Example*: As an exploratory analysis, we will test the
relationship between scores on the religiosity scale and prosociality
after adjusting for a variety of social, educational, and cognitive
covariates that are available in the dataset. We have no specific
hypotheses about which covariates will attenuate the
religiosity-prosociality relation most substantially, but we will use
this exploratory analysis to generate hypotheses to test in other,
independent datasets.

*Comments*: Whereas it is not presently the norm to
preregister exploratory analyses, it is often good to be clear about
which variables will be explored (if any), for example, to differentiate
these from the variables for which you have specific predictions or to
plan ahead about how to compute these variables.