I published this post a little while ago on my previous blog. Just to have some continuity, I am reposting it onto this blog.
The current post investigates motivations for the alternation pattern between the comparative modal constructions had better (1) and better (2) (van der Auwera and De Wit 2010, Denison and Cort 2010, Mitchell 2003, Colllins 2009).
(1) `If you know where she is, you had better tell me at once …’ (BB-cF022211)
(2) ‘ … I won’t be working for you either. We’ll be working together. You better be straight about that from the beginning.’ (BB-cF022152)
The post is meant as a more robust and statistically supported continuation of the pilot study conducted by van der Auwera and De Wit (2010). These authors tentatively suggest that the researched deontic expressions of advisability or optativity can be distinguished from each other in a number of ways, viz. 3rd person Advisee is animate vs. inanimate, UK vs. US, type of subject (pro)noun as Advisee, conversational vs. not conversational.
In summary, the following hypotheses can be highlighted.
had better is confined to narrative data, whereas better is to conversational data. The latter is especially so for UK English.
had better correlates with the Advisee (3SG/PL), whereas better correlates with the Advisee (1SG, 1PL or 2SG/PL).
better is not more characteristic of UK English than US English.
In the following, the alternation between had better and better will be further investigated. I have left ‘d better out of consideration for two main reasons. Firstly, simple and multiple logistic regressions of the kind used here are conducted on data with binary outcomes. Second, better and had better are furthest removed from each other in both frequency (at least in my dataset) and – according to van der Auwera and De Wit – in use (with better being largely confined to third persons and conversational data). Another reason adding to my decision to leave out ‘d better is to keep this study as straightforward as possible.
I used the Collins Harper Online Wordbanks Corpus (553 171 489 tokens) to investigate written and spoken UK and US English. Again to keep the study straightforward, I have tried to keep my sample small but representative: Firstly, I selected only written material from magazines and from books. Spoken data contains various unspecified genres. Secondly, all data ranges from 2002 to 2005, a set which I believe will highlight the synchronic condition of Present Day English. I found 71 instances of better and 243 instances of had better.
This study is primarily quantitative. As mentioned above, I have conducted logistic regression tests in the statistic program R. Overall, a logistic regression test provides an ‘analysis and prediction of a dichotomous outcome’ (Peng, Lee and Ingersoll 2002, p. 3). In this paper, I used two popular commands to do this: a glm (generalised linear model) for the main part of the analysis and an lrm (logistic regression model) which offers more insight in the fit of the model and its predictive power. I have tried to conform to the conventions set out by Peng, Lee and Ingersoll (2002). My data consists of 71 deontic sentences with the short better + Infinitive construction and 243 deontic sentences of the longer had better + Infinitive construction (cf. (1) and (2)). The success level or outcome variable (Obsid) is had better, the failure level is better. In the light of testing van der Auwera and De Wit’s hypotheses (cf. supra), I chose the following predictors of which they tentatively suggested that they could be significant: Country (UK or US),Conversational (Yes or No), Advisee (1SG, 1PL, 2SG/PL or 3SG/PL) and Animacy (Yes or No).
3.2 Methodological issue
A methodological issue was brought to my attention after conducting the glm command for a first time. The predictor Animacy is actually only relevant with respect to third person advisees, because, as mentioned, the other persons can only be animate (presuming we do not personify objects and refer to them as ‘you’, but I did not find that in the data anyway). I thus had to compute a combined predictor (Advisee2) for Animacy and third person Advisee using the commands in Appendix 1. When applying Advisee2in another glm (fit2), I found that only the predictor Country and the combined predictor Advisee23SG/PL/Animate were significant (Appendix 2). Another model (d.glm), omitting Advisee/Advisee2 altogether, showed me yet other results: this time, all of the predictors were significant. This indicated that the predictor Advisee (and Advisee2 for that matter) was not decisive for the alternation pattern between better and had better. An ANOVA-test testing the different fits of the two models indicated that continuing with a model without the Advisee(2) predictor (d.glm) would be better. The drop1-measure (a Chi square test) further confirmed that all the predictors in my second model were significant (Appendix 3).
1 van der Auwera and De Wit’s hunch about three of their predictors was correct. What the output shows about the predictors, however, is different from the suggested tendencies (treated in section 5.1). The output of d.glm (Appendix 4 and table 2 and 3 below) reveals that all of the predictors were significant with Country p<.05 , Conversational p<.05, Animacy p=0.
As already mentioned, the success level is had better. The estimates reveal the following findings:
1. There is a low probability for US English to use had better because the estimate of CountryUS is roughly -0.61.
2. There is an equally low probability for conversational instances using the had better variant because the estimate ofConversationalYes is -0.63.
3. There is a very low probability for the subject (pro)noun to be inanimate when using had better because the estimate is -1.17.
A first model diagnostic can at this stage be made: the residual deviance (310.96 on 310 degrees of freedom) shows that there is no sign of under- or overdispersion. The residual deviance does not differ much from the degrees of freedom and this confirms that the model has a good fit.
2 From the output of the lrm, we derive that the deviance of the model with 3 predictors is significantly smaller than the intercept only model because p » 0. The likelihood ratio chi-square (L.R.) is not very high (24.73), which also means that the fitted model is a better fit than the intercept only model. C and Dxy assess the classificatory quality of the model (Speelman 2011). My model does not have ‘decent’ classificatory quality because Dxy (.424) is lower than .6 and C (.36) is lower than .8.
3 One of the many good things about logistic regression is that you can distinguish partial effects, which are the effects of an individual predictor when controlled for the other predictors (see Appendix 5). This gives a much clearer view of the actual effect of a predictor. For my data, the partial effect commands revealed that (1) the UK had a probability of roughly 85% that had better will be used when controlled for Animacy and Conversational, (2) the UK had a probability of roughly 85% that had better will occur in non-conversational instances and that (3) the UK has a probability of roughly 75% that the subject (pro)noun will be animate when using had better.
5. Discussion and conclusion
5.1 (Dis)confirming hypotheses
Hypothesis 1: had better is confined to narrative data, whereas better is to conversational data. The latter is especially so for UK English.
When reiterating the above finding 2 that conversational instances are not likely to occur with had better, we see that the first hypothesis is confirmed (Conversational p<.05). We can also see that, proportionally, UK makes more use of better than US in conversational data. An explanation for this confinement to conversational data could be the level of informativity that is associated with better constructions.
Hypothesis 2: had better correlates with the Advisee (3SG/PL), whereas better correlates with the Advisee (1SG, 1PL or 2SG/PL).
Section 3.2 explained that the Advisee predictor was not significant for the prediction of one of the studied variants. When addressing the hypothesis then, it becomes apparent that compared to US English UK English prefers the use of 1SG and 1PL with had better and that the use of 3SG/PL/A+IA is preferred in the US with better, whereas in UK English there is a more balanced distribution. The partial effects test (section 4.1) also confirms that the UK has a probability of roughly 75% that the subject (pro)noun will be animate when using had better.
What d.glm did reveal, however, was finding 3. That is, there is a very low probability for the subject (pro)noun to be inanimate when using had better. This might seem counterintuitive seeing the higher frequency of had better. However, it could be indicative of a specialised use of better with inanimate subjects. Further research and a larger set of data will have to be conducted to get to the core of this.
Hypothesis 3: better is less characteristic of UK English than US English.
The predictor Country was a significant predictor and indeed it predicts that better is more characteristic of US English. This was already shown by finding 1 above . In the mosaic plot, which is sadly not available to you, the area covered by had better for UK is larger than for better in comparison to that covered for US. The reason for this might lie in the assumption that the betterconstructions have been used for a longer period of time in US English than in UK English. A more widespread distribution would then not be surprising. This claim, yet, is not mine to make because I do not have access to data that would confirm it.
I conclusion, van der Auwera and De Wit’s pilot study had the right intuitions about the modal comparatives had better and better. My larger dataset showed that there is indeed a difference among UK and US English with a preference for better for US English. Furthermore, conversational data was witnessed to correlate with the use of better. In one respect, however, van der Auwera and De Wit’s analysis turned out to be irrelevant for the choice between the two variants. My analysis illustrated that the grammatical number of the subject of the sentence is in fact not significant for the choice between either variant. Instead, it is the predictorAnimacy which is most determining: there is a very low probability for the subject (pro)noun to be inanimate when using had better.
Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E. 1999. Longman grammar of spoken and written English. London: Longman.
Collins, P. 2009. Modals and quasi-modals. Amsterdam & New York: Ropodi.
Denison, D. and Cort, A. 2010. Better as a Verb. In: Davidse, K., Vandelanotte, L. and Cuyckens, H. (eds.). Subjectification, Intersubjectification and Grammaticali-sation. Berlin, New York: Walter de Gruyter, pp. 349-283.
Hopper, P.J. and Traugott, E.C. 2003. Grammaticalization. Cambridge: Cambridge University Press.
Leech, G. 2003. Modality on the move: The English modal auxiliaries 1961-1992. In: Facchinetti, R., Krug, M. and Palmer, F.Modality in contemporary English. Berlin & New York: Walter de Gruyter, pp. 225-240.
Mitchell, K. 2003. Had better and might as well: On the margins of modality. In: Facchinetti, R., Krug, M. and Palmer, F. Modality in contemporary English. Berlin & New York: Walter de Gruyter, pp. 131-149.
Peng, C.J., Lee, K.L. and Ingersoll, G.M. 2002. An Introduction to Logistic Regression Analysis and Reporting. The Journal of Educational Research 96(1), pp. 3-14.
Speelman, D. 2011. Logistic Regression in Corpus Linguistics.
van der Auwera, J. and De Wit, A. 2010. The English Comparative Modals. In: Cappelle, B. and Wada, N. Distinctions in English Grammar. Offered to Renaat Declerck. Tokyo: Kaitakusha, pp. 127-147.
 This is for obvious reasons only relevant for third person subjects. Inanimate (IA) will only refer to 3SG/PL.