‘Better start work.’ – ‘Yes, you better had.’ — an alternation study.


I published this post a little while ago on my previous blog. Just to have some continuity, I am reposting it onto this blog.

1. Introduction

The current post investigates motivations for the alternation pattern between the comparative modal constructions had better (1) and better (2) (van der Auwera and De Wit 2010, Denison and Cort 2010, Mitchell 2003, Colllins 2009).

(1) `If you know where she is, you had better tell me at once …’ (BB-cF022211)

(2)       ‘ … I won’t be working for you either. We’ll be working together. You better be straight about that from the beginning.’ (BB-cF022152)

The post is meant as a more robust and statistically supported continuation of the pilot study conducted by van der Auwera and De Wit (2010). These authors tentatively suggest that the researched deontic expressions of advisability or optativity can be distinguished from each other in a number of ways, viz. 3rd person Advisee is animate vs. inanimate, UK vs. US, type of subject (pro)noun as Advisee, conversational vs. not conversational.

2. Hypotheses

In summary, the following hypotheses can be highlighted.

Hypothesis 1:

had better is confined to narrative data, whereas better is to conversational data. The latter is especially so for UK English.

Hypothesis 2:

had better correlates with the Advisee (3SG/PL), whereas better correlates with the Advisee (1SG, 1PL or 2SG/PL).

Hypothesis 3:

better is not more characteristic of UK English than US English.

3. Methodology

3.1 Preliminaries

In the following, the alternation between had better and better will be further investigated. I have left ‘d better out of consideration for two main reasons. Firstly, simple and multiple logistic regressions of the kind used here are conducted on data with binary outcomes. Second, better and had better are furthest removed from each other in both frequency (at least in my dataset) and – according to van der Auwera and De Wit –  in use (with better being largely confined to third persons and conversational data). Another reason adding to my decision to leave out ‘d better is to keep this study as straightforward as possible.

I used the Collins Harper Online Wordbanks Corpus (553 171 489 tokens) to investigate written and spoken UK and US English. Again to keep the study straightforward, I have tried to keep my sample small but representative: Firstly, I selected only written material from magazines and from books. Spoken data contains various unspecified genres. Secondly, all data ranges from 2002 to 2005, a set which I believe will highlight the synchronic condition of Present Day English. I found 71 instances of better and 243 instances of had better.

This study is primarily quantitative. As mentioned above, I have conducted logistic regression tests in the statistic program R. Overall, a logistic regression test provides an ‘analysis and prediction of a dichotomous outcome’ (Peng, Lee and Ingersoll 2002, p. 3). In this paper, I used two popular commands to do this: a glm (generalised linear model) for the main part of the analysis and an lrm (logistic regression model) which offers more insight in the fit of the model and its predictive power. I have tried to conform to the conventions set out by Peng, Lee and Ingersoll (2002). My data consists of 71 deontic sentences with the short better + Infinitive construction and 243 deontic sentences of the longer had better + Infinitive construction (cf. (1) and (2)). The success level or outcome variable (Obsid) is had better, the failure level is better. In the light of testing van der Auwera and De Wit’s hypotheses (cf. supra), I chose the following predictors of which they tentatively suggested that they could be significant: Country (UK or US),Conversational (Yes or No), Advisee (1SG, 1PL, 2SG/PL or 3SG/PL) and Animacy (Yes or No)[1].

3.2 Methodological issue

A methodological issue was brought to my attention after conducting the glm command for a first time. The predictor Animacy is actually only relevant with respect to third person advisees, because, as mentioned, the other persons can only be animate (presuming we do not personify objects and refer to them as ‘you’, but I did not find that in the data anyway). I thus had to compute a combined predictor (Advisee2) for Animacy and third person Advisee using the commands in Appendix 1. When applying Advisee2in another glm (fit2), I found that only the predictor Country and the combined predictor Advisee23SG/PL/Animate were significant (Appendix 2). Another model (d.glm), omitting Advisee/Advisee2 altogether, showed me yet other results: this time, all of the predictors were significant. This indicated that the predictor Advisee (and Advisee2 for that matter) was not decisive for the alternation pattern between better and had better. An ANOVA-test testing the different fits of the two models indicated that continuing with a model without the Advisee(2) predictor (d.glm) would be better. The drop1-measure (a Chi square test) further confirmed that all the predictors in my second model were significant (Appendix 3).

4. Results

4.1 General

1 van der Auwera and De Wit’s hunch about three of their predictors was correct. What the output shows about the predictors, however, is different from the suggested tendencies (treated in section 5.1). The output of d.glm (Appendix 4 and table 2 and 3 below) reveals that all of the predictors were significant with Country p<.05 , Conversational p<.05, Animacy p=0.

As already mentioned, the success level is had better. The estimates reveal the following findings:

1. There is a low probability for US English to use had better because the estimate of CountryUS is roughly -0.61.

2. There is an equally low probability for conversational instances using the had better variant because the estimate ofConversationalYes is -0.63.

3. There is a very low probability for the subject (pro)noun to be inanimate when using had better because the estimate is -1.17.

A first model diagnostic can at this stage be made: the residual deviance (310.96 on 310 degrees of freedom) shows that there is no sign of under- or overdispersion. The residual deviance does not differ much from the degrees of freedom and this confirms that the model has a good fit.

2 From the output of the lrm, we derive that the deviance of the model with 3 predictors is significantly smaller than the intercept only model because p » 0.  The likelihood ratio chi-square (L.R.) is not very high (24.73), which also means that the fitted model is a better fit than the intercept only model. C and Dxy assess the classificatory quality of the model (Speelman 2011). My model does not have ‘decent’ classificatory quality because Dxy (.424) is lower than .6 and C (.36) is lower than .8.

3 One of the many good things about logistic regression is that you can distinguish partial effects, which are the effects of an individual predictor when controlled for the other predictors (see Appendix 5). This gives a much clearer view of the actual effect of a predictor. For my data, the partial effect commands revealed that (1) the UK had a probability of roughly 85% that had better will be used when controlled for Animacy and Conversational, (2) the UK had a probability of roughly 85% that had better will occur in non-conversational instances and that (3) the UK has a probability of roughly 75% that the subject (pro)noun will be animate when using had better.

5. Discussion and conclusion

5.1 (Dis)confirming hypotheses

Hypothesis 1: had better is confined to narrative data, whereas better is to conversational data. The latter is especially so for UK English.

When reiterating the above finding 2 that conversational instances are not likely to occur with had better, we see that the first hypothesis is confirmed (Conversational p<.05).  We can also see that, proportionally, UK makes more use of better than US in conversational data. An explanation for this confinement to conversational data could be the level of informativity that is associated with better constructions.

Hypothesis 2: had better correlates with the Advisee (3SG/PL), whereas better correlates with the Advisee (1SG, 1PL or 2SG/PL).

Section 3.2 explained that the Advisee predictor was not significant for the prediction of one of the studied variants. When addressing the hypothesis then, it becomes apparent that compared to US English UK English prefers the use of 1SG and 1PL with had better and that the use of 3SG/PL/A+IA is preferred in the US with better, whereas in UK English there is a more balanced distribution. The partial effects test (section 4.1) also confirms that the UK has a probability of roughly 75% that the subject (pro)noun will be animate when using had better.

What d.glm did reveal, however, was finding 3. That is, there is a very low probability for the subject (pro)noun to be inanimate when using had better. This might seem counterintuitive seeing the higher frequency of had better. However, it could be indicative of a specialised use of better with inanimate subjects. Further research and a larger set of data will have to be conducted to get to the core of this.

Hypothesis 3: better is less characteristic of UK English than US English.

The predictor Country was a significant predictor and indeed it predicts that better is more characteristic of US English. This was already shown by finding 1 above . In the mosaic plot, which is sadly not available to you, the area covered by had better for UK is larger than for better in comparison to that covered for US. The reason for this might lie in the assumption that the betterconstructions have been used for a longer period of time in US English than in UK English. A more widespread distribution would then not be surprising. This claim, yet, is not mine to make because I do not have access to data that would confirm it.

5.2 Conclusion

I conclusion, van der Auwera and De Wit’s pilot study had the right intuitions about the modal comparatives had better and better. My larger dataset showed that there is indeed a difference among UK and US English with a preference for better for US English. Furthermore, conversational data was witnessed to correlate with the use of better. In one respect, however, van der Auwera and De Wit’s analysis turned out to be irrelevant for the choice between the two variants. My analysis illustrated that the grammatical number of the subject of the sentence is in fact not significant for the choice between either variant. Instead, it is the predictorAnimacy which is most determining: there is a very low probability for the subject (pro)noun to be inanimate when using had better.


Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E. 1999. Longman grammar of spoken and written English. London: Longman.

Collins, P. 2009. Modals and quasi-modals. Amsterdam & New York: Ropodi.

Denison, D. and Cort, A. 2010. Better as a Verb. In: Davidse, K., Vandelanotte, L. and Cuyckens, H. (eds.). Subjectification, Intersubjectification and Grammaticali-sation. Berlin, New York: Walter de Gruyter, pp. 349-283.

Hopper, P.J. and Traugott, E.C. 2003. Grammaticalization. Cambridge: Cambridge University Press.

Leech, G. 2003. Modality on the move: The English modal auxiliaries 1961-1992. In: Facchinetti, R., Krug, M. and Palmer, F.Modality in contemporary English. Berlin & New York: Walter de Gruyter, pp. 225-240.

Mitchell, K. 2003. Had better and might as well: On the margins of modality. In: Facchinetti, R., Krug, M. and Palmer, F. Modality in contemporary English. Berlin & New York: Walter de Gruyter, pp. 131-149.

Peng, C.J., Lee, K.L. and Ingersoll, G.M. 2002. An Introduction to Logistic Regression Analysis and Reporting. The Journal of Educational Research 96(1), pp. 3-14.

Speelman, D. 2011. Logistic Regression in Corpus Linguistics.

van der Auwera, J. and De Wit, A. 2010. The English Comparative Modals. In: Cappelle, B. and Wada, N. Distinctions in English Grammar. Offered to Renaat Declerck. Tokyo: Kaitakusha, pp. 127-147.

[1] This is for obvious reasons only relevant for third person subjects. Inanimate (IA) will only refer to 3SG/PL.


‘Possibly’ means that we’re not going to the game.

I published this post a little while ago on my previous blog. Just to have some continuity, I am reposting it onto this blog.

Much research has been done on modality and modal adverbs. As to the epistemic domain, it is often said that ‘possibly’ denotes a 50% likelihood that something is true and that ‘probably’ expresses high likelihood. As such, ‘possibly’ has been related to the modal verb ‘may/might’. While watching a film (The Pursuit of Happyness) the other day, I could not help but be fascinated by the following dialogue (bold is emphasis in speech, cursive are the modal markers):


Father: And maybe we’re going to the game.

Son: Where are we going now?

Father: Just to see someone about my job.

Son: I don’t understand.

Father: You don’t understand what?

Son: Are we going to the game?

Father: I said possibly we’re going to the game. You know what possiblymeans?

Son: Like probably?

Father: No, probably means there’s a good chance that we’re going to the game. And possibly means we mightwe might not. What does probably mean?

Son: It means we have a good chance.

Father: And what does possibly mean?

Son: I know what possibly means.

Father: What does it mean?

Son: It means that we’re not going to the game.

This little dialogue is a beautiful reflection of the subtle nuances of the English modal system, where ‘maybe’, ‘possibly’ and ‘might’ are linked and opposed to the present progressive tense and the adverb ‘probably’. Even more, where ‘possibly’ is related to the negative. I don’t have that much to say about this – because I should actually be writing a paper – but thought it was a particularly interesting bit of the film to share with you interested readers!


Eroding semiotics – the time smileys lost noses and eyes

Language change is a funny thing. It can cause frequent items to erode and go from a fully lexical, and thus understandable and retrievable, item to an altogether weird amalgamation of sounds. Take the standard example we’ve all grown tired of: ‘I do not know’ > ‘I dunno’, or ‘It does not matter’ > ‘Dun matte(r)’. Yes, us speakers are funny creatures, but it is not only in speech that such reductive processes are found (as has been frequently shown).

The smiley in written language is now common practice for nearly all computer users – even for less talented users like myself. But a while ago, something was alienating to me up to the extent that it kept bothering me in the back of my mind. Now I have finally formed some theory as to what was going on. It was during some frequent email correspondence that I noticed that my correspondent ‘must have had a fawlty keyboard’. After all, every time this person used a smiley, only part of it would show, viz. the bracket ‘)’, or ‘(‘. But then, I noticed another thing: the person’s colon made sporadic appearances when introducing lists for example. That was proof: the use of the colon wasn’t lost on this person, it was just not necessary anymore for smileys. Indeed, the smiley seems to have gone through the following reductive process, through which the poor semiotic has lost most of its features: its eyes and nose:

‘this is a joke/funny’ or ‘I feel happy’    >     🙂    >    🙂    >    )

I started wondering: if not due to a fawlty keyboard, why would one want to gradually get rid of what makes a smiley a smiley? Of course this seems to happen to all items in noteworthy change. So what’s more? Well, time is money, so economy jumped to mind straight away. Thank you capitalism. Now, we cannot forget about routinisation. We’re all so used to seeing the symbols that make up a smiley, that our interpretative skills would not have to make much of an effort to understand the mere bracket. Next, though, another – perhaps trivial – thing dawned on me: Microsoft Office (and I assume other such computer software programs). Yes, indeed; whenever we want to save a file or a document, giving it an appropriate name which expresses our happiness about finishing our latest blogpost (to mention but a random example), the bugger won’t let us use symbols, including the beloved colon! Is this part of the breeding ground for the development of the smiley?

I agree, the latter is quite speculative, but maybe the importance of computer-language shouldn’t be underestimated. This little example shows again that our spoken means of expressing (feelings) have been moulded to the constraints of our time: ‘time is money’ (economy), ‘we’re creatures of habit’ (routinisation) and ‘the confines of limited virtual space’ (the computer).

Most interesting perhaps is that this applies not only to the written realm but is also expanding into the spoken realm where people have been attested to say ‘lol‘, for instance. From a functional perspective, this could be quite revealing into communicative behaviour. Is this just a sign of the integration of these computer-induced acronyms, or do they serve other functions too? I can think of a few for ‘lol’ that I have come across. Sometimes, it seems as if we use it as a mitigator in order to not threaten someone else’s face. In other words, when we say ‘lol’ (well, I don’t, actually…), sometimes we don’t mean that we’re laughing out loud (because then we would do so probably) but we mean ‘you’re actually not being that funny, but I’ll say lol to accommodate for you’. In other circumstances, though, the acronym is a reinforcer of our harty laugh: ‘you are actually being funny’. So, are we reinforcing the good old laugh (and other facial and linguistic expressions) with a new paradigm of acronyms and featureless smiles? In reality, do we actually poke our tongues out? I have definitely seen it done more often.

Sadly, these speculations are actually outside of my field of study, but I thought it was quite an interesting thing to note. Discourse analysts are probably already heavily working on the smile(y) as I type ).

ICAME – Disappearances and Failures in Language Change

Wednesday, I was lucky enough to attend a pre-conference workshop free of charge of the ICAME conference in Leuven organised by the University of Leuven and the University of Namur. Despite the fact I had to choose 1 from 5 very interesting workshops, I was soon – evidently – drawn towards WS3 on Disappearances and Failures in Language ChangeHendrik De Smet and Peter Petré hosted, introduced and concluded 5 most inspiring talks by Florian DolbergMalte RosemeyerMarianne Hundt, Mathieu Fraikin & Peter Petré and Stefan Diemer.

It was Hendrik De Smet and Peter Petré’s aim to raise awareness that the contemporary focus on success stories in language change, most notably grammaticalisation studies, do not (and indeed should not) constitute the whole picture. ‘There are two sides to the coin of language change’ resonated clearly in the talks and the call for papers. Grammaticalisation studies (and studies on language change at large) have focused on stories of emancipation and entrenchment mainly, leading to a perhaps wrong conception that grammaticalisation inevitably implies conventionalization and routinisation. This, however, is not entirely true, which is remarked less often than neglected. Accepting the above also has its repercussions on the importance of frequency: although frequency is certainly a prerequisite of sorts for grammaticalisation (which does not necessarily lead to entrenchment), frequency in itself cannot be seen as conclusive or explanatory nor as a direct cause of success stories in language change. Additionally, it seems to me that the same principles of intralinguistic competition and system pressure and extralinguistic pressures underlie both the creation and the dissolution of linguistic items and paradigms.

The workshop was particularly interesting to me because I am currently conducting research on a seemingly disappearing modal paradigm in the English language, viz. the comparative modal better-constructions (‘had better’, ‘ ‘d better’, ‘better’). Making the topic even more interesting are conflicting views on the modal paradigm: on the one hand we witness that the well-established ‘traditional’ modal verbs are actually also declining; on the other hand Krug (2000) documents on new emerging modals (gonna, wanna, gotta, etc.) acting as replacements of the older modal verbs. The question where the better constructions belong and whether they will eventually assimilate to the emerging modals or disappear like the traditional modal verbs now seems more pressing than ever.. So, we are witnessing something like this:

modal verbs – decline

better constructions – decline        will there be analogy? or dismissal?

emerging modals -rise ­                       what is the role of entrenchment and frequency

Malte Rosemeyer, with his usage-based account on English and Spanish be + pp, raised some interesting points on persistence and entrenchment. His talk got me thinking; if the modal verb paradigm is declining, and even the less conventionalized modal verb replacements (‘competitors’) are experiencing a similar decline, could this mean that competition is lost or loosened between grammatical modal expressions? Going a few speculative steps further: could this mean that the entrenchment of the modal paradigm is slowly becoming less important to the English speaker, and why?

Stefan Diemer ended the talks on a positive note and gave all of us modality lovers hope: linguistic items can stage a comeback! I wonder when the modals (or anything like them) will strike back… Must they, need they, should they, can they, will they?