Computer-Based Testing Vs. Paper-Based Testing

What are the advantages, disadvantages and what is the future of language testing?

The situation now

On the 25th of February 2011 the BBC Today Programme invited Isabel Nisbet, the outgoing chief the UK based qualifications watchdog, Ofqual, to come on and discuss the issue of computer-based testing (CBT). Ms Nisbet stated that the general attitude to CBT had been that it was too difficult a topic to raise, because of the number of various pitfalls, complications and opposition. However, as outgoing chief she had decided to ‘move it off the too difficult pile’ and try and get the subject addressed because she felt it was indeed an important issue. Her explanation was that because children do much of their learning and exploring digitally, they should be assessed in the same way. Her exact words were‘In the future, how things are tested should match how people learn and how they act.’ This echoes back to one of the most important issues with language testing, as raised by Bachman and Palmer (1996) that tests should resemble the real thing for which they are testing. This was one of the criteria for test usefulness; authenticity. Because the way people, especially younger generations, interact with the world is largely going to be through a computer, testing and assessment should reflect that. This is certainly something which is worth discussing as it will undoubtedly have international repercussions across all areas of education.

The advantages

CBT allows for more accurate, secure, rapid and more controlled test administration. From students sitting the test, to tests being marked and results being published, all the way through to researching those data and evaluating the test. This is perhaps something critics of CBT would argue against, but I think any scepticism on this part would be aimed at a mistrust of technology rather than a genuine belief that paper-based testing (PBT) is actually better in these respects. As long as the computers are reliable and secure there is no reason to doubt the claim that CBT is far superior in these respects. I will address the problems here in the next section.
Another great advantage would be that voiced by Nisbet of the authenticity of the tests and the fact that it reflects the real world situational use of the topics being tested. In language teaching, these were referred to as Target Language Use (TLU) domain. It also applies to fields other than language testing. If, upon graduation, you mainly compose emails in French to colleagues and rarely compose postcards on paper, then the test you sat to graduate should reflect this. For my GCSES I wrote a postcard in French as part of the test. I remember I wrote a nice little postcard and then turned to page only to discover to my horror that there was a whole other page of blank space in which to write the ‘postcard’. I was incredibly angry about this, because postcards are short. The test didn’t even match what a ‘postcard’ was in reality. If I was taking that test today, I would be equally annoyed if I was asked to compose and ‘email’ and in fact I was writing it with a pen. I have seen many such examples of this inauthenticity caused by writing on the incorrect media in test preparation courses that I have used as a teacher. This lack of authenticity not only damages the students by not testing them in the context for which they will use their skills, but also damages the face validity of the tests itself, which could lead to resentment and loss of motivation.

In addition, by administering a test on the computer, the use of paper printing is minimised, almost entirely. This could reduce administration costs as well as environmental impact. Of course, that is assuming the institution does not have to buy computers especially for the test. Also, because computers can successfully mark any objective sections (where answers have a clear, binary right or wrong answer) almost instantaneously, the need to pay humans to go through with marking grids is erased. This increases the speed of the results and feedback, as well as cutting costs and of course improving the accuracy of marking.
Other studies have investigated the difference between CB and PB tests (see further reading). Most of them conclude that CBT is advantageous for students and test administrators alike. So, why the opposition to CBT? What are the dangers and what is holding us back?

The disadvantages

expense. technical issues. takes away something from pbt? too dependent on computers. cuts cost of paper and administration
Earlier I mentioned a possible mistrust in technology which deters both students and institutions from implementing CBT. I have experience of this myself, so I don’t want to come across as a blind technology advocate claiming that we do away with all paper-based tests. I remember coaching Diego, a Spanish students for his TOEFL iBT. The system we used was an internal practice test whose server recorded audio from the students’ microphone. We were in the UK and the server was in the US. On top of that, we had a very slow connection and as we were in a very built up area internet contention was also very high. For this reason, at least two or three times every practice test we did, a few students would lose their speaking test answers. Many students experienced irritation at this poor and unreliable technology, especially because they were paying a lot for their courses. I always used to say ‘it won’t happen when you take the real test’. Sadly, it did happen to my Diego. The computer in which he was working during the real test shut-down in the middle of the exam. He was not allowed to re-take and had to pay again to do the test.

Of course, another problem here is that human error can never be completely accounted for when using computers. Diego was the kind of student who was often on the receiving end of inexplicable technical errors. He once kicked the power-switch at the socket by mistake and shut down a neighbour’s computer. However, part of setting up computers for use in class is ensuring that the computers are secure and the workstation is appropriate (ie, power sockets and cables cannot be removed accidentally). Therein lies the problem. CBT does not eliminate human error, but the line between computer error and human error is very fine. In addition, computers do often go wrong, especially older machines and public computer terminals which have hundreds of different users. Administering even thirty or so networked public machines is a full-time job. Computer viruses, bad configuration, faulty hardware and unreliable internet connections all contribute to this. On top of which, students and teachers need the training to use the machines and the specific software. So, computers are no panacea in education.

However, let me point out that humans are just as prone to error. Test results being lost in the post, teachers taking home essays to mark and never returning them. The look on a student’s face when you hand out everyone’s homework except theirs, but they swear they handed it in. Yes, there are plenty of reasons to look for a more secure and reliable alternative to paper-based tests.

As long as there is a skilled and reliable technician on hand before the test (in order to check the machines are correctly set-up and maintained) and during the test. Also, test instructions and support on the use of the computers must be clear and easy to use regardless of your level of computer or language ability. The connections through which information is being sent must be secure and reliable in order to send through answers and results. Where possible, the machines should be up-to-date and have been regularly serviced to avoid malfunctions. These are things which don’t just apply to CBT but any use of computers in general, however where the stakes are higher (as in with an institutionalised language test) the need to ensure all these things are in place increases.

The future

The last ten or fifteen years have seen computers fall in price and rise significantly in terms of reliability and power. Most schools, universities and private language institutions in developed countries around the world are well equipped to offer students computer and internet access. For this reason, it seems fitting that people such as Isabel Nisbet should raise the issue of introducing computer-based testing to replace traditional paper-based tests on a national scale. As with many technological advances, language teaching may well be in the vanguard of this conversion to CBT, however that means it is more likely the technology will be tried and tested by the time it takes over. It is not likely to happen overnight, or even within the next 5 years. What is certain is that large institutionalised tests will have to offer the option to students to take the test on the computer, and gradually the PBTs will be phased out. Already, TOEFL iBT is doing this, with TOEFL PBT fast becoming obsolete. However, ETS has had to redesign much of the test and the way it is administered in order to maximise the computer-based format. For this reason I have a lot of respect for the TOEFL iBT. However, TOEIC remains purely paper-based. The Cambridge ESOL suite offers both PBT and CBT versions of all its tests, except IELTS. IELTS can be taken on the computer only in the Delhi centre, although it is likely the IELTS CBT will be rolled-out worldwide very soon (please continue to check www.engnet-education.com for updates and news on this).

Conclusion

It is a gradual process, much like the way CDs replaced cassette tapes. Of course, language teaching is one of those rare professions where teachers still can be seen running to and from the staffrooms around the world clutching pre-wound cassettes where the rest of the world is using mp3s. There are exceptions of course, but my point is there will not be an overnight switch to CBT even if there is a national government level initiative. For this reason it is important to continue researching and promoting the use of CBT because it does reflect the way things are going. It is true that we use computers more and more for communication, and as such our tests should reflect that.

Further reading

Bachman, L.F., Palmer, A.S., 1996. Language Testing in Practice. Oxford University Press, Oxford.

Extended Bibliography

Bachman, L. 1990: Fundamental considerations in language testing. Oxford University Press, Oxford.

Bachman, L.F., Palmer, A.S., 1996. Language Testing in Practice. Oxford University Press, Oxford.

Bachman, L. 2001 Designing and developing useful language tests in Elder, C. Brown, A. Grove, E. Hill, K. Iwashita, N. Lumley, T.

McNamara, T. O’Loughlin, K. (Eds.) Studies in Language Testing 11: Experimenting with uncertainty. Essays in honour of Alan Davies. Cambridge University Press, Cambridge

Brown, H. Douglas. 2004 Language Assessment: Principles and Classroom Practices Longman : New York

Chalhoub-Deville, Micheline & Deville, Craig 2005 A look back at and forward to what language testers measure In Hinkel, Eli (Ed.) Handbook of research in second language teaching and learning Routledge

Chapelle Chapelle MaryK . Enrigh Jonn M. Jamieson 2008 Building A Validity Argument For The Test Of English As A Foreign Language Routledge, New York

Douglas, Dan 2001 Three problems in testing language for specific purposes: Authenticity, specificity and inseparability in Elder, C.

Brown, A. Grove, E. Hill, K. Iwashita, N. Lumley, T. McNamara, T. O’Loughlin, K. (Eds.) Studies in Language Testing 11: Experimenting with uncertainty. Essays in honour of Alan Davies:

Downey, R. Farhady, H, Present-Thomas, R. Suzuki, M. Van Moere, A. Evaluation of the Usefulness of the Versant for English Test: A Response Language Assessment Quarterly, Volume 5, Issue 2 April 2008 , pages 160 – 167 Cambridge :

Fulcher 2000 The ‘communicative’ legacy in language testing System, 28 (4), p.483-497, Dec 2000 doi:10.1016/S0346-251X(00)00033-6

Hughes, A. 1989 Testing for Language Teachers Cambridge University Press, Cambridge

Lewkowicz, J.A., 1997. Authenticity for whom? Does authenticity really matter? In: Huhta, A., Kohonen, V., Lurki-Suonio, L.,

Luoma, S. (Eds.), Current Developments and Alternatives in Language Assessment. Jyvaskyla University, Finland, pp. 165-184.

Lewkowicz, J.A., 2000. Authenticity in language testing: some outstanding questions. Language Testing 17 (1), 43-64. Cambridge University Press DOI: 10.1080/15434300801934744

Lynch, Brian K. 2003 Language Assessment and Programme Evaluation Edinburgh University Press, Edinburgh

McNamara 2006 Validity in Language Testing: The Challenge of Sam Messick’s Legacy Language Assessment Quarterly, Volume 3, Issue 1 January 2006 , pages 31 – 51 DOI: 10.1207/s15434311laq0301_3

Messick, S., 1989. Validity. In: Linn, R.L. (Ed.), Educational Measurement. Macmillan, New York, pp. 13 -103.

Morrow, K., 1979. Communicative language testing: revolution of evolution? In: Brumfit, C.K., Johnson, K. (Eds.), The Communicative Approach to Language Teaching. Oxford University Press, Oxford, pp. 143-159.

North, Brian. 2007 Expanded set of C1 & C2 Descriptors http://www.coe.int/T/DG4/Portfolio/?L=E&M=/documents_intro/Data_bank_descriptors.html

O’Malley, J. Michael & Valdez Pierce, Lorraine 1996 Authentic Assessment for English Language Learners: Practical Approaches for Teachers Longman

Phakiti, Aek. 2008 Construct validation of Bachman and Palmer’s (1996) strategic competence model over time in EFL reading tests Language Testing; 25; 237 DOI: 10.1177/0265532207086783

Popham, W. James 1990, Modern Educational Measurement Prentice Hall, Englewood

Stoynoff & Chapelle 2005 ESOL Tests and Testing Teachers of English to Speakers of Other Languages, Inc. Maryland

Stoynoff, Stephen 2009 State-of-the-Art Article Recent developments in language assessment and the case of four large-scale tests of ESOL Language. Teaching. (2009), 42:1, 1–40 Cambridge University Press DOI:10.1017/S0261444808005399

Widdowson, Henry 1979: Explorations in applied linguistics. Oxford University Press, Oxford

Widdowson, Henry 2001 Communicative language testing: the art of the possible in Elder, C. Brown, A. Grove, E. Hill, K. Iwashita, N. Lumley, T. McNamara, T. O’Loughlin, K. (Eds.) Studies in Language Testing 11: Experimenting with uncertainty. Essays in honour of Alan Davies Cambridge University Press, Cambridge
Widdowson, Henry 1983 Learning Purpose and Language Use. Oxford University Press, Oxford.

Adapting Authentic Materials from the Web

This post is all about resources which are invaluable to language teachers – authentic materials. There are so many authentic resources which can be adapted for use in the classroom that it may seem daunting to know where to start, or what the best way to go about altering the materials is to get them class-room ready. In this post I will introduce a few examples and tools which make this job easy, and some general ideas about how to reduce the work on the teacher without decreasing the personalisation factor of adapting materials for the classroom.

Structure

If you look at any good coursebook or set of learning materials, you will notice that they have a strong structure with clearly labelled sections which the students and teachers can both identify straight away. Although the content in these sections changes from unit to unit, the section are laid out in the same way and offer a similar range of tasks and activities. This has obvious benefits, especially when you are designing teaching materials which you will use in the future for other classes or even for other teachers to use. This principle of having a strong and clear structure should also help you to speed up the adapting process. A good example of adapted authentic materials are the ones provided by onestopenglish.com based on articles from the British newspaper The Guardian. Below is an example:

Link to the original page

Straight away you can see that there are clear sections and each one contains a specific task. Also note that from week to week these activities vary only in content, for the most part the type of tasks change very little. This makes the writing process a lot easier, it means you know already what will go in your worksheet and the same is true for your students. Of course, variation of task types is a good thing, but this can be achieved with different worksheets and lessons, if you are writing a series of such materials it is best to have a strong structure, but also don’t be too rigid about it as this will make the worksheets become stale.

Elaboration Theory

Research such as that into the Involvement Load Hypothesis or Cognitive Load Theory have suggested that by increasing the difficulty and required ‘brain power’ used by a task helps with remembering the content and can lead to longer-term retention of the target language. Elaboration Theory utilises this by making tasks in a learning interaction or worksheet become gradually more complicated, thus increasing the chances that the learner will acquire the target language. These theories are easy to incorporate into your worksheets.

Source Material

No matter how good your tasks and activities, they are only ever as good as the source material. When choosing the source material which you are going to adapt, there are a few things you might want to consider before making the final selection. These are authenticity, relevance, curriculum fit and potential for further learning. Peacock (1997) found that authentic materials were more motivating for students, even lower level students, than unauthentic materials. However, there have been a lot of debates over what constitutes as authentic and what doesn’t. Henry Widdowson (1990) makes the distinction between ‘authentic’ materials and ‘genuine’ materials. Here, authentic materials are originally written for non-learners of the language (proficient speakers or L1) and used in the same way in the class with the learners. Genuine materials have been adapted from authentic materials in order to emphasise linguistic components for learning. Both of these are good things, but obviously when choosing the source material we must also consider the difficulty it may pose to our learners. There are a number of ways of doing this, but one quick and easy way is to use the Flesch–Kincaid readability test, which is built into Microsoft Word and can be used on any text you have in there. A score will be produced which you can use to roughly guess how hard the text will be, based on a number of criteria derived from corpus linguistics, such as word frequency, length, etc.

I will expand on this idea further in a later article, but please feel free to contact or use the comments below to discuss.

For me, the main thing I look for when choosing materials to adapt for class is whether they interest me or not. If they do, I am more likely to be able to get my class interested. At the end of the day, you may have to use a number of objective criteria to fine tune your choices, but the main decision will be subjective based on your own teaching preferences and this is a good thing. Teaching and adapting materials for you class are highly personal, and if they are not the lessons you do will fail because of this.

Further Reading

Widdowson, H. (1990),  Aspects of Language Teaching. Oxford: Oxford University Press

Peacock, M. (1997)  The Effect of Authentic Materials on the Motivation of EFL Learners in English Language Teaching Journal 51

Feedback

Sending out student feedback is essential. Good materials design needs to incorporate student feedback. Also, after assessments and tests, it is vital that students receive personalised feedback in a supportive way.

Below is a training video which I made to help teachers automate student assessment feedback through MS Word’s built in Mail Merge features.

VLE or LMS?

Many people ask me about the difference between an LMS and VLE, and also CMS and LCMS. Although you might find articles and posts that state otherwise, I believe that there is an important distinction between LMS and VLE, and I would also use the term CMS to mean something different. Let’s start with a definition of each.

LMS stands for Learning Management System. For me these are primarily for training, rather than education. They are often connected to mandatory CPD (continual professional development) and generally tend to be used internally rather than being client-facing or used in education. Having said this, JoomlaLMS is clearly calling itself an LMS and in my view it would fit in more with the description of a VLE. So, as you see the two terms are used interchangeably. I would like to create a distinction here for clarity, nonetheless.

VLE stands for Virtual Learning Environment. These would often be characterised by constructivist pedagogical principles and are often used as a place to collaborate and extend discussions rather than merely hosting trackable learning objects. Many VLEs and LMSs have the same features, but the emphasis and also way they are being used would distinguish them. It is possible to use a Moodle, for example, for purely behaviourist mechanical drills and compliancy training and thus it becomes an LMS through the way it is used.

The reason I am making this distinction is that I still see a lot of ambiguity about the terminology in eLearning, perhaps due to its relative infancy as a discipline. I have seen institutions make the wrong choice when considering commercial LMSs and VLEs and I blame the lack of precise definition for this. In language teaching as well, we are often in the rearguard when it comes to implementing new technology, and thus many institutions fall into the trap of simply buying or creating a load of online grammar and vocabulary drills which have been authored as eLearning and then making this available to their students as the final and finished component of their eLearning implementation.

Now, I am not saying this is bad or that we shouldn’t provide such resources for our students. What I am saying though is that this is not much different from a glorified practise book. While the online format means greater access and the possibility for flash animations and embedded video/audio, at the end of the day these are still drills which are useful primarily for test preparation, but not for helping students to acquire communicative competence. No matter how good such activities look, they still fall under the category mostly of Behaviouristic CALL. With small adjustments, it is possible to expand the eLearning platform into the realms of communicative and collaborative CALL. For example, one of the tasks for students on the VLE should be to introduce themselves on the forum. Moodle supports collaborative wikis which are ideal as group projects, and can be given as assignments or class work. There are also blogs, which can be created for free and allow comments and following. These are great ways to get the class working together on projects and have the advantage of showing students ways to continue learning and practising in authentic ways after their course has finished. Another idea would be to have a high scores table or similar, which gives students the option of posting their best scores on a game and challenging other students. This should of course be optional, but works very well for more competitive students, smart.fm is a brilliant example of this.

VLEs do not have to contain all the content within them either, they should provide links to outside content and encourage students to source their own materials. On our VLE we have a side block which shows the latest RSS feeds from the BBC learning English site, which also keeps your site contemporary.

I would love to hear what you are doing at your school and if you have any questions or ideas please share below and keep the discussion going!

When to (and when not to) use tech in class

The question of when to use, and equally as important – when not to use, technology in class has been a question that sadly gets left out of many of the discussions around new learning technologies. Unfortunately, a lot of the choices about tech in class come from a top down implementation. So, your school gets a load of new interactive whiteboards. They give you a 1 hour training session, remove all the old whiteboards and say ‘off you go then’. Questioning their practicality often gets you branded as ‘negative’ or even ‘anti-progressive’.
Happily, there are those who dare to ask questions about this approach to instructional technology. People like Mike Levy, Phil Hubbard and Greg Kessler (among others) have voiced their concern over ‘tech for tech’s sake’ and this is coming from the leading CALL experts and advocators. Interactive whiteboards, for example, don’t do things that normal ones do. You can’t have more than one person writing on it at the same time, for example, so if you are doing a spelling race or something like that you won’t be able to use it. A lot of great software and apps are being released at an amazing rate, but all too often they are put into use without prior evaluation. As CALL practitioners we need to ask ourselves, is this useful? How so? When would this be useful and when would it not? These questions are not dissimilar to the questions teachers ask themselves when planning or evaluating any resource for a lesson. You don’t need to be an expert to conduct this kind of evaluation either.

A good example is a Blended Learning Lesson Plan I wrote myself for use in my institution. I was thinking about this lesson from a very top-down perspective, I’m sorry to say. I was concerned our Moodle forums were underused, so I thought ‘how can I get these forums to be used in class?’ I created a lesson plan where the whole class is taken into the computer room and forced to use the forum to post a response to something.
Not only did this only mean that forums were used a lot for the hour of the class and then never again, it was also questionable pedagogically. Why make people communicate over a forum when they are in the same room as each other? In the pecking order of communication, face to face is always best.

Forums are powerful collaboration tools, but the point is to allow asynchronous sharing and knowledge. The same lesson applied to learners who are in a separated by time and space would be great, but not if they could just have easily have actually spoken to each other.
We are at a stage now where technology is so ubiquitous that we are not always so keen to implement it for its own sake. We need to critically evaluate the new item, see if it works, decide what it is good for and what it is not so good for.

I woul like to invite you to post your comments about any new piece of technology you have used in class. Was it useful? What can it do well? What are its limitations?

CALL Teacher Education

There is a brilliant book entitled ‘Teacher Education in CALL’ (Hubbard and Levy eds. 2006) which details the current state of CALL teacher education – some of the predominant findings are that there is not enough CALL Teacher Education going on as part of INSET or PRESET training, and even when CALL is part of Teacher Education programs it is often considered unsatisfactory in terms of preparing teachers to actually use CALL applications in class.

In the TESLCA-L List-Serv I started a post about CALL and Autonomy and was soon contacted through the list by Greg Kessler, a researcher and CALL Teacher Education Specialist who contributed to the ‘Teacher Education in CALL’ book. The post took on a slightly new purpose then, focusing on CALL Teacher Education and how this can feed into Autonomy Training.

I decided I would expand the idea by adding a post here. By joining the free mailing list TESLCA-L you can read the archived postings and also add to them, or alternatively leave a comment here on this blog about the subject.

We are particularly interested in:

  • any CALL preparation courses you have taken
  • your attitudes towards CALL use and CALL Teacher Education
  • any experiences you have had while trying to integrate CALL into your classes

We look forward to reading your postings!

Second Life

There are some amazing resources emerging regarding the use of Second Life for Language Learning. There are in world virtual schools dedicated to a range of languages, most notably English, Spanish and French. There are also groups which are dedicated to language learning.

  • EDUNATION – This is the island set up by the Consultants-e. It’s a great place and there is a lot going on.
  • CALICO – This is a Ning Social network for the CALICO/EUROCALL groups’ Virtual Worlds Special Interest Group
  • AVALON – A group funded by the Lifelong Learning Program. There are some great events and discussions here.

To name just a few. If you are not already in Second Life then I would recommend that you go in and have a look for yourself. Flying around in the virtual world can be quite demanding on your computer if you don’t have a good graphics card, but I would recommend it nonetheless. It might be some time before the computer labs in schools catch up enough to fully support entire classes using SL, but to be honest that’s not how I see it going. For example, I went in last night and found an island where Japanese people hang out. I went up to a couple of guys and introduced myself, then tried as hard as I could to follow the conversation and join in using VOIP. This is a great way to practice authentic communication with real speakers. The value of this is particularly apparent if you are learning a language in a Foreign Language context (ie. there aren’t many speakers of the target language in your country).

Has anyone else had any experiences in Second Life? How did you feel when you were in there? Can you recommend any good places or groups?

For anyone interested my Avatar’s name is Richard Spiritor.

Autonomy

Here is an updated Bibliography now listing only articles that deal with both Autonomy and CALL or Technology. This list was updated thanks to comments by Steve (see below).

If anyone has experience using technology for Autonomous langauge learning or experience with Online Self-Access Centres (OSAC) please add a comment to the discusion below.

Continue reading “Autonomy” »

Using VLEs

How many of us are now using VLEs to provide students with access to supplementary resources, lecture notes, additional or useful information and links to other repositories?

Many university level lecturing jobs and teaching posts are asking for experience with eLearning. This page aims to provide a forum for discussions about how best to use these resources for Language Teaching.

VLEs
VLEs