Thursday, December 12, 2013

Podcast about an essay from the JPU corpus

Sunday, September 22, 2013

JPU Corpus blog logs 14,000 page views

Since its launch in 2007, this JPU Corpus blog has been viewed over 14,000 times. No earth-shattering stats, those, but I am still impressed by the dedication of linguists and other colleagues from around the world, visiting the site.

A run-down of the top ten countries:

Wednesday, June 26, 2013

This Month's JPU Gem

From time to time, I feature an essay from the corpus. Way back in the 1990s, this is what a student wrote about virtual reality.


Corpus Legend

I began collecting student scripts in 1992 -- and by 1999, the JPU Corpus incorporated over 300 essays and research papers.

At the time, it was the largest European corpus of English as a Foreign Language.

My PhD study is based on this corpus. You can read it online at Google Books.

Three of the five subcorpora are available here and may be used for linguistic study.

I would love to hear from you. Please write me at my email address.


Besides colleagues in Hungary, scholars from Austria, Brazil, Canada, China, Estonia, Germany, Japan, Lithuania, Singapore, Slovenia, South Korea, Sweden, Taiwan, Thailand, the UK and the US have written and talked about the JPU corpus and my dissertation:

Thursday, June 12, 2008

The Best Corpus Project

Students in my Corpus Linguistics course have worked on some exciting projects. Here, I make available the best, by Kiss Ilona. First, an excerpt:

I started to develop my corpus by downloading articles on air travel from internet tourism magazines. As tourism is a rather wide issue I had to encompass a narrower scope and focused on travel by air. I assumed that analysing a reliable amount of data with the help of frequency ranges would help me with the selection of the most appropriate vocabulary items to be tested. The amount of data proved to be a crucial point of my investigation. I tried to find evidence to support my hypothesis that with a representative amount of data I would be able to select the most important and most frequent words of a specialised lexis in specialised texts. Thus I have collected twenty-six carefully selected professional articles from different websites comprising approximately thirteen thousand tokens.

You can get the full text, too.