Asier Alcázar

==================================================

Linguistics, Computational Linguistics, Corpus Linguistics

 

Department of Linguistics

University of Southern California

Grace Ford Salvatori Hall 301

Los Angeles, CA 90089-1693

 

Education     

Research areas    

Publications   

Software   

Service   

Organizations

 

 

 

You can e-mail me at: alcazar@usc.edu

==================================================         

Education

      University of Southern California, Los Angeles (USA)

 

v     PhD Candidate in Linguistics

v     MS in Computational Linguistics (Department of Linguistics, Department of Computer Science, Information Sciences Institute)

v     MA in Linguistics, Hispanic Linguistics

 

Universidad de Deusto, Bilbao (Spain)

 

v     BA in English Philology, Universidad de Deusto

Research areas

Syntax and its Interfaces with Semantics and Morphology, Phase Theory, Tense and Aspect, Unaccusativity, Case Theory, Non-finite Clauses; Semi-supervised Ontology Building, Applications of Natural Language Processing for Linguistic Research; Weight as a Processing Constraint in Syntactic Variation; Sociolinguistics and Discourse, Corpus Creation.

 

Languages: Basque, Spanish, Italian, Romance, Historical Romance, Latin

Top

List of publications

 

Refereed journal articles

 

2006          The interpretation of imperfective aspect in Basque and its implications for our traditional classification of verbs. Journal of Basque Linguistics. (in press)

 

2003          Two Paradoxes in the Interpretation of Imperfective Aspect and the Progressive. Journal of Cognitive Science 4.1: 79-105.

 

 

Other refereed publications

 

2006          with Mario Saltarelli. The quirky case of participial clauses. Romance Languages and Linguistic Theory 2005: Selected papers from ‘Going Romance’ 2005, Utrecht, ed. Sergio Baauw, Frank Drijkoningen, Ivo van Ginneken and Haike Jacobs. Amsterdam: John Benjamins. (to appear)

 

2006          Against an ontological commitment to unergative verbs. Proceedings of the 40th Meeting of the Chicago Linguistics Society: the Main Session. (in press)

 

2006          with Roberto Mayoral-Hernández. Sociolinguistic factors conditioning the ordering of adverbial expressions in Spanish: A computationally extended corpus analysis. Selected Papers from the 36th Linguistic Symposium on Romance Languages, Rutgers, New Brusnwick, March 2006. (to appear)

 

2006          A deceptive case of split-intransitivity in Basque. Selected Proceedings of the International Symposium on the Typology of Argument Structure and Grammatical Relations in Languages Spoken in Europe and North and Central Asia (LENCA-2), ed. Bernard Comrie, Valery Solovyev, and Pirkko Suihkonen. Amsterdam: John Benjamins. (to appear)

 

2006          with Mario Saltarelli. Argument structure of participial clauses: the unaccusative phase. Selected Proceedings of the Hispanic Linguistics Symposium 2006, University of Western Ontario. (to appear)

 

2003          The Imperfective Paradox of Basque. USC Working Papers in Linguistics 1. University of Southern California, Los Angeles.

 

 

Edited books

 

2006          with Roberto Mayoral Hernández and Michal Temkin Martínez Proceedings of WECOL 2004, University of Southern California, Los Angeles. (in press)

 

2006          with Irene Barberia, Rebeka Campos Astorkiza and Susana Huidobro. Proceedings of BIDE 2004, Universidad de Deusto, Bilbao. (in press)

                        With a co-authored introduction: ‘A new relay of linguists’

 

 

Published conference proceedings and book articles [not refereed]

 

2006          Transitive intransitives: Basque Unergatives Revisited. Proceedings of the 4th Cambridge Postgraduate Conference in Language Research. Cambridge University, Cambridge. (to appear)

 

2006          Defining transitivity and intransitivity: Split-intransitive languages and the Unaccusative Hypothesis. A Festschrift for Larry Trask, ed. José I. Hualde & Joseba Lakarra. Bilbao: University of the Basque Country. (to appear)

 

2006          Towards linguistically searchable text. Proceedings of BIDE 2005, Universidad de Deusto, Bilbao. (in press)

 

2004          A Note on the Typological Classification of Basque. Working Papers of the Linguistics Circle 17. University of Victoria, Victoria. 1-10.

 

2003          Verb Classes and Aspectual Interpretation in Basque. Proceedings of Console XI. Università degli Studi di Padova, Padua, Italy.

 

2002          On the Correlate between Case Assignment and Verbal Form in Basque. Proceedings of the 5th Durham Postgraduate Conference in Theoretical and Applied Linguistics, University of Durham, Durham. 1-10.

 

 

Unpublished presentations at conferences, workshops and symposia

 

2007          with Mario Saltarelli. Zanuttini’s Hypothesis: Participial Constructions Revisited. Linguistic Society of America Annual Meeting, Anaheim (California). Syntax: Tense & aspect on 05-Jan-2007 at 11:00.

 

2007          with Roberto Mayoral Hernández. A corpus analysis of weight and unaccusativity in Spanish. Linguistic Society of America Annual Meeting, Anaheim (California). Corpus-based investigations on 06-Jan-2007 at 09:30.

 

2006          with Mario Saltarelli. The case of participial clauses: Italian vs. Romance. Linguistic Society of America Annual Meeting, Albuquerque (New Mexico).

 

2006          The typology of absolute constructions. Hagit Borer’s Morphology Discussion Group. Department of Linguistics, University of Southern California.

 

2005          with Joseba Abaitua. A brief history of machine translation. Computer-Assisted Translation. San Sebastian International Summer Courses. Palacio Miramar, San Sebastián (Spain). June 21-22, 2005.

 

2003          Two paradoxes on the interpretation of Imperfective aspect and the progressive. It's about time: Theoretical and experimental perspectives on tense, aspect, modality and events, Linguistic Society of America Linguistic Institute, Michigan State University, July 18-19, 2003.

Top

Software

Published software/resources

 

2006          Consumer Eroski Parallel Corpus

 

I turned the Consumer Eroski magazine (http://revista.consumer.es), which publishes press articles written in Spanish and their translations to Basque, Catalan and Galician, into a parallel corpus. The corpus has approximately 1.3 million words for each language, for a combined total of 5.2 million words. The corpus is aligned at the sentence level and it is accessible online via Universidade de Vigo and Universidad de Deusto.

 

http://sli.uvigo.es/CLUVI/ (public access)          www.deli.deusto.es (research intranet)

 

The corpus is rather unique: the four major spoken languages in Spain are represented—including Basque, in a standard educated register that serves as a contemporary reference corpus and a basis for computational linguistics and corpus linguistics research. The European Constitution does not yet exist for these languages as a parallel corpus.

 

 

2004-05     Mundo-Hispano Search Interface

 

An application written in Java that uses the Google Application Interface to make multiple searches and individuate search results by Spanish speaking countries using the approximation Geographical Location of Server.

 

Hablamos Juntos (http://www.hablamosjuntos.org/) is a $30 million initiative to improve patient-provider communication among limited English proficiency Hispanics. Used by professional translators and HJ staff to survey & assess the cultural adequacy of translation practice from the US standard in healthcare to the 20 national varieties of Spanish represented by a heterogeneous population.

       

 

Unpublished software/utilities

 

      2005-06     Suite of utilities for working with online corpora (programmed in Python)

 

This suite facilitates the use of the online corpora of the Royal Academy of the Spanish Language (www.rae.es): Corpus de Referencia del Español Actual (CREA: http://corpus.rae.es/creanet.html Modern Spanish Reference Corpus);  Corpus Diacrónico del Español (CORDE: http://corpus.rae.es/cordenet.html or Spanish Historical Corpus). The suite features:

 

v     Automatic query and data extraction

v     Automatic conversion from paragraph to relevant sentence and context

v     Automatic annotation of corpus metadata:

author (gender), media source, country, topic, publisher, year

v     Tool for manual annotation

v     Automatic collation of databases with manually annotated data

v     Automatic generation of SPSS annotation files to assess significance

 

 

2005-06     Suite of utilities for corpus creation (programmed in Python)

 

The suite comprises the software I developed to create the Consumer Eroski Parallel Corpus. It features the following capabilities:

 

v     Web module for automatic download and storage of raw text

v     Text cleanser to eliminate everything but text

v     Sentence extractor to organize corpus at the sentence level (sensitive to punctuation idiosyncrasies of Basque, Catalan, Galician and Spanish).

v     Sentence tokenizer utility to interface with SVMtool (Spanish tagger).

v     Utility to create input files for Moore’s bilingual sentence aligner

v     Utility to decode output files for Moore’s bilingual sentence aligner

v     Utility to derive corpus statistics

 

See corpus online:  

 

http://sli.uvigo.es/CLUVI/    (public access)        www.deli.deusto.es      (research intranet)

 

 

      2005-06     Linguistic Search Interface (programmed in Python)

 

An advanced search interface that supports parallel corpora. The interface features the operators listed below. A frequency breakdown is available when the search seeks patterns.

 

v     Exact search

v     Boolean And, Or & Not

v     Combined Boolean operator search

v     Word distance supported search

v     Part-of-speech tag supported search

v     Combined Word distance and Part-of-speech tag supported search

v     Verb search (e.g. nonfinite, subjunctive)

v     Morpheme search (e.g. prefixes)

v     Power search (predefined: e.g relatives, absolutes, indirect questions…)

v     Regular expression search

v     Chain search: any of the above iteratively

 

 

      2006          Weight calculator for Spanish (programmed in Python)  

 

Given a string of one or more words in Spanish, this utility calculates the weight of the string as words, syllables and phonemes. Used for research to determine (i) whether the weight of a constituent affects its syntactic position, and (ii) whether some weight measures are better than others.

 

 

Resource/Product Demonstrations

 

Demonstration of Consumer Eroski Parallel Corpus and Linguistic Search Interface

 

2005           DELi Computational Linguistics Group, Universidad de Deusto (Bilbao, Spain)

 

2005          Eroski Foundation Headquarters (Elorrio, Spain)

 

2005          Computer-Assisted Translation. International Summer Courses of the University of the Basque Country. Palacio Miramar (San Sebastian, Spain)

 

2005          Xavier Gómez Guinovart, Director of SLI Computational Linguistics Group, Universidade de Vigo (Vigo, Spain)

 

      Demonstration of Mundo Hispano Search Engine

 

2005          Web cast conference call for online application. Hablamos Juntos National Program Office, Tomás Rivera Policy Institute (Los Angeles, California)

 

2004          Web cast conference call for Prototype offline. Hablamos Juntos National Program Office, Tomás Rivera Policy Institute (Los Angeles, California)

Top

Service

      For the Linguistics Department, University of Southern California

 

      2006          Graduate Students in Linguistics Constitution Revision Committee

 

2005          Student-Faculty liason

 

2005          Co-edited proceedings of Western Conference in Linguistics 2004

 

2004          Co-organized Western Conference in Linguistics 2004

 

2004          Co-created Parallel Session in Hispanic Linguistics for Western Conference in Linguistics 2004

 

      2003-04     President, Hispanic Linguistics Student Association

 

      2003          Co-author of constitution for Hispanic Linguistics Student Association

 

      2003          Co-founded Hispanic Linguistics Student Association

 

      2003          Co-founded Computational Linguistics Student Association

 

      2001          Co-organized West Coast Conference in Formal Linguistics XX

 

 

      For the English Department, Universidad de Deusto (Bilbao, Spain)

 

2004-06     Liaison for the Universidad de Deusto Study Abroad Program

 

      2005          Co-edited Proceedings of Bilbao-Deusto Student Conference 2004

 

      2004          Co-organized Bilbao-Deusto Student Conference 2004

 

      2004          Reviewer for Bilbao-Deusto Student Conference 2004

 

2003                    Co-founded Bilbao-Deusto Student Conference in Linguistics 

 

Top

Professional organizations

      Linguistic Society of America

 

      Computational Linguistics Association

 

      American Association for the Advancement of Science

Top

You can e-mail me at: alcazar@usc.edu

The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees