anderssoegaard.github.io

View My GitHub Profile

Anders Søgaard

Professor in Natural Language Processing and Machine Learning, Dpt. of Computer Science, University of Copenhagen.

News

I have been awarded a Google Focused Research Award for a project on multilingual semantic parsing using multi-task reinforcement learning. I am also writing a book on cross-lingual learning for Morgan & Claypool with Ivan Vulic (Cambridge), Manaal Faruqui (Google), and Sebastian Ruder (INSIGHT).

Research publications

2018

Ruder, Sebastian; Vulic, Ivan; Søgaard, Anders. 2018. A survey of cross-lingual word embedding models. Journal of Artificial Intelligence Research (JAIR). To appear. Arxiv

Hartmann, Mareike; Kementchedjhieva, Yova; Søgaard, Anders. 2018. Why is unsupervised alignment of English embeddings from different algorithms so hard? Conference on Empirical Methods in Natural Language Processing (EMNLP) 2018. Brussels, Belgium.

Ruder, Sebastian; Cotterell, Ryan; Kementchedjhieva, Yova; Søgaard, Anders. 2018. A discriminative latent-variable model for bilingual lexicon induction. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2018. Brussels, Belgium.

de Lhoneux, Miryam; Bjerva, Johannes; Augenstein, Isabelle; Søgaard, Anders. 2018. Parameter sharing between dependency parsers for related languages. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2018. Brussels, Belgium.

Gonzalez-Garduno, Ana Valeria; Augenstein, Isabelle; Søgaard, Anders. 2018. A strong baseline for question relevancy ranking. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2018. Brussels, Belgium.

Kementchedjhieva, Yova; Ruder, Sebastian; Cotterell, Ryan; Søgaard, Anders. 2018. Generalizing Procrustes Analysis for Better Bilingual Dictionary Induction. Conference on Computational Natural Language Learning (CoNLL). Brussels, Belgium.

Barrett, Maria; Bingel, Joachim; Hollenstein, Nora; Rei, Marek; Søgaard, Anders. 2018. Sentence classification with human attention. Conference on Computational Natural Language Learning (CoNLL). Brussels, Belgium.

Rønning, Ola; Hardt, Daniel; Søgaard, Anders. 2018. Ellipsis resolution in neural networks. EMNLP Workshop on Analyzing and interpreting neural networks for NLP. Brussels, Belgium.

Kerinec, Emma; Braud, Chloe; Søgaard, Anders. 2018. When does deep multi-task learning work for loosely related document classification tasks? EMNLP Workshop on Analyzing and interpreting neural networks for NLP. Brussels, Belgium.

Søgaard, Anders; de Lhoneux, Miryam; Augenstein, Isabelle. 2018. Nightmare at test time: How punctuation prevents parsers from generalizing. EMNLP Workshop on Analyzing and interpreting neural networks for NLP. Brussels, Belgium.

Søgaard, Anders; Ruder, Sebastian; Vulic, Ivan. 2018. On the limitations of unsupervised bilingual dictionary induction. The 56th Annual Meeting of the Association for Computational Linguistics (ACL). Melbourne, Australia.

Bollman, Marcel; Bingel, Joachim; Søgaard, Anders. 2018. Multi-task learning for historical text normalization: Size matters. Deep Learning Approaches for Low Resource Natural Language Processing (ACL). Melbourne, Australia.

Kann, Katharina; Bjerva, Johannes; Augenstein, Isabelle; Plank, Barbara; Søgaard, Anders. 2018. Character-level Supervision for Low-resource POS Tagging. Deep Learning Approaches for Low Resource Natural Language Processing (ACL). Melbourne, Australia.

Hartmann, Mareike; Søgaard, Anders. 2018. Limitations of cross-lingual learning from image search. 3rd Workshop on Representation Learning for NLP (ACL). Melbourne, Australia.

Bingel, Joachim; Paetzold, Gustavo; Søgaard, Anders. 2018. Lexi: A tool for adaptive, personalized text simplification. The 27th International Conference on Computational Linguistics (COLING), Demo Session. Santa Fe, New Mexico.

Augenstein, Isabelle; Ruder, Sebastian; Søgaard, Anders. 2018. Multi-task learning of pairwise sequence classification tasks over disparate label spaces. North American Chapter of the Association for Computational Linguistics (NAACL). New Orleans, Louisiana. Arxiv

Rønning, Ola; Hardt, Daniel; Søgaard, Anders. 2018. Sluice resolution without hand-crafted features over brittle syntax. North American Chapter of the Association for Computational Linguistics (NAACL). New Orleans, Louisiana.

Barrett, Maria; Frermann, Lea; Gonzalez-Garduno, Ana Valeria; Søgaard, Anders. 2018. Unsupervised induction of linguistic categories with records of reading, speaking, and writing. North American Chapter of the Association for Computational Linguistics (NAACL). New Orleans, Louisiana.

Rei, Marek; Søgaard, Anders. 2018. Zero-shot sequence labeling through transfer learning. North American Chapter of the Association for Computational Linguistics (NAACL). New Orleans, Louisiana.

Gonzalez-Garduno, Ana Valeria; Søgaard, Anders. 2018. Learning to predict readability using eye-movement data from natives and learners. The 32nd AAAI Conference on Artificial Intelligence (AAAI). New Orleans, Louisiana.

Søgaard, Anders. 2018. What I think when I think about treebanks. 16th International Workshop on Treebanks and Linguistic Theories. Prague, Czech. PDF

Pedersen, Bolette; Nimb, Sanni; Søgaard, Anders; Hartmann, Mareike; Olsen, Sussi. 2018. A Danish FrameNet Lexicon and an Annotated Corpus Used for Training and Evaluating a Semantic Frame Classifier. Language Resources and Evaluation Conference (LREC) 2018. Miyazaki, Japan.

2017

Felbo, Bjarke; Mislove, Alan; Søgaard, Anders; Rahwan, Iyan; Lehmann, Sune. 2017. Using millions of emoji occurrences to pretrain any-domain models for detecting emotion, sentiment, and sarcasm. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2017. Copenhagen, Denmark. PDF

Braud, Chloe; Lacroix, Ophelie; Søgaard, Anders. 2017. Does syntax help discourse segmentation? Not so much. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2017. Copenhagen, Denmark. PDF

Gonzalez-Garduno, Ana; Søgaard, Anders. 2017. Using gaze to predict text readability. 12th Workshop on Innovative Use of NLP for Building Educational Applications, EMNLP 2017. Copenhagen, Denmark. PDF

Braud, Chloe; Søgaard, Anders. 2017. Is writing style predictive of scientific fraud? Workshop on Stylistic Variation, EMNLP 2017. Copenhagen, Denmark. PDF

Salehi, Bahar; Søgaard, Anders. 2017. Evaluating hypotheses in geolocation on a very large sample of Twitter. Workshop on Noisy User-generated Text (W-NUT), EMNLP 2017. Copenhagen, Denmark. PDF

Salehi, Bahar; Hovy, Dirk; Hovy, Eduard; Søgaard, Anders. 2017. Huntsville, hospitals, and hockey teams: Names can reveal your location. Workshop on Noisy User-generated Text (W-NUT), EMNLP 2017. Copenhagen, Denmark. PDF

Søgaard, Anders. 2017. Using hyperlinks to train multilingual partial parsers. The 15th International Conference on Parsing Technologies (IWPT). Pisa, Italy. PDF

Ruder, Sebastian; Bingel, Joachim; Augenstein, Isabelle; Søgaard, Anders. 2017. Learning what to share between loosely related tasks. Arxiv

Hartmann, Mareike; Søgaard, Anders. 2017. Limitations of cross-lingual learning from image search. Arxiv

Bollmann, Marcel; Bingel, Joachim; Søgaard, Anders. 2017. Learning attention for historical text normalization by learning to pronounce. The 55th Annual Meeting of the Association for Computational Linguistics (ACL). Vancouver, Canada. PDF

Braud, Chloe; Lacroix, Ophelie; Søgaard, Anders. 2017. Cross-lingual and cross-domain discourse segmentation of entire documents. The 55th Annual Meeting of the Association for Computational Linguistics (ACL). Vancouver, Canada. PDF

Augenstein, Isabelle; Søgaard, Anders. 2017. Multi-task learning of keyphrase boundary classification. The 55th Annual Meeting of the Association for Computational Linguistics (ACL). Vancouver, Canada. PDF

Felbo, Bjarke; Mislove, Allan; Søgaard, Anders; Rahwan, Iyad; Lehmann, Sune. 2017. Using millions of emoji occurrences to pretrain any-domain models for detecting emotion, sentiment and sarcasm. 2nd Workshop on Representation Learning for NLP, ACL 2017. Vancouver, Canada.

Nimb, Sanni; Olsen, Sussi; Pedersen, Bolette; Braasch, Anna; Sørensen, Nicolai; Søgaard, Anders. 2017. From thesaurus to framenet and annotated text. Electronic Lexicography in the 21st Century (eLex). Leiden, the Netherlands. PDF

Gonzalez-Garduño, Ana Valeria; Søgaard, Anders. 2017. Using gaze data to evaluate text readability: a multi-task learning approach. 19th European Conference on Eye Movements (ECEM). Wuppertal, Germany.

Søgaard, Anders. 2017. Spikes as regularizers. European Symposium on Artificial Neural Networks, Computational Intelligence, and Machine Learning. Brugge, Belgium. Arxiv

Bingel, Joachim; Søgaard, Anders. 2017. Identifying beneficial task relations for multi-task learning in deep neural networks. The 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Valencia, Spain. PDF

Agic, Zeljko; Plank, Barbara; Søgaard, Anders. 2017. Cross-lingual tagger evaluation without test data. The 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Valencia, Spain. PDF

Schlichtkrull, Michael; Søgaard, Anders. 2017. Cross-lingual dependency parsing with late decoding for truly low-resource parsing. The 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Valencia, Spain. PDF

Braud, Chloe; Coavoux, Maximin; Søgaard, Anders. 2017. Cross-lingual RST discourse parsing. The 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Valencia, Spain. PDF

Levy, Omer; Søgaard, Anders; Goldberg, Yoav. 2017. A strong baseline for learning cross-lingual word embeddings from sentence alignments. The 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Valencia, Spain. PDF

Alonso, Hector Martinez; Plank, Barbara; Søgaard, Anders. 2017. Parsing universal dependencies without training. The 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Valencia, Spain. PDF

2016

Braud, Chloe; Plank, Barbara; Søgaard, Anders. 2016. Multi-view and multi-task training of RST discourse parsers. The 26th International Conference on Computational Linguistics (COLING). Osaka, Japan.

Bollman, Marcel; Søgaard, Anders. 2016. Improving historical spelling normalization with bi-directional LSTMs and multi-task learning. The 26th International Conference on Computational Linguistics (COLING). Osaka, Japan.

Barrett, Maria; Keller, Frank; Søgaard, Anders. 2016. Cross-lingual transfer of correlations between parts of speech and gaze features. The 26th International Conference on Computational Linguistics (COLING). Osaka, Japan.

Søgaard, Anders. 2016. Spikes as regularizers. NIPS Workshop on Computing with Spikes. Barcelona, Spain.

Agic, Zeljko; Johannsen, Anders; Plank, Barbara; Martinez, Hector; Schluter, Natalie; Søgaard, Anders. 2016. Multi-lingual projection for parsing truly low resource languages. Transactions of the Association for Computational Linguistics (TACL) 4: 301-312.

Søgaard, Anders. 2016. Evaluating word embeddings with fMRI and eye-tracking. RepEval, The 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany. Nominated for Best Paper Award.

Bingel, Joachim; Barrett, Maria; Søgaard, Anders. 2016. Extracting token-level signals of syntactic processing from fMRI - with an application to POS induction. The 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany.

Søgaard, Anders; Goldberg, Yoav. 2016. Deep multi-task learning with low level tasks supervised at lower layers. The 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany.

Plank, Barbara; Søgaard, Anders; Goldberg, Yoav. 2016. Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. The 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany.

Bingel, Joachim; Søgaard, Anders. 2016. Text simplification as tree labeling. The 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany.

Johannsen, Anders; Agic, Zeljko; Søgaard, Anders. 2016. Joint part-of-speech and dependency projection from multiple sources. The 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany. (ACL)

Barrett, Maria; Bingel, Joachim; Keller, Frank; Søgaard, Anders. 2016. Weakly supervised part-of-speech tagging using eye-tracking data. The 54th Annual Meeting of the Association for Computational Linguistics (ACL). Berlin, Germany.

Klerke, Sigrid; Goldberg, Yoav; Søgaard, Anders. 2016. Improving sentence compression by learning to predict gaze. North American Chapter of the Association for Computational Linguistics (NAACL). San Diego, CA. Received Best Short Paper Award.

Jørgensen, Anna; Hovy, Dirk; Søgaard, Anders. 2016. Learning a POS tagger for AAVE-like language. North American Chapter of the Association for Computational Linguistics (NAACL). San Diego, CA.

Pedersen, Bolette; Braasch, Anna; Johannsen, Anders; Alonso, Hector Martinez; Nimb, Sanni; Olsen, Sussi; Søgaard, Anders; Sørensen, Nicolai Hartvig. 2016. The SemDaX corpus - sense annotations with scalable sense inventories. Language Resources and Evaluation Conference (LREC) 2016. Portoroz, Slovenia.

Jørgensen, Anna; Søgaard, Anders. 2016. A test suite for evaluating POS taggers across varieties of English. WWW Workshop on NLP for Informal Text. Montreal, Canada.

Søgaard, Anders. 2016. Biases we live by. Nordisk Tidsskrift for Informationsvidenskab og Kulturformidling 5(1): 31-35.

2015

Søgaard, Anders. 2015. Empirical Gaussian priors for cross-lingual transfer learning. NIPS Workshop on Transfer and Multi-task Learning. Montreal, Canada.

Barrett, Maria; Agic, Zeljko; Søgaard, Anders. 2015. The Dundee Treebank. Treebanks and Linguistic Theories 14. Warsaw, Poland.

Johannsen, Anders; Martinez, Hector; Søgaard, Anders. 2015. Any-language frame semantic parsing. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2015. Lisbon, Portugal.

Klerke, Sigrid; Barrett, Maria; Castilho, Sheila; Søgaard, Anders. 2015. Reading metrics for estimating task efficiency with MT output. EMNLP Workshop on Cognitive Aspects of Computational Language Learning. Lisbon, Portugal.

Barrett, Maria; Søgaard, Anders. 2015. Using reading behavior to predict grammatical functions. EMNLP Workshop on Cognitive Aspects of Computational Language Learning. Lisbon, Portugal. [PDF]

Søgaard, Anders; Agic, Zeljko; Martinez, Hector; Plank, Barbara; Bohnet, Bernd; Johannsen, Anders. 2015. Inverted indexing for cross-lingual NLP. The 53rd Annual Meeting of the Association for Computational Linguistics (ACL). Beijing, China.

Hovy, Dirk; Søgaard, Anders. 2015. Tagging performance correlates with author age. The 53rd Annual Meeting of the Association for Computational Linguistics (ACL). Beijing, China.

Schluter, Natalie; Søgaard, Anders. 2015. Unsupervised extractive summarization via coverage maximization with syntactic and semantic concepts. The 53rd Annual Meeting of the Association for Computational Linguistics (ACL). Beijing, China.

Agic, Zeljko; Hovy, Dirk; Søgaard, Anders. 2015. If all you have is a bit of the Bible: Learning POS taggers for truly low-resource languages. The 53rd Annual Meeting of the Association for Computational Linguistics (ACL). Beijing, China.

Johannsen, Anders; Hovy, Dirk; Søgaard, Anders. 2015. Cross-lingual syntactic variation over age and gender. The 19th Conference on Computational Natural Language Learning (CoNLL). Beijing, China.

Plank, Barbara; Martinez Alonso, Hector; Agic, Zeljko; Merkler, Danijela; Søgaard, Anders. 2015. Do dependency parsing metrics correlate with human judgments? The 19th Conference on Computational Natural Language Learning (CoNLL). Beijing, China.

Barrett, Maria; Søgaard, Anders. 2015. Reading behavior predicts syntactic categories. The 19th Conference on Computational Natural Language Learning (CoNLL). Beijing, China.

Wullf, Julie; Søgaard, Anders. 2015. Learning finite state word representations for unsupervised Twitter adaptation of POS taggers. ACL 2015 Workshop on Noisy User-generated Text (W-NUT). Beijing, China.

Jørgensen, Anna Katrine; Hovy, Dirk; Søgaard, Anders. 2015. Challenges of studying and processing dialects in social media. ACL 2015 Workshop on Noisy User-generated Text (W-NUT). Beijing, China.

Søgaard, Anders. 2015. Where is language? ACL 2015 Workshop on Noisy User-generated Text (W-NUT). Beijing, China.

Martinez Alonso, Hector; Plank, Barbara; Skjærholt, Arne; Søgaard, Anders. 2015. Learning to parse with IAA-weighted loss. North American Chapter of the Association for Computational Linguistics (NAACL). Denver, CO.

Gouws, Stephan; Søgaard, Anders. 2015. Simple task-specific bilingual word embeddings. North American Chapter of the Association for Computational Linguistics (NAACL). Denver, CO.

Hovy, Dirk; Plank, Barbara; Martinez Alonso, Hector; Søgaard, Anders. 2015. Mining for unambiguous instances to adapt POS taggers to new domains. North American Chapter of the Association for Computational Linguistics (NAACL). Denver, CO.

Plank, Barbara; Martinez Alonso, Hector; Søgaard, Anders. 2015. Non-canonical language is not harder to annotate than canonical language. The 9th Linguistic Annotation Workshop (LAW IX), NAACL. Denver, CO.

Hovy, Dirk; Johannsen, Anders; Søgaard, Anders. 2015. User review sites as a resource for large-scale sociolinguistic studies. The 24th International World Wide Web Conference (WWW). Florence, Italy.

Martinez, Hector; Plank, Barbara; Johannsen, Anders; Søgaard, Anders. 2015. Active learning for sense annotation. The 20th Nordic Conference on Computational Linguistics. Vilnius, Lithuania.

Martinez, Hector; Johannsen, Anders; Olsen, Sussi; Nimb, Sanni; Sørensen, Nicolai Hartvig; Braasch, Anna; Søgaard, Anders; Pedersen, Bolette Sandford. 2015. Supersense tagging for Danish. The 20th Nordic Conference on Computational Linguistics. Vilnius, Lithuania.

Søgaard, Anders, Plank, Barbara; Martinez, Hector. 2015. Using frame semantics for knowledge extraction from Twitter. The 29th Conference on Artificial Intelligence (AAAI). Austin, TX.

Barrett, Maria; Søgaard, Anders. 2015. Modeling eye movements when reading microblogs. The 20th AAAI/SIGAI Doctoral Consortium, collocated with The 29th Conference on Artificial Intelligence (AAAI). Austin, TX.

2014

Plank, Barbara; Johannsen, Anders; Søgaard, Anders. 2014. Importance weighting and unsupervised domain adaptation of POS taggers: a negative result. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2014. Doha, Qatar.

Plank, Barbara; Hovy, Dirk; McDonald, Ryan; Søgaard, Anders. 2014. Adapting taggers to Twitter using not-so-distant supervision. The 25th International Conference on Computational Linguistics (COLING). Dublin, Ireland.

Søgaard, Anders; Plank, Barbara; Hovy, Dirk. 2014. Selection bias, label bias, and bias in ground truth (tutorial). The 25th International Conference on Computational Linguistics (COLING). Dublin, Ireland.

Johannsen, Anders; Hovy, Dirk; Martinez, Hector; Søgaard, Anders. 2014. More or less supervised super-sense tagging of Twitter. The 3rd Joint Conference on Lexical and Computational Semantics (*SEM). Dublin, Ireland. Received Best Paper Award.

Søgaard, Anders; Johannsen, Anders; Plank, Barbara; Hovy, Dirk; Martinez, Hector. 2014. What is in a p-value in NLP? The 18th Conference on Computational Natural Language Learning (CoNLL). Baltimore, MD.

Plank, Barbara; Hovy, Dirk; Søgaard, Anders. 2014. Linguistically debatable or just plain wrong? The 52nd Annual Meeting of the Association for Computational Linguistics (ACL). Baltimore, MD.

Hovy, Dirk; Plank, Barbara; Søgaard, Anders. 2014. Experiments with a crowd-sourced re-annotation of a POS tagging dataset. The 52nd Annual Meeting of the Association for Computational Linguistics (ACL). Baltimore, MD.

Plank, Barbara; Hovy, Dirk; Søgaard, Anders. 2014. Learning part-of-speech taggers with inter-annotator agreement loss. The 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL). Gothenburg, Sweden. Received Best Long Paper Award.

Fromreide, Hege; Søgaard, Anders. 2014. NER in tweets using a small crowdsourced dataset. The 9th International Conference on Natural Language Processing (PolTAL). Warsaw, Poland.

Schluter, Natalie; Elming, Jakob; Søgaard, Anders. 2014. Tree approximations of semantic parsing problems. SemEval 2014. Dublin, Ireland.

Hovy, Dirk; Plank, Barbara; Søgaard, Anders. 2014. When POS datasets do not add up: Combatting sample bias. Language Resources and Evaluation Conference (LREC) 2014. Reykjavik, Iceland.

Pedersen, Bolette; Nimb, Sanni; Olsen, Sussi; Søgaard, Anders; Sørensen, Nicolai. 2014. Semantic annotation of the Danish CLARIN Reference Corpus. 10th Joint ACL-ISO Workshop on Interoperable Semantic Annotation, LREC. Reykjavik, Iceland.

Fromreide, Hege; Søgaard, Anders. 2014. Crowdsourcing and annotating NER for Twitter #drift. Language Resources and Evaluation Conference 2014. Reykjavik, Iceland.

2013

Søgaard, Anders. Semi-supervised learning and domain adaptation for NLP. Morgan & Claypool.

Søgaard, Anders; Martinez, Hector; Elming, Jakob; Johannsen, Anders. 2013. Using crowdsourcing to get representations based on regular expressions. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2013. Seattle, WA.

Matthies, Franz; Søgaard, Anders. 2013. With blinkers on: Robust prediction of eye movements across readers. Conference on Empirical Methods in Natural Language Processing (EMNLP) 2013. Seattle, WA.

Søgaard, Anders. 2013. Part-of-speech tagging with antagonistic adversaries. The 51st Annual Meeting of the Association for Computational Linguistics (ACL). Sofia, Bulgaria.

Elming, Jakob; Johannsen, Anders; Klerke, Sigrid; Lapponi, Emanuele; Alonso, Hector Martinez; Søgaard, Anders. 2013. Down-stream effects of tree-to-dependency conversions. North American Chapter of the Association for Computational Linguistics (NAACL). Atlanta, GA.

Søgaard, Anders. 2013. Zipfian corruptions for robust POS tagging. North American Chapter of the Association for Computational Linguistics (NAACL). Atlanta, GA.

Søgaard, Anders. 2013. Estimating effect size across datasets. North American Chapter of the Association for Computational Linguistics (NAACL). Atlanta, GA.

Johannsen, Anders; Søgaard, Anders. 2013. Cross-domain answer ranking using importance sampling. The 6th International Joint Conference on Natural Language Processing (IJCNLP). Nagoya, Japan.

Johannsen, Anders; Søgaard, Anders. 2013. Disambiguating explicit discourse connectives without oracles. The 6th International Joint Conference on Natural Language Processing (IJCNLP). Nagoya, Japan.

Klerke, Sigrid; Søgaard, Anders. 2013. Simple readable sub-sentences. The 51st Annual Meeting of the Association for Computational Linguistics (ACL), Student Research Workshop. Sofia, Bulgaria.

Søgaard, Anders. 2013. An empirical study of differences between conversion schemes and annotation guidelines. International Conference on Dependency Linguistics 2013. Prague, the Czech Republic.

Klerke, Sigrid; Elbro, Carsten; Søgaard, Anders. 2013. Tracking readability in eye movements. 17th European Conference on Eye Movements. Lund, Sweden.

2012

Søgaard, Anders. 2012. Unsupervised dependency parsing without training. Natural Language Engineering 18(1):187-203.

Søgaard, Anders; Johannsen, Anders. 2012. Robust learning in random subspaces: equipping NLP for OOV effects. The 24th International Conference on Computational Linguistics (COLING). Mumbai, India.

Søgaard, Anders; Wulff, Julie. 2012. An empirical study of non-lexical extensions to delexicalized transfer. The 24th International Conference on Computational Linguistics (COLING). Mumbai, India.

Søgaard, Anders. 2012. Mining wisdom. Computational Linguistics in Literature, Nothern American Chapter of the Association of Computational Linguistics (NAACL). Montreal, Canada.

Søgaard, Anders. 2012. Two baselines for unsupervised dependency parsing. Workshop on Inducing Linguistic Structure, North American Chapter of the Association of Computational Linguistics (NAACL). Montreal, Canada.

Johannsen, Anders; Martinez, Hector; Klerke, Sigrid; Søgaard, Anders. 2012. EMNLP@CPH: Is frequency all there is to simplicity? SemEval-2012, 1st Joint Conference on Lexical and Computational Semantics. Montreal, Canada.

Nisbeth, Niklas; Søgaard, Anders. 2012. Parser combination under sample bias. SPLeT, Language Resources and Evaluation Conference. Istanbul, Turkey.

Klerke, Sigrid; Søgaard, Anders. 2012. DSim, a Danish parallel corpus for text simplification. The 8th International Conference on Language Resources and Evaluation. Istanbul, Turkey.

Søgaard, Anders; Kristiansen, Søren Lind. 2012. Using hybrid logic for querying dependency treebanks. Linguistic Issues in Language Technology 7(5).

Plank, Barbara; Søgaard, Anders. 2012. Parsing the web as covariate shift. First Workshop on Syntactic Analysis of Non-Canonical Language, NAACL 2012. Montreal, Canada.

Plank, Barbara; Søgaard, Anders. 2012. Experiments in newswire-to-law adaptation of graph-based dependency parsers. 8* Convegno Nazionale dell’Associazione Italiana di Scienze della Voce. Rome, Italy.

2011

Søgaard, Anders. 2011. A O(|G|n^6) time extension of inversion transduction grammars. Machine Translation 25(4):291-315.

Søgaard, Anders; Haulrich, Martin. 2011. Sentence-level instance-weighting for graph-based and transition-based dependency parsing. The 12th International Conference on Parsing Technologies (IWPT). Dublin, Ireland.

Søgaard, Anders. 2011. Semi-supervised condensed nearest neighbor for part-of-speech tagging. The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT). Portland, Oregon. Nominated for Best Short Paper Award.

Søgaard, Anders. 2011. Data point selection for cross-language adaptation of dependency parsers. The 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT). Portland, Oregon.

Johannsen, Anders; Martinez, Hector; Rishøj, Christian; Søgaard, Anders. 2011. Frustratingly hard compositionality prediction. Distributional Semantics and Compositionality, the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT). Portland, Oregon.

Søgaard, Anders. 2011. From ranked words to dependency trees: two-stage unsupervised non-projective dependency parsing. TextGraphs-6: Graph-based Methods for Natural Language Processing, the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL-HLT). Portland, Oregon. Errata: The assertion that the algorithm described in the paper does not guarantee projectivity (p. 65) is not true.

Rishøj, Christian; Søgaard, Anders. 2011. Factored translation using unsupervised word clusters. The 6th Workshop on Statistical Machine Translation, Conference on Empirical Methods in Natural Language Processing (EMNLP). Scotland, Edinburgh.

Søgaard, Anders. 2011. Using graphical models for PP attachment. The 18th Nordic Conference on Computational Linguistics. Riga, Latvia.

Søgaard, Anders. 2011. Learning grammatical functions in a realistic way. The 3rd Conference of the Scandinavian Association for Language and Cognition. Copenhagen, Denmark.

Søgaard, Anders. 2011. Cable television; Fanzines; Functionalist theories; Journalism. In Marcel Danesi (ed.), Encyclopedia of Media and Communication. Toronto: University of Toronto Press.

2010

Søgaard, Anders; Rishøj, Christian. 2010. Semi-supervised dependency parsing using generalized tri-training. The 23rd International Conference on Computational Linguistics (COLING). Beijing, China.

Søgaard, Anders. 2010. Simple semi-supervised training of part-of-speech taggers. The 48th Annual Meeting of the Association for Computational Linguistics (ACL). Uppsala, Sweden.

Søgaard, Anders; Rishøj, Christian. 2010. The effect of semi-supervised learning on parsing long distance dependencies in German and Swedish. The 7th International Conference on Natural Language Processing (IceTAL). Reykjavik, Iceland.

Søgaard, Anders; Johannsen, Anders. 2010. Robust semi-supervised and ensemble-based methods in word sense disambiguation. The 7th International Conference on Natural Language Processing (IceTAL). Reykjavik, Iceland.

Kristiansen, Søren Lind; Søgaard, Anders. 2010. Querying dependency treebanks in hybrid logic. Hybrid Logic and Applications, The 25th Annual IEEE Symposium on Logic in Computer Science (LICS). Edinburgh, Scotland.

Søgaard, Anders. 2010. Can inversion transduction grammars generate hand alignments? The 14th Annual Conference of the European Association for Machine Translation (EAMT). St. Raphael, France. Errata: The numbers in Figure 4 are incorrect. Email me for correct numbers.

Søgaard, Anders; Haulrich, Martin. 2010. On the derivation perplexity of treebanks. Treebanks and Linguistic Theories 9. Riga, Latvia.

2009

Søgaard, Anders; Lange, Martin. 2009. Polyadic dynamic logics for HPSG parsing. Journal of Logic, Language and Information 18(2): 159-198. (ERIH Category: A)

Søgaard, Anders; Wu, Dekai. 2009. Empirical lower bounds on translation unit error rate for the full class of inversion transduction grammars. The 11th International Conference on Parsing Technologies (IWPT). Paris, France.

Søgaard, Anders; Kuhn, Jonas. 2009. Using a maximum entropy-based tagger to improve a very fast vine parser. The 11th International Conference on Parsing Technologies (IWPT). Paris, France.

Søgaard, Anders; Kuhn, Jonas. 2009. Empirical lower bounds on aligment error rates in syntax-based machine translation. The 3rd Workshop on Syntax and Structure in Statistical Translation, North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT) 2009. Boulder, Colorado.

Søgaard, Anders. 2009. On the complexity of alignment problems in two synchronous grammar formalisms. The 3rd Workshop on Syntax and Structure in Statistical Translation, North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT) 2009. Boulder, Colorado.

Søgaard, Anders; Østerskov, Stine. 2009. On definitions of consciousness. Journal of Consciousness Studies.

Søgaard, Anders. 2009. Compound constructions: a reply to Bundgaard et al. Semiotica: Journal of the International Association for Semiotic Studies 169(1): 163-169. (ERIH Category: A)

Søgaard, Anders. 2009. Ensemble-based POS tagging of Italian. The 11th Conference of the Italian Association for Artificial Intelligence, EVALITA. Reggio Emilia, Italy.

Søgaard, Anders; Rishøj, Christian. 2009. Vine parsing augmented Italian treebanks. The 11th Conference of the Italian Association for Artificial Intelligence, EVALITA. Reggio Emilia, Italy.

Søgaard, Anders. 2009. A linear time extension of deterministic pushdown automata. The 17th Nordic Conference on Computational Linguistics. Odense, Denmark.

Søgaard, Anders. 2009. Cubic time querying of treebanks for nonlocal multicomponent tree-adjoining grammar and head-driven phrase structure grammar. The 17th Nordic Conference on Computational Linguistics. Odense, Denmark.

Søgaard, Anders; Haugereid, Petter. 2009. Introduction. In Anders Søgaard and Petter Haugereid (eds.), Typed feature structure grammars. Berlin: Peter Lang.

Søgaard, Anders. 2009. From unordered context-free grammar to polysize HPSG without moving. In Anders Søgaard and Petter Haugereid (eds.), Typed feature structure grammars. Berlin: Peter Lang.

2008

Søgaard, Anders. 2008. Range concatenation grammars for translation. The 22nd International Conference on Computational Linguistics (COLING). Manchester, England. Pp. 103-106.

Søgaard, Anders. 2008. On the weak generative capacity of weighted context-free grammars. The 22nd International Conference on Computational Linguistics (COLING). Manchester, England. Pp. 99-102.

Maier, Wolfgang; Søgaard, Anders. 2008. Treebanks and mild context-sensitivity. The 13th Conference on Formal Grammar. Hamburg, Germany. Pp. 61-76.

Søgaard, Anders. 2008. Learning context-sensitive synchronous rules. The 13th Annual Conference on the European Association for Machine Translation (EAMT). Hamburg, Germany. Pp. 170-175.

2007

Søgaard, Anders; Haugereid, Petter. 2007. A tractable typed feature structure grammar for Mainland Scandinavian. Nordic Journal of Linguistics 30(1): 87-128. (ERIH Category: B)

Søgaard, Anders. 2007. The grammaticalization and disappearance of adpositions in nominal compounds. California Linguistic Notes 17(2): 1-24.

Søgaard, Anders; Lichte, Timm; Maier, Wolfgang. 2007. On the complexity of linguistically motivated extensions of tree-adjoining grammar. Recent Advances in Natural Language Processing 2007. Borovets, Bulgaria.

Søgaard, Anders. 2007. Operations on polyadic structures. Model-theoretic syntax @ 10, the 19th European Summer School on Logic, Language and Information. Dublin, Ireland.

Søgaard, Anders. 2007. Complexity of conceptual integration. The 10th International Cognitive Linguistics Conference. Krakow, Poland.

Søgaard, Anders. 2007. Polynomial charts for totally unordered languages. Proceedings of the 16th Nordic Conference of Computational Linguistics. Tartu, Estonia. Pp. 183-190.

Søgaard, Anders. 2007. Propositional and first order verification of linguistic structures. Proceedings of the 2nd International Workshop on Typed Feature Structure Grammars. Tartu, Estonia. Pp. 17-24.

Søgaard, Anders; Nimb, Sanni. 2007. A typed account of adverbial quantifiers. Proceedings of the 2nd International Workshop on Typed Feature Structure Grammars. Tartu, Estonia. Pp. 53-61.

Søgaard, Anders. 2007. From unordered context-free grammar to polysize HPSG without moving. Proceedings of the 1st International Workshop on Typed Feature Structure Grammars. Aalborg, Denmark. Pp. 17-30.

Søgaard, Anders. 2007. Mathematical properties of natural language and mathematical properties of linguistic theories. The 21st Grammar in Focus. Lund, Sweden.

2006

Søgaard, Anders. 2006. Computational semantics as reasoning about programs. South Asian Language Review 16, Special Issue on Computational Semantics.

Søgaard, Anders. 2006. The semantics of possession in natural language and knowledge representation. Journal of Universal Language 6(2): 85-115.

Søgaard, Anders. 2006. Model generation in a dynamic environment. Takashi Washio, Akito Sakurai, Satoshi Tojo and Makoto Yokoo (eds.), New Frontiers in Artificial Intelligence. Berlin: Springer. Pp. 126-133.

Søgaard, Anders. 2006. Unification-based grammars and complexity classes. 8. Konferenz zur Verarbeitung natürlicher Sprache. Konstanz, Germany. Pp. 137-142.

Søgaard, Anders. 2006. Embodied construction grammar as layered modal languages. Proceedings of The Joint Human Language Technology Conference and the North American Chapter of the Association of Computational Linguistics 2006, Third International Workshop on Scalable Natural Language Understanding. New York, New York. Pp. 65-72.

Søgaard, Anders. 2006. Logical investigations on the adequacy of certain feature-based theories of natural language. Proceedings of The Joint Human Language Technology Conference and the North American Chapter of the Association of Computational Linguistics 2006, Doctoral Consortium. New York, New York. Pp. 239-242.

2005

Søgaard, Anders. 2005. Update semantics for HPSG grammars. H. Holmboe (ed.), Nordisk Sprogteknologi 2005. Copenhagen: Museum Tusculanum. Pp. 167-72.

Søgaard, Anders. 2005. Compounding theories and linguistic diversity. Zygmunt Frajzyngier, Adam Hodges and David S. Rood (eds.), Linguistic diversity and language theories. Amsterdam: John Benjamins. Pp. 319-37.

Søgaard, Anders; Haugereid, Petter. 2005. A brief documentation of a computational HPSG grammar specifying (most of) the common subset of linguistic types for Danish, Norwegian and Swedish. H. Holmboe (ed.), Nordisk Sprogteknologi 2004. Copenhagen: Museum Tusculanum. Pp. 247-56.

Søgaard, Anders. 2005. Where does the meaning of compounds and possessives come from? A contrastive view. The 3rd International Conference in Contrastive Semantics and Pragmatics. Shanghai, China.

Søgaard, Anders. 2005. Computing sense and reference. Computing and Philosophy 2005. Västerås, Sweden.

Søgaard, Anders; Haugereid, Petter. 2005. Functionality in grammar design. Stefan Werner (ed.), Proceedings of The 15th Nordic Conference of Computational Linguistics. Joensuu: University of Joensuu Electronic Publications in Linguistics and Language Technology, vol. 1. Pp. 193-202.

Søgaard, Anders. 2005. Model generation in a dynamic environment. Proceedings of Logic and Engineering of Natural Language Semantics 2005. Kitakyushu, Japan.

Søgaard, Anders. 2005. Extending the HPSG Grammar Matrix with richer lexical semantics. Proceedings of The 3rd International Workshop on Generative Approaches to the Lexicon. Geneva, Switzerland. [PDF]

2004

Søgaard, Anders. 2004. A compound matrix. Proceedings of The 11th International Conference on Head-Driven Phrase Structure Grammar (HPSG). Leuven, Belgium.

Søgaard, Anders. 2004. On appropriateness as a relation between ontologies. International Conference on Formal Ontology in Information Systems. Torino, Italy.

Søgaard, Anders. 2004. K-structure: (a prerequisite for) an interlingua. Papers from the 20th Scandinavian Conference of Linguistics. Helsinki, Finland.

Søgaard, Anders; Haugereid, Petter. 2004. The noun phrase in Mainland Scandinavian. The 3nd Meeting of the Scandinavian Network of Grammar Engineering and Machine Translation. Gothenburg, Sweden.

2003

Søgaard, Anders. 2003. A compound algorithm. Computational Linguistics in the Netherlands 2003. Antwerpen, Belgium.

Søgaard, Anders. 2003. Compounding and the generative lexicon. Sprache, Wissen, Wissenschaft. Munich, Germany.