Welcome to the GU ANNIS Web Interface

ANNIS
Legend for symbols Annotation layer symbols:

For all questions and details about obtaining a login to restricted corpora, see this page. For larger/flat annotated corpora, also see our CQP web interface.

Please select a corpus from the list below to enter.

Multilayer Corpora

  • Georgetown University Multilayer Corpus (GUM)
     eng-us / 64,005 / 76
          
  • OntoNotes 3.0 - WSJ section (OntoNotes)  
     eng-us / 370,789 / 597
       
  • OntoNotes 5.0 Coref Section (OntoNotes5_coref)  
     eng-us / 1,590,885 / 3,393
      
  • OntoNotes 5.0 Dependencies (OntoNotes5_dep)  
     eng-us / 2,589,499 / 12,721
     
  • The Potsdam Commentary Corpus Sampler (pcc2)
     deu / 399 / 2
         
  • Treebanks

  • Arabic Treebank (Buckwalter vocalized) (arabic.treebank)  
     ara / 177,950 / 734
     
  • Arabic Tree Test (arabic.tree.test)
     ara / 11 / 1
     
  • Chinese Treebank 9.0 (Chinese Treebank 9.0)  
     zho / 2,287,073 / 3,726
     
  • English Web Treebank (English.Web.Treebank)  
     eng-us / 272,779 / 1,174
     
  • Foreebank En - English Web Support Forum Treebank (Foreebank-en)  
     eng / 15,613 / 1
     
  • Foreebank Fr - French Web Support Forum Treebank (Foreebank-fr)  
     fra / 19,667 / 1
     
  • Open American National Corpus - Manually Annotated Subcorpus (Court Transcripts) (MASC_court)
     eng-us / 37,756 / 39
     
  • Switchboard Telephone Conversation Constituent Corpus (switchboard_const)
     eng-us / 1,095,089 / 646
     
  • Switchboard Telephone Conversation Dependency Corpus (Switchboard (dep))
     eng-us / 1,287,379 / 649
     
  • The Tiger Treebank version 2 (tiger2)  
     deu / 888,578 / 1,971
     
  • Spanish Universal Dependency Treebank 2.0 (unidep.es)
     spa / 375,180 / 369
     
  • Japanese Universal Dependency Treebank 2.0 (unidep.jp)
     jap / 80,172 / 80
     
  • Wall Street Journal Dependency Corpus (Wall Street Journal (dep))  
     eng-us / 1,173,766 / 2,312
     
  • Wall Street Journal Constituent Treebank (wsj.const_ptb)  
     eng-us / 1,209,785 / 2,235
     
  • CALLHOME Mandarin Telephone Conversation Treebank (zh.callhome.tb)  
     zho / 108,531 / 41
     
  • Xinhua Mandarin News Treebank (zh.xinhua.tb)  
     zho / 106,934 / 325
     
  • Historical Corpora

  • Penn Parsed Corpus of Early Modern English - Helsinki Subcorpus (PPCEME_helsinki)  
     eng-eme / 627,993 / 147
     
  • Penn Parsed Corpus of Early Modern English - Penn Subcorpus 1 (PPCEME_penn1)  
     eng-eme / 636,421 / 152
     
  • Parallel Corpora

  • SMULTRON Parallel Treebank Sampler (SMULTRON_Banana)
     eng-us,deu / 3,782 / 2
      
  • Learner Corpora

  • CityU Corpus of Essay Drafts of English Language Learners (cityu-2007-08A)  
     eng-L2 / 600,031 / 1,018
      
  • CityU Corpus of Essay Drafts of English Language Learners (cityu-2007-08B)  
     eng-L2 / 1,173,329 / 1,696
      
  • CityU Corpus of Essay Drafts of English Language Learners (cityu-2008-09A)  
     eng-L2 / 3,428,414 / 3,872
      
  • CityU Corpus of Essay Drafts of English Language Learners (cityu-2008-09B)  
     eng-L2 / 2,151,821 / 4,046
      
  • CityU Corpus of Essay Drafts of English Language Learners (cityu-2009-10B)  
     eng-L2 / 424,841 / 532
      
  • The MERLIN corpus - L2 Czech (MERLIN_Czech)
     cze-L2 / 79,969 / 441
      
  • The MERLIN corpus - L2 German (MERLIN_German)
     deu-L2 / 154,335 / 1,033
      
  • The MERLIN corpus - L2 Italian (MERLIN_Italian)
     ita-L2 / 107,211 / 813
      
  • Miscellaneous Corpora

  • VU Amsterdam Metaphor Corpus (VUAMC)
     eng-uk / 238,905 / 117
      
  • Hausa Corpora

  • SFB632 A5 Hausa News Corpus (a5.hausa.news)  
     hau / 2,017 / 4
  • SFB632 A5 Hausa Film Corpus [Umarnin Uwa] (a5.hausa.umarnin.uwa_V2)
     hau / 10,194 / 47
  • Coptic SCRIPTORIUM Corpora

  • Apophthegmata Patrum (apophthegmata.patrum)
     cop / 7,911 / 52
       
  • Apophthegmata Patrum (apophthegmata.patrum)
     cop / 7,911 / 52
       
  • Besa - Letters (besa.letters)
     cop / 2,296 / 3
     
  • Documentary Papyri (doc.papyri)  
     cop / 290 / 3
     
  • Sahidica Bible - 1 Corinthians (sahidica.1corinthians)
     cop / 12,471 / 16
     
  • Sahidica Bible - Mark (sahidica.mark)
     cop / 20,185 / 16
     
  • Sahidica Coptic New Testament (sahidica.nt)
     cop / 233,906 / 259
     
  • Sahidica Coptic New Testament (sahidica.nt)
     cop / 233,906 / 259
     
  • Shenoute - Acephalous 22 (shenoute.a22)
     cop / 7,589 / 4
     
  • Shenoute - Abraham our Father (shenoute.abraham.our.father)
     cop / 7,670 / 7
     
  • Shenoute - I See Your Eagerness (shenoute.eagerness)
     cop / 18,353 / 17
     
  • Shenoute - Not Because a Fox Barks (shenoute.fox)  
     cop / 2,814 / 1
      
  • Shenoute - Not Because a Fox Barks (shenoute.fox)  
     cop / 2,814 / 1
      
  • [Admin logon]