C58 : the Greek Discourse Relations Corpus
What we do
C58's annotated texts
You can have full access to C58's annotated texts via brat's server we set up at the Department of Linguistics in Aristotle University of Thessaloniki.
The relevant link is here.
Apart from browsing our annotated texts by double-clicking on each text, you can exploit brat's search utlities for extracting information about our dataset. Information about thematic roles and grammatical aspect is viewed by placing the mouse over verbs and their arguments. Moreover, brat provides an exportation utility in a customized format for further analysis of C58.
Based partly on Asher & Laskarides' (2003) theory on discourse relations, SDRT (Segmented Discourse Represantation Theory) we used Brat Annotation Tool as our platform for annotating GDRC with discourse relations. During the annotation procedure various issues came up concerning SDRT's existing discourse relations' inventory related to the nature of discourse relations.
Apart from annotation of discourse relations, C58 was annotated for:
-
argument structure (partly based on Verbnet's thematic roles),
- grammatical aspect and its annotations will comprise the basis for further analysis of the interplay between intersentential and intrasenential factors.
In the second phase of the project we trained a model of automatic text classification for tracing Elaboration and Commentary on C58's dataset (cf. papers' section).
In the near future, we plan to apply various data analytic techniques for revealing patterns as to how authors choose to unfold their stories in various journalistic genres.
C58's annotation scheme
Discourse Relations
Alter = Alternation
Instance
Precond = Precondition
Purpose
Nar = Narration
Elab = Elaboration
Expl = Explanation
Comm = Commentary
Paral = Parallel
Contin = Continuation
Res = Result
Contr = Contrast
Back = Background
Topic
Thematic Roles
Act = Actor
Ag = Agent
Pt = Patient
Exp = Experiencer
Stim = Stimulus
Caus = Causer
Asset
Attr = Attribute
Benef = Beneficiary
Loc = Location
Dest = Destinations
Source = Source
Extent = Extent
Inst = Instrument
Mater-Prod = Material-Product
Prod = Product
Pred = Predicate
Recip = Recipient
Theme
Time
Topic
Co_Theme
Below is a snapshot of one of the annotated texts with C58's annotation scheme.
The above example in English
P1:"At the same time, deffence will try to exploit testimonials like those who were given the previous day,"
P2: "when the eyewitnesses recognised the accused, but, the other hand, in a different role than the one imputed by the indictment."
P3: "And, since certain accusations have been imputed by confessions, deffence will bring back the request for higher officers of the Greek Police and doctors to be called as witnesses. "
P4: "Though only few believed the rumors about the tortures, "
P5: "the truth is that there are no more margins for deffence."