Discovering the Lexical Features of a Language Eric Brill * Department of Computer and Information Science University of Pennsylvania Philadelphia, PA 19104 emaih brill@unagi.cis.upenn.edu 1 Introduction This paper examines the possibility of automatically discovering the lexieal fe[r]
found that part-of-speech statistics as well as gen-eral text statistics (e.g., average sentence length)are more effective than the traditional bag-of-words representation when classifying documentsfrom multiple domains. This supports the notionthat we can use non-lexical features to c[r]
a novel second-order distributional simi-larity measure, and the positive effect isspecially relevant for out-of-domain data.Our findings suggest that selectional pref-erences have potential for improving a fullsystem for Semantic Role Labeling.1 IntroductionSemantic Role Labeling (SRL) systems usual[r]
for the i-thfeature in cluster k are estimated in the maximum-entropy framework as in the baseline model. How-ever, the mixture weights wkcan be optimized di-rectly towards the translation evaluation metric, suchas BLEU (Papineni et al., 2002), along with otherusual costs (e.g. language model scores[r]
results in Section 3 shows that all of these figures arehigher than their computer tutoring counterparts.With respect to predictive accuracy, Table 10shows our results for the agreed data. A compari-son with Tables 4-6 shows that overall, the human-human data yields increased performance across allfe[r]
(like ACE), where we have a lot of relation typeswith relatively small amounts of annotated data.Our system certainly benefits from features derivedfrom parse trees, but it is not inextricably linked tothem. Even using very simple lexical features, weobtained high precision extra[r]
al. (2009) or Shriberg et al. (1998). They include:F0 statistics (mean, stdev, max, min) computed overthe whole utterance and over the last 200ms; slopescomputed from a linear regression to the F0 contour(over the whole utterance and last 200ms); initialand final slope values output from the stylizer[r]
B3, we also tried extending the features based onword IDs to those based on n-gram IDs, wheren = 1, 2, 3. This greatly increased the number oflexical features but did not improve learning per-formance, most likely due to the limited amountsof training data coupled with the sparsity of[r]
describe their DIRT system for extraction of para-phrase-like inference rules. 5 Evaluation We selected a subset of the verbs annotated in the OntoNotes project (Chen, 2007) that had at least 50 instances. The resulting data set consisted of 46,577 instances of 217 verbs. The predominant sense basel[r]
quence and dependencies between variables, in-cluding the DDAs. Bui et al. (2009) were espe-cially interested in whether DGMs better exploitnon-lexical features. Fern´andez et al. (2008) ob-tained much more value from lexical than non-lexical features (and indeed n[r]
OBJECTIVES The study is intended to achieve the following objectives: - To examine the discourse features of school announcements in English in terms of their layout, lexical features, s[r]
they are trained and tested on different data sets353but they achieve accuracy in a similar range. Ofthese systems, only the DAPPER system (De Fe-lice and Pulman, 2008; De Felice and Pulman,2009; De Felice, 2009) uses a parser, the C&Cparser (Clark and Curran, 2007)), to determinethe head an[r]
It is a chance for us to explore some linguistic features in terms of lexical features and syntactic features in English competition law and Vietnamese competition law to find out the si[r]
choice of many optional tags that can be associated with eachDA. To deal with this problem, we used various scaled-downversions of the original tagset.relations. We define:as the tag of the most recent spurt before thatis produced by Y and addresses X. This definitionwill help our multi-party analyses[r]
of overlapping features. SVMs learn binary clas-sifiers, but the method can be extended to multi-class classification (Allwein et al., 2000; Kudo and Matsumoto, 2000). SVMs have been successfully applied to many NLP tasks since (Joachims, 1998), and specifi-cally for base phrase chunking (Kud[r]
are labeled in the TDT-2 corpus and classifies them either as a story boundary or non-boundary. We form lexical chains from the TDT-2 ASR outputs by linking repeated words. Since words may also repeat across different stories, we limit the maximum distance between consecutive words within the[r]
III. NEW DESIGN FEATURES Steel Design Strength & Drift Controlled Optimization The American Petroleum Institute - API-RP 2A LRFD 1997 The American Petroleum Institute - API-RP 2A WSD 2000 Latticed Transmission Structures - ASCE-10-97 2000 The Uniform Building Code - UBC-ASD 1997 The[r]
According to Quirk [19, p.6] Words classes are generally divided into two broad groups: lexical categories and non – lexical categories Lexical Categories Example Noun N Verb V Adjectiv[r]
THE PRAGMATIC FEATURES OF VIETNAMESE COMPANY SLOGANS 4.6.1 SHOWING MARKETING STRATEGIES OF A COMPANY 4.6.2 SHOWING A COMMITMENT OR A PROMISE OF THE COMPANY TO THE CUSTOMERS 4.6.3 SHOWING[r]
7 AnalysisIn this section, we conduct some case studies toshow how the proposed models improve translationaccuracy by looking into the differences that theymake on translation hypotheses.Table 6 displays a translation example whichshows the difference between the baseline andthe system enhanced with[r]