{"id":78,"date":"2008-10-30T15:47:32","date_gmt":"2008-10-30T13:47:32","guid":{"rendered":"http:\/\/elanguage.net\/blogs\/booknotices\/?p=78"},"modified":"2008-10-30T15:48:38","modified_gmt":"2008-10-30T13:48:38","slug":"c-oral-rom-integrated-reference-corpora-for-spoken-romance-languages","status":"publish","type":"post","link":"https:\/\/journals.linguisticsociety.org\/booknotices\/?p=78","title":{"rendered":"C-ORAL-ROM: Integrated reference corpora for spoken Romance languages."},"content":{"rendered":"<p><meta http-equiv=\"Content-Type\" content=\"text\/html; charset=utf-8\" \/><meta name=\"ProgId\" content=\"Word.Document\" \/><meta name=\"Generator\" content=\"Microsoft Word 11\" \/><meta name=\"Originator\" content=\"Microsoft Word 11\" \/>\n<link href=\"file:\/\/\/C:%5CDOKUME%7E1%5CADMINI%7E1%5CLOKALE%7E1%5CTemp%5Cmsohtml1%5C01%5Cclip_filelist.xml\" rel=\"File-List\" \/><o:smarttagtype namespaceuri=\"urn:schemas-microsoft-com:office:smarttags\" name=\"PlaceType\"><\/o:smarttagtype><o:smarttagtype namespaceuri=\"urn:schemas-microsoft-com:office:smarttags\" name=\"PlaceName\"><\/o:smarttagtype><o:smarttagtype namespaceuri=\"urn:schemas-microsoft-com:office:smarttags\" name=\"City\"><\/o:smarttagtype><o:smarttagtype namespaceuri=\"urn:schemas-microsoft-com:office:smarttags\" name=\"place\"><\/o:smarttagtype><!--[if gte mso 9]><xml>  <w:WordDocument>   <w:View>Normal<\/w:View>   <w:Zoom>0<\/w:Zoom>   <w:HyphenationZone>21<\/w:HyphenationZone>   <w:PunctuationKerning\/>   <w:ValidateAgainstSchemas\/>   <w:SaveIfXMLInvalid>false<\/w:SaveIfXMLInvalid>   <w:IgnoreMixedContent>false<\/w:IgnoreMixedContent>   <w:AlwaysShowPlaceholderText>false<\/w:AlwaysShowPlaceholderText>   <w:Compatibility>    <w:BreakWrappedTables\/>    <w:SnapToGridInCell\/>    <w:WrapTextWithPunct\/>    <w:UseAsianBreakRules\/>    <w:DontGrowAutofit\/>   <\/w:Compatibility>   <w:BrowserLevel>MicrosoftInternetExplorer4<\/w:BrowserLevel>  <\/w:WordDocument> <\/xml><![endif]--><!--[if gte mso 9]><xml>  <w:LatentStyles DefLockedState=\"false\" LatentStyleCount=\"156\">  <\/w:LatentStyles> <\/xml><![endif]--><!--[if !mso]><object  classid=\"clsid:38481807-CA0E-42D2-BF39-B33AF135CC4D\" id=ieooui><\/object>\n\n\n<style> st1\\:*{behavior:url(#ieooui) } <\/style>\n\n <![endif]--><\/p>\n<style><!--  \/* Style Definitions *\/  p.MsoNormal, li.MsoNormal, div.MsoNormal \t{mso-style-parent:\"\"; \tmargin:0cm; \tmargin-bottom:.0001pt; \tmso-pagination:widow-orphan; \tfont-size:12.0pt; \tfont-family:\"Times New Roman\"; \tmso-fareast-font-family:\"Times New Roman\"; \tmso-ansi-language:EN-US; \tmso-fareast-language:EN-US;} a:link, span.MsoHyperlink \t{color:blue; \ttext-decoration:underline; \ttext-underline:single;} a:visited, span.MsoHyperlinkFollowed \t{color:purple; \ttext-decoration:underline; \ttext-underline:single;} @page Section1 \t{size:612.0pt 792.0pt; \tmargin:72.0pt 72.0pt 72.0pt 72.0pt; \tmso-header-margin:36.0pt; \tmso-footer-margin:36.0pt; \tmso-paper-source:0;} div.Section1 \t{page:Section1;} --> <\/style>\n<p><!--[if gte mso 10]>\n\n\n<style>  \/* Style Definitions *\/  table.MsoNormalTable \t{mso-style-name:\"Normale Tabelle\"; \tmso-tstyle-rowband-size:0; \tmso-tstyle-colband-size:0; \tmso-style-noshow:yes; \tmso-style-parent:\"\"; \tmso-padding-alt:0cm 5.4pt 0cm 5.4pt; \tmso-para-margin:0cm; \tmso-para-margin-bottom:.0001pt; \tmso-pagination:widow-orphan; \tfont-size:10.0pt; \tfont-family:\"Times New Roman\"; \tmso-ansi-language:#0400; \tmso-fareast-language:#0400; \tmso-bidi-language:#0400;} <\/style>\n\n <![endif]--><\/p>\n<p class=\"MsoNormal\" style=\"margin-left: 18pt; text-indent: -18pt; line-height: 150%\"><strong><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\">C-ORAL-ROM:<\/span><\/strong><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\"> Integrated reference corpora for spoken Romance languages. Ed. by <strong>Emanuela Cresti<\/strong> and <strong>Massimo Moneglia<\/strong>. (Studies in corpus linguistics 15.) <st1:city w:st=\"on\"><st1:place w:st=\"on\">Amsterdam<\/st1:place><\/st1:city>: John Benjamins, 2005. Pp. xvii, 304, DVD. ISBN <a href=\"http:\/\/www.worldcat.org\/isbn\/902722286X\">902722286X<\/a>. $144 (Hb).<o:p><\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"margin: 6pt 0cm 18pt; text-align: right; line-height: 150%\" align=\"right\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\">Reviewed by <a href=\"http:\/\/www.fsu.edu\/~modlang\/divisions\/spanish\/gonzalez.html\"><st1:city w:st=\"on\"><strong>Carolina<\/strong><\/st1:city><strong> Gonz\u00e1lez<\/strong><\/a>, <st1:place w:st=\"on\"><st1:placename w:st=\"on\"><em>Florida<\/em><\/st1:placename><em> <st1:placetype w:st=\"on\">State<\/st1:placetype>  <st1:placetype w:st=\"on\">University<\/st1:placetype><\/em><\/st1:place><o:p><\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><em><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\">C-ORAL-ROM<\/span><\/em><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\"> presents corpora of spontaneous speech of French, Italian, Portuguese, and Spanish collected in <st1:place w:st=\"on\">Europe<\/st1:place> by researchers following the same guidelines. This collaborative effort outlines the history of the project and the conventions and methodological issues that were relevant for its completion.<o:p><\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\"><o:p>\u00a0<\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\">Ch. 1 deals with the C-ORAL-ROM resource in general. The corpora consist of approximately 300,000 words for each of the four languages and include recordings and texts from a wide variety of contexts, genres, and dialogue structures. Available in the accompanying DVD and through the ELDA Catalogue (<a href=\"http:\/\/www.elda.fr\/\">http:\/\/www.elda.fr<\/a>), the corpora are presented in a multimedia format that includes both textual and acoustic information. The textual information, which follows the CHAT format (MacWhinney 1994), is prosodically tagged and annotated for part of speech. A key feature of C-ORAL-ROM is text-to-speech alignment, which is a function of the selection of each utterance in the resource through prosodic cues. The resource provides text-to-speech synchronization of roughly 130 hours of spontaneous speech.<o:p><\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\"><o:p>\u00a0<\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\">Chs. 2\u20135 focus on the subcorpora for each language, and Ch. 6 provides some discussion of the important role of the utterance\u2014defined as an \u2018expression marked by a prosodic terminal break\u2019 (210)\u2014in speech-corpora analysis. Finally, the appendix briefly presents the results from the external evaluation of the prosodic annotation utilized in the project.<o:p><\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\"><o:p>\u00a0<\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\">The DVD offers several tools. The corpus metadata provides metalinguistic information for each language sample. Glossaries are included for Italian regional forms and Spanish nonstandard forms. Text-to-speech alignment is provided through a demo version of the WinPitch Corpus (\u00a9 Philippe Martin), where recordings can be listened to and analyzed acoustically with the help of waveforms, spectrograms, and pitch tracking. This is especially helpful for prosodic analysis. A text search engine is also provided, through a demo version of Contextes (1.1.0) (\u00a9 Jean V\u00e9ronis). Every match returned for word or lemma searches includes a partial context; the script where the match appears can be uploaded with a simple click. Frequency lists for words and lemmas for each of the subcorpora are also included in the DVD, together with tables and comparative diagrams of relevant linguistic measures and strategies in each language.<o:p><\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\"><o:p>\u00a0<\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\">Overall, this is a great resource for researchers in the areas of Romance linguistics, corpus linguistics, syntax, second language acquisition, and speech and prosody research. The operation of the DVD and the tools included in it is quite straightforward. The exception is the WinPitch Corpus, for which a troubleshooting section and additional information on its operation would be a welcome addition. An online tutorial for this program is announced at <a href=\"http:\/\/lablita.dit.unifi.it\/coralrom\/\">http:\/\/lablita.dit.unifi.it\/coralrom\/<\/a>. Finally, it is unfortunate that one of the key options in Contextes\u2014playing the context for each match returned through the search function\u2014is not supported in the demo version distributed in the DVD.<o:p><\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\"><o:p>\u00a0<\/o:p><\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: 150%\"><span style=\"font-size: 11pt; line-height: 150%; font-family: Arial\" lang=\"EN-US\"><o:p>\u00a0<\/o:p><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>C-ORAL-ROM: Integrated reference corpora for spoken Romance languages. Ed. by Emanuela Cresti and Massimo Moneglia. (Studies in corpus linguistics 15.) Amsterdam: John Benjamins, 2005. Pp. xvii, 304, DVD. ISBN 902722286X. $144 (Hb). Reviewed by Carolina Gonz\u00e1lez, Florida State University C-ORAL-ROM presents corpora of spontaneous speech of French, Italian, Portuguese, and Spanish collected in Europe by [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[],"tags":[],"_links":{"self":[{"href":"https:\/\/journals.linguisticsociety.org\/booknotices\/index.php?rest_route=\/wp\/v2\/posts\/78"}],"collection":[{"href":"https:\/\/journals.linguisticsociety.org\/booknotices\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/journals.linguisticsociety.org\/booknotices\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/journals.linguisticsociety.org\/booknotices\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/journals.linguisticsociety.org\/booknotices\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=78"}],"version-history":[{"count":0,"href":"https:\/\/journals.linguisticsociety.org\/booknotices\/index.php?rest_route=\/wp\/v2\/posts\/78\/revisions"}],"wp:attachment":[{"href":"https:\/\/journals.linguisticsociety.org\/booknotices\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=78"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/journals.linguisticsociety.org\/booknotices\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=78"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/journals.linguisticsociety.org\/booknotices\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=78"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}